# Spatial Analysis of Population Shifts: A Raster-based Exploration

In this tutorial, we’ll walk through a geospatial analysis of population shifts using raster data. We’ll be comparing census data from two years, 2006 and 2021, to identify significant changes in population.

### Why Rasterize the 1KM Grid Census Data?

When working with geospatial data, especially for analysis over large areas or with many polygons, operations can become computationally expensive and time-consuming. The census data we have is in the form of 1KM grid polygons. While this format is detailed and precise, performing operations like overlays, intersections, and calculations on such polygon data can be slow.

By converting this polygon data into raster format, we can:

• Speed up computations: Raster operations, especially on regular grids, can be much faster than equivalent vector operations.
• Simplify data: Rasters represent data as a matrix of cells, which can be more straightforward to work with for certain analyses.
• Facilitate integration: If we want to combine our census data with other raster datasets (like satellite imagery or elevation data), having everything in raster format can make the process smoother.

However, it’s important to note that raster analysis is not a one-size-fits-all solution. While it offers advantages in certain scenarios, there are cases where vector analysis might be more appropriate, especially when dealing with intricate spatial details or when precise boundaries are essential.

### Prerequisites:

• Basic knowledge of Python and geospatial analysis.
• Libraries: `geopandas`, `pandas`, `geocube`, `xarray`, and `numpy`.

### Steps:

Before diving into the analysis, ensure you’ve have downloaded the data from https://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/population-distribution-demography/geostat. For this tutorial, we have already downloaded the data and kept it in github repo.
The data we have is 1KM grid census polygons for 2021 and 2006 of Slovakia region

#### 2. Rasterizing Vector Data:

We’ll rasterize the vector data using the `make_geocube` library. This will convert our vector data into a grid format, making it easier to compare.

#### 3. Aligning Rasters:

Since the rasters from 2006 and 2021 might have different dimensions, we need to align them.

#### 4. Calculating Relative Change:

Subtract the 2006 raster from the 2021 raster to get the relative change in population.

#### 5. Exporting data to raster:

The rasters for 2021, 2006, and the relative difference are saved as TIFF files for future use or visualization.

#### 6. Identifying Top Changes:

Extract the top 5 highest relative population changes. We want to identify the areas with the most significant population changes. This step extracts the top 5 highest relative population changes.

#### 7. Extracting Centroid Coordinates:

For the top changes, we’ll convert the raster pixels to centroid coordinates. For each of the top changes, we convert the raster pixels to centroid coordinates. This gives us the central point of each grid cell that experienced significant change.

#### 8. Creating a DataFrame:

Organize the extracted data into a DataFrame.

#### 9. Reprojecting Data:

Convert the data to the EPSG:4326 projection. Spatial data can exist in various coordinate reference systems (CRS). To ensure compatibility with other datasets or tools, we convert our data to the commonly used EPSG:4326 projection.

#### 10. Saving Results:

Save the results to a CSV file.