In case if you don’t have any idea about geopandas, it is geospatial python library and is used to for analysis and processing of location based data.
In this tutorial, we will learn how to merge or union polygons in GeoPandas. By merging we mean how to do union of polygons in GeoPandas
We will perform two kind of merging.
- Union of all polygons
- Dissolving multiple polygons into one based on column value
Before merging multiple polygons into one, we will set up conda environment and install required python GIS packages for this task. Make sure conda and jupyter notebook is installed on your system. Use following commands to create a conda environment and to install python libraries.
We have done some geospatial visualization. For that, we have used jupyter notebook because some of the code is jupyter notebook specific. Please make sure you have installed jupyter notebook.
Along with GeoPandas, we will also install pygeos library which is used to speed up the vectorized operations in GeoPandas and Shapely.
1 2 3 4 |
(base) geoknight@pop-os:~$conda create -n spatial-dev.guru python=3.10 (base) geoknight@pop-os:~$conda activate spatial-dev.guru (spatial-dev.guru) geoknight@pop-os:~$conda install -c conda-forge geopandas (spatial-dev.guru) geoknight@pop-os:~$conda install -c conda-forge pygeos |
Once you have successfully installed required libraries, we will merge polygons
The complete source code is given below
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# import geopandas library import geopandas as gpd # Read shapefile county_subs = gpd.read_file('files/CA_County_Sub/CA_County_Sub.shp').to_crs('EPSG:3857') # Print columns county_subs.columns # complete union of polygons county_subs_union = county_subs.unary_union county_subs_union = gpd.GeoDataFrame(geometry=[county_subs_union], crs='EPSG:3857') county_subs_union.explore(cmap="Set1", style_kwds=dict(color="black")) # Dissolve polygons based on column value county_subs_dissolve = county_subs.dissolve(by='COUNTYFP', as_index=False) county_subs_dissolve.explore(column = 'COUNTYFP', cmap="Set1", style_kwds=dict(color="black")) |
We will divide this tutorial into three parts:
- Load spatial dataset
- unary_union of dataset
- dissolve polygons in dataset based on condition
1. Load Spatial Dataset
We will start with loading spatial dataset. We have dataset for California state of US sub counties. We import geopandas library. And then we will read shapefile using read_file method by passing the path for that file. This will return a geodataframe as shown below.
1 2 |
import geopandas as gpd county_subs = gpd.read_file('files/CA_County_Sub/CA_County_Sub.shp') |
To visualize the above read shapefile visually, we use explore method. This method will automatically create map and place the data on it.
1 |
county_subs.explore(column="COUNTYFP", cmap="Set1", style_kwds=dict(color="black")) |
2. unary_union of dataset
To perform complete union of spatial data in GeoPandas, we will use unary_union method as shown as below. This will return a single geometry which comprise the union of all polygons.
1 |
county_subs.unary_union |
3. Dissolve polygons in dataset based on condition
In this step, we will union polygons based on some column value. In below geodataframe, the column “COUNTYFP” acts as categorical column. This column tells to which counties the rows or records belongs to. We will dissolve polygons belonging to specific county into one.
As shown in below image, we have multiple polygons belongs to some specific county. In below case, the polygons belonging to county number 093 dissolved into one. In similar way, we will dissolve other polygons belonging to specific county into one.
To dissolve polygons, we will use dissolve method and we pass column name parameter accordingly to dissolve polygons. The we will use explore method to visualize the result on map. As you can see in below image, the polygons belonging to specific county are merged into one.
1 2 |
county_subs_dissolve = county_subs.dissolve(by='COUNTYFP', as_index=False) county_subs_dissolve.explore(column = 'COUNTYFP', cmap="Set1", style_kwds=dict(color="black")) |
The complete source code is given below
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# import geopandas library import geopandas as gpd # Read shapefile county_subs = gpd.read_file('files/CA_County_Sub/CA_County_Sub.shp').to_crs('EPSG:3857') # Print columns county_subs.columns # complete union of polygons county_subs_union = county_subs.unary_union county_subs_union = gpd.GeoDataFrame(geometry=[county_subs_union], crs='EPSG:3857') county_subs_union.explore(cmap="Set1", style_kwds=dict(color="black")) # Dissolve polygons based on column value county_subs_dissolve = county_subs.dissolve(by='COUNTYFP', as_index=False) county_subs_dissolve.explore(column = 'COUNTYFP', cmap="Set1", style_kwds=dict(color="black")) |
I hope this tutorial will create a good foundation for you. If you want tutorials on another GIS topic or you have any queries, please send an email at contact@spatial-dev.guru.