GeoPandas is great python library to process your vector data. It has direct support for postgis database where you can directly read/write from/to postgis database. GeoPandas internally uses SqlAlchemy API to manage database specific operations like reading from PostGIS, writing to PostGIS etc. SqlAlchemy can be used for almost all the databases out there like PostgreSQL, MySQL, Oracle Database, SQL Server.
GeoPandas provides read_postgis and to_postgis methods to read/write from/to PostGIS database. These methods take SqlAlchemy connection object to read/write from/to database. These methods provides very simple, elegant method to read and write from/to PostGIS database.
Psycopg2 is used only with PostgreSQL database. GeoPandas read_postgis method also works with Psycopg2 connection object. But to_postgis doesn’t work with Psycopg2 connection object.
GeoPandas to_postgis don’t support PsycoPG2. We have created multiple methods to support this functionality and are self explanatory. See code snippet below. We will talk mainly about to_postgis_using_psycopg2 method defined at line number 59 at below code snippet.
We will focus mainly at line number 74 where we are converting GeoDataframe geometries to WKB(Web Known Binary text) hexadecimal string using to_wkb method of PYGEOS module.
WKB hexadecimal string is a type of encoding that PostGIS use to store geometrical data.
By converting geometries to WKB hexadecimal string, we are making easy for postgis to insert the database from GeoDataFrame. Once we have converted the geometries to WKB hexadecimal string, we are converting the GeoDatafame to tuple and then finally inserting it to Database. See line number 81.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 |
import geopandas as gpd import psycopg2 import psycopg2.extras as extras from pygeos import to_wkb, set_srid # Create connection def create_connection(credentials): conn = psycopg2.connect(**credentials) cur = conn.cursor() return [conn, cur] # Close connection def close_connection(conn, cur): conn.close() cur.close() def does_table_exist(cur, name, schema): """ Check if table exists in the database. """ sql = f""" SELECT * from pg_tables where schemaname = '{schema}' and tablename = '{name}' """ cur.execute(sql) status = cur.fetchone() if status is None: return False else: return True def createTableFromGDF(gdf, conn, cur, name, schema, geom_name): """ Create table based on geopandas dataframe. """ if_exists = does_table_exist(cur, name, schema) if if_exists is False: srid = gdf.crs.to_epsg() table_name = schema + "." + name # Create a table query for geopandas file based on the columns list create_table_query = 'GID SERIAL PRIMARY KEY' for column in gdf.columns: column_type = str(gdf.dtypes[column]) if column == geom_name: create_table_query += f', {geom_name} GEOMETRY(GEOMETRY, {srid})' elif column_type.find('int') != -1: create_table_query += ', ' + column + ' INTEGER' elif column_type.find('float') != -1: create_table_query += ', ' + column + ' NUMERIC' else: create_table_query += ', ' + column + ' TEXT' create_table_query = 'CREATE TABLE ' + table_name + '(' + create_table_query + ')' cur.execute(create_table_query) conn.commit() def to_postgis_using_psycopg2(gdf, conn, cur, name, schema = "public", geom_name="geom"): """ Using psycopg2 to export geopandas to postgis database. pygeos method to_wkb is used to convert geometries to wkb(well known binary) format. hex=true will return Hexadecimal string of the wkb which can be stored in postgis geometry column. set_srid is used to set the srid of the geometries. """ if geom_name not in gdf.columns: gdf = gdf.rename(columns={gdf.geometry.name: geom_name}).set_geometry(geom_name, crs=gdf.crs) createTableFromGDF(gdf, conn, cur, name, schema, geom_name) srid = gdf.crs.to_epsg() #convert geom to wkb hex string geom = to_wkb( set_srid(gdf[geom_name].values.data, srid=srid), hex=True, include_srid=True ) gdf[geom_name] = geom tuples = [tuple(x) for x in gdf.to_numpy()] cols = ','.join(list(gdf.columns)) query = "INSERT INTO %s(%s) VALUES %%s" % (name, cols) extras.execute_values(cur, query, tuples) conn.commit() if __name__ == "__main__": credentials = {"user":"postgres", "password":"admin", "host":"localhost", "port":"5432", "database":"postgres"} [conn, cur] = create_connection(credentials) counties = gpd.read_file("shps/germany_counties.shp") name = "counties" schema = "public" to_postgis_using_psycopg2(counties, conn, cur, name, schema, geom_name="geom") close_connection(conn, cur) |
I hope this tutorial will create a good foundation for you. If you want tutorials on another GIS topic or you have any queries, please send an email at contact@spatial-dev.guru.
Pingback: Building a Custom Geocoding Service with Autocomplete using Python, PostGIS, and OpenLayers for Address Lookup - Spatial Dev Guru