Extracting Data from OSM PBF Files Using ogr2ogr tool

EmailTwitterLinkedInFacebookWhatsAppShare

Problem Background

OpenStreetMap (OSM) is a widely used open-source mapping platform that provides geographic data in various formats. One of the most common formats for downloading large datasets is the PBF (Protocolbuffer Binary Format) file, which is compact and efficient for storing geospatial data. However, working with PBF files can sometimes be challenging due to their size and complexity.

For example:

  • Opening large PBF files directly in GIS software like QGIS can cause the application to freeze or crash because of memory constraints.
  • Extracting specific regions or features from the PBF file requires additional tools and techniques to efficiently filter and process the data.

In this tutorial, we will address these challenges by using the powerful command-line tool ogr2ogr to extract and convert data from an OSM PBF file into a more manageable GeoJSON format. We’ll also demonstrate how to clip the data to a specific region and filter specific layers (e.g., multipolygons).


Tools and Prerequisites

Before we begin, ensure you have the following installed on your system:

  1. GDAL/OGR: A geospatial data processing library that includes the ogr2ogr utility. You can install it via package managers like apt, brew, or download it from GDAL’s official website.
  • On Ubuntu: sudo apt install gdal-bin
  • On macOS: brew install gdal
  1. OSM PBF File: Download the PBF file for your region of interest from Geofabrik or another source.
  2. GeoJSON Clip File: A GeoJSON file defining the boundary of the region you want to extract (e.g., tricity.json in this example).
  3. Basic Command-Line Knowledge: Familiarity with running commands in a terminal or command prompt.

Step-by-Step Guide

Step 1: Understand the Input Data

  • Input PBF File: This is the raw OSM data file (e.g., northern-zone-latest.osm.pbf).
  • Clip File: A GeoJSON file (tricity.json) that defines the spatial extent of the region you want to extract.
  • Target Layer: The layer you want to extract (e.g., multipolygons).

Step 2: Prepare Your Environment

Ensure that ogr2ogr is installed and accessible from the command line. Test it by running:

You should see the version number of GDAL/OGR.

Place all required files (northern-zone-latest.osm.pbf, tricity.json) in a single directory for convenience.

Step 3: Run the ogr2ogr Command

The command you provided is already well-structured. Let’s break it down and explain each part:

Explanation of Parameters:

  1. -f "GeoJSON": Specifies the output format as GeoJSON.
  2. multipolygons.geojson: The name of the output file where the extracted data will be saved.
  3. northern-zone-latest.osm.pbf: The input PBF file containing the raw OSM data.
  4. -clipsrc tricity.json: Clips the data to the spatial extent defined in tricity.json.
  5. -sql "SELECT * FROM multipolygons": Filters the data to include only features from the multipolygons layer.

Step 4: Execute the Command

  1. Open your terminal or command prompt.
  2. Navigate to the directory containing your input files:
  1. Run the command:

Step 5: Verify the Output

After the command completes, you should see a new file named multipolygons.geojson in your directory. This file contains the clipped and filtered data in GeoJSON format.

You can now open this file in QGIS or any other GIS software to visualize and analyze the extracted data.


Additional Tips

1. Handling Large PBF Files

If your PBF file is extremely large, consider splitting it into smaller regions using tools like osmium-tool or osmosis. For example:

2. Exploring Available Layers

To list all available layers in your PBF file, use:

This will display layers such as points, lines, multipolygons, etc.

3. Optimizing Performance

  • Use the -gt option to specify the maximum number of features per transaction (e.g., -gt 10000).
  • If memory usage is an issue, consider increasing the swap space on your system.

4. Automating Workflows

You can write shell scripts to automate repetitive tasks. For example:


Conclusion

By following this tutorial, you can efficiently extract and process geospatial data from large OSM PBF files using ogr2ogr. This approach avoids the performance issues associated with opening PBF files directly in GIS software and allows you to focus on specific regions and layers of interest. With these skills, you can streamline your geospatial workflows and unlock the full potential of OpenStreetMap data.

I hope this tutorial will create a good foundation for you. If you want tutorials on another GIS topic or you have any queries, please send an mail at contact@spatial-dev.guru.

Leave a ReplyCancel reply

Discover more from Spatial Dev Guru

Subscribe now to keep reading and get access to the full archive.

Continue reading

Discover more from Spatial Dev Guru

Subscribe now to keep reading and get access to the full archive.

Continue reading