GitHub - praszuk/osm-replication-osc-poly-filter: Simplifies setting up an OSM db in PostGIS with osm2pgsql, including regional PBF import with metadata, and handling minutely replication. (original) (raw)

This project simplifies creating and updating a PostGIS database for OSM data using osm2pgsql.

The target DB state:

Requirements

You will need:

How to use

The example is based on a Poland dump with metadata (February 2026), but I recommend trying this first with a smaller region before importing the entire country.

Load pbf dump

  1. Download the example project files:
  2. Prepare an OSM PBF file or download ready extract e.g. the latest Geofabrik region dump
    • Go to: https://download.geofabrik.de/europe/poland.html
    • Without metadata (recommended):
      * Download poland-latest.osm.pbf (1.8 GB)
    • With metadata (use it only if you really need to process all user ids, changeset ids, or usernames)
      * Click on Extracts with full metadata
      * Log in via an OSM account
      * Download poland-latest-internal.osm.pbf (2.3 GB)
    • Download the .poly file
    • Move both files to the data/ project directory
  3. Run containers:
    Copy or rename example.env to .env and change POSTGRES_PASSWORD.
    If you want to include metadata, you need to also uncomment/change OSM2PGSQL_USE_METADATA=true.
    Important:
    GeoFabrik and osm2pgsql metadata option – they are not the same thing:
    • OSM2PGSQL_USE_METADATA=true with GeoFabrik extracts without metadata, the db will contain all columns including: username, uid and changesetid, but with null values. The replication process will start filling them when an object is changed in OSM.
    • OSM2PGSQL_USE_METADATA=false (default) – osm2pgsql will not create columns: version, timestamp, username, uid and changeset. Replication will also not create them automatically.
      After changing the .env file, run:
      docker compose build
      docker compose up -d
      This will build containers and create an empty PostGIS database.
  4. Init the osm2pgsql schema and load the dump
    Loading the Poland .pbf took about 1.5 hours on my machine and about 4 hours in a Proxmox container with 4 GB of memory and 1 CPU. The DB size is 46 GB (from a 2.3 GB .pbf file), but during the import, the db grew over 70 GB in the post-processing/indexing step.
    Edit the last argument to match your downloaded filename!
    docker compose run --rm import create /data/poland-xxxx.osm.pbf

Replication

In general, this project tries to perform replication by checking if every object is in the polygon (poly), but to keep it efficient and due to limitations of the osc diff structure, there may still be unwanted objects imported to the db. For example:

These cases usually occur near borders or due to anomalies/vandalism. If this is important, you can always recreate the db from a dump periodically.

To understand how the filter works with different cases, you can check the integration osc filter tests.


To perform replication, run the command below. It will run endlessly. Initially, it won't sleep between iterations to allow faster sync. After catching up the latest state, it will sleep between each iteration.

Replace poland.poly with the downloaded .poly filename.

docker compose run --rm -it import replicate /data/poland.poly

Schema

If you want to change the schema and create your own tables, you can find examples in the osm2pgsql repo.

To modify the schema, edit file schemas/schema.lua.

If you want to remove all db data to recreate it with the new schema, use:

docker compose down --volumes

and repeat section Load pbf dump.

Explore

The database is running on localhost with the default port 5432. You can connect with your client/app or join via docker exec.

docker compose exec -it db bash -c 'psql -d POSTGRESDB−UPOSTGRES_DB -U POSTGRESDBUPOSTGRES_USER -p $POSTGRES_PORT'

Meta cheatsheet:

-- show databases with sizes \l+

-- list all tables (add + to show size) \d

-- show columns/schema for table \d table_name

Tag queries cheatsheet:

-- nodes with highway=bus_stop tag SELECT * FROM nodes WHERE tags @> '{"highway": "bus_stop"}' ORDER BY node_id DESC LIMIT 5;

-- nodes with highway=bus_stop and lit=yes SELECT * FROM nodes WHERE tags @> '{"highway": "bus_stop", "lit": "yes"}' ORDER BY node_id DESC LIMIT 3;

-- nodes with highway=bus_stop without public_transport=platform SELECT * FROM nodes WHERE tags @> '{"highway": "bus_stop"}' AND NOT tags @> '{"public_transport": "platform"}' ORDER BY node_id DESC LIMIT 3;

-- nodes with highway=bus_stop in 'powiat opolski' area based on teryt:terc tag SELECT * FROM nodes n WHERE tags @> '{"highway": "bus_stop"}' AND ST_Contains((SELECT r.geom FROM relations r WHERE tags @> '{"teryt:terc": "1609"}' LIMIT 1), n.geom);

-- ways with highway=residential and their road names or none if they don't have SELECT way_id, tags->'name' AS name FROM ways WHERE tags @> '{"highway": "residential"}' ORDER BY way_id DESC LIMIT 3;

-- nodes with created_by key SELECT * FROM nodes WHERE tags ? 'created_by' ORDER BY node_id DESC LIMIT 3;

Note: Columns tags and members use jsonb type. You can find more in postgres docs.

Read only user

If you want to create a read-only database user (for example, to share access across multiple apps), run:

docker compose exec -it db bash -c "psql -d $POSTGRES_DB -U $POSTGRES_USER -p $POSTGRES_PORT -c "CREATE ROLE osm_ro WITH LOGIN PASSWORD 'osm_ro'; GRANT CONNECT ON DATABASE $POSTGRES_DB TO osm_ro; GRANT USAGE ON SCHEMA public TO osm_ro; GRANT SELECT ON ALL TABLES IN SCHEMA public TO osm_ro; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO osm_ro;""

Login / Pass: osm_ro

License

MIT