AWS Database Blog

Replicate spatial data using AWS DMS and Amazon RDS for PostgreSQL

In this post, we show you how to migrate spatial (geospatial) data from self-managed PostgreSQL, Amazon RDS for PostgreSQL, or Amazon Aurora PostgreSQL-Compatible Edition to Amazon RDS for PostgreSQL or Amazon Aurora PostgreSQL-Compatible Edition using AWS Database Migration Service (AWS DMS). Spatial data is useful for applications such as mapping, routing, asset tracking, and geographic visualization. We walk through setting up your environment, configuring AWS DMS, and validating the successful migration of spatial datasets.

Geospatial data represents physical network paths, building locations, and elements that cross space. It includes critical data that helps identify geographic locations and enables insight generation for various applications. This type of data can be represented using a 2D coordinate system with latitude and longitude coordinates. The data can take multiple forms including:

  • Points (such as GPS coordinates)
  • Lines (such as roads or rivers)
  • Polygons (such as borders and land use zones)
  • Collections of these features

AWS DMS treats these columns as large objects (LOBs) rather than native spatial types. Without configuring LOB handling and enabling PostGIS support on the target database, replication tasks may fail or lead to incomplete data migration.

Amazon RDS for PostgreSQL spatial data

PostgreSQL databases can use the PostGIS extension to handle geospatial data types and functions effectively. This extension enables storage and manipulation of geometry and geography spatial types while providing specialized spatial functions. When working with spatial data, PostGIS allows operations such as ST_Within for performing radius searches and distance calculations between geometries. The extension supports various geometric objects and can perform complex spatial operations such as checking if a polygon is contained within another polygon using simple SQL function such as ST_Contains since it’s a PostGIS spatial SQL function.

Amazon Relational Database Service (Amazon RDS) for PostgreSQL provides full integration with PostGIS functionality, making it suitable for applications requiring geospatial operations.

Solution overview

This solution uses AWS DMS to replicate geospatial data managed by the PostGIS extension. The source and target databases are both Amazon RDS for PostgreSQL. While the source and target can be different PostgreSQL versions, ensure that the PostGIS extension versions are compatible between two databases. AWS DMS supports cross-version PostgreSQL migrations.

Prerequisites

Before setting up the PostGIS extension for RDS for Postgres in both source and target, make sure you have the following:

  1. AWS setup:
    1. AWS account with permissions to create and manage AWS DMS resources
    2. Amazon RDS for PostgreSQL instances as source and target. For testing purposes, you can use a single RDS instance and replicate data between different databases within that instance to reduce costs.
    3. AWS DMS replication instance, endpoints, and AWS DMS task created
  2. Database configuration:
    1. PostGIS installed on both source and target databases
    2. To install PostGIS, connect to each database using your SQL client of preference, and enter the following command:
      CREATE EXTENSION postgis;
      

      Note: Without the PostGIS extension on the target database, the AWS DMS task will fail with errors such as “type geometry does not exist.”

    3. Tables must have primary key or unique key constraints
      CREATE TABLE public.province_boundary (
          province_id SERIAL PRIMARY KEY,  -- Required: Primary key for DMS replication
          name TEXT NULL,
          osm_id INT4 NULL,
          country_code TEXT NULL,
          concave_hull GEOMETRY(GEOMETRY, 3857) NULL,  -- Nullable for LOB handling
          admin_level INT4 NULL
      );

      Note: Geometry columns must be nullable on target tables for LOB handling.

      Note: AWS DMS will remove LOB columns if tables don’t have a primary key or unique constraint. This is critical for capturing LOB changes during CDC. AWS DMS requires geometry columns to be nullable on target tables for LOB handling. If your source table has NOT NULL constraints on geometry columns, migration will fail. You can reapply NOT NULL constraints after migration completes.

  3. Additional considerations:
    1. Review Managing spatial data with the PostGIS extension in the Amazon RDS documentation
    2. Evaluate which additional PostGIS extensions might be needed for your use case
    3. Ensure the database instance has sufficient resources for PostGIS operations. Monitor key Amazon CloudWatch metrics: CPUUtilization (<60%), FreeableMemory (>25% available), ReadIOPS/WriteIOPS, and FreeStorageSpace.

Best practices when configuring PostGIS extension

Consider the following best practices when configuring PostGIS:

  1. Validate PostgreSQL and PostGIS version compatibility. To validate the extension was created, use the following command:
    SELECT * FROM pg_extension where extname= 'postgis';
  2. Evaluate additional PostGIS extensions based on workload requirements.
  3. Monitor resource usage when executing complex spatial queries.
  4. Plan appropriate backup strategies for spatially enabled databases.

Implementation steps

Step 1: Begin by creating a spatial table in the source Amazon RDS for PostgreSQL database. To define spatial tables and load sample data, complete the following steps. The following example includes a geometry column defined using the 3857 spatial reference system (SRID):

  1. Create the sample table in your source database. Using the sample DDL provided in the prerequisites section (2c), create the table in your source Amazon RDS for PostgreSQL database.
  2. Insert the sample data in the province_boundary table by using the following query:
    INSERT INTO province_boundary (
        name,
        osm_id,
        country_code,
        concave_hull,
        admin_level
    )
    SELECT
        'lionkingbaby',
        123456,
        'TN',
        ST_SetSRID(
            ST_MakeLine(
                ARRAY(
                    SELECT ST_MakePoint(x * 0.01, sin(x * 0.01)) -- creates curved path
                    FROM generate_series(1, 400000) AS x          -- adjust this for size ~400KB
                )
            ),
            3857
        ),
        4;
  3. Enter the following query to measure the geometry size (in bytes):
    SELECT name,  pg_column_size(concave_hull) AS hull_size_bytes
    FROM province_boundary;

Step 2: Prepare Target Database and Configure AWS DMS

  1. Before starting the AWS DMS task, connect to your target Amazon RDS for PostgreSQL database and create the PostGIS extension:
    CREATE EXTENSION postgis;
    

    Note: AWS DMS creates the target table automatically if it doesn’t exist, regardless of the table preparation mode selected. However, the PostGIS extension must be installed beforehand.

  2. Configure the AWS DMS replication task Create a new AWS DMS task with the following configuration
    1. Migration type: Full load and CDC
    2. Target table preparation mode: Do nothing
    3. LOB mode: Limited LOB mode
    4. Maximum LOB size: 2048 KB
      Note: “The maximum LOB size for geospatial data cannot exceed 980,000 KB.”
    5. To determine the appropriate LOB size for your spatial data, use the queries provided in the AWS DMS documentation for determining LOB column sizes.
    6. Turn on CloudWatch logs: Essential for monitoring migration progress, troubleshooting issues, and validating successful data transfer.
    7. Enable log context for detailed monitoring
    8. Add selection rules to include your spatial tables (e.g., schema: public, table: province_boundary)
    9. After configuring all the settings, start the AWS DMS task to begin the migration.
  3. Monitor the migration process

Once the task starts, monitor the CloudWatch logs to verify successful data transfer. You should see log entries similar to the following:

2025-08-08T12:06:55 [SOURCE_UNLOAD] I: Unload finished for table 'public'.'province_boundary' (Id = 1). 7 rows sent. (streamcomponent.c:4010)

2025-08-08T12:06:55 [TARGET_LOAD] I: Load finished for table 'public'.'province_boundary' (Id = 1). 7 rows received. 0 rows skipped. Volume transferred 29203256. {connectionId:22438} (streamcomponent.c:4303)

Common errors that we encounter during the replication

During spatial data migration, you may encounter errors related to LOB handling or PostGIS data types. The following example shows a typical error when AWS DMS fails to process geometry data

00220859: 2025-04-03T16:51:27 [TASK_MANAGER] W: Table 'public'.'cluster' was errored/suspended (subtask 3 thread 1). Command failed to load data with exit error code 0 and exitwhy 1. Please check target database logs for more information.; Failed to wait for previous run; Failed to start load process for file '2'; Failed to load file '2' (replicationtask.c:3068)

Upon reviewing the database logs, an ‘invalid input json’ error was identified.

Generate test data on the source database

To test the spatial data migration with a larger dataset, use the following procedure to generate synthetic spatial data:

  1. Connect to your source Amazon RDS for PostgreSQL database using your preferred SQL client (psql, pgAdmin, and more)
  2. Run the following script to generate test data. You can change the value of `v_rows` to control how many rows are generated (the current script generates 10 rows)
    BEGIN;
    DO $$
     DECLARE
              
        v_pts_per_row  integer := 20000;  
        v_hull_pct     float   := 0.90;   
        i              integer;
         g              geometry;
     BEGIN
         FOR i IN 1..v_rows LOOP
             SELECT ST_SetSRID(
                      ST_ConcaveHull(
                        ST_Collect(ARRAY(
                          SELECT ST_MakePoint(
                                   i * 20000 + x * 2 + (random() * 50.0),
                                   i * 20000 + sin(x / 20.0) * 300.0 + (random() * 50.0)
                                )
                          FROM generate_series(1, v_pts_per_row) AS x
                        )),
                        v_hull_pct,
                        TRUE
                      ),
                      3857
                    )
             INTO g;
    INSERT INTO public.province_boundary(name, osm_id, country_code, concave_hull, admin_level)
             VALUES (
                 format('synthetic_%s', i),
                 900000 + i,
                 'TN',
                 g,
                 4
             );
         END LOOP;
     END$$;
    CREATE INDEX IF NOT EXISTS idx_province_boundary_geom
     ON public.province_boundary
     USING GIST (concave_hull);
    COMMIT;
  3. Verify the data was created successfully by checking the row count
    SELECT COUNT (*) FROM public.provience_boundary;
    

Post-migration validation

To validate your data after migration, follow these steps:

  1. Compare row counts between source and target tables.
  2. Verify that geometry data types are preserved correctly.
    Run the following query on both source and target databases to compare geometry columns

     SELECT
           f_table_name,
           f_geometry_column,
           srid,
           type
       FROM geometry_columns
       WHERE f_table_name = 'province_boundary';

    Ensure the SRID and geometry type match between source and target.

  3. Rebuild spatial indexes as needed.
  4. Validate the functionality of PostGIS spatial functions (for example, ST_Within, ST_Contains).
  5. Reapply NOT NULL constraints to geometry columns if they were temporarily made nullable.

Consider the following best practices for your migration:

  1. Implement table parallelism for efficient data loading.
  2. Monitor task progress through CloudWatch logs.

Clean up

To avoid ongoing charges, clean up all resources after migration including AWS DMS tasks, the AWS DMS replication instance, endpoints, and databases. For information about deleting AWS DMS tasks, refer to delete-replication-task. To delete the replication instance, refer to Deleting a replication instance. If you created test databases specifically for this migration and no longer need them, delete the Amazon RDS for PostgreSQL instances.

Conclusion

In this post, we walked through how to successfully migrate spatial data between Amazon RDS for PostgreSQL instances using AWS Database Migration Service. We covered the critical configuration steps for handling geospatial data as LOBs, including PostGIS extension setup, primary key requirements, LOB mode configuration, and CloudWatch monitoring. By following these guidelines ensuring tables have primary keys, making geometry columns nullable, and properly sizing LOB parameters you can avoid common pitfalls and achieve reliable replication of spatial datasets. Understanding these requirements is essential because spatial data presents unique challenges in database migration, requiring careful attention to both AWS DMS configuration and PostGIS-specific considerations.

We encourage you to try this solution in your own environment, starting with a test instance to validate the configuration before migrating production spatial data. If you have questions or insights from your spatial data migrations, please share them in the comments below.
 


About the authors

Ramdas Gutlapalli

Ramdas Gutlapalli

Ramdas is a Database Engineer at AWS. He is also a Subject Matter Expert in AWS DMS and Amazon RDS for Oracle. He holds 11 AWS Certifications, He helps enterprise customers optimize their databases on AWS, providing expert guidance for cloud migrations and technical improvements.

Rishika Kasani

Rishika Kasani

Rishika is a Cloud Support Engineer at AWS. She is a Subject Matter Expert in Amazon RDS PostgreSQL and AWS DMS, holding three AWS certifications. Rishika works closely with customers to troubleshoot and resolve database-related issues.