Skip to content

How do I resolve schema change and performance issues when I replicate data in AWS DMS?

7 minute read
0

When I use AWS Database Migration Service (AWS DMS) to replicate data, tables are suspended from replication after data definition language (DDL) changes on the source. New columns are missing on the target, or data mismatches occur between source and target.

Resolution

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.

Configure DDL handling policy settings

AWS DMS replicates supported DDL statements during change data capture (CDC).

If your DDL operation runs on the source and the change doesn't replicate to the target, then AWS DMS suspends the table from replication.

To resolve this issue, configure the ChangeProcessingDdlHandlingPolicy task settings to control how AWS DMS handles DDL changes.

Manually apply DDL statements to the target

If AWS DMS doesn't automatically replicate your DDL statements to the target, then manually add them.

Complete the following steps:

  1. Stop the replication task.
  2. Use a SQL client to connect to your target database, and then run the same DDL statements that you ran on the source. For example, if you made the following DDL changes on the source, then apply the same DDL changes on the target:
    ALTER TABLE your-table-name ADD COLUMN your-column-name VARCHAR (100); 
    ALTER TABLE your-table-name DROP COLUMN your-column-name; 
    CREATE INDEX your_index_name ON your-table-name; 
    Note: Replace your-table-name with your table name, your-column-name with your column name and your-index-name with your index name. The syntax for DDL statements might vary depending on the source and target database engine.
  3. To identify the DDL statements that failed to replicate, check the AWS DMS task logs in Amazon CloudWatch. In the TARGET_APPLY component, search for error entries that have the E: prefix.
  4. To restart the replication task, run the following start-replication-task AWS CLI command:
    aws dms start-replication-task --replication-task-arn --start-replication-task-type resume-processing
    Note: Replace your-task-arn with your Amazon Resource Name (ARN).
  5. Run the missing DDL statements on the target database.

For more information, see When do I resume or restart my AWS DMS task that's in the Stopped or Failed status?

Resolve missing columns on your target after DDL changes

If you use one of the following AWS services as your target, then you must manually restart your task:

  • Amazon Simple Storage Service (Amazon S3)
  • Amazon Kinesis Data Streams
  • Amazon Managed Streaming for Apache Kafka (Amazon MSK)

After schema changes on your S3 target, data might be missing or duplicated because Amazon S3 doesn't support primary keys, in-place updates, or referential integrity. To resolve this issue, see How do I resolve inconsistent data when I use an Amazon S3 bucket as the target for AWS DMS migration?

When you add a column on the source, AWS DMS captures the table structure as the task starts and doesn't refresh the task. As a result, the new column doesn't appear in the target output files or streams.

To resolve this issue, stop the replication task. Then, run the following start-replication-task AWS CLI command to restart the task:

aws dms start-replication-task --replication-task-arn --start-replication-task-type reload-target

Note: Replace your-task-ARN with your ARN.

The restart forces AWS DMS to reload the table structure from the source, including new columns.

Resolve DDL scenarios that cause table suspension

Column and table modification issues

You're experiencing one of the following issues:

  • You add a column to the source, but the table is suspended on the target. The target doesn't support the data type.
  • You drop a column on the source, and the table is suspended on the target. The target has dependent objects on the dropped column.
  • You rename a table on the source, but the target still uses the old name. The RENAME TABLE statement doesn't support your source engine.
  • The TRUNCATE TABLE command causes data mismatch. 

To resolve these issues, first stop the replication task

If the target doesn't support the data type, then check the AWS DMS task logs in CloudWatch to identify the DDL statements that failed to replicate. In the TARGET_APPLY component, search for error entries that have the E: prefix.

Run the following ALTER command to manually create the column on the target:

ALTER TABLE your-table-name ADD your-column-name datatype;

Note: Replace your-table-name with your table name, your-column-name with your column name, and datatype with a supported data type. The syntax for DDL statements might vary depending on the source and target database engine.

If the target has dependent objects on the dropped column, then remove dependent objects on the target that reference the column. Then, run the following command on the target to manually drop the column:

ALTER TABLE your-table-name DROP COLUMN your-column-name;

Note: Replace your-table-name with your table name and your-column name with your column name. The syntax for DDL statements might vary depending on the source and target database engine.

If the RENAME TABLE statement doesn't support your source engine, then run the following command to manually rename the table on the target:

ALTER TABLE old_table_name RENAME TO new_table_name;

Note: Replace old_table_name with your table name and new_table_name with your new table name. The syntax for DDL statements may vary depending on the source and target database engine.

If the TRUNCATE TABLE command causes a data mismatch, then run the following command to manually remove stale data from the target:

TRUNCATE TABLE your_table_name;

Note: Replace your_table_name with your table name.

To automatically replicate future TRUNCATE operations to the target, set the HandleSourceTableTruncated parameter to true in your task settings.

After you make the necessary column or table changes, run the following start-replication-task command to restart the task: 

aws dms start-replication-task --replication-task-arn --start-replication-task-type resume-processing

Note: Replace your-task-arn with your ARN.

For more information, see How do I troubleshoot data mismatch issues between the source and target databases in AWS DMS?

New table not replicated

When you create a new table on the source after the task started and the table isn't on the target, AWS DMS doesn't detect the table.

To resolve this issue, add the following new table to the task table for mapping rules: 

{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "schema_name",
                "table-name": "table_name"
            },
            "rule-action": "explicit"
        }
    ]
}

Note: Replace schema_name with your schema name and table_name with your new table name.

Then, reload the table or restart the task.

AWS DMS misses changes from consecutive DDL sequences

When the target schema doesn't match the source after you consecutively run multiple DDL statements, AWS DMS might not correctly parse the source log.

To resolve this issue, wait for AWS DMS to apply each change to the target before you run the next DDL on the source.

Resolve data inconsistencies caused by DDL changes that didn't replicate

If AWS DMS doesn't replicate your DDL change to the target, then data manipulation language (DML) operations fail because the target table structure doesn't match the source.

To identify and resolve data inconsistencies, complete the following steps:

  1. Set the EnableValidation parameter to true in your data validation task settings.

  2. Run the following query on the target validation failures table to view specific mismatched rows:

    SELECT * FROM awsdms_validation_failures_v1;
  3. Turn on AWS DMS Data resync to automatically resolve inconsistencies.
    Note: Data resync supports only Oracle and Microsoft SQL Server as a source and PostgreSQL and Amazon Aurora PostgreSQL-Compatible Edition as a target.

  4. If you review the output and there's mismatch of rows or affected tables, then stop the task and apply the missing DDL on the target.

  5. Reload the affected tables.

Related information

Selection rules and actions

Troubleshooting migration tasks in AWS DMS

Performing bidirectional replication