AWS DMS and Oracle LOB Columns: Why Your CDC Task Stalls at 2AM

How DMS Handles LOB Columns

AWS Database Migration Service uses Oracle LogMiner (or the Binary Reader for on-premises Oracle) to capture change data from the Oracle redo log stream. For most column types — VARCHAR2, NUMBER, DATE, TIMESTAMP — DMS reads the changed value directly from the redo log entry. Efficient, low-overhead, keeps up with high transaction rates easily.

LOB columns (CLOB, BLOB, NCLOB, XMLTYPE) are different. Oracle does not store the full LOB value inline in redo log entries the same way it stores scalar values. The redo log contains a LOB locator — a pointer to where the LOB data lives in the table's LOB segment — not the actual LOB content. To capture the new LOB value after an UPDATE, DMS cannot read it from the redo log alone. It has to issue a supplemental query back to the source Oracle database to fetch the current LOB value for the changed row.

This back-query pattern is the source of every LOB-related DMS problem. It turns a streaming replication process into a point-in-time query workload against your production source database, with all the implications that entails.

The Three LOB Modes and Their Trade-offs

DMS offers three LOB handling modes, configured in the task settings:

Limited LOB mode (default): DMS truncates LOB values to a configurable maximum size (default 32KB). Any LOB larger than the limit is silently truncated at the target. The task does not error. You get no warning unless you're specifically monitoring LOB column sizes. This mode is fast — no back-queries — but produces corrupted data for any application storing LOBs larger than the limit.

Full LOB mode: DMS fetches the full LOB value via back-query regardless of size. No truncation. But the task processes each LOB update synchronously — fetch the full LOB, then apply it to the target. For a batch job that updates 50,000 rows with LOB columns, DMS issues 50,000 individual LOB fetch queries against the source Oracle database. CDC throughput drops significantly. If the source Oracle database is under load during the batch window, DMS falls behind on redo log consumption — which is why tasks stall specifically at 2AM when the nightly batch runs.

Limited LOB mode with inline LOB: Available in newer DMS versions. DMS reads LOBs up to a configurable chunk size inline from redo where possible, only falling back to back-queries for LOBs that exceed the chunk threshold. Best of both modes for workloads with mixed small and large LOBs, but requires careful chunk size calibration.

-- DMS Task settings JSON for LOB handling
-- Configure via the AWS Console task settings or CLI

{
  "TargetMetadata": {
    "TargetSchema": "",
    "SupportLobs": true,
    "FullLobMode": false,
    "LobChunkSize": 0,
    "LimitedSizeLobMode": true,
    "LobMaxSize": 102400
  }
}

-- Limited LOB mode: LimitedSizeLobMode=true, LobMaxSize=102400 (100KB limit)
-- Full LOB mode:    FullLobMode=true, LimitedSizeLobMode=false
-- Inline LOB mode: LimitedSizeLobMode=true, LobChunkSize=16 (16KB chunks)

-- Check actual LOB sizes in source Oracle before choosing a limit
SELECT
  table_name,
  column_name,
  data_type,
  ROUND(AVG(DBMS_LOB.GETLENGTH(column_name)), 0) AS avg_lob_bytes,
  ROUND(MAX(DBMS_LOB.GETLENGTH(column_name)), 0) AS max_lob_bytes,
  COUNT(*) AS row_count
FROM your_table
-- Replace column_name with actual CLOB/BLOB column names
-- This query needs to be run per-column for each LOB column

Before configuring any LOB mode, audit your actual LOB column sizes. Most teams skip this step and discover after cutover that their "limited" mode was silently truncating 15% of LOB values that exceeded the default 32KB limit.

The Supplemental Logging Requirement

DMS requires Oracle supplemental logging to capture change data correctly. For tables with LOB columns, the supplemental logging requirement is more specific than for non-LOB tables.

-- Minimum supplemental logging for DMS (required for all tables)
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;

-- For tables with LOB columns, you need ALL COLUMNS supplemental logging
-- (not just PRIMARY KEY logging) to ensure DMS can correctly identify
-- which row to back-query when a LOB update occurs

-- Option 1: Enable ALL COLUMNS supplemental logging database-wide
-- (high redo overhead -- only appropriate for small databases)
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;

-- Option 2: Enable per-table for tables with LOB columns
-- (preferred approach -- targeted redo overhead)
ALTER TABLE documents ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
ALTER TABLE contracts ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
ALTER TABLE audit_trail ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;

-- Verify supplemental logging is configured
SELECT
  log_mode,
  supplemental_log_data_min AS min_supplemental,
  supplemental_log_data_pk  AS pk_supplemental,
  supplemental_log_data_ui  AS ui_supplemental,
  supplemental_log_data_fk  AS fk_supplemental,
  supplemental_log_data_all AS all_supplemental
FROM v$database;

Without ALL COLUMNS supplemental logging on LOB tables, DMS may fail to issue correct back-queries during CDC — it can't reliably construct the WHERE clause to identify the specific row whose LOB changed. The symptom is task errors like ORA-01403: no data found during LOB fetch operations, which cause the CDC task to stop and require a restart.

Why Tasks Stall During Batch Windows

The 2AM stall pattern is specific and predictable. Here's the sequence:

Nightly batch job starts. It updates 100,000 rows across several tables, many of which have CLOB columns storing document content, notes, or XML payloads.
Oracle generates redo log entries for all 100,000 updates. DMS consumes these via LogMiner, identifying 100,000 LOB-containing rows that need back-queries.
DMS begins issuing back-queries to Oracle: SELECT clob_column FROM documents WHERE rowid = 'AAA...', one per updated row.
Oracle is simultaneously running the batch job, generating new redo, and now also handling DMS's 100,000 supplemental queries. Source database load spikes.
DMS back-queries slow down because Oracle is under load. DMS falls behind on redo log consumption — it's spending more time fetching LOBs than consuming new redo entries.
DMS redo lag accumulates. The task's "CDC latency" metric climbs from near-zero to minutes to hours.
If the redo log retention window is shorter than the accumulated lag, DMS loses access to the redo it needs to resume. The task fails with a redo log access error and requires a full reload from scratch.

-- Monitor DMS task lag from CloudWatch (CLI)
aws cloudwatch get-metric-statistics \
  --namespace AWS/DMS \
  --metric-name CDCLatencySource \
  --dimensions Name=ReplicationInstanceIdentifier,Value=my-dms-instance \
  --start-time 2022-10-17T02:00:00Z \
  --end-time 2022-10-17T06:00:00Z \
  --period 300 \
  --statistics Average \
  --query 'Datapoints[*].[Timestamp,Average]' \
  --output table

-- CDCLatencySource: seconds between event time on source and DMS processing
-- CDCLatencyTarget: seconds between DMS processing and target apply
-- A spike in CDCLatencySource during batch windows confirms LOB back-query pressure

The Fix: Per-Table LOB Mode Configuration

The most effective solution is to configure LOB handling per-table rather than globally for the entire task. Tables without LOB columns should use the default (non-LOB) handling. Tables with LOB columns should use Full LOB mode if LOBs can exceed 32KB, or calibrated Limited LOB mode if you can confirm all LOBs are smaller than a known limit.

-- DMS table mapping rules with per-table LOB settings
-- Configure in the table mapping JSON for the DMS task

{
  "rules": [
    {
      "rule-type": "selection",
      "rule-id": "1",
      "rule-name": "select-all",
      "object-locator": {
        "schema-name": "APPSCHEMA",
        "table-name": "%"
      },
      "rule-action": "include"
    },
    {
      "rule-type": "transformation",
      "rule-id": "2",
      "rule-name": "lob-table-full-mode",
      "rule-action": "convert-uppercase",
      "rule-target": "table",
      "object-locator": {
        "schema-name": "APPSCHEMA",
        "table-name": "DOCUMENTS"
      },
      "lob-settings": {
        "bulk-max-size": "0",
        "mode": "unlimited"
      }
    }
  ]
}

XMLTYPE: A Special Case

Oracle XMLTYPE columns deserve separate mention. XMLTYPE is stored internally as either CLOB or as an object-relational structure depending on the storage option used when the column was defined. DMS handles XMLTYPE as a LOB, but with additional complexity:

XMLTYPE stored as CLOB can be migrated to CLOB or TEXT on the target with appropriate LOB settings
XMLTYPE stored as OBJECT-RELATIONAL cannot be replicated via CDC at all — DMS will error or skip these columns. Object-relational XMLTYPE requires a different migration strategy (export to CLOB, then migrate)
XMLTYPE with SECUREFILE storage behaves differently from BASICFILE storage in terms of redo generation and back-query behavior

-- Identify XMLTYPE column storage type on the source Oracle database
SELECT
  t.table_name,
  c.column_name,
  x.storage_type  -- 'OBJECT-RELATIONAL' or 'CLOB' or 'BINARY'
FROM dba_xml_tab_cols x
JOIN dba_tab_cols c
  ON c.table_name = x.table_name
  AND c.column_name = x.column_name
JOIN dba_tables t
  ON t.table_name = x.table_name
WHERE t.owner = 'APPSCHEMA'
ORDER BY t.table_name;

Oracle Redo Log Retention: The Dependency That Bites

Full LOB mode back-queries require the row data to still exist on the source database at the time DMS issues the query. For INSERT and UPDATE operations, this is generally fine — the row exists. For DELETE operations where you need to capture the LOB value before deletion (for audit trail or CDC targets that need the deleted value), DMS cannot back-query a deleted row. The LOB value for deleted rows is simply not available via back-query.

If your application requires capturing deleted LOB values in the CDC stream, you need to implement Oracle before-image supplemental logging — which stores the old LOB value in the redo log itself, allowing DMS to capture it without a back-query. This has significant redo log volume implications and requires careful testing against your redo log storage and archiving capacity.

For most migration projects, the practical recommendation is simpler: identify all tables with LOB columns, measure actual LOB sizes, configure appropriate LOB mode per table, size your DMS replication instance with enough CPU and memory to handle the back-query load during batch windows (at least r5.2xlarge for heavy LOB workloads), and set Oracle's redo log retention to cover at least 24 hours beyond your maximum expected CDC lag.

AWS DMS task falling behind or failing during batch windows?

We diagnose DMS CDC configurations for Oracle source databases and fix the LOB handling, supplemental logging, and redo retention issues causing lag. Free assessment.

Get a Free Assessment → More Articles