Understanding Real-Time Change Data Capture
Change Data Capture (CDC) is a pattern that tracks row-level changes — inserts, updates, and deletes — in a source database and delivers them downstream in near real-time. It's the foundation of any modern analytics architecture that can't afford to wait for nightly batch loads.
How Traditional ETL Falls Short
In a traditional batch ETL pipeline, data is extracted on a schedule — say, once every hour. That means:
- Stale data — dashboards can be up to 59 minutes behind reality.
- Full-table scans — without a reliable "last modified" column, the only safe option is to re-read the entire table every run.
- Missing deletes — if a row is deleted between two full scans, it silently disappears from the warehouse without a trace.
Enter CDC
CDC solves all three problems by reading the database's own transaction log (WAL in Postgres, binlog in MySQL, redo log in Oracle). Because the log is append-only and ordered, CDC guarantees:
- Sub-second latency — changes are captured as soon as they're committed.
- Incremental reads — only the rows that changed are processed, drastically reducing load on the source.
- Delete tracking — deletes appear as explicit events, so the warehouse can mark records accordingly.
How Planasonix Implements CDC
Planasonix's CDC engine is built on a lightweight Go-based connector framework:
- Log reader — a per-database module that connects to the source's replication slot (Postgres) or binlog stream (MySQL) and emits a normalized change-event stream.
- Schema tracker — automatically detects DDL changes (column adds, type changes) and propagates them to the destination without manual intervention.
- Exactly-once delivery — each change event carries a monotonically increasing LSN/offset that is checkpointed in the destination, preventing duplicates even across restarts.
- Backfill orchestrator — for initial loads, Planasonix snapshots the table in parallel chunks and then seamlessly switches to the live log stream once the snapshot is consistent.
Getting Started
Enable CDC for any supported source in three clicks:
- Toggle Real-time sync in the pipeline settings.
- Confirm that the source database has logical replication enabled.
- Hit Start — Planasonix handles the rest.
For a full walkthrough, see our CDC setup guide.