Three Replication Topologies
Why Replication Exists
Replication solves three distinct problems simultaneously: fault tolerance (if one node fails, others serve requests), read scalability (multiple nodes can serve reads), and reduced latency (serve reads from nodes geographically close to users). The topology you choose determines which of these you optimize for — and which you sacrifice.
Single-Leader (Primary-Replica)
One node — the primary or leader — is designated to accept all write operations. Replicas (followers) maintain a copy of the data and serve read-only queries. The primary streams its transaction log to replicas, which apply changes asynchronously.
- Synchronous replication: Primary waits for at least one replica to confirm receipt before acknowledging the write. Strong durability; slightly higher write latency.
- Asynchronous replication: Primary acknowledges the write immediately without waiting for replicas. Lower write latency; replication lag introduces a staleness window.
- Use when: Write volume fits comfortably on one node; read-heavy workload; strong consistency requirements. Examples: PostgreSQL with streaming replicas, MySQL with binlog replication.
Multi-Leader (Active-Active)
Multiple nodes simultaneously accept writes and replicate changes to each other. Common in multi-datacenter deployments where each datacenter has its own local leader, eliminating the cross-datacenter write round trip of the single-leader topology.
- Key advantage: Writes land in the nearest datacenter — no cross-region write latency.
- Key challenge: Write conflicts. When two clients update the same record on different leaders simultaneously, those writes are incompatible. The system must have a defined conflict resolution strategy.
Leaderless (Dynamo-Style)
There is no designated primary. Writes are sent to multiple nodes simultaneously and are considered successful once W nodes acknowledge. Reads are sent to R nodes and the client takes the most recent value (using version numbers or timestamps).
- Tunable parameters: N = total replicas, W = write quorum, R = read quorum. When W + R > N, you achieve strong consistency because any read set must overlap with any write set.
- Use when: Maximum write availability is required; multi-datacenter; eventual consistency is acceptable. Examples: Cassandra, DynamoDB, Riak.