In distributed databases, replica consistency ensures that multiple copies (replicas) of data remain synchronized, providing clients with a uniform view regardless of which replica they access.
Depending on the consistency model chosen by a specific database technology, the usage of a consensus algorithm like Paxos and Raft will be needed to help coordinate replicas in a distributed system to ensure a single, agreed-upon state.
Consistency models and consensus
In a distributed system, strong consistency means that all replicas reflect the same value after a write. This requires coordination among replicas to agree on the value and its order in the sequence of writes.
In eventually consistent models , updates propagate asynchronously. While consensus isn’t required for every update, it might be used for specific cases, such as: - Deciding the order of conflicting updates (e.g., merging divergent states in causal consistency). - Electing a leader or coordinating key events in eventual consistency. (like in DynamoDB) - Applying schema updates (Cassandra)
Consensus is generally not required when implementing causal consistency. Instead, systems use mechanisms like Lamport clocks or vector clocks to capture the causal relationships between operations.
| Consistency Model | Role of Consensus |
|---|---|
| Strong Consistency | Required to maintain linearizability and ensure agreement among replicas. |
| Eventual Consistency | Used selectively, e.g., for leader election or conflict resolution in specific cases. |
| Causal Consistency | Rarely needed, as causality is managed via timestamps or vector clocks. |
Strongly Consistent Systems
Consensus is critical in strongly consistent system to guarantee:
- linearizability, i.e. all operations appear to execute atomically in global order
- fault tolerance, such as a replica crash or a network partitioning For example, if you had a crash in the master node that has started replication, replica needs to agree on whether to include that data in their future reads.