Consistency Models: From Strong to Eventual
Understanding the spectrum of consistency models in distributed systems, from linearizability and sequential consistency to causal and eventual consistency, and the trade-offs each model makes.
Terminology
- Consistency model: a contract between a distributed system and its clients that defines which values a read operation may return after a write, and in what order operations appear to execute
- Linearizability: the strongest consistency model; every operation appears to take effect atomically at some point between its invocation and its response, and all operations are ordered consistently with real time
- Sequential consistency: all operations appear to execute in some sequential order, and each process's operations appear in the order they were issued, but the global order need not respect real-time ordering
- Causal consistency: operations that are causally related (one depends on the result of another) are seen in the same order by all nodes; concurrent (causally unrelated) operations may be seen in different orders
- Eventual consistency: if no new writes occur, all replicas will eventually converge to the same value; there is no bound on how long convergence takes
- Strong consistency: an informal term often used interchangeably with linearizability; every read returns the most recent write
- Read-your-writes consistency: after a client writes a value, that same client's subsequent reads will always see the written value or something newer
- Monotonic reads: if a client reads a value at time $t_1$, any subsequent read at $t_2 > t_1$ will return the same value or a newer one, never an older one
- Monotonic writes: writes by a single client are applied in the order they were issued; a later write is never applied before an earlier write from the same client
- Happens-before: a partial ordering of events in a distributed system; event $A$ happens before event $B$ if $A$ could have influenced $B$ (same process order, or message send before receive)
- Concurrent events: two events are concurrent if neither happens before the other; the system cannot determine which occurred first
- Stale read: a read that returns an outdated value because the replica has not yet received the latest write
- Total order: an ordering where every pair of operations is comparable; one definitively comes before the other
- Partial order: an ordering where some pairs of operations are comparable (causally related) and others are not (concurrent)
What & Why
When data is replicated across multiple nodes, a fundamental question arises: what does a client see when it reads? If a write just happened on node A, does a read on node B see it immediately, eventually, or never until some condition is met?
Consistency models answer this question by defining the rules of the game. They are a contract: the system promises certain behaviors, and the application can rely on those promises. Stronger models (like linearizability) are easier to program against because they behave like a single-node system. Weaker models (like eventual consistency) are harder to reason about but allow the system to be faster and more available.
The trade-off is real. Linearizability requires coordination between nodes on every operation, which adds latency and reduces availability during network partitions. Eventual consistency requires no coordination, so it is fast and always available, but the application must handle the possibility of reading stale or conflicting data.
Most real systems do not pick one extreme. They offer tunable consistency or use different models for different operations. Understanding the full spectrum lets you choose the right model for each part of your application.
How It Works
The Consistency Spectrum
Linearizability
The gold standard. Every operation appears to happen at a single, atomic point in time between when the client sends the request and receives the response. If client A's write completes before client B's read begins (in real time), client B is guaranteed to see client A's write.
Linearizability requires coordination. Typically, a consensus protocol (Raft, Paxos) or synchronized clocks (Google's TrueTime) enforce the ordering. This adds at least one round-trip of latency to every operation and makes the system unavailable during network partitions (per the CAP theorem).
Sequential Consistency
Slightly weaker than linearizability. All operations appear to execute in some total order, and each client's operations appear in the order they were issued. However, the global order does not need to match real-time ordering. Two clients might disagree about which of two concurrent writes happened first, as long as they both see the same order.
The practical difference: with linearizability, if you call your friend on the phone and say "I just updated the record," your friend is guaranteed to see the update. With sequential consistency, there is a window where your friend might not see it yet, even though your write has completed.
Causal Consistency
Only causally related operations must be seen in the same order by all nodes. If operation A causes operation B (e.g., A writes a value that B reads), then every node sees A before B. But if A and B are concurrent (neither caused the other), different nodes may see them in different orders.
Causal consistency is attractive because it is the strongest model that does not require synchronous coordination. It can be implemented with vector clocks or dependency tracking, and it remains available during network partitions.
Eventual Consistency
The weakest useful model. The system only guarantees that if writes stop, all replicas will eventually converge to the same state. There is no bound on how long this takes, and during convergence, different clients may see different values.
Eventual consistency is the default for many distributed databases (Cassandra, DynamoDB with default settings) because it allows maximum availability and minimum latency. The application must be designed to tolerate stale reads and potential conflicts.
Session Guarantees
Between causal and eventual consistency, several session-level guarantees help applications cope with weak consistency:
Read-your-writes: After you write a value, your subsequent reads always see it. Other clients may not see it yet, but you always see your own writes.
Monotonic reads: Once you read a value, you never see an older value in a subsequent read. Your view of the data only moves forward.
Monotonic writes: Your writes are applied in the order you issued them. If you write A then B, no replica applies B before A.
Complexity Analysis
| Model | Coordination Cost | Availability | Read Latency |
|---|---|---|---|
| Linearizable | $O(n)$ messages per op | Unavailable during partition | $O(\text{RTT} \times \lceil n/2 \rceil)$ |
| Sequential | $O(n)$ messages per op | Unavailable during partition | $O(\text{RTT})$ |
| Causal | $O(1)$ metadata per op | Available during partition | $O(1)$ local |
| Read-your-writes | $O(1)$ session tracking | Available during partition | $O(1)$ with sticky sessions |
| Eventual | $O(0)$ no coordination | Always available | $O(1)$ local |
For linearizability with a quorum of $n$ nodes, the minimum number of messages per read or write:
This accounts for the request to a quorum and the quorum's responses. In practice, Raft-based systems require the leader to contact a majority for every write.
For causal consistency, each operation carries a vector clock of size $n$ (number of nodes):
This can be compressed using techniques like dotted version vectors, reducing the practical overhead to $O(k)$ where $k$ is the number of concurrent writers.
Convergence time for eventual consistency under continuous writes with replication lag $\delta$:
Implementation
ALGORITHM LinearizableRead(key, replicas, quorum)
INPUT: key: data key, replicas: list of N nodes, quorum: majority count
OUTPUT: value
BEGIN
// Phase 1: Read from quorum
responses <- empty list
FOR EACH replica IN replicas DO
response <- SEND_READ(replica, key)
IF response is successful THEN
APPEND response TO responses
END IF
IF LENGTH(responses) >= quorum THEN
BREAK
END IF
END FOR
IF LENGTH(responses) < quorum THEN
RETURN error("quorum not reached")
END IF
latest <- response IN responses WITH MAX(timestamp)
// Phase 2: Write-back to ensure linearizability
// (ensures a subsequent read from any quorum sees this value)
writeBackCount <- 0
FOR EACH replica IN replicas DO
response <- SEND_WRITE(replica, key, latest.value, latest.timestamp)
IF response is successful THEN
writeBackCount <- writeBackCount + 1
END IF
IF writeBackCount >= quorum THEN
BREAK
END IF
END FOR
RETURN latest.value
END
ALGORITHM CausalConsistencyWrite(node, key, value, causalContext)
INPUT: node: this node, key: data key, value: data to write, causalContext: vector clock from client
OUTPUT: updated vector clock
BEGIN
// Merge client's causal context with node's clock
FOR EACH nodeID IN causalContext DO
node.vectorClock[nodeID] <- MAX(node.vectorClock[nodeID], causalContext[nodeID])
END FOR
// Increment this node's entry
node.vectorClock[node.id] <- node.vectorClock[node.id] + 1
// Store the value with the vector clock
node.store.put(key, value, COPY(node.vectorClock))
// Replicate asynchronously to peers
FOR EACH peer IN node.peers DO
ASYNC_SEND_REPLICATE(peer, key, value, COPY(node.vectorClock))
END FOR
RETURN COPY(node.vectorClock)
END
ALGORITHM CausalConsistencyRead(node, key, causalContext)
INPUT: node: this node, key: data key, causalContext: client's vector clock
OUTPUT: value and updated context, or wait
BEGIN
(storedValue, storedClock) <- node.store.get(key)
// Check if this node has seen all writes the client depends on
FOR EACH nodeID IN causalContext DO
IF node.vectorClock[nodeID] < causalContext[nodeID] THEN
// This node is behind; wait for replication or redirect
WAIT_UNTIL node.vectorClock[nodeID] >= causalContext[nodeID]
END IF
END FOR
RETURN (storedValue, storedClock)
END
ALGORITHM ReadYourWritesSession(client, replicas, key)
INPUT: client: client with session state, replicas: list of nodes, key: data key
OUTPUT: value
BEGIN
lastWriteTimestamp <- client.session.lastWriteTimestamp[key]
FOR EACH replica IN replicas DO
response <- SEND_READ(replica, key)
IF response.timestamp >= lastWriteTimestamp THEN
RETURN response.value
END IF
END FOR
// Fallback: read from the node that accepted the write
response <- SEND_READ(client.session.lastWriteNode, key)
RETURN response.value
END
ALGORITHM EventualConvergenceCheck(replicas, key)
INPUT: replicas: list of all replica nodes, key: data key
OUTPUT: converged (boolean)
BEGIN
values <- empty set
FOR EACH replica IN replicas DO
response <- SEND_READ(replica, key)
ADD response.value TO values
END FOR
IF SIZE(values) = 1 THEN
RETURN TRUE // all replicas agree
ELSE
RETURN FALSE // still divergent
END IF
END
Real-World Applications
- Google Spanner: provides linearizability using TrueTime (GPS and atomic clocks) to assign globally consistent timestamps, enabling strong consistency across datacenters with bounded clock uncertainty
- Apache Cassandra: defaults to eventual consistency but supports tunable consistency levels per query (ONE, QUORUM, ALL), letting applications choose the trade-off per operation
- Amazon DynamoDB: offers eventual consistency by default for reads, with an option for strongly consistent reads that route to the leader replica at higher latency
- MongoDB causal sessions: provides causal consistency within a client session, ensuring read-your-writes, monotonic reads, and monotonic writes without requiring global coordination
- CockroachDB: implements serializable isolation (equivalent to linearizability for transactions) using a combination of hybrid logical clocks and Raft-based replication
- Redis: single-node Redis is linearizable; Redis Cluster with asynchronous replication provides eventual consistency, with the risk of losing acknowledged writes if the primary fails before replication
Key Takeaways
- Consistency models define what values a read may return after a write; stronger models are easier to program against but require more coordination
- Linearizability is the strongest model: operations appear atomic and respect real-time ordering, but it requires consensus and is unavailable during partitions
- Sequential consistency relaxes real-time ordering but still provides a total order; causal consistency only orders causally related operations
- Eventual consistency provides no ordering guarantees during operation but promises convergence when writes stop; it enables maximum availability and minimum latency
- Session guarantees (read-your-writes, monotonic reads) provide useful middle ground between causal and eventual consistency
- Most production systems offer tunable consistency, letting applications choose the right model per operation based on their specific requirements