Linearizability

Prerequisites

Intuition

Imagine a shared bank account with one balance. Two people try to withdraw at the same time. A linearizable system behaves as if one withdrawal happened first, then the other saw the updated balance. Even though both requests were in flight simultaneously, the system never lets both see the old balance and both succeed.

Linearizability is the strongest consistency guarantee for concurrent systems. It means: the system behaves as though there is a single copy of the data, and every operation takes effect atomically at some instant between when it was invoked and when it returned.

Formal definition

A system is linearizable (Herlihy & Wing, 1990) if for every concurrent execution, there exists a sequential ordering of all operations such that:

  1. Atomicity: each operation appears to take effect at a single indivisible point in time — its linearization point.
  2. Real-time ordering: if operation A’s response comes before operation B’s invocation, then A must appear before B in the sequential order.
  3. Sequential specification: the resulting sequential history is valid according to the object’s specification (e.g., a register returns the last value written to it).

The key subtlety is what happens with concurrent operations — operations whose invocation and response intervals overlap in time. Linearizability allows these to be ordered in either direction, because neither provably happened “before” the other.

Client 1:  |--- write(42) ---|
Client 2:       |--- read() -> ? ---|

The read overlaps the write. A linearizable register can return either the old value or 42 — both are valid linearizations. But if the read starts after the write completes:

Client 1:  |--- write(42) ---|
Client 2:                         |--- read() -> must return 42 ---|

The read must return 42. The write’s response preceded the read’s invocation, so the real-time ordering constraint applies.

The “real-time before” problem

The formal definition talks about real-time ordering, but in a distributed system there is no global clock. Two requests arriving from different clients on different machines have no observable real-time ordering unless one causally depends on the other.

The only ways to establish a “before” relationship are:

  • Same client, sequential calls: the client sends A, waits for the response, then sends B. The client serialized them.
  • Causal chain: Client 1 completes A, communicates the result to Client 2, and Client 2 then sends B.
  • Synchronized clocks: systems like Google Spanner use TrueTime (GPS + atomic clocks) to bound clock uncertainty and establish ordering across nodes.

Two independent users submitting requests at roughly the same time? There is no “before.” Network jitter, load balancer routing, and thread scheduling all destroy any notion of ordering. And that’s fine — linearizability explicitly permits either ordering for truly concurrent operations.

How to achieve it

Single-node: it’s just a transaction

A single-primary database (MySQL/InnoDB, PostgreSQL) where all reads and writes go to the primary is linearizable for single-object operations by default. SELECT ... FOR UPDATE within a transaction gives you:

  • Mutual exclusion: the row-level lock blocks concurrent access
  • Atomicity: check-then-act happens within one transaction
  • Durability and visibility: after commit, every subsequent operation sees the updated state

No consensus protocol, no distributed coordination, no clock synchronization. The database engine’s concurrency control is sufficient.

BEGIN;
SELECT balance FROM account WHERE id = ? FOR UPDATE;
-- application checks: balance >= withdrawal_amount
UPDATE account SET balance = balance - ? WHERE id = ?;
COMMIT;

The second concurrent transaction blocks at FOR UPDATE until the first commits, then sees the correct balance. This is linearizable.

Multi-node: you need consensus

On a single node, a mutex serializes operations. In a distributed system with replicated state, you need all nodes to agree on the same order of operations. A consensus protocol (like Paxos or Raft) achieves this by electing a single leader that serializes all writes. The leader is effectively the distributed equivalent of the mutex holder — all operations pass through it, and it decides the order.

The mechanism:

  1. A client sends a write to the leader
  2. The leader assigns the write a position in its log (this is the linearization point)
  3. The leader replicates the log entry to a majority of followers (quorum)
  4. Once a quorum acknowledges, the leader commits and responds to the client
  5. Any subsequent read through the leader sees the committed write

The leader serializes writes the same way FOR UPDATE does on a single node — it’s the single point of ordering. The quorum replication ensures that if the leader crashes, a new leader can be elected without losing committed writes.

Reads must also go through the leader

Consensus alone doesn’t guarantee linearizable reads. If a client reads from a follower, it may see stale data (the follower might not have applied the latest committed entries). To get linearizable reads, you need one of:

  • Leader reads: all reads go through the leader, which confirms it’s still the leader before responding
  • Read leases: the leader holds a time-bounded lease; as long as the lease is valid, it can serve reads without re-confirming
  • Quorum reads: read from a majority of nodes and take the most recent value
SystemMechanismTrade-off
Google SpannerPaxos + TrueTime (GPS/atomic clocks)Global linearizability, but requires specialized hardware for TrueTime
etcdRaft consensusLinearizable reads and writes, leader-based
ZooKeeperZAB (Zookeeper Atomic Broadcast)Writes are linearizable, reads are not by default (must use sync + read)
CockroachDBRaft per rangeLinearizable within each range

The cost is latency — every write must wait for a quorum of replicas to acknowledge before returning. By the CAP theorem, a linearizable system must sacrifice availability during network partitions.

Multiple services: linearizability doesn’t compose across databases

Linearizability is composable for objects within the same system — but not across independent systems with separate databases. Each service is linearizable in isolation, but the combination is not.

Example: an inventory service and a payment service.

The inventory service uses MySQL with FOR UPDATE — linearizable. The payment service uses its own MySQL with FOR UPDATE — also linearizable. A checkout operation needs to atomically reserve inventory AND charge payment:

Checkout request arrives:

1. Inventory service: SELECT ... FOR UPDATE → reserve last item → COMMIT  ✓
2. Payment service:   SELECT ... FOR UPDATE → charge card → FAILS (declined)

Now: inventory is reserved, but payment failed. The item is locked
and no one else can buy it. The system is in an inconsistent state.

Each service individually is linearizable — no concurrent operation can see stale state within that service. But the cross-service operation is not atomic. Between steps 1 and 2, another request could observe the inventory as reserved but the payment as absent.

The fundamental problem: there is no shared lock across the two databases. FOR UPDATE on Service A’s MySQL says nothing about Service B’s MySQL. Each mutex operates in its own scope.

None of these solutions give you cross-service linearizability — simultaneous visibility of both changes to all observers. The only way to get linearizability is a single lock scope (a single database). Everything else is a trade-off between how bad the inconsistency window is and how much latency/coupling you accept:

ApproachHow it worksWhat it guaranteesWhat it does NOT guarantee
Two-phase commit (2PC)A coordinator ensures all participants commit or all rollback (see below)Atomic outcome — you’ll never end up permanently with inventory reserved and payment absentSimultaneous visibility — there is a brief window where one DB has committed and the other hasn’t
Saga patternEach service commits independently; if a later step fails, earlier steps are compensated (e.g., release the reservation)Eventual correctness — the system converges to a consistent stateAnything about the intermediate states — they are fully visible to concurrent readers
Single databasePut both resources in the same database; use a single transactionLinearizability — one lock, one commit, atomic visibilityIndependent scaling — the services are coupled at the data layer
Outbox + single writerOne service writes its local change + an event to an outbox in the same transaction; downstream services consume the eventAtomic local-change-plus-event-publicationTimeliness — downstream services see the change after consuming the event, not at the moment of commit

What 2PC actually provides (and what it doesn’t)

Sequence diagram showing the two phases. Red bars on the DB lifelines represent row-level locks. During the prepare phase, locks are held and nothing is committed — no external reader can see any change. The problem is what happens during the commit phase.

The prepare phase works as expected. The coordinator asks each participant to prepare: acquire the lock, validate the operation, write a durable prepare record to the WAL, but do NOT commit. During this phase, external readers either see the old state or block on the lock. No partial state is visible. If any participant fails to prepare, the coordinator sends ROLLBACK to all — locks released, nothing changed.

The commit phase is where linearizability breaks down. Once all participants respond “prepared,” the coordinator sends COMMIT to each participant. But those COMMIT messages arrive at different times:

Coordinator sends COMMIT to Inventory DB  ──→  Inventory commits, releases lock
                                                 ↓
                                                 Reader sees: inventory reserved ✓
Coordinator sends COMMIT to Payment DB    ──→  (still in transit...)
                                                 ↓
                                                 Reader sees: no payment yet ✗

Between the first participant committing and the last participant committing, a concurrent reader CAN observe the partial state: inventory reserved, payment absent. This window is brief (typically milliseconds), but it exists. The system is not linearizable across the two databases during this window.

What 2PC does guarantee: the outcome is atomic. You will never end up permanently in a state where inventory is reserved and payment failed. Either both will commit (because the coordinator decided COMMIT and will retry until all participants acknowledge) or both will rollback. But “both will eventually be consistent” is not the same as “both are always consistent” — which is what linearizability requires.

2PC solves a different problem than linearizability

2PC answers: “will the system end up in a consistent state?” (yes — all commit or all abort). Linearizability answers: “can any observer ever see an inconsistent state?” (with 2PC — yes, briefly, during the commit phase).

For many applications the brief inconsistency window is acceptable. A user checking their order status might see “reserved” for a few milliseconds before “paid” appears. But for systems that require strict linearizability (e.g., a trading engine that must never show a partial fill), 2PC is insufficient — you need a single database.

Coordinator crash — the blocking problem

If the coordinator crashes after participants have responded “prepared” but before it sends COMMIT or ROLLBACK, every participant is stuck: they hold locks and cannot unilaterally decide to commit (the other participant might have voted to abort) or abort (the coordinator might have decided to commit). The participants must wait for the coordinator to recover. Until it does, those rows remain locked — blocking all other transactions that touch them. This can last seconds, minutes, or indefinitely if the coordinator’s disk is lost. This is why 2PC is called a “blocking protocol” and why it is unsuitable for systems that need high availability.

This is why microservice architectures fundamentally trade linearizability for availability and independence. Each service is internally consistent, but cross-service operations cannot be made linearizable without putting the data in the same lock scope. There is no protocol that makes two independent databases behave as one — you either accept a window of inconsistency (2PC, sagas, outbox) or you put the data together (single database).

When you actually need it

Linearizability matters when two concurrent operations on the same resource can produce an incorrect outcome if they both see stale state. The classic examples:

  • Resource reservation: two requests try to reserve the last available unit (budget, seat, inventory). Without mutual exclusion, both succeed and the resource is overcommitted.
  • Leader election: two nodes both believe they are the leader. Without linearizable compare-and-swap, split-brain occurs.
  • Unique constraint enforcement: two users try to register the same username concurrently.

In all these cases, the pattern is the same: read current state, decide, write new state — and the decision is only correct if no one else changed the state between the read and the write. This is the TOCTOU (time-of-check-to-time-of-use) problem, and linearizability eliminates it.

When you don’t need distributed linearizability

If your system has a single primary database and all reads/writes go to that primary, you already have linearizability through database transactions. Invoking distributed systems terminology (consensus, linearizability, CAP) is technically correct but adds no value — the guarantee comes from FOR UPDATE, not from Raft.

The situations where single-node linearizability breaks down:

  • Reading from replicas: a replica may not have applied the latest committed transaction. A write on the primary followed by a read on a replica can return stale data, violating the real-time ordering property. Semi-synchronous replication (where the primary waits for the replica to receive the transaction log, but not apply it) does not fix this.
  • Derived data from eventually consistent sources: if your decision depends on data from an ETL pipeline or data warehouse with inherent lag, the composite operation is not linearizable even if each individual data source is. The decision is made against stale input.
  • Multiple databases without coordination: if an operation must atomically update two different databases, you need distributed transactions (2PC) or a saga pattern to maintain consistency across them.

Composability

A rare and powerful property: linearizability is composable. If every individual object in a system is linearizable, the entire system is linearizable. No other consistency model has this property.

This means you can reason about each resource (each account, each inventory item, each budget) independently. If each one is protected by its own lock or CAS, the system as a whole is correct — you don’t need a global lock or a global transaction across all resources.

Linearizability vs mutual exclusion

Linearizability is a correctness property — it describes what the system looks like from the outside. A mutex (or SELECT ... FOR UPDATE) is an implementation mechanism — one way to achieve it.

ConceptLevelWhat it answers
LinearizabilitySpecification”Does the system behave as if operations happened one at a time, respecting real-time?”
Mutual exclusion (mutex, row lock)Implementation”How do I prevent two operations from interleaving on this resource?”

A mutex always gives you linearizability (assuming correct usage). But linearizability can also be achieved without a mutex — compare-and-swap (CAS) loops, consensus protocols, and lock-free data structures all provide linearizable operations without holding a lock.

The practical intuition is close though: linearizability means that when you need to perform a read-decide-write sequence on shared state with invariants (e.g., “balance must not go negative”), no one else can perform a conflicting operation between your read and your write. Whether that’s enforced by a mutex, a CAS, or a Raft leader is an implementation choice.

Linearizability vs total ordering

Linearizability guarantees atomicity — no two operations see the same stale state. It does NOT guarantee that operations are processed in the order they were submitted by clients.

If multiple threads or processes independently submit requests, the order in which those requests reach the serialization point (the lock, the Raft leader, the CAS) is non-deterministic. Thread scheduling, network latency, and GC pauses all affect arrival order. The system is still linearizable — whichever request wins the lock first, the next one sees updated state — but you don’t control which one wins.

This matters when the business logic requires FIFO processing (e.g., “process operations in submission order”). Linearizability doesn’t give you that. To get both linearizability and submission ordering, you need a single-writer pattern:

Multiple producers → Ordered queue → Single consumer → Database (FOR UPDATE)

The single consumer dequeues in order and writes sequentially. You get linearizability from the database lock AND total ordering from the single writer.

Without the single writer, multiple threads race to the database and whichever acquires the lock first wins — still linearizable, but not FIFO. For most use cases (concurrent resource reservation, concurrent withdrawals), any valid serialization is acceptable and FIFO is unnecessary. But when operations have causal dependencies (“the budget cut must apply before the hold that depends on the new balance”), submission ordering matters, and a single writer is the simplest way to achieve it.

Linearizability vs other consistency models

The key difference between linearizability and weaker models is whether the system respects real-time ordering — the constraint that if operation A’s response arrives before operation B’s invocation (i.e., A provably finished before B started, as observed by clients), then the system must order A before B.

ModelWhat it guaranteesWhat it allows
LinearizabilityOperations appear sequential AND respect real-time: if A finished before B started, A is ordered firstConcurrent operations (overlapping in time) may be ordered either way
Sequential consistencyOperations appear sequential AND each process’s operations appear in program orderThe global ordering may violate real-time: A can finish before B starts, yet B appears first globally — as long as each individual process’s operations stay in order
Causal consistencyIf A causally precedes B (B read A’s result, or A and B are from the same process), A is ordered firstUnrelated operations across processes may appear in different orders to different observers
Eventual consistencyAll replicas will converge to the same state eventuallyAnything goes in the short term — stale reads, out-of-order observations, temporary divergence

Linearizability vs sequential consistency — the subtle difference

Both produce a valid sequential ordering of all operations. The difference is one constraint: does the ordering have to match wall-clock reality?

Process 1:  |--- write(x=1) ---|
                                    (time passes, write is done)
Process 2:                              |--- read(x) -> 0 ---|
  • Linearizable: illegal. The write completed before the read started, so the read must see x=1.
  • Sequentially consistent: legal. The system can order the read before the write in the global sequence, as long as each process’s own operations remain in order. Process 2 only issued one operation, so there’s nothing to violate.

This sounds academic, but it matters when processes communicate outside the system. If Process 1 writes x=1, then sends a Slack message to Process 2 saying “I wrote x,” and Process 2 reads x and gets 0 — that’s a violation of real-time ordering that linearizability forbids but sequential consistency permits. The system has no knowledge of the Slack message, so it considers the operations concurrent from its perspective.

See also