CRDT

CRDTs: what they buy you (and what they don’t)

A CRDT is a replicated data type whose update/merge rules guarantee convergence under asynchronous replication: messages can arrive out of order, can be duplicated, and replicas can accept local writes while partitioned/offline; once replicas have learned the same information, they end in the same state.

The critical separation is: CRDTs solve replica convergence (a distributed-systems problem). They do not automatically solve semantic conflict resolution (a product/domain problem). If you want humans to decide between concurrent values, CRDTs can still be the right tool: you choose a CRDT whose meaning is “preserve the conflict as data” rather than “auto-pick a winner”.

![important] A useful mental model is “two layers”: (1) a convergent replication substrate (CRDT), and (2) an application policy that interprets the convergent state (including possibly showing a conflict UI).

Total order vs partial order (the real fork in the road)

When multiple replicas accept writes concurrently, the system must choose how it will define “what happened”.

A total order means every update is placed into one global sequence: for any two operations op₁ and op₂, either op₁ happens before op₂ or op₂ happens before op₁. This is what you get if you run a single sequencer (or per-key leader) and make everyone submit updates through it, or if you run a consensus protocol (Raft/Paxos-family) to replicate an ordered log. Total order is powerful because it removes ambiguity: replaying the same ordered log yields the same state everywhere, and “last write wins” is literally “the one later in the log”. The cost is coordination: under partitions, someone must stop accepting writes, or you accept divergent histories that must be reconciled later.

A partial order is what you get in the “real” offline/multi-writer setting: some operations are ordered by causality (if op₂ was produced after observing op₁, then op₁ → op₂), but many operations are concurrent (neither observed the other). In this world there is no single correct linear history to replay unless you add coordination after the fact. This is where naive event-log replication breaks: two replicas can legitimately replay the same set of operations in different orders and end up with different results.

CRDTs are a disciplined way to live in the partial-order world: they define a deterministic interpretation of concurrency so replicas converge without first converting the partial order into a total order.

The “two-layer” view: convergence vs meaning

In a multi-writer, offline-capable system, you cannot rely on delivery order to define meaning. There are only two general ways to prevent divergence:

Impose a single total order on updates (consensus / sequencer / single-writer per key) and replay that order everywhere.
Make the datatype’s merge semantics order-independent for concurrency (CRDT approach).

CRDTs pick (2). That means you must decide what “concurrent updates” mean for each field/type. This is exactly why CRDTs are a “datatype thing”: the merge meaning is encoded at the type boundary.

A concrete way to think about it: the replication system delivers a set of updates plus causal relationships (“op₂ saw op₁”), not a single linear history. A CRDT is the rulebook that turns that partially ordered information into a deterministic state.

Where CRDTs shine: edits whose meaning is naturally mergeable

CRDTs feel compelling when the domain already has a commutative meaning under concurrency, so “conflict resolution” is not a human-level dispute.

Collaborative text editing (sequence CRDT family)

Text editing is naturally expressed as “insert this atom at this position” and “delete this atom”, where atoms have stable identities. With stable identities, concurrent inserts/deletes can be merged deterministically.

At a high level, the trick is: an insert doesn’t mean “insert at numeric index 5” (indexes are not stable under concurrency); it means “insert new atom X with id=… after atom Y (or between two ids)”. Deletes target ids, not positions. Concurrency becomes “two inserts relative to the same neighborhood” which can be made deterministic with a tie-break on ids.

This is why CRDTs are a good fit for “Docs-like” editors: you want always-editable, low-latency local operations, and a deterministic merge that produces a single text.

![tip] This is also where people confuse CRDT with “automatic conflict resolution”. The merge is automatic, but the datatype’s meaning (a single merged text) is acceptable for users most of the time.

Counters, reactions, analytics-style increments

If the operation is “increment by δ”, then concurrent updates commute. A CRDT counter gives you availability without coordination and still converges.

A minimal state-based counter is per-replica components merged by max:

// G-Counter (grow-only)
type ReplicaId = string;
type GCounter = Map<ReplicaId, number>;
 
function inc(s: GCounter, me: ReplicaId, by = 1): GCounter {
  const next = new Map(s);
  next.set(me, (next.get(me) ?? 0) + by);
  return next;
}
 
function merge(a: GCounter, b: GCounter): GCounter {
  const out = new Map(a);
  for (const [rid, v] of b) out.set(rid, Math.max(out.get(rid) ?? 0, v));
  return out;
}
 
function value(s: GCounter): number {
  let sum = 0;
  for (const v of s.values()) sum += v;
  return sum;
}

Replication can be dumb: ship snapshots/deltas whenever; duplicates and reordering don’t matter because merge is commutative + idempotent.

Sets and “membership” data (tags, read receipts, membership lists)

Sets get tricky with remove/add concurrency. CRDT sets work by attaching identity to adds, and making removes target observed identities. This removes ambiguity without a global order.

The exact case you’re worried about: “I want a human to choose”

If a field is a single scalar and concurrent assignments are a semantic conflict (“two people set different values; we should decide”), CRDTs are still useful because you can encode a datatype that preserves the conflict and still converges.

Multi-Value Register: conflict as first-class state

A multi-value register is: “the value is a set of concurrently-written versions; if writes are causally ordered, newer overwrites older; if concurrent, keep both.”

You do not need consensus to do this; you need causal metadata so replicas agree on what is concurrent vs causally-after.

A minimal shape looks like this:

// Sketch: MV-Register with dotted version vectors (conceptual)
type ReplicaId = string;
type Dot = `${ReplicaId}:${number}`;
 
type VV = Map<ReplicaId, number>; // causal context frontier
 
type Version<T> = { dot: Dot; ctx: VV; value: T };
 
type MVRegister<T> = {
  versions: Map<Dot, Version<T>>; // live concurrent versions
  ctx: VV; // "I've seen up to ..." (for compaction)
};

Operationally:

A write creates a fresh dot (replica-local counter).
The write carries a causal context (what it has seen).
Merge keeps versions that are not “dominated” (i.e., not causally overwritten by something that happened-after seeing them).

The UI policy is then straightforward: if the register has 1 version, show it; if it has >1, show a conflict picker. This matches your “no automated resolution” desire while keeping replicas convergent.

![important] This is the key move: CRDT does not force you to auto-resolve. It lets you implement deterministic convergence while deferring semantic choice to a human or to domain logic.

Why this is not equivalent to a naive event log

You can replicate an event log, but without a total order you must still answer: which events are concurrent vs overwriting? If you don’t encode causality, replay order differences reintroduce divergence.

MV-register is essentially “event log replication + explicit causal semantics + deterministic projection into state”. It’s not magical; it’s disciplined.

Composing CRDTs: the “document is a map of CRDT fields” pattern

Most real systems don’t use “one CRDT”. They use a CRDT map whose values are themselves CRDTs:

document = Map(field → CRDT-value)
“title” = text CRDT
“likes” = counter CRDT
“tags” = set CRDT
“status” = MV-register (human picks if concurrent)
“updatedAt” = LWW-register (if you accept that semantics)

This is where CRDTs stop being academic: you get a uniform replication substrate and choose per-field semantics that match your product.

Where CRDTs are the wrong tool (or need a second tool)

CRDTs don’t give you global invariants. If you need properties like “only one user can claim this handle” or “don’t exceed inventory”, you need a coordination mechanism (consensus/transactions/escrow techniques) at the boundary where the invariant matters.

A useful division of labor is:

CRDTs for high-availability, mergeable state
consensus/transactions for global uniqueness / caps / strict ordering

Practical costs you should expect (the real engineering trade)

CRDTs “pay” for avoiding coordination by carrying metadata that makes concurrency well-defined.

The two costs that dominate real deployments are:

Metadata growth (tombstones, per-insert identifiers, causal contexts). If you never compact, state grows without bound for sets/sequences.
Garbage collection requires causal stability: you can only safely drop tombstones/old versions once you know all replicas that matter have observed them. That usually means tracking a notion of “everyone’s causal frontier” (version vectors or a server-mediated stability mechanism).

![warning] If you design compaction wrong, you can get the worst failure mode: replicas converge, then later “un-converge” via resurrection (a removed element reappears) or data loss (a live version is dropped).

How to sanity-check a CRDT implementation (before trusting it)

A CRDT should be testable as a pure distributed-systems object:

Generate random replicas and random local updates.
Deliver messages in randomized orders with duplication.
Ensure that once all replicas have received the same information, their states are identical (or equivalent under a canonicalization).

For state-based CRDTs, you can also property-test merge algebra directly: merge must be commutative, associative, and idempotent over reachable states. If it isn’t, you don’t have a CRDT; you have “best-effort sync”.

Concrete use-case mapping (to resolve the “why would I use this?” feeling)

CRDTs are compelling when at least one of these is true:

You want offline-first or highly partition-tolerant UX, where waiting for a coordinator is unacceptable.
Your data contains a lot of operations that are naturally mergeable (text edits, counters, sets, independent fields).
You want to represent “conflict” as data (MV-register) and resolve it with UI/domain logic later, but you still require deterministic convergence and no data loss.

If your domain is mostly “single scalar overwrites where humans must decide immediately”, CRDTs will still help you avoid divergence, but they won’t remove the need for a human decision; they mainly let you implement “show conflict” without building a bespoke reconciliation protocol per feature.

Edmondo's Vault

Explorer