Skip to main content

Consistency Models

Understanding consistency models is fundamental to designing distributed systems. This module covers the entire consistency spectrum, from the strongest (linearizability) to the weakest (eventual consistency).
Module Duration: 12-16 hours
Key Topics: Linearizability, Serializability, Causal Consistency, Eventual Consistency, Session Guarantees
Interview Focus: Trade-offs between consistency levels, real-world examples

The Consistency Spectrum

┌─────────────────────────────────────────────────────────────────────────────┐
│                    CONSISTENCY SPECTRUM                                      │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  STRONGEST ◄──────────────────────────────────────────────► WEAKEST        │
│                                                                              │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐    │
│  │Linearizable  │  │ Sequential   │  │   Causal     │  │  Eventual    │    │
│  │              │  │ Consistency  │  │ Consistency  │  │ Consistency  │    │
│  └──────────────┘  └──────────────┘  └──────────────┘  └──────────────┘    │
│        ▲                  ▲                 ▲                 ▲            │
│        │                  │                 │                 │            │
│  Real-time         Global order       Preserves         Eventually        │
│  ordering          (some order)       causality         converges         │
│                                                                              │
│  ────────────────────────────────────────────────────────────────────────   │
│  TRADE-OFF:                                                                 │
│  ────────────────────────────────────────────────────────────────────────   │
│                                                                              │
│  Consistency  ─────────────────────────────────────────► Higher             │
│  Availability ◄───────────────────────────────────────── Lower              │
│  Latency      ─────────────────────────────────────────► Higher             │
│  Throughput   ◄───────────────────────────────────────── Lower              │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Linearizability (Strict Consistency)

The Gold Standard: Linearizability is the strongest single-object consistency model. Systems behave as if there’s a single copy of data.

Definition

┌─────────────────────────────────────────────────────────────────────────────┐
│                    LINEARIZABILITY                                           │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  DEFINITION:                                                                │
│  ───────────                                                                │
│  Every operation appears to execute instantaneously at some point           │
│  between its invocation and response (linearization point).                 │
│                                                                              │
│  PROPERTIES:                                                                │
│  ───────────                                                                │
│  1. Operations are atomic (happen at a single point in time)                │
│  2. Real-time ordering is preserved                                         │
│  3. All clients see the same order of operations                            │
│                                                                              │
│  EXAMPLE - Linearizable:                                                    │
│  ────────────────────────                                                   │
│  Time ────────────────────────────────────────────────────►                 │
│                                                                              │
│  Client A: [──write(x=1)──]                                                 │
│                       ↓ linearization point                                 │
│  Client B:              [──read(x)──] → returns 1 ✓                        │
│                                                                              │
│  After A's write completes, B must see x=1                                 │
│                                                                              │
│  EXAMPLE - NOT Linearizable:                                                │
│  ──────────────────────────                                                 │
│  Time ────────────────────────────────────────────────────►                 │
│                                                                              │
│  Client A: [──write(x=1)──]                                                 │
│  Client B:              [──read(x)──] → returns 0 ✗                        │
│  Client C:                  [──read(x)──] → returns 1                       │
│                                                                              │
│  B reads after A completes but sees old value - VIOLATION!                 │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Real-World Linearizable Systems

Linearizable writes:
- All writes go through the leader
- Leader assigns sequential zxid
- Write is linearized when majority acknowledge

Reads:
- sync() + read for linearizable reads
- Regular reads may be stale (sequential consistency)

Cost of Linearizability

┌─────────────────────────────────────────────────────────────────────────────┐
│                    COST OF LINEARIZABILITY                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  LATENCY:                                                                   │
│  ────────                                                                   │
│  - Writes must be acknowledged by majority                                  │
│  - Reads may need to contact leader or do sync                             │
│  - Cross-region: 50-200ms per operation                                    │
│                                                                              │
│  AVAILABILITY:                                                              │
│  ─────────────                                                              │
│  - Network partition → one side cannot make progress                       │
│  - Leader failure → election delay (seconds)                               │
│  - CAP theorem: Cannot have C and A during partition                       │
│                                                                              │
│  THROUGHPUT:                                                                │
│  ───────────                                                                │
│  - All operations serialized through leader                                │
│  - Consensus overhead per write                                            │
│  - Read scalability limited without compromising consistency               │
│                                                                              │
│  WHEN TO USE:                                                               │
│  ─────────────                                                              │
│  ✓ Leader election                                                         │
│  ✓ Distributed locks                                                       │
│  ✓ Unique ID generation                                                    │
│  ✓ Configuration management                                                │
│  ✓ Financial transactions                                                  │
│                                                                              │
│  WHEN TO AVOID:                                                             │
│  ──────────────                                                             │
│  ✗ High-throughput read-heavy workloads                                    │
│  ✗ Geographically distributed with low-latency requirements                │
│  ✗ Systems where availability is paramount                                 │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Sequential Consistency

Definition and Examples

┌─────────────────────────────────────────────────────────────────────────────┐
│                    SEQUENTIAL CONSISTENCY                                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  DEFINITION:                                                                │
│  ───────────                                                                │
│  - Operations from each client appear in program order                     │
│  - All clients see the SAME global order                                   │
│  - BUT: Order may not respect real-time                                    │
│                                                                              │
│  DIFFERENCE FROM LINEARIZABLE:                                              │
│  ─────────────────────────────                                              │
│                                                                              │
│  Linearizable:  Real-time order matters                                    │
│  Sequential:    Only program order per client matters                      │
│                                                                              │
│  EXAMPLE - Sequentially Consistent but NOT Linearizable:                   │
│  ─────────────────────────────────────────────────────────                  │
│                                                                              │
│  Real-time:                                                                 │
│  Client A: write(x=1) ────────────────────────────────────                 │
│  Client B: ──────────────────── read(x) → 0                                │
│                                                                              │
│  Possible sequential order: B.read(x), A.write(x=1)                        │
│  This is valid because B's read started before A's write completed!        │
│                                                                              │
│  For LINEARIZABILITY: B must see x=1 if read starts after write ends      │
│                                                                              │
│  USE CASES:                                                                 │
│  ───────────                                                                │
│  - Multi-core CPU memory models                                            │
│  - Zookeeper reads (without sync)                                          │
│  - Some distributed databases                                              │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Serializability

Database Focus: Serializability is about transactions, not individual operations. It’s the isolation property in ACID.

Definition

┌─────────────────────────────────────────────────────────────────────────────┐
│                    SERIALIZABILITY                                           │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  DEFINITION:                                                                │
│  ───────────                                                                │
│  Transactions appear to execute in SOME serial order.                      │
│  The actual order may differ from real-time order.                         │
│                                                                              │
│  EXAMPLE:                                                                   │
│  ─────────                                                                  │
│  T1: R(A) W(A)           Concurrent execution:                             │
│  T2: R(A) W(A)                                                              │
│                          T1: ──R(A)────────W(A)──                          │
│                          T2: ────────R(A)────────W(A)                      │
│                                                                              │
│  Serializable if equivalent to: T1→T2 or T2→T1                             │
│                                                                              │
│  SERIALIZABLE vs LINEARIZABLE:                                              │
│  ─────────────────────────────                                              │
│  ┌────────────────┬──────────────────┬──────────────────┐                  │
│  │                │ Linearizable     │ Serializable     │                  │
│  ├────────────────┼──────────────────┼──────────────────┤                  │
│  │ Scope          │ Single object    │ Multi-object txn │                  │
│  │ Real-time      │ Yes              │ No               │                  │
│  │ Ordering       │ Real-time        │ Any serial order │                  │
│  │ Domain         │ Distributed sys  │ Databases        │                  │
│  └────────────────┴──────────────────┴──────────────────┘                  │
│                                                                              │
│  STRICT SERIALIZABILITY:                                                   │
│  ────────────────────────                                                   │
│  = Serializable + Linearizable                                             │
│  = Transactions respect real-time order                                    │
│  = What Spanner provides                                                   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Isolation Levels

┌─────────────────────────────────────────────────────────────────────────────┐
│                    SQL ISOLATION LEVELS                                      │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  WEAKEST ◄─────────────────────────────────────────────► STRONGEST          │
│                                                                              │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐           │
│  │ Read        │ │ Read        │ │ Repeatable  │ │ Serializable│           │
│  │ Uncommitted │ │ Committed   │ │ Read        │ │             │           │
│  └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘           │
│                                                                              │
│  ANOMALIES PREVENTED:                                                       │
│  ─────────────────────                                                      │
│  ┌─────────────────────┬───────┬───────┬───────┬───────────────┐           │
│  │ Anomaly             │ RU    │ RC    │ RR    │ Serializable  │           │
│  ├─────────────────────┼───────┼───────┼───────┼───────────────┤           │
│  │ Dirty Read          │ ✗     │ ✓     │ ✓     │ ✓             │           │
│  │ Non-repeatable Read │ ✗     │ ✗     │ ✓     │ ✓             │           │
│  │ Phantom Read        │ ✗     │ ✗     │ ✗     │ ✓             │           │
│  │ Write Skew          │ ✗     │ ✗     │ ✗     │ ✓             │           │
│  └─────────────────────┴───────┴───────┴───────┴───────────────┘           │
│                                                                              │
│  DIRTY READ:                                                                │
│  ────────────                                                               │
│  T1: W(x=10)         ← not committed                                       │
│  T2: R(x)=10         ← reads uncommitted data                              │
│  T1: ROLLBACK        ← T2 has phantom data!                                │
│                                                                              │
│  WRITE SKEW:                                                                │
│  ────────────                                                               │
│  Constraint: x + y >= 0                                                    │
│  Initial: x=5, y=5                                                         │
│  T1: if (x+y >= 5) then x = x - 5                                         │
│  T2: if (x+y >= 5) then y = y - 5                                         │
│  Both read x+y=10, both proceed, result: x=0, y=0, violates constraint!    │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Causal Consistency

Definition and Intuition

┌─────────────────────────────────────────────────────────────────────────────┐
│                    CAUSAL CONSISTENCY                                        │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  DEFINITION:                                                                │
│  ───────────                                                                │
│  If operation A causally precedes operation B,                              │
│  then all nodes see A before B.                                            │
│                                                                              │
│  CAUSALITY RULES:                                                           │
│  ────────────────                                                           │
│  1. Program order: In same thread, A before B → A causes B                 │
│  2. Message passing: Send before receive → Send causes receive             │
│  3. Transitivity: A→B and B→C implies A→C                                  │
│                                                                              │
│  EXAMPLE - Causally Consistent:                                            │
│  ───────────────────────────────                                            │
│                                                                              │
│  Alice posts: "I got the job!"                                             │
│  Bob (seeing Alice's post): "Congratulations!"                             │
│                                                                              │
│  Carol must see Alice's post before Bob's reply (causal order)             │
│  Carol may see unrelated posts in any order (concurrent = no causal order) │
│                                                                              │
│  CAUSALLY RELATED vs CONCURRENT:                                           │
│  ────────────────────────────────                                           │
│                                                                              │
│  Client A: W(x=1) ────────────────────────────                             │
│                  ↘                                                          │
│  Client B:         R(x=1) ── W(y=2)                                        │
│                              ↓ causally related (B read A's write)         │
│                                                                              │
│  Client A: W(x=1) ────────────────────────────                             │
│  Client B: W(y=2) ────────────────────────────                             │
│            ↑ concurrent (no communication)                                  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Eventual Consistency

Definition and Variations

┌─────────────────────────────────────────────────────────────────────────────┐
│                    EVENTUAL CONSISTENCY                                      │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  DEFINITION:                                                                │
│  ───────────                                                                │
│  If no new updates are made, eventually all replicas converge              │
│  to the same value.                                                        │
│                                                                              │
│  THE WEAKEST USEFUL GUARANTEE:                                             │
│  ─────────────────────────────                                              │
│  - No guarantee about WHEN convergence happens                             │
│  - No guarantee about order of updates                                     │
│  - During updates, different replicas may return different values          │
│                                                                              │
│  TIMELINE:                                                                  │
│  ─────────                                                                  │
│  Update ─────────────────────────────────────► Convergence                 │
│          ↑ inconsistency window ↑                                          │
│                                                                              │
│  Replica A: [x=1] ─────────────────── [x=5] ─── [x=5]                      │
│  Replica B: [x=1] ── [x=5] ──────────────────── [x=5]                      │
│  Replica C: [x=1] ──────────────────────── [x=5][x=5]                      │
│                                                                              │
│  During inconsistency window:                                              │
│  - Read from A: x=1                                                        │
│  - Read from B: x=5                                                        │
│  - Application must handle this!                                           │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Session Guarantees

┌─────────────────────────────────────────────────────────────────────────────┐
│                    SESSION GUARANTEES                                        │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  SESSION GUARANTEES provide consistency within a single client session.    │
│  Weaker than linearizability but practical for many applications.          │
│                                                                              │
│  1. READ YOUR WRITES (RYW)                                                 │
│     ─────────────────────                                                   │
│     After a write, subsequent reads by same client see the write.          │
│                                                                              │
│  2. MONOTONIC READS                                                        │
│     ────────────────────                                                    │
│     If client reads value v, subsequent reads won't return older values.  │
│                                                                              │
│  3. MONOTONIC WRITES                                                       │
│     ─────────────────────                                                   │
│     Writes by a client are applied in order on all replicas.              │
│                                                                              │
│  4. WRITES FOLLOW READS                                                    │
│     ────────────────────────                                                │
│     A write following a read is ordered after that read on all replicas.  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

8.1 Implementing Session Guarantees

In high-scale systems (like DynamoDB or Cassandra), session guarantees are often implemented using Client-Side Metadata or Version Vectors.

Read Your Writes (RYW)

To ensure a client sees their own update immediately, even if the database is eventually consistent:
  1. Write: The client performs a write and receives a Version Token (or Timestamp/LSN) in the response.
  2. Storage: The client stores this token in their session (e.g., a cookie or local storage).
  3. Read: When reading, the client sends this token. The server ensures the read replica has at least reached that version before responding. If the replica is lagging, the server can:
    • Wait for the replica to catch up.
    • Route the request to a fresher replica.
    • Read from the Leader.

Monotonic Reads

To prevent the “Time Travel” bug (where a user sees a post, refreshes, and it’s gone because they hit a lagging replica):
  1. The client tracks the Max Version it has seen so far.
  2. Every read request includes this min_version filter.
  3. The load balancer or database proxy ensures that the request only hits replicas that are at or beyond this version.
class SessionClient:
    """
    Pseudo-code for a client providing Read-Your-Writes 
    and Monotonic Reads.
    """
    def __init__(self, db_cluster):
        self.db = db_cluster
        self.last_version_seen = 0

    def write(self, key, value):
        response = self.db.put(key, value)
        # Update session metadata from write response
        self.last_version_seen = max(self.last_version_seen, response.version)
        return response

    def read(self, key):
        # Pass the last version seen to ensure monotonic progress
        response = self.db.get(key, min_version=self.last_version_seen)
        self.last_version_seen = max(self.last_version_seen, response.version)
        return response.data
Staff Tip: While session guarantees improve the user experience, they can create Hot Spots. If a single user is extremely active, they may effectively “pin” themselves to a single replica, preventing the load balancer from distributing traffic effectively.

9. Advanced Theoretical Frameworks

To truly master consistency, one must look beyond simple definitions and understand the mathematical foundations.

9.1 The CALM Theorem

CALM stands for Consistency As Logical Monotonicity. This theorem provides a formal boundary for the CAP theorem: it identifies exactly which programs can be consistent and available without coordination.

The Fundamental Insight

Distributed consistency is hard because nodes disagree on the order and absence of events. The CALM theorem states that:
Consistency can be achieved without coordination if and only if the program is logically monotonic.

Monotonic vs. Non-Monotonic Operations

TypeDescriptionExamplesCoordination?
MonotonicAdding information never invalidates previous conclusions. “More is better.”Set union, Reachability, Maximum, Logical OR/ANDNo (Coordination-free)
Non-MonotonicAdding information can change a previous “True” to “False.” “Absence matters.”Set difference, Negation (NOT), Aggregation (Count/Sum), Garbage CollectionYes (Requires locks/consensus)

Why Monotonicity Matters

  1. Order Independence: In a monotonic system, messages can arrive in any order, be delayed, or be duplicated, and the final result will be the same. This is why CRDTs (Module 15) work—they are mathematically monotonic.
  2. Deterministic Convergence: Multiple replicas receiving different subsets of updates will always “eventually converge” to the same state as they receive more information.
  3. Availability: Monotonic programs are AP (Available under Partitions) because they don’t need to ask other nodes “Do you know anything that would make my current conclusion false?”

Practical Application: Garbage Collection

Consider a distributed system where you want to delete a file.
  • Problem: Deletion is non-monotonic. If Node A deletes file1, but Node B hasn’t seen the delete yet and re-replicates it, file1 “resurrects.”
  • Solution: Use a monotonic approximation. Instead of deleting, add a “Tombstone” (a record that says ‘this is deleted’). Adding a tombstone is monotonic (you are adding info). The actual space reclamation (purging) is non-monotonic and requires coordination or a background GC process with a grace period.
Staff+ Tip: When designing a high-scale system, always ask: “Can I refactor this non-monotonic operation into a monotonic one?” If you can, you remove the need for expensive Paxos/Raft coordination.

9.2 Linearizability vs. Sequential Consistency (The Proof of Non-Composability)

A critical (and often asked) property of linearizability is that it is composable.
  • If object XX is linearizable and object YY is linearizable, then the combined system (X,Y)(X, Y) is also linearizable.
  • Sequential consistency is NOT composable. You can have two sequentially consistent objects that, when used together, violate sequential consistency. This is why multi-core memory models (which are often sequentially consistent) are so difficult to reason about at scale.

9.3 Formal Verification with TLA+

How do we know a consistency model is actually implemented correctly?
  • TLA+ (Temporal Logic of Actions): A language for modeling concurrent systems.
  • You define your system’s state and allowed transitions (actions).
  • You define Safety Invariants (e.g., “no two nodes are leader in the same term”).
  • The model checker explores all possible interleavings to find violations.

10. Testing Consistency in the Wild: The Jepsen Framework

Created by Kyle Kingsbury, Jepsen is the industry standard for testing distributed systems.

10.1 How Jepsen Works

  1. Setup: Spins up a cluster of nodes (e.g., 5 nodes).
  2. Client: A set of clients perform operations (reads, writes, CAS) on the cluster.
  3. Nemesis: A special process that causes “havoc”:
    • Network partitions (iptables drops).
    • Clock skew (ntpdate jumps).
    • Process crashes (kill -9).
  4. Checker: After the test, Jepsen analyzes the history of operations to see if they violate the claimed consistency model (e.g., using Knossos for linearizability).

10.2 Famous Jepsen Findings

  • MongoDB: Found that “Strong Consistency” wasn’t actually strong in many edge cases (later fixed with WiredTiger and Raft).
  • Cassandra: Found that lightweight transactions (LWT) could lose data during partitions.
  • Redis (Redlock): Kingsbury’s critique of Redlock showed that without fencing tokens, distributed locks are not safe under clock skew.

11. Consistency in Practice: The Decision Matrix

ModelCoordinationAvailabilityLatencyTypical Use Case
LinearizableHigh (Quorum)Low (CP)HighLeader Election, Locks
SequentialModerateLow (CP)MediumMemory models, CPU caches
CausalLow (Metadata)High (AP)LowSocial feeds, comments
EventualNoneHigh (AP)Ultra-lowAnalytics, background jobs
SEC (CRDTs)NoneHigh (AP)LowCollaborative editing

12. Interview Playbook: “The Deep Dive”

“When discussing consistency, I distinguish between single-object models like linearizability and multi-object models like serializability. Linearizability provides the strongest recency guarantee but at the cost of availability during partitions—a trade-off described by the CAP theorem. For high-availability systems, I look towards Causal Consistency, which is the strongest model achievable without global coordination. I also apply the CALM theorem to identify non-monotonic operations that strictly require coordination. Finally, I verify these systems using frameworks like Jepsen to ensure that under network partitions or clock skew, the safety invariants of the chosen model still hold.”

13. Key Takeaways

Consistency is a Spectrum

From linearizable (strongest) to eventual (weakest). Choose based on your requirements.

Stronger = Slower

Strong consistency requires coordination, which adds latency and reduces availability.

CAP is About Partitions

During partitions, choose consistency (reject writes) or availability (accept divergence).

Session Guarantees Help

Read-your-writes and monotonic reads provide practical consistency within a session.

14. Next Steps