Skip to main content
Transactions & ACID - Atomicity, Consistency, Isolation, Durability

Module 3: Transactions & ACID

Understanding transactions is what separates developers who write code that “usually works” from those who build systems that are reliable under all conditions. This module gives you deep intuition for how transactions work.
Estimated Time: 8-10 hours
Hands-On: Concurrency simulation labs
Key Skill: Choosing the right isolation level for your use case

3.1 What is a Transaction? (Understanding the Fundamentals)

The Real-World Analogy: All-or-Nothing Operations

Think of a transaction like a purchase at a store: When you buy something, multiple things must happen together:
  1. Your payment is processed
  2. Money is deducted from your account
  3. Money is added to the store’s account
  4. Inventory is reduced by 1
  5. Receipt is printed
What if step 3 fails? You’d have paid, but the store didn’t receive the money! That’s inconsistent and wrong. With transactions: Either ALL steps succeed, or ALL steps are rolled back (undone). You can’t have a partial purchase.

What is a Transaction in Databases?

A transaction is a sequence of database operations that are treated as a single, indivisible unit of work. Think of it as a “package deal” - either everything in the package happens, or nothing does. Key Concept: Atomicity
  • Atomic = “indivisible” (like an atom, which can’t be split)
  • A transaction is atomic: it can’t be partially completed
  • It’s like a light switch: it’s either ON or OFF, never halfway
Simple Example:
-- Transfer $100 from Alice to Bob
BEGIN;  -- Start transaction (like opening a package)

-- Step 1: Deduct from Alice
UPDATE accounts SET balance = balance - 100 WHERE id = 1;

-- Step 2: Add to Bob
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

COMMIT;  -- Complete transaction (close the package, make it permanent)
What happens:
  • If BOTH updates succeed → COMMIT makes them permanent
  • If EITHER update fails → ROLLBACK undoes everything (Alice keeps her money, Bob doesn’t get it)
Without transactions (the problem):
-- Step 1 succeeds
UPDATE accounts SET balance = balance - 100 WHERE id = 1;  -- ✅ Works

-- Step 2 fails (maybe Bob's account doesn't exist)
UPDATE accounts SET balance = balance + 100 WHERE id = 2;  -- ❌ Error!

-- Result: Alice lost $100, but Bob didn't get it! Money disappeared! 💸
With transactions (the solution):
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;  -- ✅ Works
UPDATE accounts SET balance = balance + 100 WHERE id = 2;  -- ❌ Error!
ROLLBACK;  -- Undoes step 1 automatically!

-- Result: Alice keeps her $100. No money lost! ✅

Why Do We Need Transactions?

The Problem: Real-World Scenarios
  1. Bank Transfer: Money must leave one account AND arrive in another (both or neither)
  2. E-commerce Order: Order must be created AND inventory reduced AND payment processed (all or nothing)
  3. User Registration: User account created AND welcome email sent AND profile initialized (all or nothing)
Without Transactions:
  • Partial failures leave data inconsistent
  • Money can disappear
  • Orders can be created without inventory being reduced
  • Users can be created but profiles missing
With Transactions:
  • Either everything succeeds, or everything is rolled back
  • Data stays consistent
  • No “half-completed” operations
A transaction is a sequence of database operations that are treated as a single, indivisible unit of work.
-- Classic example: Money transfer
BEGIN;  -- Start transaction

-- Deduct from sender
UPDATE accounts SET balance = balance - 100 WHERE id = 1;

-- Add to receiver  
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

COMMIT;  -- Make changes permanent
-- If anything fails, ROLLBACK undoes everything

Why Transactions Matter

Without transactions, partial failures leave your database in an inconsistent state.

3.2 MVCC Internals: How Isolation Works

PostgreSQL uses Multi-Version Concurrency Control (MVCC) to achieve high concurrency. “Readers never block writers, and writers never block readers.”

The Hidden Metadata: xmin and xmax

Every row (tuple) in a PostgreSQL table contains hidden system columns used for visibility checks:
  • xmin: The Transaction ID (XID) that created this row version.
  • xmax: The XID that deleted or updated this row version. If 0, the row is not deleted.
  • ctid: The physical location (page number, index) of the row version.

The Update “Append-Only” Flow

When you execute an UPDATE, PostgreSQL performs a Delete + Insert:
  1. The existing row version’s xmax is set to the current transaction ID (marking it as “logically deleted”).
  2. A new row version is inserted with its xmin set to the current transaction ID.
This results in “Dead Tuples”—old versions of rows that are no longer visible to any active transaction but still take up disk space.

Transaction Isolation via Snapshots

A Snapshot defines which row versions are visible to a transaction. It contains:
  • xmin: The earliest XID that is still active.
  • xmax: The first as-yet-unassigned XID.
  • xip_list: A list of active XIDs between xmin and xmax.
Visibility Rule: A row is visible to your transaction if:
  1. The row’s xmin has committed.
  2. The row’s xmin is not in the snapshot’s “active” list.
  3. The row’s xmax is either 0, has not committed, or is in the snapshot’s “active” list.

Vacuuming & Bloat

To prevent the database from growing indefinitely, the Auto-Vacuum process runs in the background to:
  1. Reclaim Space: Mark dead tuples as reusable for new data.
  2. Freeze XIDs: Transactions IDs are 32-bit integers (~4 billion). To prevent Wraparound (where IDs reset and old data becomes invisible), Vacuum “freezes” old XIDs, marking them as permanently visible.
  3. Update Statistics: Refresh pg_statistic for the Query Planner.
-- Monitor bloat and dead tuples
SELECT relname, n_dead_tup, last_vacuum, last_autovacuum 
FROM pg_stat_user_tables;

Understanding ACID: The Four Guarantees

Transactions provide four critical guarantees, known as ACID: A - Atomicity: All or nothing
  • Either all operations in a transaction succeed, or all are rolled back
  • No partial updates
C - Consistency: Valid state transitions
  • Database moves from one valid state to another
  • Constraints are never violated
  • Example: Account balance can’t go negative (if you have a CHECK constraint)
I - Isolation: Concurrent transactions don’t interfere
  • Multiple transactions can run simultaneously
  • Each transaction sees a consistent view of data
  • One transaction’s changes don’t affect others until committed
D - Durability: Committed changes survive crashes
  • Once a transaction commits, the changes are permanent
  • Even if the database crashes, committed data is safe
  • Changes are written to disk before COMMIT returns

Consistency: Valid State Transitions

What is Consistency? Consistency ensures the database moves from one valid state to another. Transactions cannot violate database constraints or rules. Real-World Analogy: Think of consistency like business rules:
  • “Account balance cannot be negative” (constraint)
  • “Every order must have a customer” (foreign key)
  • “Email addresses must be unique” (unique constraint)
How Consistency Works:
-- Constraints enforce consistency
CREATE TABLE accounts (
    id SERIAL PRIMARY KEY,
    balance DECIMAL(12, 2) NOT NULL,
    CONSTRAINT positive_balance CHECK (balance >= 0)  -- Business rule!
);
Example: What happens with consistency
BEGIN;
UPDATE accounts SET balance = balance - 150 WHERE id = 1;
-- If account 1 has $100, this would make balance = -$50
-- CHECK constraint prevents this! Transaction fails.
ROLLBACK;  -- Automatic rollback due to constraint violation
Consistency is enforced by:
  1. Constraints: CHECK, NOT NULL, UNIQUE, FOREIGN KEY
  2. Data types: Can’t put text in an integer column
  3. Business rules: Enforced by application logic or triggers
Important: Consistency is about the database rules, not about concurrent access (that’s Isolation).

3.3 Isolation Levels

Isolation Levels Comparison
┌─────────────────────────────────────────────────────────────────────────────┐
│                          ISOLATION LEVELS                                   │
├────────────────────┬─────────────┬──────────────────┬─────────────────────┤
│ Isolation Level    │ Dirty Read  │ Non-Repeatable   │ Phantom Read        │
│                    │             │ Read             │                     │
├────────────────────┼─────────────┼──────────────────┼─────────────────────┤
│ READ UNCOMMITTED   │ Possible    │ Possible         │ Possible            │
│ READ COMMITTED     │ Prevented   │ Possible         │ Possible            │
│ REPEATABLE READ    │ Prevented   │ Prevented        │ Possible*           │
│ SERIALIZABLE       │ Prevented   │ Prevented        │ Prevented           │
├────────────────────┴─────────────┴──────────────────┴─────────────────────┤
│ * PostgreSQL's REPEATABLE READ also prevents phantoms (uses MVCC)         │
│                                                                            │
│ Higher isolation = More protection = More overhead/locking                 │
└────────────────────────────────────────────────────────────────────────────┘

READ COMMITTED (PostgreSQL Default)

-- Transaction 1
BEGIN;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SELECT balance FROM accounts WHERE id = 1;  -- Returns 1000
-- (Transaction 2 updates and commits here)
SELECT balance FROM accounts WHERE id = 1;  -- Returns 1200 (changed!)
COMMIT;
Behavior:
  • Each statement sees a fresh snapshot of committed data
  • Same query can return different results within a transaction
  • No locks held between statements
Use when:
  • Most OLTP workloads
  • You don’t need consistent reads across multiple queries
  • Maximum concurrency is important

REPEATABLE READ

-- Transaction 1
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT balance FROM accounts WHERE id = 1;  -- Returns 1000
-- (Transaction 2 updates and commits here)
SELECT balance FROM accounts WHERE id = 1;  -- Still returns 1000!
-- We see the snapshot from when our transaction started
COMMIT;
Behavior:
  • Snapshot taken at first query in transaction
  • All reads see that same snapshot
  • Write conflicts cause serialization errors
-- Write conflict example
-- Transaction 1                    -- Transaction 2
BEGIN;                              BEGIN;
SET TRANSACTION ISOLATION           SET TRANSACTION ISOLATION
  LEVEL REPEATABLE READ;              LEVEL REPEATABLE READ;
  
UPDATE accounts                     
SET balance = balance + 100         
WHERE id = 1;                       
                                    UPDATE accounts
                                    SET balance = balance + 50
                                    WHERE id = 1;
                                    -- ERROR: could not serialize access
                                    -- due to concurrent update
COMMIT;
                                    ROLLBACK;  -- Must retry
Use when:
  • Reports that need consistent point-in-time data
  • Complex transactions reading related data
  • When you can handle retry logic for write conflicts

SERIALIZABLE

BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- Transactions behave as if run one after another
-- Any anomaly causes serialization failure
COMMIT;
Behavior:
  • Strongest isolation: transactions appear to run serially
  • Uses Serializable Snapshot Isolation (SSI) in PostgreSQL
  • Detects anomalies and aborts conflicting transactions
Use when:
  • Financial calculations
  • When correctness is more important than performance
  • Complex business logic with multiple dependencies

3.4 Locking Mechanisms

PostgreSQL uses locks to manage concurrent access.

Row-Level Locks

-- FOR UPDATE: Lock rows for modification
BEGIN;
SELECT * FROM accounts WHERE id = 1 FOR UPDATE;
-- Other transactions block on UPDATE to this row
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
COMMIT;

-- FOR SHARE: Lock rows for reading (prevents updates)
BEGIN;
SELECT * FROM products WHERE id = 10 FOR SHARE;
-- Allows other SELECTs, blocks UPDATEs
COMMIT;

-- SKIP LOCKED: Don't wait, skip locked rows
SELECT * FROM jobs 
WHERE status = 'pending' 
ORDER BY created_at 
LIMIT 1 
FOR UPDATE SKIP LOCKED;
-- Perfect for job queues: take next available job

-- NOWAIT: Fail immediately if locked
SELECT * FROM accounts WHERE id = 1 FOR UPDATE NOWAIT;
-- ERROR: could not obtain lock on row

Advisory Locks

Application-controlled locks for custom synchronization.
-- Session-level advisory lock (held until session ends or explicitly released)
SELECT pg_advisory_lock(12345);  -- Acquire lock on key 12345
-- ... do work ...
SELECT pg_advisory_unlock(12345);

-- Transaction-level advisory lock (released at COMMIT/ROLLBACK)
SELECT pg_advisory_xact_lock(12345);
-- No explicit unlock needed

-- Try lock (don't wait)
SELECT pg_try_advisory_lock(12345);  -- Returns true/false

-- Use case: Prevent duplicate processing
SELECT pg_try_advisory_lock(hashtext('process-order-' || order_id::text));

Deadlock Prevention

Deadlock Scenario PostgreSQL detects deadlocks and aborts one transaction: ERROR: deadlock detected DETAIL: Process 12345 waits for ShareLock on transaction 67890; blocked by process 67891.

**Preventing Deadlocks**:

```sql
-- Strategy 1: Lock in consistent order
-- Always lock lower ID first
BEGIN;
UPDATE accounts SET balance = balance - 100 
WHERE id = LEAST(1, 2);  -- Lock id=1 first
UPDATE accounts SET balance = balance + 100 
WHERE id = GREATEST(1, 2);  -- Then id=2
COMMIT;

-- Strategy 2: Lock all rows at once
BEGIN;
SELECT * FROM accounts 
WHERE id IN (1, 2) 
ORDER BY id 
FOR UPDATE;  -- Lock both in consistent order

UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

-- Strategy 3: Use short transactions
-- Don't hold locks while doing slow operations

3.5 Practical Transaction Patterns

Pattern 1: Retry on Serialization Failure

-- PostgreSQL function with retry logic
CREATE OR REPLACE FUNCTION transfer_money(
    p_from_account INTEGER,
    p_to_account INTEGER,
    p_amount DECIMAL
) RETURNS BOOLEAN AS $$
DECLARE
    retry_count INTEGER := 0;
    max_retries INTEGER := 3;
BEGIN
    LOOP
        BEGIN
            -- Check sufficient balance
            IF (SELECT balance FROM accounts WHERE id = p_from_account) < p_amount THEN
                RAISE EXCEPTION 'Insufficient funds';
            END IF;
            
            -- Perform transfer
            UPDATE accounts SET balance = balance - p_amount WHERE id = p_from_account;
            UPDATE accounts SET balance = balance + p_amount WHERE id = p_to_account;
            
            -- Log transfer
            INSERT INTO transfers (from_account, to_account, amount, transferred_at)
            VALUES (p_from_account, p_to_account, p_amount, NOW());
            
            RETURN TRUE;
            
        EXCEPTION 
            WHEN serialization_failure OR deadlock_detected THEN
                retry_count := retry_count + 1;
                IF retry_count > max_retries THEN
                    RAISE;
                END IF;
                -- Exponential backoff
                PERFORM pg_sleep(0.1 * power(2, retry_count));
        END;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

Pattern 2: Idempotent Operations

Design operations that can be safely retried.
-- Use idempotency key to prevent duplicate processing
CREATE TABLE payments (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    idempotency_key VARCHAR(255) UNIQUE,  -- Client-provided key
    user_id INTEGER NOT NULL,
    amount DECIMAL(12, 2) NOT NULL,
    status VARCHAR(20) NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Insert with conflict handling
INSERT INTO payments (idempotency_key, user_id, amount, status)
VALUES ('client-req-12345', 1, 100.00, 'completed')
ON CONFLICT (idempotency_key) DO NOTHING
RETURNING *;

-- If duplicate, return existing record
-- Application checks if RETURNING gave a row or needs to SELECT

Pattern 3: Optimistic Locking

Check version before updating, no locks held during read.
-- Add version column
ALTER TABLE products ADD COLUMN version INTEGER NOT NULL DEFAULT 1;

-- Read current state
SELECT id, name, price, stock, version FROM products WHERE id = 123;
-- Returns: version = 5

-- Update with version check
UPDATE products 
SET price = 99.99, 
    version = version + 1
WHERE id = 123 AND version = 5;  -- Expected version

-- Check if update succeeded
-- If 0 rows affected: someone else updated, retry
-- If 1 row affected: success

Pattern 4: Queue Processing with SELECT FOR UPDATE SKIP LOCKED

-- Job queue table
CREATE TABLE jobs (
    id SERIAL PRIMARY KEY,
    payload JSONB NOT NULL,
    status VARCHAR(20) NOT NULL DEFAULT 'pending',
    attempts INTEGER NOT NULL DEFAULT 0,
    locked_at TIMESTAMPTZ,
    completed_at TIMESTAMPTZ,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_jobs_pending ON jobs(created_at) WHERE status = 'pending';

-- Worker: Get next job (doesn't block other workers)
BEGIN;

UPDATE jobs 
SET status = 'processing', 
    locked_at = NOW(),
    attempts = attempts + 1
WHERE id = (
    SELECT id FROM jobs 
    WHERE status = 'pending'
    ORDER BY created_at
    LIMIT 1
    FOR UPDATE SKIP LOCKED  -- Skip jobs other workers are processing
)
RETURNING *;

-- Process the job...

-- Mark complete
UPDATE jobs SET status = 'completed', completed_at = NOW() WHERE id = ?;

COMMIT;

3.6 Common Transaction Mistakes

Mistake 1: Transaction Too Long

# ❌ BAD: Long transaction holding locks
with db.transaction():
    order = db.query("SELECT * FROM orders WHERE id = 1 FOR UPDATE")
    
    # External API call - could take seconds!
    payment_result = stripe.charge(order.total)
    
    if payment_result.success:
        db.execute("UPDATE orders SET status = 'paid' WHERE id = 1")
    
# Locks held for entire Stripe call duration!

# ✅ GOOD: Short transactions
order = db.query("SELECT * FROM orders WHERE id = 1")

# External call outside transaction
payment_result = stripe.charge(order.total)

if payment_result.success:
    with db.transaction():
        db.execute("""
            UPDATE orders 
            SET status = 'paid', payment_id = %s 
            WHERE id = 1 AND status = 'pending'
        """, [payment_result.id])

Mistake 2: Not Handling Conflicts

# ❌ BAD: Assumes success
db.execute("UPDATE inventory SET stock = stock - 1 WHERE product_id = 1")

# ✅ GOOD: Check result and handle conflicts
result = db.execute("""
    UPDATE inventory 
    SET stock = stock - 1 
    WHERE product_id = 1 AND stock > 0
    RETURNING stock
""")

if result.rowcount == 0:
    raise OutOfStockError("Product is out of stock")

Mistake 3: Ignoring Isolation Level

# ❌ BAD: Inconsistent read without proper isolation
with db.transaction():
    total = db.query("SELECT SUM(amount) FROM line_items WHERE order_id = 1")
    # Another transaction adds a line item here!
    items = db.query("SELECT * FROM line_items WHERE order_id = 1")
    # items and total don't match!

# ✅ GOOD: Use REPEATABLE READ for consistent snapshot
with db.transaction(isolation_level='REPEATABLE READ'):
    total = db.query("SELECT SUM(amount) FROM line_items WHERE order_id = 1")
    items = db.query("SELECT * FROM line_items WHERE order_id = 1")
    # Both queries see the same snapshot

3.7 Practice Exercises

Exercise 1: Money Transfer

Implement a safe money transfer function that:
  • Validates sufficient balance
  • Handles concurrent transfers correctly
  • Logs all transfers
  • Prevents negative balances
CREATE OR REPLACE FUNCTION safe_transfer(
    p_from INTEGER,
    p_to INTEGER,
    p_amount DECIMAL
) RETURNS TABLE(success BOOLEAN, message TEXT) AS $$
DECLARE
    v_from_balance DECIMAL;
BEGIN
    -- Lock both accounts in consistent order to prevent deadlock
    PERFORM * FROM accounts 
    WHERE id IN (p_from, p_to) 
    ORDER BY id 
    FOR UPDATE;
    
    -- Check balance
    SELECT balance INTO v_from_balance 
    FROM accounts WHERE id = p_from;
    
    IF v_from_balance < p_amount THEN
        RETURN QUERY SELECT false, 'Insufficient funds';
        RETURN;
    END IF;
    
    -- Perform transfer
    UPDATE accounts SET balance = balance - p_amount WHERE id = p_from;
    UPDATE accounts SET balance = balance + p_amount WHERE id = p_to;
    
    -- Log transfer
    INSERT INTO transfer_log (from_account, to_account, amount)
    VALUES (p_from, p_to, p_amount);
    
    RETURN QUERY SELECT true, 'Transfer successful';
END;
$$ LANGUAGE plpgsql;

Exercise 2: Inventory Reservation

Design a system to reserve inventory for a shopping cart that:
  • Holds items for 15 minutes
  • Releases reservation if cart is abandoned
  • Prevents overselling
  • Handles concurrent checkout
-- Reservations table
CREATE TABLE inventory_reservations (
    id SERIAL PRIMARY KEY,
    product_id INTEGER NOT NULL REFERENCES products(id),
    cart_id UUID NOT NULL,
    quantity INTEGER NOT NULL,
    expires_at TIMESTAMPTZ NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    UNIQUE (product_id, cart_id)
);

-- Reserve inventory
CREATE OR REPLACE FUNCTION reserve_inventory(
    p_cart_id UUID,
    p_product_id INTEGER,
    p_quantity INTEGER
) RETURNS BOOLEAN AS $$
DECLARE
    v_available INTEGER;
BEGIN
    -- Calculate available (stock - active reservations)
    SELECT 
        p.stock - COALESCE(SUM(r.quantity), 0)
    INTO v_available
    FROM products p
    LEFT JOIN inventory_reservations r 
        ON r.product_id = p.id AND r.expires_at > NOW()
    WHERE p.id = p_product_id
    GROUP BY p.id
    FOR UPDATE OF p;  -- Lock product row
    
    IF v_available < p_quantity THEN
        RETURN FALSE;
    END IF;
    
    -- Create or update reservation
    INSERT INTO inventory_reservations (product_id, cart_id, quantity, expires_at)
    VALUES (p_product_id, p_cart_id, p_quantity, NOW() + INTERVAL '15 minutes')
    ON CONFLICT (product_id, cart_id) DO UPDATE
    SET quantity = EXCLUDED.quantity,
        expires_at = EXCLUDED.expires_at;
    
    RETURN TRUE;
END;
$$ LANGUAGE plpgsql;

-- Cleanup expired reservations (run periodically)
DELETE FROM inventory_reservations WHERE expires_at < NOW();

-- On checkout: Convert reservation to order
CREATE OR REPLACE FUNCTION checkout(p_cart_id UUID) RETURNS INTEGER AS $$
DECLARE
    v_order_id INTEGER;
BEGIN
    -- Verify all reservations still valid
    IF EXISTS (
        SELECT 1 FROM inventory_reservations 
        WHERE cart_id = p_cart_id AND expires_at < NOW()
    ) THEN
        RAISE EXCEPTION 'Some reservations have expired';
    END IF;
    
    -- Create order and reduce stock atomically
    INSERT INTO orders (cart_id, status) VALUES (p_cart_id, 'confirmed')
    RETURNING id INTO v_order_id;
    
    -- Reduce stock
    UPDATE products p
    SET stock = stock - r.quantity
    FROM inventory_reservations r
    WHERE r.product_id = p.id AND r.cart_id = p_cart_id;
    
    -- Remove reservations
    DELETE FROM inventory_reservations WHERE cart_id = p_cart_id;
    
    RETURN v_order_id;
END;
$$ LANGUAGE plpgsql;

Next Module

Module 4: Indexing Deep Dive

Master the art of database indexing for blazing-fast queries