Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Module 3: Transactions & ACID
Understanding transactions is what separates developers who write code that “usually works” from those who build systems that are reliable under all conditions. This module gives you deep intuition for how transactions work.Estimated Time: 8-10 hours
Hands-On: Concurrency simulation labs
Key Skill: Choosing the right isolation level for your use case
Hands-On: Concurrency simulation labs
Key Skill: Choosing the right isolation level for your use case
3.1 What is a Transaction? (Understanding the Fundamentals)
The Real-World Analogy: All-or-Nothing Operations
Think of a transaction like a purchase at a store: When you buy something, multiple things must happen together:- Your payment is processed
- Money is deducted from your account
- Money is added to the store’s account
- Inventory is reduced by 1
- Receipt is printed
What is a Transaction in Databases?
A transaction is a sequence of database operations that are treated as a single, indivisible unit of work. Think of it as a “package deal” - either everything in the package happens, or nothing does. Key Concept: Atomicity- Atomic = “indivisible” (like an atom, which can’t be split)
- A transaction is atomic: it can’t be partially completed
- It’s like a light switch: it’s either ON or OFF, never halfway
- If BOTH updates succeed → COMMIT makes them permanent
- If EITHER update fails → ROLLBACK undoes everything (Alice keeps her money, Bob doesn’t get it)
Why Do We Need Transactions?
The Problem: Real-World Scenarios- Bank Transfer: Money must leave one account AND arrive in another (both or neither)
- E-commerce Order: Order must be created AND inventory reduced AND payment processed (all or nothing)
- User Registration: User account created AND welcome email sent AND profile initialized (all or nothing)
- Partial failures leave data inconsistent
- Money can disappear
- Orders can be created without inventory being reduced
- Users can be created but profiles missing
- Either everything succeeds, or everything is rolled back
- Data stays consistent
- No “half-completed” operations
Why Transactions Matter
Without transactions, partial failures leave your database in an inconsistent state.3.2 MVCC Internals: How Isolation Works
PostgreSQL uses Multi-Version Concurrency Control (MVCC) to achieve high concurrency. “Readers never block writers, and writers never block readers.”The Hidden Metadata: xmin and xmax
Every row (tuple) in a PostgreSQL table contains hidden system columns used for visibility checks:xmin: The Transaction ID (XID) that created this row version.xmax: The XID that deleted or updated this row version. If0, the row is not deleted.ctid: The physical location (page number, index) of the row version.
The Update “Append-Only” Flow
When you execute anUPDATE, PostgreSQL performs a Delete + Insert:
- The existing row version’s
xmaxis set to the current transaction ID (marking it as “logically deleted”). - A new row version is inserted with its
xminset to the current transaction ID.
UPDATE to a 1KB row creates a new 1KB row. If you run UPDATE users SET last_login = NOW() on 10 million rows, you have just written 10 million new tuples, doubling the table size temporarily until VACUUM cleans up. Always consider the write amplification cost of broad UPDATE statements.
Transaction Isolation via Snapshots
A Snapshot defines which row versions are visible to a transaction. It contains:xmin: The earliest XID that is still active.xmax: The first as-yet-unassigned XID.xip_list: A list of active XIDs betweenxminandxmax.
- The row’s
xminhas committed. - The row’s
xminis not in the snapshot’s “active” list. - The row’s
xmaxis either0, has not committed, or is in the snapshot’s “active” list.
Vacuuming & Bloat
To prevent the database from growing indefinitely, the Auto-Vacuum process runs in the background to:- Reclaim Space: Mark dead tuples as reusable for new data.
- Freeze XIDs: Transactions IDs are 32-bit integers (~4 billion). To prevent Wraparound (where IDs reset and old data becomes invisible), Vacuum “freezes” old XIDs, marking them as permanently visible.
- Update Statistics: Refresh
pg_statisticfor the Query Planner.
Understanding ACID: The Four Guarantees
Transactions provide four critical guarantees, known as ACID: A - Atomicity: All or nothing- Either all operations in a transaction succeed, or all are rolled back
- No partial updates
- Database moves from one valid state to another
- Constraints are never violated
- Example: Account balance can’t go negative (if you have a CHECK constraint)
- Multiple transactions can run simultaneously
- Each transaction sees a consistent view of data
- One transaction’s changes don’t affect others until committed
- Once a transaction commits, the changes are permanent
- Even if the database crashes, committed data is safe
- Changes are written to disk before COMMIT returns
Consistency: Valid State Transitions
What is Consistency? Consistency ensures the database moves from one valid state to another. Transactions cannot violate database constraints or rules. Real-World Analogy: Think of consistency like business rules:- “Account balance cannot be negative” (constraint)
- “Every order must have a customer” (foreign key)
- “Email addresses must be unique” (unique constraint)
- Constraints: CHECK, NOT NULL, UNIQUE, FOREIGN KEY
- Data types: Can’t put text in an integer column
- Business rules: Enforced by application logic or triggers
3.3 Isolation Levels
READ COMMITTED (PostgreSQL Default)
- Each statement sees a fresh snapshot of committed data
- Same query can return different results within a transaction
- No locks held between statements
- Most OLTP workloads
- You don’t need consistent reads across multiple queries
- Maximum concurrency is important
BEGIN/COMMIT gives them a frozen view of the database. It does not at READ COMMITTED. If you read a value, do some computation, then read it again, the value may have changed. If your logic depends on values staying stable across multiple queries, you need REPEATABLE READ.
REPEATABLE READ
- Snapshot taken at first query in transaction
- All reads see that same snapshot
- Write conflicts cause serialization errors
- Reports that need consistent point-in-time data
- Complex transactions reading related data
- When you can handle retry logic for write conflicts
SERIALIZABLE
- Strongest isolation: transactions appear to run serially
- Uses Serializable Snapshot Isolation (SSI) in PostgreSQL
- Detects anomalies and aborts conflicting transactions
- Financial calculations where any anomaly is unacceptable
- When correctness is more important than performance
- Complex business logic with multiple dependencies
3.4 Locking Mechanisms
PostgreSQL uses locks to manage concurrent access.Row-Level Locks
Advisory Locks
Application-controlled locks for custom synchronization.Deadlock Prevention
3.5 Practical Transaction Patterns
Pattern 1: Retry on Serialization Failure
Pattern 2: Idempotent Operations
Design operations that can be safely retried.Pattern 3: Optimistic Locking
Check version before updating, no locks held during read.Pattern 4: Queue Processing with SELECT FOR UPDATE SKIP LOCKED
3.6 Common Transaction Mistakes
Mistake 1: Transaction Too Long
Mistake 2: Not Handling Conflicts
Mistake 3: Ignoring Isolation Level
3.7 Practice Exercises
Exercise 1: Money Transfer
Implement a safe money transfer function that:- Validates sufficient balance
- Handles concurrent transfers correctly
- Logs all transfers
- Prevents negative balances
Solution
Solution
Exercise 2: Inventory Reservation
Design a system to reserve inventory for a shopping cart that:- Holds items for 15 minutes
- Releases reservation if cart is abandoned
- Prevents overselling
- Handles concurrent checkout
Solution
Solution
Next Module
Module 4: Indexing Deep Dive
Master the art of database indexing for blazing-fast queries
Interview Deep-Dive
Explain the difference between REPEATABLE READ in PostgreSQL and REPEATABLE READ in MySQL/InnoDB. Why does this distinction matter in production?
Explain the difference between REPEATABLE READ in PostgreSQL and REPEATABLE READ in MySQL/InnoDB. Why does this distinction matter in production?
Strong Answer:
- PostgreSQL’s REPEATABLE READ uses true Snapshot Isolation (SI) via MVCC. When your transaction starts, you get a frozen snapshot of the database at that point in time. All reads within the transaction see this same snapshot. Crucially, PostgreSQL’s SI also prevents phantom reads — if another transaction inserts a new row that would match your query’s WHERE clause, you will not see it. This is stronger than the SQL standard requires for REPEATABLE READ.
- MySQL/InnoDB’s REPEATABLE READ also uses MVCC snapshots for SELECT statements, but it uses gap locking for write operations to prevent phantoms. The subtle difference is that InnoDB’s approach can cause more lock contention because it physically locks ranges of index records, while PostgreSQL’s approach detects conflicts at commit time and raises serialization errors.
- This matters in production because developers migrating from MySQL to PostgreSQL (or vice versa) often assume identical behavior. In PostgreSQL, concurrent updates to the same row under REPEATABLE READ will cause the second transaction to fail with
ERROR: could not serialize access due to concurrent update. In MySQL, the second transaction would block until the first commits, then proceed. The PostgreSQL approach requires application-level retry logic but avoids long lock waits.
A production system is experiencing frequent deadlocks. Walk me through your diagnosis and resolution process.
A production system is experiencing frequent deadlocks. Walk me through your diagnosis and resolution process.
Strong Answer:
- Step 1: Enable deadlock logging. Set
log_lock_waits = onanddeadlock_timeout = 1s(the default). PostgreSQL logs the exact queries and lock types involved in every deadlock, including the circular dependency chain. Read the log — it tells you exactly which transactions, which tables, which rows. - Step 2: Identify the pattern. In my experience, 90% of application-level deadlocks fall into two categories: (a) two transactions updating the same rows in different order, and (b) a SELECT FOR UPDATE followed by an UPDATE that creates a gap between lock acquisition and lock escalation.
- Step 3: Fix the ordering. The universal solution is to ensure all transactions acquire locks in a consistent global order. If you are transferring money between accounts, always lock the lower account ID first:
SELECT * FROM accounts WHERE id IN (1, 2) ORDER BY id FOR UPDATE. This eliminates the possibility of circular waits. - Step 4: Shorten transactions. Deadlocks are more likely when transactions hold locks for longer. Move any non-database work (API calls, computation) outside the transaction. Use
SET LOCAL statement_timeout = '5s'to prevent runaway transactions from holding locks indefinitely. - Step 5: Consider advisory locks for application-level coordination. If the business logic inherently requires locking across multiple tables in unpredictable order, use
pg_advisory_xact_lock(hash_of_entity)at the beginning of the transaction to serialize access to a logical entity.
deadlock_timeout milliseconds (default 1 second). It builds a wait-for graph by examining all backends that are waiting for locks and the backends that hold the conflicting locks. If it finds a cycle in this graph, it picks a victim — typically the transaction that has done the least work (measured by number of locks held) — and terminates it with ERROR: deadlock detected. The detection is global across all backends and runs in the lock manager’s shared memory. The key insight is that the deadlock_timeout also acts as a damper: short-lived lock waits resolve naturally without invoking the detector, reducing overhead.You need to implement an exactly-once job processing system using PostgreSQL. How would you design it without external message brokers?
You need to implement an exactly-once job processing system using PostgreSQL. How would you design it without external message brokers?
Strong Answer:
- The pattern is SELECT FOR UPDATE SKIP LOCKED combined with idempotency keys. Create a jobs table with columns: id, payload, status (pending/processing/completed/failed), locked_at, attempts, idempotency_key.
- Workers run:
BEGIN; UPDATE jobs SET status = 'processing', locked_at = NOW() WHERE id = (SELECT id FROM jobs WHERE status = 'pending' AND (locked_at IS NULL OR locked_at < NOW() - INTERVAL '5 minutes') ORDER BY created_at LIMIT 1 FOR UPDATE SKIP LOCKED) RETURNING *;This atomically claims a job. SKIP LOCKED means other workers do not block — they grab the next available job instead. - After processing, the worker either sets status to ‘completed’ and COMMITs, or on failure, sets status to ‘failed’ and increments attempts. The
locked_attimeout handles the case where a worker crashes mid-processing — after 5 minutes, the job becomes available for re-claim. - For exactly-once semantics, use the idempotency_key: before processing, check if the output of this job already exists. This is crucial because “at-least-once delivery + idempotent processing = exactly-once semantics.”
- A partial index
CREATE INDEX idx_jobs_pending ON jobs(created_at) WHERE status = 'pending'ensures the worker query remains fast even with millions of completed jobs in the table.
Explain the transaction ID wraparound problem in PostgreSQL. Why is it dangerous, and how do you prevent it?
Explain the transaction ID wraparound problem in PostgreSQL. Why is it dangerous, and how do you prevent it?
Strong Answer:
- PostgreSQL uses a 32-bit transaction ID (XID) counter, which means it can only represent about 4.2 billion transactions before wrapping around. MVCC visibility relies on comparing XIDs — “is this row’s xmin older or newer than my snapshot?” When the counter wraps, old XIDs that were “in the past” suddenly appear to be “in the future,” making previously visible data invisible. If this goes unchecked, the database will proactively shut down to prevent data loss.
- The prevention mechanism is VACUUM’s freeze operation. VACUUM scans tables and replaces old XIDs with a special FrozenTransactionId (2), which is defined as “always in the past” for visibility checks. Once frozen, a row’s visibility is no longer dependent on XID comparison. The parameter
autovacuum_freeze_max_age(default 200 million) forces an aggressive autovacuum when a table’s oldest unfrozen XID approaches the danger zone. - The danger scenario: a long-running transaction holds back the oldest active XID, preventing VACUUM from freezing rows in any table. If this goes on long enough (approaching 2 billion XIDs), PostgreSQL will refuse to start new transactions and log warnings about impending wraparound. The emergency fix is to kill the long-running transaction, then run VACUUM FREEZE on affected tables.
- Production prevention: monitor
age(datfrozenxid)per database andage(relfrozenxid)per table. Alert when any value exceeds 500 million. Kill transactions older than 1 hour. Ensure autovacuum is not being blocked by long-running queries on replicas (hot_standby_feedback can contribute to this).