Skip to main content

Module 16: System Design Case Studies

Apply everything you’ve learned to real-world database design challenges. Each case study covers requirements, schema design, queries, scaling, and trade-offs.
Estimated Time: 10-12 hours
Format: 6 comprehensive case studies
Interview Use: Practice for 45-60 min design rounds
Skill Level: Senior/Staff

Case Study 1: E-Commerce Platform

Requirements

  • Product catalog with categories and attributes
  • User accounts with addresses
  • Shopping cart and checkout
  • Order management with status tracking
  • Inventory management with stock levels
  • Review and rating system

Schema Design

-- Core entities
CREATE TABLE users (
    id BIGSERIAL PRIMARY KEY,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE addresses (
    id BIGSERIAL PRIMARY KEY,
    user_id BIGINT REFERENCES users(id),
    type VARCHAR(20), -- 'shipping', 'billing'
    street TEXT NOT NULL,
    city VARCHAR(100),
    country VARCHAR(2),
    postal_code VARCHAR(20),
    is_default BOOLEAN DEFAULT false
);

-- Product catalog
CREATE TABLE categories (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    parent_id INT REFERENCES categories(id),
    path ltree  -- requires ltree extension
);

CREATE TABLE products (
    id BIGSERIAL PRIMARY KEY,
    sku VARCHAR(50) UNIQUE NOT NULL,
    name VARCHAR(255) NOT NULL,
    description TEXT,
    category_id INT REFERENCES categories(id),
    base_price DECIMAL(10,2) NOT NULL,
    attributes JSONB DEFAULT '{}',  -- flexible attributes
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE product_variants (
    id BIGSERIAL PRIMARY KEY,
    product_id BIGINT REFERENCES products(id),
    sku VARCHAR(50) UNIQUE NOT NULL,
    name VARCHAR(100),  -- "Red, Large"
    price_modifier DECIMAL(10,2) DEFAULT 0,
    attributes JSONB NOT NULL  -- {"color": "red", "size": "L"}
);

-- Inventory
CREATE TABLE inventory (
    variant_id BIGINT PRIMARY KEY REFERENCES product_variants(id),
    quantity INT NOT NULL DEFAULT 0,
    reserved INT NOT NULL DEFAULT 0,
    CHECK (quantity >= 0),
    CHECK (reserved >= 0),
    CHECK (reserved <= quantity)
);

-- Orders
CREATE TABLE orders (
    id BIGSERIAL PRIMARY KEY,
    user_id BIGINT REFERENCES users(id),
    status VARCHAR(20) DEFAULT 'pending',
    total DECIMAL(12,2) NOT NULL,
    shipping_address_id BIGINT REFERENCES addresses(id),
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE order_items (
    id BIGSERIAL PRIMARY KEY,
    order_id BIGINT REFERENCES orders(id),
    variant_id BIGINT REFERENCES product_variants(id),
    quantity INT NOT NULL,
    unit_price DECIMAL(10,2) NOT NULL,
    CHECK (quantity > 0)
);

-- Reviews
CREATE TABLE reviews (
    id BIGSERIAL PRIMARY KEY,
    product_id BIGINT REFERENCES products(id),
    user_id BIGINT REFERENCES users(id),
    rating INT NOT NULL CHECK (rating BETWEEN 1 AND 5),
    title VARCHAR(200),
    body TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    UNIQUE(product_id, user_id)
);

Key Queries

-- Product listing with category tree
WITH RECURSIVE cat_tree AS (
    SELECT id, name, parent_id, ARRAY[name] AS path
    FROM categories WHERE id = $category_id
    UNION ALL
    SELECT c.id, c.name, c.parent_id, ct.path || c.name
    FROM categories c
    JOIN cat_tree ct ON c.parent_id = ct.id
)
SELECT p.*, array_agg(pv.id) AS variant_ids
FROM products p
JOIN cat_tree ct ON p.category_id = ct.id
LEFT JOIN product_variants pv ON pv.product_id = p.id
GROUP BY p.id;

-- Checkout with inventory reservation
BEGIN;
SELECT * FROM inventory 
WHERE variant_id = $variant_id 
FOR UPDATE;  -- Lock row

UPDATE inventory 
SET reserved = reserved + $quantity
WHERE variant_id = $variant_id
  AND (quantity - reserved) >= $quantity;

-- Check affected rows, if 0 then out of stock
COMMIT;

-- Product with aggregated reviews
SELECT 
    p.*,
    COUNT(r.id) AS review_count,
    AVG(r.rating) AS avg_rating,
    jsonb_build_object(
        '5', COUNT(*) FILTER (WHERE r.rating = 5),
        '4', COUNT(*) FILTER (WHERE r.rating = 4),
        '3', COUNT(*) FILTER (WHERE r.rating = 3),
        '2', COUNT(*) FILTER (WHERE r.rating = 2),
        '1', COUNT(*) FILTER (WHERE r.rating = 1)
    ) AS rating_distribution
FROM products p
LEFT JOIN reviews r ON r.product_id = p.id
WHERE p.id = $product_id
GROUP BY p.id;

Scaling Strategy

┌─────────────────────────────────────────────────────────────────────────────┐
│                    E-COMMERCE SCALING                                        │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   Read Path (90% of traffic):                                               │
│   ┌─────────────┐     ┌─────────────┐     ┌─────────────┐                  │
│   │   CDN       │────▶│   Redis     │────▶│   Replica   │                  │
│   │ (static)    │     │ (products)  │     │  (queries)  │                  │
│   └─────────────┘     └─────────────┘     └─────────────┘                  │
│                                                                              │
│   Write Path (10% of traffic):                                              │
│   ┌─────────────┐     ┌─────────────┐                                       │
│   │    App      │────▶│   Primary   │                                       │
│   │  (orders)   │     │  (writes)   │                                       │
│   └─────────────┘     └─────────────┘                                       │
│                                                                              │
│   Indexing:                                                                  │
│   • products(category_id, created_at DESC) - listings                       │
│   • products USING GIN(attributes) - attribute search                       │
│   • inventory(variant_id) - stock checks                                    │
│   • orders(user_id, created_at DESC) - order history                        │
│                                                                              │
│   Partitioning:                                                              │
│   • orders by created_at (monthly) - historical data                        │
│   • reviews by product_id hash - even distribution                          │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Case Study 2: Real-Time Analytics

Requirements

  • Ingest millions of events per day
  • Sub-second dashboard queries
  • Time-series aggregations (hourly, daily, weekly)
  • Retention: raw data 30 days, aggregates forever

Schema Design

-- Time-series events with partitioning
CREATE TABLE events (
    id BIGSERIAL,
    event_time TIMESTAMPTZ NOT NULL,
    event_type VARCHAR(50) NOT NULL,
    user_id BIGINT,
    properties JSONB,
    PRIMARY KEY (event_time, id)
) PARTITION BY RANGE (event_time);

-- Create monthly partitions
CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
-- ... more partitions

-- Pre-aggregated summaries
CREATE TABLE event_stats_hourly (
    hour TIMESTAMPTZ NOT NULL,
    event_type VARCHAR(50) NOT NULL,
    count BIGINT DEFAULT 0,
    unique_users BIGINT DEFAULT 0,
    PRIMARY KEY (hour, event_type)
);

CREATE TABLE event_stats_daily (
    date DATE NOT NULL,
    event_type VARCHAR(50) NOT NULL,
    count BIGINT DEFAULT 0,
    unique_users BIGINT DEFAULT 0,
    properties_agg JSONB,  -- aggregated metrics
    PRIMARY KEY (date, event_type)
);

-- HyperLogLog for unique counts (requires extension)
CREATE EXTENSION IF NOT EXISTS hll;

ALTER TABLE event_stats_hourly 
ADD COLUMN user_hll hll DEFAULT hll_empty();

Ingestion Pattern

-- Batch insert with conflict handling
INSERT INTO events (event_time, event_type, user_id, properties)
VALUES 
    ($1, $2, $3, $4),
    ($5, $6, $7, $8),
    -- ... batch of 1000
ON CONFLICT DO NOTHING;

-- Continuous aggregation (run every hour)
INSERT INTO event_stats_hourly (hour, event_type, count, unique_users, user_hll)
SELECT 
    date_trunc('hour', event_time) AS hour,
    event_type,
    COUNT(*) AS count,
    COUNT(DISTINCT user_id) AS unique_users,
    hll_add_agg(hll_hash_bigint(user_id)) AS user_hll
FROM events
WHERE event_time >= $last_processed_hour
  AND event_time < $current_hour
GROUP BY 1, 2
ON CONFLICT (hour, event_type) DO UPDATE SET
    count = event_stats_hourly.count + EXCLUDED.count,
    unique_users = EXCLUDED.unique_users,
    user_hll = hll_union(event_stats_hourly.user_hll, EXCLUDED.user_hll);

Query Patterns

-- Dashboard query (uses pre-aggregates)
SELECT 
    date_trunc('day', hour) AS date,
    event_type,
    SUM(count) AS total_events,
    hll_cardinality(hll_union_agg(user_hll)) AS unique_users
FROM event_stats_hourly
WHERE hour >= NOW() - INTERVAL '7 days'
GROUP BY 1, 2
ORDER BY 1, 2;

-- Drill-down to raw events
SELECT 
    event_time,
    event_type,
    properties
FROM events
WHERE event_time >= $start AND event_time < $end
  AND event_type = $type
ORDER BY event_time DESC
LIMIT 100;

Data Lifecycle

-- Drop old partitions (automated)
DROP TABLE events_2023_01;  -- Instant, no locking

-- Compress old data with TimescaleDB (optional)
-- Or move to columnar storage (Citus columnar)
SELECT alter_table_set_access_method('events_2023_12', 'columnar');

Case Study 3: Social Media Feed

Requirements

  • Users follow other users
  • Home feed shows posts from followed users
  • Support for 100M users, 10M DAU
  • Feed must be < 100ms

Schema Design

CREATE TABLE users (
    id BIGSERIAL PRIMARY KEY,
    username VARCHAR(50) UNIQUE NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE follows (
    follower_id BIGINT NOT NULL REFERENCES users(id),
    following_id BIGINT NOT NULL REFERENCES users(id),
    created_at TIMESTAMPTZ DEFAULT NOW(),
    PRIMARY KEY (follower_id, following_id)
);

CREATE TABLE posts (
    id BIGSERIAL PRIMARY KEY,
    author_id BIGINT NOT NULL REFERENCES users(id),
    content TEXT NOT NULL,
    media_urls TEXT[],
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Fan-out timeline (for < 10K followers)
CREATE TABLE timelines (
    user_id BIGINT NOT NULL,
    post_id BIGINT NOT NULL REFERENCES posts(id),
    author_id BIGINT NOT NULL,
    created_at TIMESTAMPTZ NOT NULL,
    PRIMARY KEY (user_id, created_at DESC, post_id)
);

-- Celebrity posts (for > 10K followers)
-- Not fanned out, merged at read time
CREATE INDEX idx_posts_author_time ON posts(author_id, created_at DESC);

Fan-Out Strategy

-- On new post from regular user (< 10K followers)
INSERT INTO timelines (user_id, post_id, author_id, created_at)
SELECT 
    follower_id,
    $post_id,
    $author_id,
    $created_at
FROM follows
WHERE following_id = $author_id;

-- Async worker for large fan-outs
-- Use message queue (RabbitMQ, SQS) for batching

Feed Query (Hybrid)

-- Get user's celebrity followings
WITH celebrities AS (
    SELECT following_id 
    FROM follows f
    JOIN user_stats us ON us.user_id = f.following_id
    WHERE f.follower_id = $user_id 
      AND us.follower_count > 10000
),
-- Get pre-materialized timeline
materialized AS (
    SELECT post_id, author_id, created_at
    FROM timelines
    WHERE user_id = $user_id
      AND created_at > NOW() - INTERVAL '7 days'
    ORDER BY created_at DESC
    LIMIT 100
),
-- Get celebrity posts
celebrity_posts AS (
    SELECT id AS post_id, author_id, created_at
    FROM posts
    WHERE author_id IN (SELECT following_id FROM celebrities)
      AND created_at > NOW() - INTERVAL '7 days'
    ORDER BY created_at DESC
    LIMIT 100
)
-- Merge and return
SELECT p.*
FROM posts p
JOIN (
    SELECT * FROM materialized
    UNION ALL
    SELECT * FROM celebrity_posts
) feed ON feed.post_id = p.id
ORDER BY feed.created_at DESC
LIMIT 20;

Case Study 4: Multi-Tenant SaaS

Requirements

  • Thousands of tenants (companies)
  • Complete data isolation
  • Some tenants are very large (1M+ rows)
  • Compliance requirements (GDPR)

Schema Design (Shared with RLS)

-- Enable RLS
CREATE TABLE tenants (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name VARCHAR(255) NOT NULL,
    plan VARCHAR(20) DEFAULT 'free',
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE users (
    id BIGSERIAL PRIMARY KEY,
    tenant_id UUID NOT NULL REFERENCES tenants(id),
    email VARCHAR(255) NOT NULL,
    role VARCHAR(20) DEFAULT 'member',
    UNIQUE(tenant_id, email)
);

CREATE TABLE projects (
    id BIGSERIAL PRIMARY KEY,
    tenant_id UUID NOT NULL REFERENCES tenants(id),
    name VARCHAR(255) NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE tasks (
    id BIGSERIAL PRIMARY KEY,
    tenant_id UUID NOT NULL REFERENCES tenants(id),
    project_id BIGINT REFERENCES projects(id),
    title VARCHAR(255) NOT NULL,
    status VARCHAR(20) DEFAULT 'open',
    assignee_id BIGINT REFERENCES users(id)
);

-- Row Level Security
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
ALTER TABLE tasks ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation_projects ON projects
    USING (tenant_id = current_setting('app.tenant_id')::UUID);

CREATE POLICY tenant_isolation_tasks ON tasks
    USING (tenant_id = current_setting('app.tenant_id')::UUID);

-- Set tenant context (in application)
-- SET app.tenant_id = 'uuid-here';

Connection Management

# Application middleware
def set_tenant_context(tenant_id):
    """Set tenant context for RLS"""
    with get_db_connection() as conn:
        conn.execute(f"SET app.tenant_id = '{tenant_id}'")
        yield conn
        conn.execute("RESET app.tenant_id")

# PgBouncer with tenant routing (advanced)
# Route large tenants to dedicated pools

Tenant Isolation Testing

-- Test: Ensure tenant A can't see tenant B's data
SET app.tenant_id = 'tenant-a-uuid';

SELECT * FROM projects;  -- Should only see tenant A

-- Try to access tenant B (should fail silently - return 0 rows)
SELECT * FROM projects WHERE tenant_id = 'tenant-b-uuid';
-- Returns 0 rows due to RLS

-- Bypass for admin operations
SET ROLE admin_role;  -- Must have BYPASSRLS
SELECT * FROM projects;  -- Sees all tenants

Case Study 5: Event Sourcing System

Requirements

  • All state changes stored as events (audit trail)
  • Ability to rebuild state at any point in time
  • Support for projections (read models)
  • Event replay for debugging

Schema Design

-- Event store (append-only)
CREATE TABLE events (
    id BIGSERIAL PRIMARY KEY,
    stream_id UUID NOT NULL,
    stream_type VARCHAR(100) NOT NULL,  -- 'Order', 'User', etc.
    event_type VARCHAR(100) NOT NULL,   -- 'OrderCreated', 'ItemAdded'
    event_data JSONB NOT NULL,
    metadata JSONB DEFAULT '{}',
    version INT NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    UNIQUE(stream_id, version)  -- Optimistic concurrency
);

-- Example events
INSERT INTO events (stream_id, stream_type, event_type, event_data, version)
VALUES 
    ('order-123', 'Order', 'OrderCreated', 
     '{"customer_id": 456, "items": []}', 1),
    ('order-123', 'Order', 'ItemAdded', 
     '{"product_id": 789, "quantity": 2}', 2),
    ('order-123', 'Order', 'OrderSubmitted', 
     '{"submitted_at": "2024-01-15T10:30:00Z"}', 3);

-- Projection: Materialized current state
CREATE TABLE order_projections (
    order_id UUID PRIMARY KEY,
    customer_id BIGINT NOT NULL,
    items JSONB NOT NULL DEFAULT '[]',
    status VARCHAR(20) NOT NULL,
    total DECIMAL(12,2),
    last_event_id BIGINT NOT NULL,  -- For replay
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- Subscription checkpoints
CREATE TABLE subscription_checkpoints (
    subscription_id VARCHAR(100) PRIMARY KEY,
    last_processed_id BIGINT NOT NULL,
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

Event Replay

-- Rebuild projection from events
CREATE OR REPLACE FUNCTION rebuild_order_projection(p_order_id UUID)
RETURNS void AS $$
DECLARE
    evt RECORD;
    current_state JSONB := '{}';
BEGIN
    -- Delete existing projection
    DELETE FROM order_projections WHERE order_id = p_order_id;
    
    -- Replay all events
    FOR evt IN 
        SELECT * FROM events 
        WHERE stream_id = p_order_id 
        ORDER BY version
    LOOP
        current_state := apply_event(current_state, evt.event_type, evt.event_data);
    END LOOP;
    
    -- Insert rebuilt projection
    INSERT INTO order_projections (order_id, customer_id, items, status, last_event_id)
    VALUES (
        p_order_id,
        (current_state->>'customer_id')::BIGINT,
        current_state->'items',
        current_state->>'status',
        evt.id
    );
END;
$$ LANGUAGE plpgsql;

Subscription Processing

# Python async event processor
async def process_events():
    last_id = get_checkpoint('order_projection')
    
    while True:
        events = await db.fetch("""
            SELECT * FROM events 
            WHERE id > $1 
            ORDER BY id 
            LIMIT 100
        """, last_id)
        
        for event in events:
            await update_projection(event)
            last_id = event['id']
        
        await save_checkpoint('order_projection', last_id)
        await asyncio.sleep(0.1)

Case Study 6: Global Application

Requirements

  • Users in US, EU, Asia
  • Data residency (EU data stays in EU)
  • Low latency reads globally
  • Strong consistency for writes

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                    GLOBAL DATABASE ARCHITECTURE                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   US-EAST (Primary for US users)                                            │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │  ┌─────────────┐     ┌─────────────┐                               │   │
│   │  │   Primary   │────▶│   Replica   │ (local reads)                 │   │
│   │  │  (writes)   │     │             │                               │   │
│   │  └──────┬──────┘     └─────────────┘                               │   │
│   └─────────┼───────────────────────────────────────────────────────────┘   │
│             │ async replication (reference data only)                       │
│   ┌─────────┼───────────────────────────────────────────────────────────┐   │
│   │         ▼           EU-WEST (Primary for EU users)                 │   │
│   │  ┌─────────────┐     ┌─────────────┐                               │   │
│   │  │   Primary   │────▶│   Replica   │                               │   │
│   │  │  (writes)   │     │             │                               │   │
│   │  └─────────────┘     └─────────────┘                               │   │
│   │                                                                     │   │
│   │  EU user data ONLY stored here (GDPR)                              │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│   Data Routing:                                                             │
│   • User data: Route to home region based on user.region                   │
│   • Reference data: Replicated globally (products, configs)                │
│   • Cross-region queries: Federation layer or application-level           │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Implementation

-- Users partitioned by region
CREATE TABLE users (
    id BIGSERIAL,
    region VARCHAR(10) NOT NULL,  -- 'us', 'eu', 'asia'
    email VARCHAR(255) NOT NULL,
    data JSONB,
    PRIMARY KEY (region, id)
) PARTITION BY LIST (region);

CREATE TABLE users_us PARTITION OF users FOR VALUES IN ('us');
CREATE TABLE users_eu PARTITION OF users FOR VALUES IN ('eu');
CREATE TABLE users_asia PARTITION OF users FOR VALUES IN ('asia');

-- Reference data (replicated everywhere)
CREATE TABLE products (
    id BIGSERIAL PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    -- No region-specific data
);

-- Foreign data wrapper for cross-region queries
CREATE EXTENSION postgres_fdw;

CREATE SERVER eu_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'eu-db.example.com', port '5432', dbname 'app');

CREATE USER MAPPING FOR app_user
SERVER eu_server
OPTIONS (user 'app_user', password 'secret');

CREATE FOREIGN TABLE users_eu_remote (
    id BIGINT,
    region VARCHAR(10),
    email VARCHAR(255)
) SERVER eu_server OPTIONS (table_name 'users_eu');

How to Use These Case Studies in Interviews

System design interviews for database engineering roles typically follow a 45-60 minute format. Here’s how to leverage these case studies for maximum interview success:

Interview Structure and Timing

Phase 1: Requirements Clarification (5-10 minutes)
  • Ask about scale: How many users? Queries per second? Data volume?
  • Identify read vs write ratio (often 90:10 or higher)
  • Clarify consistency requirements: Strong vs eventual consistency?
  • Understand geographic distribution and latency requirements
  • Confirm data retention and compliance needs
Phase 2: High-Level Design (10-15 minutes)
  • Start with a simple single-server design
  • Identify the main entities and their relationships
  • Draw a basic ER diagram or schema outline
  • Explain your normalization choices
  • Discuss primary access patterns and query shapes
Phase 3: Deep Dive on Specific Components (15-20 minutes)
  • Schema Design: Walk through 2-3 core tables in detail
    • Show primary keys, foreign keys, indexes
    • Explain constraint choices
    • Justify data types and column sizes
  • Query Patterns: Write 2-3 critical queries
    • Show how indexes support these queries
    • Discuss query plan expectations
    • Mention potential optimization techniques
  • Scaling Bottlenecks: Identify the first scaling constraint
    • Connection limits? I/O? CPU?
    • Propose specific solutions (not just “add caching”)
Phase 4: Scaling Discussion (10-15 minutes)
  • Progression from single server to scaled architecture
  • Read scaling: Replicas, caching layers, CDN
  • Write scaling: Partitioning, sharding strategies
  • Handle trade-offs explicitly:
    • “If we denormalize here, we gain X but lose Y”
    • “Caching improves reads but introduces consistency challenges”
Phase 5: Trade-offs and Extensions (5-10 minutes)
  • Discuss monitoring and observability needs
  • Mention backup and disaster recovery
  • Consider operational complexity
  • Address interviewer’s specific concerns

What Interviewers Look For

✅ Strong Signals:
  • Clear, structured thinking with explicit phases
  • Asking clarifying questions before jumping to solutions
  • Justifying design decisions with concrete trade-offs
  • Concrete SQL examples, not just hand-waving
  • Understanding when to denormalize and why
  • Awareness of operational concerns (monitoring, backups)
  • Scaling incrementally (not over-engineering initially)
❌ Red Flags:
  • Immediately jumping to complex distributed systems
  • Generic answers (“we’ll use microservices” without specifics)
  • Not considering query patterns when designing schema
  • Ignoring constraints and data integrity
  • Proposing technologies without understanding their trade-offs
  • Unable to write actual SQL queries
  • No discussion of indexes or query optimization

Practice Strategies

1. Time Yourself: Set a 45-minute timer and work through a full case study 2. Write Actual SQL: Don’t just describe—write CREATE TABLE statements 3. Draw Diagrams: Practice drawing ER diagrams and architecture diagrams quickly 4. Explain Out Loud: Verbalize your thinking process as if explaining to an interviewer 5. Compare Your Solution: After completing, review the provided solution and note differences 6. Focus on Trade-offs: For every design choice, practice stating the pros and cons

Common Interview Questions to Prepare

  • “How would you handle hot partition problems in sharding?”
  • “What’s your strategy for zero-downtime migrations?”
  • “How do you ensure data consistency across microservices?”
  • “Explain your approach to managing database schema changes in production.”
  • “How would you optimize this query?” (given EXPLAIN output)
  • “What monitoring metrics would you track for this system?”

Case Study to Interview Mapping

  • E-Commerce: Tests inventory management, transactions, and consistency
  • Analytics Platform: Tests time-series optimization and aggregation strategies
  • Social Feed: Tests fan-out approaches and caching patterns
  • Multi-Tenant SaaS: Tests isolation strategies and connection management
  • Event Sourcing: Tests immutable data patterns and projections
  • Global Application: Tests geo-distribution and data residency
Use these case studies to build a mental library of patterns you can adapt to any interview scenario.

Design Template: Build Your Own Case Study

Use this template to practice system design for any application domain:

1. Requirements Analysis

Functional Requirements:
  • List 5-7 core features the system must support
  • Identify the main entities and their relationships
  • Define critical user workflows (e.g., checkout flow, post creation)
Non-Functional Requirements:
  • Scale: How many users? Requests/second? Data volume?
  • Read:Write Ratio: Typically 90:10, 95:5, or 50:50?
  • Latency: P95/P99 targets (e.g., less than 100ms for reads, less than 500ms for writes)
  • Consistency: Strong, eventual, or per-operation?
  • Availability: Target uptime (e.g., 99.9%, 99.99%)?

2. Data Model Design

Core Entities Template:
-- Entity 1: [Name]
CREATE TABLE [entity_name] (
    id BIGSERIAL PRIMARY KEY,
    -- Add relevant columns with appropriate types
    -- Include timestamps
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Entity 2: [Name]
-- ...

-- Relationships (junction tables if needed)
-- ...
Schema Design Checklist:
  • All tables have explicit primary keys
  • Foreign keys defined with appropriate ON DELETE behavior
  • Unique constraints on natural keys (email, username, etc.)
  • Check constraints for data validation
  • Appropriate indexes for query patterns (list them explicitly)
  • Consider partitioning strategy if time-series or very large tables
  • JSONB columns only where schema flexibility is truly needed

3. Access Patterns and Queries

Identify Top 5 Queries by Frequency:
  1. [Query description] - Expected QPS: ___
    -- Write the SQL query
    -- Note: This query uses index on [column]
    
  2. [Query description] - Expected QPS: ___
  3. [Query description] - Expected QPS: ___
  4. [Query description] - Expected QPS: ___
  5. [Query description] - Expected QPS: ___
Write Patterns:
  • Transactional writes (e.g., orders, payments)
  • Bulk imports (e.g., data ingestion)
  • Background jobs (e.g., aggregation, cleanup)

4. Indexing Strategy

-- List all indexes with justification
CREATE INDEX idx_[table]_[columns] ON [table]([columns]) 
WHERE [condition if partial];
-- Justification: Supports [specific query pattern]

-- Example:
CREATE INDEX idx_orders_user_created ON orders(user_id, created_at DESC);
-- Justification: Supports "recent orders by user" query (ORDER BY created_at DESC)

CREATE INDEX idx_products_category_active ON products(category_id) 
WHERE status = 'active';
-- Justification: Partial index for active product listings by category

5. Scaling Plan

Phase 1: Single Server (0-10K users)
  • Single PostgreSQL instance
  • Application handles connection pooling
  • Basic monitoring and backups
Phase 2: Read Scaling (10K-100K users)
  • Add 2-3 read replicas
  • Implement read/write splitting in application
  • Add Redis for session and hot data caching
  • CDN for static assets
Phase 3: Write Scaling (100K-1M users)
  • Vertical scaling of primary (larger instance)
  • Implement connection pooling (PgBouncer)
  • Consider partitioning large tables by date/hash
  • Asynchronous processing for non-critical writes
Phase 4: Distributed Architecture (1M+ users)
  • Sharding strategy: [by user_id, by geographic region, etc.]
  • Cross-shard query handling approach
  • Multi-region deployment with data residency
  • Event-driven architecture for eventual consistency

6. Trade-offs and Considerations

Consistency vs Performance:
  • Where can you use eventual consistency?
  • Which operations require strong consistency?
  • How do you handle consistency across shards?
Normalization vs Denormalization:
  • Which tables/columns will you denormalize? Why?
  • What’s your strategy for keeping denormalized data in sync?
Complexity vs Maintainability:
  • Is sharding premature optimization for your scale?
  • Can simpler solutions (caching, indexing) solve the problem?

7. Operational Considerations

Monitoring:
  • Key metrics to track (latency, throughput, error rate, saturation)
  • Query performance tracking (pg_stat_statements)
  • Replication lag monitoring
  • Connection pool utilization
Disaster Recovery:
  • Backup strategy and RPO (Recovery Point Objective)
  • Restore and RTO (Recovery Time Objective)
  • WAL archiving and PITR (Point-in-Time Recovery)
Schema Evolution:
  • How will you handle schema migrations with zero downtime?
  • Versioning strategy for breaking changes

Self-Evaluation Rubric

Use this rubric to assess your case study solutions and identify areas for improvement:

Requirements Understanding (Score: /10)

Excellent (9-10):
  • Asked clarifying questions about scale, consistency, and latency requirements
  • Identified all implicit requirements (e.g., multi-tenancy, geographic distribution)
  • Made reasonable assumptions where information was missing
  • Documented assumptions explicitly
Good (7-8):
  • Asked some clarifying questions
  • Covered most critical requirements
  • Made some assumptions but didn’t document all
Needs Improvement (score below 7):
  • Jumped directly to solution without clarifying requirements
  • Missed critical non-functional requirements
  • Made unrealistic assumptions

Schema Design (Score: /20)

Excellent (17-20):
  • All entities properly normalized (or intentionally denormalized with justification)
  • Appropriate data types and constraints
  • Foreign keys with correct ON DELETE behavior
  • Indexes match query patterns precisely
  • Considered partitioning where appropriate
  • JSONB used judiciously
Good (13-16):
  • Mostly normalized schema
  • Some indexes missing or unnecessary indexes included
  • Minor data type issues
  • Constraints mostly appropriate
Needs Improvement (under 13):
  • Normalization violations without justification
  • Missing foreign keys or constraints
  • No indexes or all indexes wrong
  • Inappropriate use of JSONB for structured data
  • Missing primary keys

Query Design (Score: /15)

Excellent (13-15):
  • All queries are efficient and use indexes
  • Proper use of JOINs, CTEs, and window functions
  • Transactions used appropriately for consistency
  • Handles edge cases (NULL values, empty results)
Good (10-12):
  • Queries are functional but not optimally efficient
  • Some missed opportunities for index usage
  • Minor correctness issues
Needs Improvement (under 10):
  • Queries don’t use available indexes
  • N+1 query patterns
  • Missing WHERE clauses or overly broad queries
  • Incorrect JOIN types or conditions

Scaling Strategy (Score: /20)

Excellent (17-20):
  • Clear progression from simple to complex
  • Specific solutions for read and write scaling
  • Justified when to introduce each component
  • Considered operational complexity
  • Addressed specific bottlenecks (connection limits, I/O, CPU)
Good (13-16):
  • Reasonable scaling approach
  • Some components mentioned without detail
  • Missing specific thresholds for scaling triggers
Needs Improvement (under 13):
  • Generic “use microservices” without specifics
  • Over-engineered from the start
  • No consideration of bottlenecks
  • Proposed solutions don’t match requirements

Trade-off Analysis (Score: /15)

Excellent (13-15):
  • Every major decision includes explicit trade-offs
  • Quantified trade-offs where possible (e.g., “10% write overhead for 50% read improvement”)
  • Considered alternative approaches
  • Addressed operational implications
Good (10-12):
  • Some trade-offs discussed
  • Mostly qualitative analysis
  • Missing some alternative approaches
Needs Improvement (under 10):
  • No trade-off discussion
  • Claims solutions are “best” without justification
  • Doesn’t consider downsides

Operational Awareness (Score: /10)

Excellent (9-10):
  • Detailed monitoring plan
  • Backup and disaster recovery strategy
  • Schema migration approach
  • Considered operational complexity and team skills
Good (7-8):
  • Mentioned monitoring and backups
  • Missing some operational details
Needs Improvement (score under 7):
  • No discussion of operations
  • Proposes complex system without considering maintenance

Communication Clarity (Score: /10)

Excellent (9-10):
  • Clear, structured presentation
  • Used diagrams effectively
  • Explained rationale for decisions
  • Easy to follow progression
Good (7-8):
  • Mostly clear
  • Some organization issues
  • Could use more diagrams
Needs Improvement (score under 7):
  • Difficult to follow
  • Jumps around randomly
  • No visual aids

Total Score: ___/100

90-100: Excellent - Ready for staff+ interviews 75-89: Good - Solid senior-level understanding 60-74: Fair - Has foundational knowledge but needs more practice Below 60: Needs Improvement - Review fundamentals and practice more

Action Items Based on Your Score

If you scored below 70:
  1. Review the fundamentals in earlier modules (especially Database Design and Performance)
  2. Practice writing actual SQL, not just describing solutions
  3. Focus on one case study at a time and compare your solution to the provided one
If you scored 70-85:
  1. Deepdive into areas where you scored lowest
  2. Practice explaining trade-offs out loud
  3. Time yourself on case studies to improve pacing
  4. Write more detailed schemas and queries
If you scored above 85:
  1. Practice more complex variations
  2. Focus on communication and presentation
  3. Review edge cases and failure scenarios
  4. Consider mentoring others to solidify your knowledge

Capstone Project: Build a Complete Database-Backed System

Put everything together by building a production-ready system from scratch:

Project Overview

Design, implement, and document a complete database-backed application that demonstrates mastery of all course concepts. Duration: 2-4 weeks
Outcome: Portfolio piece suitable for interviews
Format: Working application + comprehensive documentation

Project Requirements

Choose ONE of these domains:
  1. Multi-Tenant SaaS Analytics Platform: Track events, generate reports, support multiple customers
  2. Real-Time Collaborative Tool: Like Google Docs or Figma, with conflict resolution
  3. E-Learning Platform: Courses, enrollments, progress tracking, assessments
  4. Inventory Management System: Multi-warehouse, real-time stock, order fulfillment
  5. Social Media Feed Engine: Posts, follows, timeline generation, notifications
  6. OR propose your own (must be approved based on complexity)

Implementation Checklist

Phase 1: Design and Planning (Week 1)

  • Write a requirements document (functional and non-functional)
  • Create detailed ER diagram with cardinality
  • Design complete schema with all constraints
  • Identify top 10 query patterns
  • Plan indexing strategy
  • Document normalization decisions and intentional denormalizations

Phase 2: Core Implementation (Week 1-2)

  • Set up PostgreSQL database (local or cloud)
  • Implement complete schema with migrations
  • Seed database with realistic test data (use faker libraries)
  • Implement all core queries with EXPLAIN ANALYZE results
  • Add appropriate indexes and verify performance
  • Write SQL-level tests for critical queries

Phase 3: Application Layer (Week 2-3)

  • Build simple API or web app that uses the database
  • Implement connection pooling (PgBouncer or application-level)
  • Add basic authentication and authorization
  • Implement at least one complex transaction (e.g., checkout, enrollment)
  • Handle concurrent access scenarios
  • Add error handling and validation

Phase 4: Performance and Scaling (Week 3)

  • Set up pg_stat_statements for query monitoring
  • Profile top queries and optimize
  • Implement caching layer (Redis or similar) for hot data
  • Add read replica (if using cloud provider)
  • Load test with realistic traffic patterns
  • Document performance baselines

Phase 5: Production Readiness (Week 4)

  • Set up automated backups
  • Implement monitoring dashboards (Grafana or similar)
  • Add alerts for critical metrics
  • Create runbook for common operational tasks
  • Implement graceful degradation for failures
  • Document disaster recovery procedures

Deliverables

1. Code Repository (GitHub/GitLab)
  • Schema migrations (versioned)
  • Seed data scripts
  • Application code
  • Test suite
  • README with setup instructions
2. Design Document (Markdown/PDF)
  • Requirements and assumptions
  • ER diagrams and schema design
  • Query patterns and access patterns
  • Indexing strategy with justifications
  • Scaling plan (current + future)
  • Trade-off analysis
3. Performance Report
  • Query performance benchmarks (EXPLAIN ANALYZE outputs)
  • Load testing results (throughput, latency percentiles)
  • Bottleneck analysis
  • Optimization decisions and their impact
4. Operational Playbook
  • Monitoring setup and key metrics
  • Backup and restore procedures
  • Common troubleshooting scenarios
  • Schema migration workflow
  • Disaster recovery plan

Evaluation Criteria

Technical Depth (40%)
  • Schema design quality and normalization
  • Query optimization and index usage
  • Transaction handling and concurrency
  • Performance under load
Production Readiness (30%)
  • Monitoring and observability
  • Error handling and resilience
  • Backup and recovery capability
  • Security considerations
Documentation (20%)
  • Clear architecture and design decisions
  • Trade-off analysis
  • Operational procedures
  • Code comments and README
Completeness (10%)
  • All phases completed
  • Realistic data and scenarios
  • Working end-to-end

Success Examples

Excellent Projects Include:
  • Thoughtful denormalization with documented trade-offs
  • Creative use of PostgreSQL features (JSONB, CTEs, window functions)
  • Real performance optimization with before/after metrics
  • Production-quality error handling and logging
  • Comprehensive monitoring setup
Red Flags to Avoid:
  • Using MongoDB or MySQL when PostgreSQL was required
  • No indexes or all wrong indexes
  • Hand-waving instead of actual implementation
  • No performance testing
  • Generic solutions without domain-specific optimization

Bonus Challenges (Optional)

For extra depth, consider adding:
  • Multi-region deployment with data residency
  • Sharding implementation for a specific table
  • Custom PostgreSQL extension
  • Advanced full-text search with ranking
  • Event sourcing pattern for audit trail
  • GraphQL API with optimized data loader
  • Real-time subscriptions using PostgreSQL LISTEN/NOTIFY

Getting Help

  • Post questions in course Discord
  • Schedule 1:1 code reviews (if available)
  • Share your design doc for feedback before full implementation
  • Present your project to peers for practice
This capstone project is your opportunity to demonstrate comprehensive mastery of database engineering principles in a realistic, portfolio-worthy implementation.

Summary

You’ve now seen how to apply database design principles to real-world systems:

E-Commerce

Inventory, orders, flexible attributes

Analytics

Time-series, pre-aggregation, partitioning

Social Feed

Fan-out, hybrid approach, caching

Multi-Tenant

RLS, isolation, connection management

Event Sourcing

Append-only, projections, replay

Global

Geo-routing, data residency, federation

Course Completion 🎉

Congratulations! You’ve completed the Database Engineering Mastery course.
You’re now prepared for:
  • Senior/Staff engineering interviews
  • PostgreSQL open-source contributions
  • Production database architecture and operations

What’s Next?