Module 16: System Design Case Studies

Apply everything you’ve learned to real-world database design challenges. Each case study covers requirements, schema design, queries, scaling, and trade-offs.

Estimated Time: 10-12 hours
Format: 6 comprehensive case studies
Interview Use: Practice for 45-60 min design rounds
Skill Level: Senior/Staff

Case Study 1: E-Commerce Platform

Requirements

Product catalog with categories and attributes
User accounts with addresses
Shopping cart and checkout
Order management with status tracking
Inventory management with stock levels
Review and rating system

Schema Design

-- Core entities
CREATE TABLE users (
    id BIGSERIAL PRIMARY KEY,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE addresses (
    id BIGSERIAL PRIMARY KEY,
    user_id BIGINT REFERENCES users(id),
    type VARCHAR(20), -- 'shipping', 'billing'
    street TEXT NOT NULL,
    city VARCHAR(100),
    country VARCHAR(2),
    postal_code VARCHAR(20),
    is_default BOOLEAN DEFAULT false
);

-- Product catalog
CREATE TABLE categories (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    parent_id INT REFERENCES categories(id),
    path ltree  -- requires ltree extension
);

CREATE TABLE products (
    id BIGSERIAL PRIMARY KEY,
    sku VARCHAR(50) UNIQUE NOT NULL,
    name VARCHAR(255) NOT NULL,
    description TEXT,
    category_id INT REFERENCES categories(id),
    base_price DECIMAL(10,2) NOT NULL,
    attributes JSONB DEFAULT '{}',  -- flexible attributes
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE product_variants (
    id BIGSERIAL PRIMARY KEY,
    product_id BIGINT REFERENCES products(id),
    sku VARCHAR(50) UNIQUE NOT NULL,
    name VARCHAR(100),  -- "Red, Large"
    price_modifier DECIMAL(10,2) DEFAULT 0,
    attributes JSONB NOT NULL  -- {"color": "red", "size": "L"}
);

-- Inventory
CREATE TABLE inventory (
    variant_id BIGINT PRIMARY KEY REFERENCES product_variants(id),
    quantity INT NOT NULL DEFAULT 0,
    reserved INT NOT NULL DEFAULT 0,
    CHECK (quantity >= 0),
    CHECK (reserved >= 0),
    CHECK (reserved <= quantity)
);

-- Orders
CREATE TABLE orders (
    id BIGSERIAL PRIMARY KEY,
    user_id BIGINT REFERENCES users(id),
    status VARCHAR(20) DEFAULT 'pending',
    total DECIMAL(12,2) NOT NULL,
    shipping_address_id BIGINT REFERENCES addresses(id),
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE order_items (
    id BIGSERIAL PRIMARY KEY,
    order_id BIGINT REFERENCES orders(id),
    variant_id BIGINT REFERENCES product_variants(id),
    quantity INT NOT NULL,
    unit_price DECIMAL(10,2) NOT NULL,
    CHECK (quantity > 0)
);

-- Reviews
CREATE TABLE reviews (
    id BIGSERIAL PRIMARY KEY,
    product_id BIGINT REFERENCES products(id),
    user_id BIGINT REFERENCES users(id),
    rating INT NOT NULL CHECK (rating BETWEEN 1 AND 5),
    title VARCHAR(200),
    body TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    UNIQUE(product_id, user_id)
);

Key Queries

-- Product listing with category tree
WITH RECURSIVE cat_tree AS (
    SELECT id, name, parent_id, ARRAY[name] AS path
    FROM categories WHERE id = $category_id
    UNION ALL
    SELECT c.id, c.name, c.parent_id, ct.path || c.name
    FROM categories c
    JOIN cat_tree ct ON c.parent_id = ct.id
)
SELECT p.*, array_agg(pv.id) AS variant_ids
FROM products p
JOIN cat_tree ct ON p.category_id = ct.id
LEFT JOIN product_variants pv ON pv.product_id = p.id
GROUP BY p.id;

-- Checkout with inventory reservation
BEGIN;
SELECT * FROM inventory 
WHERE variant_id = $variant_id 
FOR UPDATE;  -- Lock row

UPDATE inventory 
SET reserved = reserved + $quantity
WHERE variant_id = $variant_id
  AND (quantity - reserved) >= $quantity;

-- Check affected rows, if 0 then out of stock
COMMIT;

-- Product with aggregated reviews
SELECT 
    p.*,
    COUNT(r.id) AS review_count,
    AVG(r.rating) AS avg_rating,
    jsonb_build_object(
        '5', COUNT(*) FILTER (WHERE r.rating = 5),
        '4', COUNT(*) FILTER (WHERE r.rating = 4),
        '3', COUNT(*) FILTER (WHERE r.rating = 3),
        '2', COUNT(*) FILTER (WHERE r.rating = 2),
        '1', COUNT(*) FILTER (WHERE r.rating = 1)
    ) AS rating_distribution
FROM products p
LEFT JOIN reviews r ON r.product_id = p.id
WHERE p.id = $product_id
GROUP BY p.id;

Scaling Strategy

┌─────────────────────────────────────────────────────────────────────────────┐
│                    E-COMMERCE SCALING                                        │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   Read Path (90% of traffic):                                               │
│   ┌─────────────┐     ┌─────────────┐     ┌─────────────┐                  │
│   │   CDN       │────▶│   Redis     │────▶│   Replica   │                  │
│   │ (static)    │     │ (products)  │     │  (queries)  │                  │
│   └─────────────┘     └─────────────┘     └─────────────┘                  │
│                                                                              │
│   Write Path (10% of traffic):                                              │
│   ┌─────────────┐     ┌─────────────┐                                       │
│   │    App      │────▶│   Primary   │                                       │
│   │  (orders)   │     │  (writes)   │                                       │
│   └─────────────┘     └─────────────┘                                       │
│                                                                              │
│   Indexing:                                                                  │
│   • products(category_id, created_at DESC) - listings                       │
│   • products USING GIN(attributes) - attribute search                       │
│   • inventory(variant_id) - stock checks                                    │
│   • orders(user_id, created_at DESC) - order history                        │
│                                                                              │
│   Partitioning:                                                              │
│   • orders by created_at (monthly) - historical data                        │
│   • reviews by product_id hash - even distribution                          │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Case Study 2: Real-Time Analytics

Requirements

Ingest millions of events per day
Sub-second dashboard queries
Time-series aggregations (hourly, daily, weekly)
Retention: raw data 30 days, aggregates forever

Schema Design

-- Time-series events with partitioning
CREATE TABLE events (
    id BIGSERIAL,
    event_time TIMESTAMPTZ NOT NULL,
    event_type VARCHAR(50) NOT NULL,
    user_id BIGINT,
    properties JSONB,
    PRIMARY KEY (event_time, id)
) PARTITION BY RANGE (event_time);

-- Create monthly partitions
CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
-- ... more partitions

-- Pre-aggregated summaries
CREATE TABLE event_stats_hourly (
    hour TIMESTAMPTZ NOT NULL,
    event_type VARCHAR(50) NOT NULL,
    count BIGINT DEFAULT 0,
    unique_users BIGINT DEFAULT 0,
    PRIMARY KEY (hour, event_type)
);

CREATE TABLE event_stats_daily (
    date DATE NOT NULL,
    event_type VARCHAR(50) NOT NULL,
    count BIGINT DEFAULT 0,
    unique_users BIGINT DEFAULT 0,
    properties_agg JSONB,  -- aggregated metrics
    PRIMARY KEY (date, event_type)
);

-- HyperLogLog for unique counts (requires extension)
CREATE EXTENSION IF NOT EXISTS hll;

ALTER TABLE event_stats_hourly 
ADD COLUMN user_hll hll DEFAULT hll_empty();

Ingestion Pattern

-- Batch insert with conflict handling
INSERT INTO events (event_time, event_type, user_id, properties)
VALUES 
    ($1, $2, $3, $4),
    ($5, $6, $7, $8),
    -- ... batch of 1000
ON CONFLICT DO NOTHING;

-- Continuous aggregation (run every hour)
INSERT INTO event_stats_hourly (hour, event_type, count, unique_users, user_hll)
SELECT 
    date_trunc('hour', event_time) AS hour,
    event_type,
    COUNT(*) AS count,
    COUNT(DISTINCT user_id) AS unique_users,
    hll_add_agg(hll_hash_bigint(user_id)) AS user_hll
FROM events
WHERE event_time >= $last_processed_hour
  AND event_time < $current_hour
GROUP BY 1, 2
ON CONFLICT (hour, event_type) DO UPDATE SET
    count = event_stats_hourly.count + EXCLUDED.count,
    unique_users = EXCLUDED.unique_users,
    user_hll = hll_union(event_stats_hourly.user_hll, EXCLUDED.user_hll);

Query Patterns

-- Dashboard query (uses pre-aggregates)
SELECT 
    date_trunc('day', hour) AS date,
    event_type,
    SUM(count) AS total_events,
    hll_cardinality(hll_union_agg(user_hll)) AS unique_users
FROM event_stats_hourly
WHERE hour >= NOW() - INTERVAL '7 days'
GROUP BY 1, 2
ORDER BY 1, 2;

-- Drill-down to raw events
SELECT 
    event_time,
    event_type,
    properties
FROM events
WHERE event_time >= $start AND event_time < $end
  AND event_type = $type
ORDER BY event_time DESC
LIMIT 100;

Data Lifecycle

-- Drop old partitions (automated)
DROP TABLE events_2023_01;  -- Instant, no locking

-- Compress old data with TimescaleDB (optional)
-- Or move to columnar storage (Citus columnar)
SELECT alter_table_set_access_method('events_2023_12', 'columnar');

Requirements

Users follow other users
Home feed shows posts from followed users
Support for 100M users, 10M DAU
Feed must be < 100ms

Schema Design

CREATE TABLE users (
    id BIGSERIAL PRIMARY KEY,
    username VARCHAR(50) UNIQUE NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE follows (
    follower_id BIGINT NOT NULL REFERENCES users(id),
    following_id BIGINT NOT NULL REFERENCES users(id),
    created_at TIMESTAMPTZ DEFAULT NOW(),
    PRIMARY KEY (follower_id, following_id)
);

CREATE TABLE posts (
    id BIGSERIAL PRIMARY KEY,
    author_id BIGINT NOT NULL REFERENCES users(id),
    content TEXT NOT NULL,
    media_urls TEXT[],
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Fan-out timeline (for < 10K followers)
CREATE TABLE timelines (
    user_id BIGINT NOT NULL,
    post_id BIGINT NOT NULL REFERENCES posts(id),
    author_id BIGINT NOT NULL,
    created_at TIMESTAMPTZ NOT NULL,
    PRIMARY KEY (user_id, created_at DESC, post_id)
);

-- Celebrity posts (for > 10K followers)
-- Not fanned out, merged at read time
CREATE INDEX idx_posts_author_time ON posts(author_id, created_at DESC);

Fan-Out Strategy

-- On new post from regular user (< 10K followers)
INSERT INTO timelines (user_id, post_id, author_id, created_at)
SELECT 
    follower_id,
    $post_id,
    $author_id,
    $created_at
FROM follows
WHERE following_id = $author_id;

-- Async worker for large fan-outs
-- Use message queue (RabbitMQ, SQS) for batching

Feed Query (Hybrid)

-- Get user's celebrity followings
WITH celebrities AS (
    SELECT following_id 
    FROM follows f
    JOIN user_stats us ON us.user_id = f.following_id
    WHERE f.follower_id = $user_id 
      AND us.follower_count > 10000
),
-- Get pre-materialized timeline
materialized AS (
    SELECT post_id, author_id, created_at
    FROM timelines
    WHERE user_id = $user_id
      AND created_at > NOW() - INTERVAL '7 days'
    ORDER BY created_at DESC
    LIMIT 100
),
-- Get celebrity posts
celebrity_posts AS (
    SELECT id AS post_id, author_id, created_at
    FROM posts
    WHERE author_id IN (SELECT following_id FROM celebrities)
      AND created_at > NOW() - INTERVAL '7 days'
    ORDER BY created_at DESC
    LIMIT 100
)
-- Merge and return
SELECT p.*
FROM posts p
JOIN (
    SELECT * FROM materialized
    UNION ALL
    SELECT * FROM celebrity_posts
) feed ON feed.post_id = p.id
ORDER BY feed.created_at DESC
LIMIT 20;

Case Study 4: Multi-Tenant SaaS

Requirements

Thousands of tenants (companies)
Complete data isolation
Some tenants are very large (1M+ rows)
Compliance requirements (GDPR)

Schema Design (Shared with RLS)

-- Enable RLS
CREATE TABLE tenants (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name VARCHAR(255) NOT NULL,
    plan VARCHAR(20) DEFAULT 'free',
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE users (
    id BIGSERIAL PRIMARY KEY,
    tenant_id UUID NOT NULL REFERENCES tenants(id),
    email VARCHAR(255) NOT NULL,
    role VARCHAR(20) DEFAULT 'member',
    UNIQUE(tenant_id, email)
);

CREATE TABLE projects (
    id BIGSERIAL PRIMARY KEY,
    tenant_id UUID NOT NULL REFERENCES tenants(id),
    name VARCHAR(255) NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE tasks (
    id BIGSERIAL PRIMARY KEY,
    tenant_id UUID NOT NULL REFERENCES tenants(id),
    project_id BIGINT REFERENCES projects(id),
    title VARCHAR(255) NOT NULL,
    status VARCHAR(20) DEFAULT 'open',
    assignee_id BIGINT REFERENCES users(id)
);

-- Row Level Security
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
ALTER TABLE tasks ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation_projects ON projects
    USING (tenant_id = current_setting('app.tenant_id')::UUID);

CREATE POLICY tenant_isolation_tasks ON tasks
    USING (tenant_id = current_setting('app.tenant_id')::UUID);

-- Set tenant context (in application)
-- SET app.tenant_id = 'uuid-here';

Connection Management

# Application middleware
def set_tenant_context(tenant_id):
    """Set tenant context for RLS"""
    with get_db_connection() as conn:
        conn.execute(f"SET app.tenant_id = '{tenant_id}'")
        yield conn
        conn.execute("RESET app.tenant_id")

# PgBouncer with tenant routing (advanced)
# Route large tenants to dedicated pools

Tenant Isolation Testing

-- Test: Ensure tenant A can't see tenant B's data
SET app.tenant_id = 'tenant-a-uuid';

SELECT * FROM projects;  -- Should only see tenant A

-- Try to access tenant B (should fail silently - return 0 rows)
SELECT * FROM projects WHERE tenant_id = 'tenant-b-uuid';
-- Returns 0 rows due to RLS

-- Bypass for admin operations
SET ROLE admin_role;  -- Must have BYPASSRLS
SELECT * FROM projects;  -- Sees all tenants

Case Study 5: Event Sourcing System

Requirements

All state changes stored as events (audit trail)
Ability to rebuild state at any point in time
Support for projections (read models)
Event replay for debugging

Schema Design

-- Event store (append-only)
CREATE TABLE events (
    id BIGSERIAL PRIMARY KEY,
    stream_id UUID NOT NULL,
    stream_type VARCHAR(100) NOT NULL,  -- 'Order', 'User', etc.
    event_type VARCHAR(100) NOT NULL,   -- 'OrderCreated', 'ItemAdded'
    event_data JSONB NOT NULL,
    metadata JSONB DEFAULT '{}',
    version INT NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    UNIQUE(stream_id, version)  -- Optimistic concurrency
);

-- Example events
INSERT INTO events (stream_id, stream_type, event_type, event_data, version)
VALUES 
    ('order-123', 'Order', 'OrderCreated', 
     '{"customer_id": 456, "items": []}', 1),
    ('order-123', 'Order', 'ItemAdded', 
     '{"product_id": 789, "quantity": 2}', 2),
    ('order-123', 'Order', 'OrderSubmitted', 
     '{"submitted_at": "2024-01-15T10:30:00Z"}', 3);

-- Projection: Materialized current state
CREATE TABLE order_projections (
    order_id UUID PRIMARY KEY,
    customer_id BIGINT NOT NULL,
    items JSONB NOT NULL DEFAULT '[]',
    status VARCHAR(20) NOT NULL,
    total DECIMAL(12,2),
    last_event_id BIGINT NOT NULL,  -- For replay
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- Subscription checkpoints
CREATE TABLE subscription_checkpoints (
    subscription_id VARCHAR(100) PRIMARY KEY,
    last_processed_id BIGINT NOT NULL,
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

Event Replay

-- Rebuild projection from events
CREATE OR REPLACE FUNCTION rebuild_order_projection(p_order_id UUID)
RETURNS void AS $$
DECLARE
    evt RECORD;
    current_state JSONB := '{}';
BEGIN
    -- Delete existing projection
    DELETE FROM order_projections WHERE order_id = p_order_id;
    
    -- Replay all events
    FOR evt IN 
        SELECT * FROM events 
        WHERE stream_id = p_order_id 
        ORDER BY version
    LOOP
        current_state := apply_event(current_state, evt.event_type, evt.event_data);
    END LOOP;
    
    -- Insert rebuilt projection
    INSERT INTO order_projections (order_id, customer_id, items, status, last_event_id)
    VALUES (
        p_order_id,
        (current_state->>'customer_id')::BIGINT,
        current_state->'items',
        current_state->>'status',
        evt.id
    );
END;
$$ LANGUAGE plpgsql;

Subscription Processing

# Python async event processor
async def process_events():
    last_id = get_checkpoint('order_projection')
    
    while True:
        events = await db.fetch("""
            SELECT * FROM events 
            WHERE id > $1 
            ORDER BY id 
            LIMIT 100
        """, last_id)
        
        for event in events:
            await update_projection(event)
            last_id = event['id']
        
        await save_checkpoint('order_projection', last_id)
        await asyncio.sleep(0.1)

Case Study 6: Global Application

Requirements

Users in US, EU, Asia
Data residency (EU data stays in EU)
Low latency reads globally
Strong consistency for writes

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                    GLOBAL DATABASE ARCHITECTURE                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   US-EAST (Primary for US users)                                            │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │  ┌─────────────┐     ┌─────────────┐                               │   │
│   │  │   Primary   │────▶│   Replica   │ (local reads)                 │   │
│   │  │  (writes)   │     │             │                               │   │
│   │  └──────┬──────┘     └─────────────┘                               │   │
│   └─────────┼───────────────────────────────────────────────────────────┘   │
│             │ async replication (reference data only)                       │
│   ┌─────────┼───────────────────────────────────────────────────────────┐   │
│   │         ▼           EU-WEST (Primary for EU users)                 │   │
│   │  ┌─────────────┐     ┌─────────────┐                               │   │
│   │  │   Primary   │────▶│   Replica   │                               │   │
│   │  │  (writes)   │     │             │                               │   │
│   │  └─────────────┘     └─────────────┘                               │   │
│   │                                                                     │   │
│   │  EU user data ONLY stored here (GDPR)                              │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│   Data Routing:                                                             │
│   • User data: Route to home region based on user.region                   │
│   • Reference data: Replicated globally (products, configs)                │
│   • Cross-region queries: Federation layer or application-level           │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Implementation

-- Users partitioned by region
CREATE TABLE users (
    id BIGSERIAL,
    region VARCHAR(10) NOT NULL,  -- 'us', 'eu', 'asia'
    email VARCHAR(255) NOT NULL,
    data JSONB,
    PRIMARY KEY (region, id)
) PARTITION BY LIST (region);

CREATE TABLE users_us PARTITION OF users FOR VALUES IN ('us');
CREATE TABLE users_eu PARTITION OF users FOR VALUES IN ('eu');
CREATE TABLE users_asia PARTITION OF users FOR VALUES IN ('asia');

-- Reference data (replicated everywhere)
CREATE TABLE products (
    id BIGSERIAL PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    -- No region-specific data
);

-- Foreign data wrapper for cross-region queries
CREATE EXTENSION postgres_fdw;

CREATE SERVER eu_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'eu-db.example.com', port '5432', dbname 'app');

CREATE USER MAPPING FOR app_user
SERVER eu_server
OPTIONS (user 'app_user', password 'secret');

CREATE FOREIGN TABLE users_eu_remote (
    id BIGINT,
    region VARCHAR(10),
    email VARCHAR(255)
) SERVER eu_server OPTIONS (table_name 'users_eu');

How to Use These Case Studies in Interviews

System design interviews for database engineering roles typically follow a 45-60 minute format. Here’s how to leverage these case studies for maximum interview success:

Interview Structure and Timing

Phase 1: Requirements Clarification (5-10 minutes)

Ask about scale: How many users? Queries per second? Data volume?
Identify read vs write ratio (often 90:10 or higher)
Clarify consistency requirements: Strong vs eventual consistency?
Understand geographic distribution and latency requirements
Confirm data retention and compliance needs

Phase 2: High-Level Design (10-15 minutes)

Start with a simple single-server design
Identify the main entities and their relationships
Draw a basic ER diagram or schema outline
Explain your normalization choices
Discuss primary access patterns and query shapes

Phase 3: Deep Dive on Specific Components (15-20 minutes)

Schema Design: Walk through 2-3 core tables in detail
- Show primary keys, foreign keys, indexes
- Explain constraint choices
- Justify data types and column sizes
Query Patterns: Write 2-3 critical queries
- Show how indexes support these queries
- Discuss query plan expectations
- Mention potential optimization techniques
Scaling Bottlenecks: Identify the first scaling constraint
- Connection limits? I/O? CPU?
- Propose specific solutions (not just “add caching”)

Phase 4: Scaling Discussion (10-15 minutes)

Progression from single server to scaled architecture
Read scaling: Replicas, caching layers, CDN
Write scaling: Partitioning, sharding strategies
Handle trade-offs explicitly:
- “If we denormalize here, we gain X but lose Y”
- “Caching improves reads but introduces consistency challenges”

Phase 5: Trade-offs and Extensions (5-10 minutes)

Discuss monitoring and observability needs
Mention backup and disaster recovery
Consider operational complexity
Address interviewer’s specific concerns

What Interviewers Look For

✅ Strong Signals:

Clear, structured thinking with explicit phases
Asking clarifying questions before jumping to solutions
Justifying design decisions with concrete trade-offs
Concrete SQL examples, not just hand-waving
Understanding when to denormalize and why
Awareness of operational concerns (monitoring, backups)
Scaling incrementally (not over-engineering initially)

❌ Red Flags:

Immediately jumping to complex distributed systems
Generic answers (“we’ll use microservices” without specifics)
Not considering query patterns when designing schema
Ignoring constraints and data integrity
Proposing technologies without understanding their trade-offs
Unable to write actual SQL queries
No discussion of indexes or query optimization

Practice Strategies

1. Time Yourself: Set a 45-minute timer and work through a full case study 2. Write Actual SQL: Don’t just describe—write CREATE TABLE statements 3. Draw Diagrams: Practice drawing ER diagrams and architecture diagrams quickly 4. Explain Out Loud: Verbalize your thinking process as if explaining to an interviewer 5. Compare Your Solution: After completing, review the provided solution and note differences 6. Focus on Trade-offs: For every design choice, practice stating the pros and cons

Common Interview Questions to Prepare

“How would you handle hot partition problems in sharding?”
“What’s your strategy for zero-downtime migrations?”
“How do you ensure data consistency across microservices?”
“Explain your approach to managing database schema changes in production.”
“How would you optimize this query?” (given EXPLAIN output)
“What monitoring metrics would you track for this system?”

Case Study to Interview Mapping

E-Commerce: Tests inventory management, transactions, and consistency
Analytics Platform: Tests time-series optimization and aggregation strategies
Social Feed: Tests fan-out approaches and caching patterns
Multi-Tenant SaaS: Tests isolation strategies and connection management
Event Sourcing: Tests immutable data patterns and projections
Global Application: Tests geo-distribution and data residency

Use these case studies to build a mental library of patterns you can adapt to any interview scenario.

Design Template: Build Your Own Case Study

Use this template to practice system design for any application domain:

1. Requirements Analysis

Functional Requirements:

List 5-7 core features the system must support
Identify the main entities and their relationships
Define critical user workflows (e.g., checkout flow, post creation)

Non-Functional Requirements:

Scale: How many users? Requests/second? Data volume?
Read:Write Ratio: Typically 90:10, 95:5, or 50:50?
Latency: P95/P99 targets (e.g., less than 100ms for reads, less than 500ms for writes)
Consistency: Strong, eventual, or per-operation?
Availability: Target uptime (e.g., 99.9%, 99.99%)?

2. Data Model Design

Core Entities Template:

-- Entity 1: [Name]
CREATE TABLE [entity_name] (
    id BIGSERIAL PRIMARY KEY,
    -- Add relevant columns with appropriate types
    -- Include timestamps
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Entity 2: [Name]
-- ...

-- Relationships (junction tables if needed)
-- ...

Schema Design Checklist:

All tables have explicit primary keys
Foreign keys defined with appropriate ON DELETE behavior
Unique constraints on natural keys (email, username, etc.)
Check constraints for data validation
Appropriate indexes for query patterns (list them explicitly)
Consider partitioning strategy if time-series or very large tables
JSONB columns only where schema flexibility is truly needed

3. Access Patterns and Queries

Identify Top 5 Queries by Frequency:

[Query description] - Expected QPS: ___

-- Write the SQL query
-- Note: This query uses index on [column]

[Query description] - Expected QPS: ___
[Query description] - Expected QPS: ___
[Query description] - Expected QPS: ___
[Query description] - Expected QPS: ___

Write Patterns:

Transactional writes (e.g., orders, payments)
Bulk imports (e.g., data ingestion)
Background jobs (e.g., aggregation, cleanup)

4. Indexing Strategy

-- List all indexes with justification
CREATE INDEX idx_[table]_[columns] ON [table]([columns]) 
WHERE [condition if partial];
-- Justification: Supports [specific query pattern]

-- Example:
CREATE INDEX idx_orders_user_created ON orders(user_id, created_at DESC);
-- Justification: Supports "recent orders by user" query (ORDER BY created_at DESC)

CREATE INDEX idx_products_category_active ON products(category_id) 
WHERE status = 'active';
-- Justification: Partial index for active product listings by category

5. Scaling Plan

Phase 1: Single Server (0-10K users)

Single PostgreSQL instance
Application handles connection pooling
Basic monitoring and backups

Phase 2: Read Scaling (10K-100K users)

Add 2-3 read replicas
Implement read/write splitting in application
Add Redis for session and hot data caching
CDN for static assets

Phase 3: Write Scaling (100K-1M users)

Vertical scaling of primary (larger instance)
Implement connection pooling (PgBouncer)
Consider partitioning large tables by date/hash
Asynchronous processing for non-critical writes

Phase 4: Distributed Architecture (1M+ users)

Sharding strategy: [by user_id, by geographic region, etc.]
Cross-shard query handling approach
Multi-region deployment with data residency
Event-driven architecture for eventual consistency

6. Trade-offs and Considerations

Consistency vs Performance:

Where can you use eventual consistency?
Which operations require strong consistency?
How do you handle consistency across shards?

Normalization vs Denormalization:

Which tables/columns will you denormalize? Why?
What’s your strategy for keeping denormalized data in sync?

Complexity vs Maintainability:

Is sharding premature optimization for your scale?
Can simpler solutions (caching, indexing) solve the problem?

7. Operational Considerations

Monitoring:

Key metrics to track (latency, throughput, error rate, saturation)
Query performance tracking (pg_stat_statements)
Replication lag monitoring
Connection pool utilization

Disaster Recovery:

Backup strategy and RPO (Recovery Point Objective)
Restore and RTO (Recovery Time Objective)
WAL archiving and PITR (Point-in-Time Recovery)

Schema Evolution:

How will you handle schema migrations with zero downtime?
Versioning strategy for breaking changes

Self-Evaluation Rubric

Use this rubric to assess your case study solutions and identify areas for improvement:

Requirements Understanding (Score: /10)

Excellent (9-10):

Asked clarifying questions about scale, consistency, and latency requirements
Identified all implicit requirements (e.g., multi-tenancy, geographic distribution)
Made reasonable assumptions where information was missing
Documented assumptions explicitly

Good (7-8):

Asked some clarifying questions
Covered most critical requirements
Made some assumptions but didn’t document all

Needs Improvement (score below 7):

Jumped directly to solution without clarifying requirements
Missed critical non-functional requirements
Made unrealistic assumptions

Schema Design (Score: /20)

Excellent (17-20):

All entities properly normalized (or intentionally denormalized with justification)
Appropriate data types and constraints
Foreign keys with correct ON DELETE behavior
Indexes match query patterns precisely
Considered partitioning where appropriate
JSONB used judiciously

Good (13-16):

Mostly normalized schema
Some indexes missing or unnecessary indexes included
Minor data type issues
Constraints mostly appropriate

Needs Improvement (under 13):

Normalization violations without justification
Missing foreign keys or constraints
No indexes or all indexes wrong
Inappropriate use of JSONB for structured data
Missing primary keys

Query Design (Score: /15)

Excellent (13-15):

All queries are efficient and use indexes
Proper use of JOINs, CTEs, and window functions
Transactions used appropriately for consistency
Handles edge cases (NULL values, empty results)

Good (10-12):

Queries are functional but not optimally efficient
Some missed opportunities for index usage
Minor correctness issues

Needs Improvement (under 10):

Queries don’t use available indexes
N+1 query patterns
Missing WHERE clauses or overly broad queries
Incorrect JOIN types or conditions

Scaling Strategy (Score: /20)

Excellent (17-20):

Clear progression from simple to complex
Specific solutions for read and write scaling
Justified when to introduce each component
Considered operational complexity
Addressed specific bottlenecks (connection limits, I/O, CPU)

Good (13-16):

Reasonable scaling approach
Some components mentioned without detail
Missing specific thresholds for scaling triggers

Needs Improvement (under 13):

Generic “use microservices” without specifics
Over-engineered from the start
No consideration of bottlenecks
Proposed solutions don’t match requirements

Trade-off Analysis (Score: /15)

Excellent (13-15):

Every major decision includes explicit trade-offs
Quantified trade-offs where possible (e.g., “10% write overhead for 50% read improvement”)
Considered alternative approaches
Addressed operational implications

Good (10-12):

Some trade-offs discussed
Mostly qualitative analysis
Missing some alternative approaches

Needs Improvement (under 10):

No trade-off discussion
Claims solutions are “best” without justification
Doesn’t consider downsides

Operational Awareness (Score: /10)

Excellent (9-10):

Detailed monitoring plan
Backup and disaster recovery strategy
Schema migration approach
Considered operational complexity and team skills

Good (7-8):

Mentioned monitoring and backups
Missing some operational details

Needs Improvement (score under 7):

No discussion of operations
Proposes complex system without considering maintenance

Communication Clarity (Score: /10)

Excellent (9-10):

Clear, structured presentation
Used diagrams effectively
Explained rationale for decisions
Easy to follow progression

Good (7-8):

Mostly clear
Some organization issues
Could use more diagrams

Needs Improvement (score under 7):

Difficult to follow
Jumps around randomly
No visual aids

Total Score: ___/100

90-100: Excellent - Ready for staff+ interviews 75-89: Good - Solid senior-level understanding 60-74: Fair - Has foundational knowledge but needs more practice Below 60: Needs Improvement - Review fundamentals and practice more

Action Items Based on Your Score

If you scored below 70:

Review the fundamentals in earlier modules (especially Database Design and Performance)
Practice writing actual SQL, not just describing solutions
Focus on one case study at a time and compare your solution to the provided one

If you scored 70-85:

Deepdive into areas where you scored lowest
Practice explaining trade-offs out loud
Time yourself on case studies to improve pacing
Write more detailed schemas and queries

If you scored above 85:

Practice more complex variations
Focus on communication and presentation
Review edge cases and failure scenarios
Consider mentoring others to solidify your knowledge

Capstone Project: Build a Complete Database-Backed System

Put everything together by building a production-ready system from scratch:

Project Overview

Design, implement, and document a complete database-backed application that demonstrates mastery of all course concepts. Duration: 2-4 weeks
Outcome: Portfolio piece suitable for interviews
Format: Working application + comprehensive documentation

Project Requirements

Choose ONE of these domains:

Multi-Tenant SaaS Analytics Platform: Track events, generate reports, support multiple customers
Real-Time Collaborative Tool: Like Google Docs or Figma, with conflict resolution
E-Learning Platform: Courses, enrollments, progress tracking, assessments
Inventory Management System: Multi-warehouse, real-time stock, order fulfillment
Social Media Feed Engine: Posts, follows, timeline generation, notifications
OR propose your own (must be approved based on complexity)

Implementation Checklist

Phase 1: Design and Planning (Week 1)

Write a requirements document (functional and non-functional)
Create detailed ER diagram with cardinality
Design complete schema with all constraints
Identify top 10 query patterns
Plan indexing strategy
Document normalization decisions and intentional denormalizations

Phase 2: Core Implementation (Week 1-2)

Set up PostgreSQL database (local or cloud)
Implement complete schema with migrations
Seed database with realistic test data (use faker libraries)
Implement all core queries with EXPLAIN ANALYZE results
Add appropriate indexes and verify performance
Write SQL-level tests for critical queries

Phase 3: Application Layer (Week 2-3)

Build simple API or web app that uses the database
Implement connection pooling (PgBouncer or application-level)
Add basic authentication and authorization
Implement at least one complex transaction (e.g., checkout, enrollment)
Handle concurrent access scenarios
Add error handling and validation

Phase 4: Performance and Scaling (Week 3)

Set up pg_stat_statements for query monitoring
Profile top queries and optimize
Implement caching layer (Redis or similar) for hot data
Add read replica (if using cloud provider)
Load test with realistic traffic patterns
Document performance baselines

Phase 5: Production Readiness (Week 4)

Set up automated backups
Implement monitoring dashboards (Grafana or similar)
Add alerts for critical metrics
Create runbook for common operational tasks
Implement graceful degradation for failures
Document disaster recovery procedures

Deliverables

1. Code Repository (GitHub/GitLab)

Schema migrations (versioned)
Seed data scripts
Application code
Test suite
README with setup instructions

2. Design Document (Markdown/PDF)

Requirements and assumptions
ER diagrams and schema design
Query patterns and access patterns
Indexing strategy with justifications
Scaling plan (current + future)
Trade-off analysis

3. Performance Report

Query performance benchmarks (EXPLAIN ANALYZE outputs)
Load testing results (throughput, latency percentiles)
Bottleneck analysis
Optimization decisions and their impact

4. Operational Playbook

Monitoring setup and key metrics
Backup and restore procedures
Common troubleshooting scenarios
Schema migration workflow
Disaster recovery plan

Evaluation Criteria

Technical Depth (40%)

Schema design quality and normalization
Query optimization and index usage
Transaction handling and concurrency
Performance under load

Production Readiness (30%)

Monitoring and observability
Error handling and resilience
Backup and recovery capability
Security considerations

Documentation (20%)

Clear architecture and design decisions
Trade-off analysis
Operational procedures
Code comments and README

Completeness (10%)

All phases completed
Realistic data and scenarios
Working end-to-end

Success Examples

Excellent Projects Include:

Thoughtful denormalization with documented trade-offs
Creative use of PostgreSQL features (JSONB, CTEs, window functions)
Real performance optimization with before/after metrics
Production-quality error handling and logging
Comprehensive monitoring setup

Red Flags to Avoid:

Using MongoDB or MySQL when PostgreSQL was required
No indexes or all wrong indexes
Hand-waving instead of actual implementation
No performance testing
Generic solutions without domain-specific optimization

Bonus Challenges (Optional)

For extra depth, consider adding:

Multi-region deployment with data residency
Sharding implementation for a specific table
Custom PostgreSQL extension
Advanced full-text search with ranking
Event sourcing pattern for audit trail
GraphQL API with optimized data loader
Real-time subscriptions using PostgreSQL LISTEN/NOTIFY

Getting Help

Post questions in course Discord
Schedule 1:1 code reviews (if available)
Share your design doc for feedback before full implementation
Present your project to peers for practice

This capstone project is your opportunity to demonstrate comprehensive mastery of database engineering principles in a realistic, portfolio-worthy implementation.

Summary

You’ve now seen how to apply database design principles to real-world systems:

E-Commerce

Inventory, orders, flexible attributes

Analytics

Time-series, pre-aggregation, partitioning

Social Feed

Fan-out, hybrid approach, caching

Multi-Tenant

RLS, isolation, connection management

Event Sourcing

Append-only, projections, replay

Global

Geo-routing, data residency, federation

Course Completion 🎉

Congratulations! You’ve completed the Database Engineering Mastery course.

You’re now prepared for:

Senior/Staff engineering interviews
PostgreSQL open-source contributions
Production database architecture and operations

What’s Next?

Join the Community

Connect with fellow database engineers

Start Contributing

Make your first PostgreSQL contribution

Overview

Testing & Code Quality

Crash Courses

AI Engineering

Math for ML - Understanding Linear Algebra

Probability & Statistics for ML

Math for ML - Understanding Calculus

ML Mastery

Deep Learning Mastery

NestJS Mastery

Microservices Mastery

Low Level Design

OOP Concepts

SOLID Principles

Design Patterns

LLD Case Studies

System Design (HLD)

Senior Level (L5+/Staff)

HLD Case Studies

Engineering Fundamentals

DevOps & Operations

Azure Cloud Engineering

AWS Cloud

AWS Monitoring & Observability

AWS Security Services

AWS Serverless

AWS Operations

AWS Advanced

AWS Case Studies

GCP Cloud Engineering

DevOps Tools

Database Engineering

HIPAA Compliance Mastery

Operating Systems

Linux Internals

Distributed Systems

Networking Mastery

Build Your Own X

Go Lang Mastery

C Programming

Classic Research Papers

Distributed System Tools

​Module 16: System Design Case Studies

​Case Study 1: E-Commerce Platform

​Requirements

​Schema Design

​Key Queries

​Scaling Strategy

​Case Study 2: Real-Time Analytics

​Requirements

​Schema Design

​Ingestion Pattern

​Query Patterns

​Data Lifecycle

​Case Study 3: Social Media Feed

​Requirements

​Schema Design

​Fan-Out Strategy

​Feed Query (Hybrid)

​Case Study 4: Multi-Tenant SaaS

​Requirements

​Schema Design (Shared with RLS)

​Connection Management

​Tenant Isolation Testing

​Case Study 5: Event Sourcing System

​Requirements

​Schema Design

​Event Replay

​Subscription Processing

​Case Study 6: Global Application

​Requirements

​Architecture

​Implementation

​How to Use These Case Studies in Interviews

​Interview Structure and Timing

​What Interviewers Look For

​Practice Strategies

​Common Interview Questions to Prepare

​Case Study to Interview Mapping

​Design Template: Build Your Own Case Study

Module 16: System Design Case Studies

Case Study 1: E-Commerce Platform

Requirements

Schema Design

Key Queries

Scaling Strategy

Case Study 2: Real-Time Analytics

Requirements

Schema Design

Ingestion Pattern

Query Patterns

Data Lifecycle

Case Study 3: Social Media Feed

Requirements

Schema Design

Fan-Out Strategy

Feed Query (Hybrid)

Case Study 4: Multi-Tenant SaaS

Requirements

Schema Design (Shared with RLS)

Connection Management

Tenant Isolation Testing

Case Study 5: Event Sourcing System

Requirements

Schema Design

Event Replay

Subscription Processing

Case Study 6: Global Application

Requirements

Architecture

Implementation

How to Use These Case Studies in Interviews

Interview Structure and Timing

What Interviewers Look For

Practice Strategies

Common Interview Questions to Prepare

Case Study to Interview Mapping

Design Template: Build Your Own Case Study

1. Requirements Analysis

2. Data Model Design

3. Access Patterns and Queries

4. Indexing Strategy

5. Scaling Plan

6. Trade-offs and Considerations

7. Operational Considerations

Self-Evaluation Rubric

Requirements Understanding (Score: /10)

Schema Design (Score: /20)

Query Design (Score: /15)

Scaling Strategy (Score: /20)

Trade-off Analysis (Score: /15)

Operational Awareness (Score: /10)

Communication Clarity (Score: /10)

Total Score: ___/100

Action Items Based on Your Score

Capstone Project: Build a Complete Database-Backed System

Project Overview

Project Requirements

Implementation Checklist

Phase 1: Design and Planning (Week 1)

Phase 2: Core Implementation (Week 1-2)

Phase 3: Application Layer (Week 2-3)

Phase 4: Performance and Scaling (Week 3)

Phase 5: Production Readiness (Week 4)

Deliverables

Evaluation Criteria

Success Examples

Bonus Challenges (Optional)

Getting Help

Summary

Course Completion 🎉

What’s Next?