Skip to main content

Module 8: Scaling Strategies

When your application grows beyond what a single server can handle, you need scaling strategies. This module covers horizontal scaling, sharding, and partitioning.
Estimated Time: 12-14 hours
Hands-On: Implement table partitioning
Key Skill: Choosing the right scaling strategy
PostgreSQL scaling playbook from single node to sharding

8.1 Scaling Fundamentals

Vertical vs Horizontal Scaling

┌─────────────────────────────────────────────────────────────────────────────┐
│                    VERTICAL VS HORIZONTAL SCALING                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  VERTICAL SCALING (Scale Up):                                               │
│  ─────────────────────────────                                               │
│                                                                              │
│    Before:              After:                                               │
│    ┌─────────┐          ┌─────────────────┐                                 │
│    │ 4 cores │   ──▶    │    64 cores     │                                 │
│    │  16GB   │          │     512GB       │                                 │
│    │  1TB    │          │      20TB       │                                 │
│    └─────────┘          └─────────────────┘                                 │
│                                                                              │
│    Pros: Simple, no code changes                                            │
│    Cons: Hardware limits, expensive, single point of failure               │
│                                                                              │
│  HORIZONTAL SCALING (Scale Out):                                            │
│  ────────────────────────────────                                            │
│                                                                              │
│    Before:              After:                                               │
│    ┌─────────┐          ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐    │
│    │ 4 cores │   ──▶    │ 4 cores │ │ 4 cores │ │ 4 cores │ │ 4 cores │    │
│    │  16GB   │          │  16GB   │ │  16GB   │ │  16GB   │ │  16GB   │    │
│    └─────────┘          └─────────┘ └─────────┘ └─────────┘ └─────────┘    │
│                                                                              │
│    Pros: Linear scaling, fault tolerant, cost effective                     │
│    Cons: Complex, requires application changes, CAP theorem                 │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

When to Scale

SignalWhat It MeansAction
CPU > 80% sustainedQuery processing bottleneckRead replicas, optimize queries
Memory pressureWorking set > RAMMore RAM, or shard data
Disk I/O saturatedToo many reads/writesFaster disks, caching, sharding
Connection limitsToo many clientsConnection pooling
Response time degradationAll of the aboveProfile and optimize

8.2 Table Partitioning

What is Partitioning?

┌─────────────────────────────────────────────────────────────────────────────┐
│                       TABLE PARTITIONING                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  Without Partitioning:                                                       │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │                         orders                                      │    │
│  │  500 million rows - all in one table                               │    │
│  │  Every query scans entire table (even with indexes)               │    │
│  │  Maintenance operations lock everything                            │    │
│  └─────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│  With Partitioning:                                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │                    orders (parent table)                           │    │
│  ├─────────────┬─────────────┬─────────────┬─────────────────────────┤    │
│  │ orders_2023 │ orders_2024 │ orders_2025 │         ...            │    │
│  │   50M rows  │   150M rows │   200M rows │                         │    │
│  └─────────────┴─────────────┴─────────────┴─────────────────────────┘    │
│                                                                              │
│  Benefits:                                                                   │
│  • Query "WHERE created_at >= '2024-01-01'" only scans 2024/2025           │
│  • VACUUM, REINDEX operate on individual partitions                        │
│  • Can drop old partitions instantly (vs DELETE + VACUUM)                  │
│  • Different storage for hot/cold data                                      │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Partition Types

Best for: Time-series data, continuous values
-- Create partitioned table
CREATE TABLE orders (
    id BIGINT,
    customer_id INT,
    order_date DATE,
    total DECIMAL(10,2),
    status VARCHAR(20)
) PARTITION BY RANGE (order_date);

-- Create partitions
CREATE TABLE orders_2023 PARTITION OF orders
    FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');

CREATE TABLE orders_2024 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

CREATE TABLE orders_2025 PARTITION OF orders
    FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');

-- PostgreSQL automatically routes inserts
INSERT INTO orders VALUES (1, 100, '2024-06-15', 99.99, 'completed');
-- Goes to orders_2024

-- Query only scans relevant partitions
EXPLAIN SELECT * FROM orders WHERE order_date >= '2024-06-01';

Partition Maintenance

-- Add new partition for next year
CREATE TABLE orders_2026 PARTITION OF orders
    FOR VALUES FROM ('2026-01-01') TO ('2027-01-01');

-- Detach old partition (keeps data, removes from parent)
ALTER TABLE orders DETACH PARTITION orders_2020;

-- Then archive or drop
-- Option 1: Archive to cold storage
\copy orders_2020 TO 'orders_2020.csv' CSV HEADER
DROP TABLE orders_2020;

-- Option 2: Drop immediately
DROP TABLE orders_2020;

-- Automatic partition creation with pg_partman
CREATE EXTENSION pg_partman;
SELECT partman.create_parent(
    p_parent_table := 'public.orders',
    p_control := 'order_date',
    p_type := 'native',
    p_interval := 'monthly'
);

8.3 Database Sharding

Sharding Concepts

┌─────────────────────────────────────────────────────────────────────────────┐
│                         DATABASE SHARDING                                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  Partitioning: Split data within ONE database                               │
│  Sharding: Split data across MULTIPLE databases                             │
│                                                                              │
│                          ┌───────────────┐                                   │
│                          │   Application │                                   │
│                          └───────┬───────┘                                   │
│                                  │                                           │
│                          ┌───────▼───────┐                                   │
│                          │ Shard Router  │                                   │
│                          └───────┬───────┘                                   │
│                                  │                                           │
│           ┌──────────────────────┼──────────────────────┐                   │
│           │                      │                      │                   │
│           ▼                      ▼                      ▼                   │
│   ┌───────────────┐      ┌───────────────┐      ┌───────────────┐          │
│   │   Shard 0     │      │   Shard 1     │      │   Shard 2     │          │
│   │ users 0-999K  │      │ users 1M-2M   │      │ users 2M-3M   │          │
│   │  Server: A    │      │  Server: B    │      │  Server: C    │          │
│   └───────────────┘      └───────────────┘      └───────────────┘          │
│                                                                              │
│  Each shard is a complete PostgreSQL instance with full functionality      │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Sharding Strategies

# Application determines shard
def get_shard(user_id):
    return user_id % NUM_SHARDS

# All user data for user 12345
shard = get_shard(12345)  # e.g., shard 1
conn = get_connection(shard)
Pros: Even distribution, simple Cons: Resharding is expensive (data migration)
# User IDs 0-1M on shard 0, 1M-2M on shard 1, etc.
def get_shard(user_id):
    if user_id < 1_000_000:
        return 0
    elif user_id < 2_000_000:
        return 1
    else:
        return 2
Pros: Range queries possible, easy to add shards Cons: Hotspots if ranges accessed unevenly
# Lookup table maps keys to shards
# shard_directory table:
# | user_id_range | shard_id |
# | 0-500000      | 0        |
# | 500001-1000000| 1        |

def get_shard(user_id):
    return lookup_service.get_shard_for_user(user_id)
Pros: Flexible, can rebalance easily Cons: Lookup adds latency, directory is SPOF

Cross-Shard Challenges

ChallengeProblemSolution
JoinsCan’t join across shardsDenormalize, or application-level joins
TransactionsNo cross-shard ACIDTwo-phase commit, or eventual consistency
AggregationsSUM/COUNT across shardsScatter-gather queries
SequencesIDs must be unique globallyUUID, or dedicated ID service

8.4 Caching Strategies

Caching Layers

┌─────────────────────────────────────────────────────────────────────────────┐
│                         CACHING ARCHITECTURE                                 │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│                          ┌─────────────┐                                     │
│                          │    Client   │                                     │
│                          └──────┬──────┘                                     │
│                                 │                                            │
│                          ┌──────▼──────┐                                     │
│  CDN Cache ─────────────▶│     CDN     │◀──── Static assets, API responses  │
│                          └──────┬──────┘                                     │
│                                 │                                            │
│                          ┌──────▼──────┐                                     │
│  App Cache ─────────────▶│    Redis    │◀──── Session, hot data, results    │
│                          └──────┬──────┘                                     │
│                                 │ Miss                                       │
│                          ┌──────▼──────┐                                     │
│  Query Cache ────────────▶│ PostgreSQL │                                     │
│                          │   Buffer    │◀──── Recently accessed pages        │
│                          │    Pool     │                                     │
│                          └──────┬──────┘                                     │
│                                 │                                            │
│                          ┌──────▼──────┐                                     │
│  OS Cache ──────────────▶│  OS Page    │◀──── Filesystem cache               │
│                          │   Cache     │                                     │
│                          └──────┬──────┘                                     │
│                                 │                                            │
│                          ┌──────▼──────┐                                     │
│                          │    Disk     │                                     │
│                          └─────────────┘                                     │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Cache Patterns

Application manages cache explicitly.
def get_user(user_id):
    # 1. Check cache
    cached = redis.get(f"user:{user_id}")
    if cached:
        return json.loads(cached)
    
    # 2. Cache miss - fetch from DB
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    
    # 3. Populate cache
    redis.setex(f"user:{user_id}", 3600, json.dumps(user))
    
    return user

def update_user(user_id, data):
    # 1. Update database
    db.execute("UPDATE users SET ... WHERE id = %s", user_id)
    
    # 2. Invalidate cache
    redis.delete(f"user:{user_id}")

Cache Invalidation Strategies

-- PostgreSQL LISTEN/NOTIFY for cache invalidation
CREATE OR REPLACE FUNCTION notify_cache_invalidation()
RETURNS TRIGGER AS $$
BEGIN
    PERFORM pg_notify('cache_invalidation', 
        json_build_object(
            'table', TG_TABLE_NAME,
            'id', NEW.id,
            'operation', TG_OP
        )::text
    );
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER user_cache_trigger
AFTER INSERT OR UPDATE OR DELETE ON users
FOR EACH ROW EXECUTE FUNCTION notify_cache_invalidation();
# Python listener
import psycopg2
import select

conn = psycopg2.connect(...)
conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
cur = conn.cursor()
cur.execute("LISTEN cache_invalidation;")

while True:
    if select.select([conn], [], [], 5) == ([], [], []):
        continue
    conn.poll()
    while conn.notifies:
        notify = conn.notifies.pop(0)
        data = json.loads(notify.payload)
        redis.delete(f"{data['table']}:{data['id']}")

8.5 Connection Scaling

Connection Pooling with PgBouncer

┌─────────────────────────────────────────────────────────────────────────────┐
│                    MULTI-TIER CONNECTION POOLING                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  Tier 1: Application Pool (e.g., HikariCP)                                  │
│  ┌────────────────────────────────────────────────────────────────────┐     │
│  │  App Server 1      App Server 2      App Server N                 │     │
│  │  ┌───────────┐     ┌───────────┐     ┌───────────┐                │     │
│  │  │ 10 conns  │     │ 10 conns  │     │ 10 conns  │                │     │
│  │  └─────┬─────┘     └─────┬─────┘     └─────┬─────┘                │     │
│  └────────┼─────────────────┼───────────────┼────────────────────────┘     │
│           │                 │               │                               │
│           └────────────┬────┴───────────────┘                              │
│                        │                                                    │
│  Tier 2: PgBouncer (1000 client conns → 50 server conns)                   │
│  ┌─────────────────────┼────────────────────────────────────────────┐      │
│  │                ┌────▼────┐                                       │      │
│  │                │PgBouncer│                                       │      │
│  │                │pool=50  │                                       │      │
│  │                └────┬────┘                                       │      │
│  └─────────────────────┼────────────────────────────────────────────┘      │
│                        │                                                    │
│                        ▼                                                    │
│  ┌──────────────────────────────────────────────────────────────────┐      │
│  │                     PostgreSQL                                    │      │
│  │                  (50 backends)                                    │      │
│  └──────────────────────────────────────────────────────────────────┘      │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Optimizing PostgreSQL for Connections

-- postgresql.conf

# Connection limits
max_connections = 200              # Keep low! Use pooler
superuser_reserved_connections = 3

# Memory per connection
work_mem = 64MB                    # Careful: per sort/hash operation
maintenance_work_mem = 256MB

# Connection overhead
# Each connection uses:
# - ~10MB for backend process
# - work_mem per operation
# - Session state

# Rule of thumb:
# max_connections × work_mem < 50% of available RAM

8.6 Global Distribution

Multi-Region Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                    MULTI-REGION DATABASE                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │                          US-EAST                                     │    │
│  │  ┌─────────────┐           ┌─────────────┐                          │    │
│  │  │   Primary   │──sync────▶│   Standby   │                          │    │
│  │  │  (writes)   │           │  (failover) │                          │    │
│  │  └──────┬──────┘           └─────────────┘                          │    │
│  │         │                                                            │    │
│  └─────────┼────────────────────────────────────────────────────────────┘    │
│            │                                                                 │
│      async │ replication                                                    │
│            │                                                                 │
│  ┌─────────┼────────────────────────────────────────────────────────────┐    │
│  │         ▼                    EU-WEST                                 │    │
│  │  ┌─────────────┐                                                     │    │
│  │  │   Replica   │◀─────────────── EU users (reads)                   │    │
│  │  │ (read-only) │                                                     │    │
│  │  └─────────────┘                                                     │    │
│  └──────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│  ┌──────────────────────────────────────────────────────────────────────┐    │
│  │                        ASIA-PACIFIC                                  │    │
│  │  ┌─────────────┐                                                     │    │
│  │  │   Replica   │◀─────────────── APAC users (reads)                 │    │
│  │  │ (read-only) │                                                     │    │
│  │  └─────────────┘                                                     │    │
│  └──────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│  Challenge: Write latency for non-US users (cross-Atlantic round trip)     │
│  Solutions: Accept latency, or use multi-master with conflict resolution   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Geo-Sharding

# Route users to their nearest shard based on region
REGION_SHARDS = {
    'us': 'postgres://shard-us.db.example.com/app',
    'eu': 'postgres://shard-eu.db.example.com/app',
    'asia': 'postgres://shard-asia.db.example.com/app',
}

def get_db_connection(user):
    region = determine_user_region(user)
    return connect(REGION_SHARDS[region])

# User data stays in their region (data residency compliance)
# Cross-region queries require federation layer

8.7 Scaling Decision Framework

Decision Tree

┌─────────────────────────────────────────────────────────────────────────────┐
│                    SCALING DECISION FRAMEWORK                                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  1. Is your database actually the bottleneck?                               │
│     ├── No → Fix application code first                                    │
│     └── Yes ↓                                                               │
│                                                                              │
│  2. Have you optimized queries and indexes?                                 │
│     ├── No → Do that first (cheapest fix)                                  │
│     └── Yes ↓                                                               │
│                                                                              │
│  3. Is it a read-heavy workload?                                            │
│     ├── Yes → Add read replicas + caching                                  │
│     └── No (write-heavy) ↓                                                 │
│                                                                              │
│  4. Can you scale vertically?                                               │
│     ├── Yes → Upgrade hardware (simpler)                                   │
│     └── No (maxed out) ↓                                                   │
│                                                                              │
│  5. Can you partition within one database?                                  │
│     ├── Yes → Table partitioning                                           │
│     └── No (too big) ↓                                                     │
│                                                                              │
│  6. Can you shard by a natural key?                                         │
│     ├── Yes → Implement sharding                                           │
│     └── No → Consider different database architecture                      │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Scaling Complexity Ladder

StageSizeStrategyComplexity
1< 10GBSingle server, good indexes
210-100GBVertical scaling, query optimization⭐⭐
3100GB-1TBRead replicas, connection pooling⭐⭐⭐
41-10TBTable partitioning, caching layer⭐⭐⭐⭐
5> 10TBSharding, dedicated infrastructure⭐⭐⭐⭐⭐

8.8 Practice: Implement Partitioning

Lab Exercise

Create a time-series orders table with monthly partitions and automated maintenance.
-- 1. Create partitioned orders table
CREATE TABLE orders (
    id BIGINT GENERATED ALWAYS AS IDENTITY,
    customer_id INT NOT NULL,
    order_date TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    total DECIMAL(12,2) NOT NULL,
    status VARCHAR(20) DEFAULT 'pending',
    items JSONB,
    PRIMARY KEY (id, order_date)
) PARTITION BY RANGE (order_date);

-- 2. Create monthly partitions for 2024
DO $$
DECLARE
    start_date DATE := '2024-01-01';
    end_date DATE;
    partition_name TEXT;
BEGIN
    FOR i IN 0..11 LOOP
        start_date := DATE '2024-01-01' + (i || ' months')::INTERVAL;
        end_date := start_date + '1 month'::INTERVAL;
        partition_name := 'orders_' || TO_CHAR(start_date, 'YYYY_MM');
        
        EXECUTE format(
            'CREATE TABLE %I PARTITION OF orders 
             FOR VALUES FROM (%L) TO (%L)',
            partition_name, start_date, end_date
        );
    END LOOP;
END $$;

-- 3. Create indexes on partitions (automatically applied to all)
CREATE INDEX idx_orders_customer ON orders(customer_id);
CREATE INDEX idx_orders_status ON orders(status);

-- 4. Insert test data
INSERT INTO orders (customer_id, order_date, total, status, items)
SELECT 
    (random() * 10000)::INT,
    '2024-01-01'::TIMESTAMPTZ + (random() * 365 * INTERVAL '1 day'),
    (random() * 1000)::DECIMAL(12,2),
    (ARRAY['pending', 'processing', 'shipped', 'delivered'])[1 + (random()*3)::INT],
    jsonb_build_object('item_count', (random() * 10)::INT)
FROM generate_series(1, 1000000);

-- 5. Verify partition pruning
EXPLAIN (ANALYZE, COSTS OFF) 
SELECT * FROM orders 
WHERE order_date >= '2024-06-01' AND order_date < '2024-07-01';
-- Should only scan orders_2024_06

-- 6. Maintenance: Drop old partition
ALTER TABLE orders DETACH PARTITION orders_2024_01;
DROP TABLE orders_2024_01;

-- 7. Add new partition for next period
CREATE TABLE orders_2025_01 PARTITION OF orders
    FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');

Summary

You’ve learned how to scale PostgreSQL from a single instance to a globally distributed system. Key takeaways:

Optimize First

Indexes and queries before hardware

Partition Early

Plan for growth from the start

Cache Strategically

Right data, right layer

Course Completion

Congratulations! You’ve completed the Database Engineering course.
You now know how to:
  • Write efficient SQL and design schemas
  • Understand transactions and isolation
  • Optimize queries and indexes
  • Navigate PostgreSQL internals
  • Build highly available systems
  • Scale to millions of users

Next Steps