> ## Documentation Index
> Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Performance Engineering

> Production performance analysis for database engineers at PlanetScale, Supabase, and beyond

# Performance Engineering: The Bottleneck Analysis Guide

Performance engineering is not about "tuning parameters" -- it is a systematic process of identifying the **bottleneck** in a complex, layered system and applying the minimum change required to shift that bottleneck to a more scalable resource.

**Real-world analogy:** Think of a database system like a factory assembly line. The overall throughput is determined by the slowest station. If the painting station can only process 10 units per hour but assembly handles 100, making assembly faster is a total waste of effort. Performance engineering is the discipline of finding the painting station, improving it, and then finding the next bottleneck. The moment you start optimizing without measuring, you are guessing which station to fix -- and guessing has about a 10% hit rate.

<Info>
  **Target Audience**: Staff/Principal Database Engineers\
  **Core Methodology**: The USE Method + Wait Event Profiling\
  **Key Goal**: Move from "guessing" to "evidential analysis"
</Info>

***

## The Performance Engineering Mindset

A principal engineer view of performance follows three laws:

1. **The Law of Amdahl**: The speedup of a system is limited by its serial components.
2. **The Law of Little**: Average number of items in a system = arrival rate × average time spent in system (L = λW).
3. **The Law of Bottlenecks**: Performance is always limited by the **most saturated resource**. Optimizing anything else is a waste of time.

***

## Phase 0: Triage & The Four Golden Signals

Before diving into profiles, you must identify the **symptom**. We use the "Four Golden Signals" to determine if the system is healthy.

| Signal         | Metric                   | Principal Question                                           |
| :------------- | :----------------------- | :----------------------------------------------------------- |
| **Latency**    | P99 Response Time        | Is the database slow, or is it just a few "outlier" queries? |
| **Throughput** | Queries Per Second (QPS) | Are we handling the load, or is the engine stalling?         |
| **Errors**     | Rollback/Conflict Rate   | Are we failing due to deadlocks or serialization issues?     |
| **Saturation** | Resource % (CPU/IO)      | How much "headroom" is left before a total collapse?         |

### Triage Query: The 30-Second Snapshot

```sql theme={null}
-- Identify the "Current Reality" of the engine.
-- This is the single most important query in your performance toolkit.
-- Run it first, every time. It tells you WHERE the database is spending time.
SELECT 
    state,
    wait_event_type,          -- NULL = actively using CPU; non-NULL = waiting
    wait_event,               -- Specific lock/IO/event the backend is stalled on
    count(*) as count,
    round(100.0 * count(*) / sum(count(*)) over(), 2) as pct  -- percentage of active backends
FROM pg_stat_activity
WHERE state != 'idle'         -- Exclude connections doing nothing
GROUP BY 1, 2, 3
ORDER BY 4 DESC;
```

**How to read the results:** If `wait_event_type` is NULL for most sessions, you are **CPU Bound** -- the database is actively computing, not waiting. If it is `IO`, you are **Disk Bound** -- backends are stalled waiting for pages from disk. If it is `LWLock`, you are **Contention Bound** -- multiple backends are fighting over the same internal resource. Each diagnosis leads to a fundamentally different remediation path, which is why this query must always be your first step.

***

## Phase 1: Resource Saturation (The USE Method)

Developed by Brendan Gregg, the **USE Method** should be applied to every resource (CPU, Memory, Disk, Network).

### 1.1 CPU: Computation vs. Context Switching

**Utilization**: `top` or `mpstat`. High utilization (>80%) is expected in high-performance DBs.
**Saturation**: Look at the **Run Queue Length** (load average). If Load > Number of Cores, tasks are waiting for a CPU slot.

**Principal Observation**: High CPU with low throughput often indicates **Spinlock Contention** or **Excessive Context Switching**.

```bash theme={null}
# Check context switches per second
vmstat 1
# 'cs' column > 100k/sec usually indicates a problem (e.g., too many connections)
```

### 1.2 Memory: The Buffer Hit Ratio Myth

**Utilization**: `free -m`.
**Saturation**: **Scanning/Paging**. Look at `si/so` (swap in/out) in `vmstat`.

**Staff Engineer Insight**: A "99% Buffer Cache Hit Ratio" can be a trap. If you have 1,000,000 hits and 10,000 reads, your ratio is 99%. But if those 10,000 reads are random I/Os on a slow disk, your latency will be terrible. **Focus on the absolute number of reads/sec, not the ratio.**

**Real-world analogy:** A 99% cache hit ratio is like saying "99% of our customers were served in under a second." That sounds great until you learn you have 1 million customers per hour -- meaning 10,000 people per hour waited 50ms+ for a disk read. In an OLTP system with a P99 latency SLA, those 10,000 slow operations are your entire problem, and the ratio hides them completely.

### 1.3 Disk: IOPS vs. Throughput

**Utilization**: `%util` in `iostat`.
**Saturation**: `await` (Average Wait Time).

```bash theme={null}
iostat -xy 1
# If await > 5ms on an NVMe drive, the storage is saturated.
# If %util is 100% but rkB/s is low, you have a Random I/O bottleneck.
```

***

## Phase 2: Database Internal Waits

## Phase 2: Database Internal Waits

If resources are not saturated but latency is high, the database is likely waiting on internal synchronization.

### 2.1 The Hierarchy of Waits

| Wait Class | Principal Archetype | Root Cause                                                                                  |
| :--------- | :------------------ | :------------------------------------------------------------------------------------------ |
| **LWLock** | `buffer_content`    | **Shared Buffer Contention**. Too many backends trying to access the same page (Hot Block). |
| **LWLock** | `WALWriteLock`      | **WAL Serializer Bottleneck**. Too many small transactions; slow fsync to WAL disk.         |
| **Lock**   | `relation`          | **Catalog Contention**. An uncommitted DDL (e.g., `ALTER TABLE`) is blocking all DML.       |
| **IO**     | `DataFileRead`      | **Cache Miss**. Query is pulling cold data from disk into the buffer pool.                  |

### 2.2 Advanced: ProcArrayLock Contention

In high-concurrency systems (1000+ TPS), `ProcArrayLock` often becomes the global bottleneck. This lock is used when a transaction needs to compute a "Snapshot" of the system (visibility).

**Staff Engineer Insight**: If you see `ProcArrayLock` as a top wait event, scaling CPU won't help. You must reduce the transaction rate, use a connection pooler (PgBouncer) to reduce backend count, or move to a version of Postgres (14+) with snapshot caching optimizations.

***

## Phase 3: Code-Level Profiling (Flamegraphs)

When you are **CPU Bound** and cannot find a simple query fix, you must look at what the CPU is actually doing inside the Postgres binary.

### 3.1 Capturing the Flamegraph

**Real-world analogy:** A flamegraph is like a time-lapse photo of a kitchen during a dinner rush. Each flame shows where the cooks (CPU) are spending their time. A wide plateau means a lot of time is spent in one activity (e.g., chopping onions). Jagged towers mean the work is fragmented across many small tasks. You optimize by making the widest plateaus narrower.

```bash theme={null}
# Profile all postgres backends for 60 seconds.
# -F 99: Sample at 99 Hz (not 100, to avoid aliasing with periodic tasks).
# -g: Capture full call stacks (essential for useful flamegraphs).
# -p: Target only postgres backend processes.
perf record -F 99 -g -p $(pgrep -d',' postgres) -- sleep 60

# Generate the SVG (requires Brendan Gregg's FlameGraph tools)
perf script | ./stackcollapse-perf.pl | ./flamegraph.pl > db_profile.svg
# Open db_profile.svg in a browser. Click on frames to zoom in.
```

### 3.2 Interpreting "Staff-Level" Patterns

| Pattern           | Code Path           | Meaning                                                                                                                                                            |
| :---------------- | :------------------ | :----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Wide Plateau**  | `hash_search`       | The Hash Table (for Joins or Aggs) is huge and spending time in bucket lookup. Consider `work_mem` tuning.                                                         |
| **Wide Plateau**  | `slot_deform_tuple` | **The Defragmentation Tax**. Postgres spends massive CPU "deforming" rows to find a column at the end of a wide table. Move heavily accessed columns to the front. |
| **Jagged Towers** | `ExecQual`          | Massive expression evaluation. Likely a complex `WHERE` clause with non-indexed functions.                                                                         |
| **Wide Tower**    | `LWLockAcquire`     | **Lock Contention**. The engine is spending more time waiting for metadata than doing work.                                                                        |

***

## Phase 4: Query-Level Optimization (The 80/20 Rule)

You cannot optimize every query. A principal engineer identifies the 20% of queries causing 80% of the load.

### 4.1 The pg\_stat\_statements Strategy

```sql theme={null}
-- Identify queries with the highest I/O impact.
-- This is your "top offenders" list. Fix these first.
SELECT 
    query,
    calls,                        -- How often this query runs
    round(total_exec_time::numeric, 2) as total_ms,  -- Cumulative wall-clock time
    shared_blks_read + shared_blks_written as io_blocks,  -- Total disk I/O in 8KB blocks
    round(100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0), 2) as hit_ratio
FROM pg_stat_statements
ORDER BY io_blocks DESC          -- Sort by I/O impact, not just time
LIMIT 10;
```

**Staff Engineer Insight**: Look for queries with high `calls` but low `rows`. This indicates **index-thrashing** or **unnecessary repeat work**. A query called 10,000 times a second that takes 1ms is often more dangerous than a 10s query called once an hour.

**Why sort by `io_blocks` instead of `total_exec_time`?** Time-based sorting conflates CPU-bound and I/O-bound work. A CPU-bound query on a fast machine might show high time but zero disk pressure. An I/O-bound query might show modest time but be saturating your disk, degrading every other query on the system. I/O blocks reveal the queries that are poisoning the shared resource.

### 4.2 The "Temp File" Trap

If `work_mem` is too small, Postgres spills sorts and hashes to disk. This is a silent performance killer -- you will not see an error or warning; the query just silently becomes 10-100x slower because in-memory operations become disk I/O operations.

**Real-world analogy:** Imagine sorting a deck of 1,000 cards on a table that only fits 100 cards. You would have to sort in batches of 100, write each batch to a pile on the floor, then merge the piles. The floor (disk) is orders of magnitude slower than the table (RAM). That is exactly what an "external merge sort" does.

```sql theme={null}
-- Find queries that are spilling to disk.
-- Any row in this result is a query that could be dramatically faster
-- with more work_mem or a query rewrite that reduces intermediate results.
SELECT query, temp_blks_read, temp_blks_written
FROM pg_stat_statements
WHERE temp_blks_read > 0
ORDER BY 2 DESC;
```

**Solution:** Increase `work_mem` for specific users or queries (not globally -- see the memory management pitfall about per-operation limits), or optimize the join to reduce the intermediate result set size so it fits in memory.

***

## Phase 5: Concurrency & Lock Contention

### 5.1 The Blocker Tree

Use this recursive query to find the "Root Blocker" causing a pile-up.

```sql theme={null}
WITH RECURSIVE blockers AS (
  SELECT 
    pid, 
    pg_blocking_pids(pid) AS pids 
  FROM pg_stat_activity 
  WHERE cardinality(pg_blocking_pids(pid)) > 0
  UNION ALL
  SELECT 
    s.pid, 
    pg_blocking_pids(s.pid) 
  FROM pg_stat_activity s 
  JOIN blockers b ON s.pid = ANY(b.pids)
)
SELECT 
  pid, 
  state, 
  wait_event, 
  query 
FROM pg_stat_activity 
WHERE pid IN (SELECT unnest(pids) FROM blockers)
  AND pid NOT IN (SELECT pid FROM blockers);
```

### 5.2 Common Contention Archetypes

1. **The Hot Row**: Millions of updates to a single row (e.g., a counter). *Solution: Use batching or CRDT-style additive counters.*
2. **The Foreign Key Lock**: Updating a child table requires a shared lock on the parent. If many children update simultaneously, the parent becomes a bottleneck.
3. **The Index Build**: Creating an index without `CONCURRENTLY` locks the whole table for writes.

***

## Phase 6: Production Troubleshooting Runbook

| Symptom             | Initial Check      | Probable Root Cause                                                      |
| :------------------ | :----------------- | :----------------------------------------------------------------------- |
| **Spiking Latency** | `pg_stat_activity` | Lock pile-up or a "Plan Flip" (CBO chose a bad plan due to stale stats). |
| **High Load Avg**   | `vmstat` / `top`   | High context switching or Spinlock contention.                           |
| **I/O Saturation**  | `iostat`           | Checkpoint spike or a huge Sequential Scan.                              |
| **Out of Memory**   | `dmesg`            | `work_mem` is set too high for the connection count.                     |

***

## Summary: The Principal's Workflow

1. **Start at the OS**: Is there a resource (CPU/IO) at 100%? (USE Method)
2. **Look at the Waits**: Is the database waiting on itself? (`pg_stat_activity`)
3. **Find the Culprit**: Which queries are causing the most impact? (`pg_stat_statements`)
4. **Profile if Needed**: If CPU is high but queries look fine, use `perf` and Flamegraphs.
5. **Optimize Evidentially**: Change one variable, measure the throughput shift, and repeat.

### 5.2 PgBouncer Configuration

**Why connection pooling matters:** Each PostgreSQL connection is a separate OS process consuming approximately 5-10MB of RAM. A naive setup with 500 application threads each holding a database connection wastes 2.5-5GB of RAM just on connection overhead, and PostgreSQL performs poorly beyond \~100-200 active connections due to lock contention in the ProcArray. PgBouncer solves this by multiplexing thousands of client connections onto a small pool of real database connections.

**Real-world analogy:** PgBouncer is like a restaurant host who manages a waiting list. The restaurant (PostgreSQL) has 50 tables (connections). When 500 customers (application threads) show up, the host does not try to seat them all at once. Instead, each customer gets a table only while they are actively eating (executing a transaction), and gives it up between courses. This is `transaction` pooling mode.

```ini theme={null}
# pgbouncer.ini for high performance

[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
# Connection pooling mode
# "transaction" returns connection to pool after each transaction completes.
# "session" holds connection for entire client session (like no pooling).
# "statement" returns after each statement (breaks multi-statement transactions).
pool_mode = transaction  # Best for most workloads; use "session" if you need prepared statements

# Pool sizing -- the most important tuning parameter
default_pool_size = 20      # Connections per user/database pair (start low, increase if needed)
min_pool_size = 5           # Keep some connections warm (avoids cold-start latency)
reserve_pool_size = 5       # Extra for traffic bursts
reserve_pool_timeout = 3    # Seconds to wait before dipping into the reserve pool

# Connection limits
max_client_conn = 1000      # Max client connections to pgbouncer (can be very high)
max_db_connections = 50     # Max connections to actual database (this is the real constraint)

# Timeouts
query_timeout = 30          # Kill query if longer than 30s
client_idle_timeout = 600   # Close idle clients after 10min
server_idle_timeout = 60    # Close idle server connections

# Performance
tcp_keepalive = 1
tcp_keepidle = 60
tcp_keepintvl = 10
tcp_keepcnt = 3

# Logging
log_connections = 0         # Reduce log spam in production
log_disconnections = 0
log_pooler_errors = 1

# Admin
admin_users = pgbouncer_admin
stats_users = pgbouncer_stats
```

### 5.3 Memory Configuration

**Practical tip:** The settings below are starting points, not final answers. Every workload is different. The right approach is to start with these values, monitor with `pg_stat_bgwriter` (for checkpoint tuning), `pg_stat_statements` (for temp file usage), and `pg_stat_activity` (for memory-related wait events), then adjust based on evidence.

```sql theme={null}
-- Memory settings for different workload types

-- OLTP workload (many small queries, many connections)
shared_buffers = '8GB'              -- ~25% of RAM. Above 8-16GB, diminishing returns
                                    -- because the OS page cache handles the rest.
effective_cache_size = '24GB'       -- Tell the planner how much total cache exists
                                    -- (shared_buffers + OS cache). Does NOT allocate memory.
work_mem = '64MB'                   -- Conservative for OLTP. Many connections = many
                                    -- concurrent operations, each consuming work_mem.
maintenance_work_mem = '2GB'        -- For VACUUM, CREATE INDEX. Only a few run at once,
                                    -- so this can be much larger than work_mem.
wal_buffers = '64MB'                -- WAL insertion buffer. Default is too small for
                                    -- write-heavy workloads. 64MB is a safe upper bound.

-- OLAP workload (few large queries, few connections)
shared_buffers = '32GB'             -- Larger because analytics scans more data.
effective_cache_size = '96GB'       
work_mem = '2GB'                    -- Much larger because OLAP queries have complex
                                    -- sorts and hash joins. Acceptable because there
                                    -- are only 5-20 concurrent queries, not 500.
maintenance_work_mem = '4GB'
max_parallel_workers_per_gather = 8 -- Parallelism is critical for large scans/aggregations.

-- Work_mem calculation for hash joins:
/*
 * Memory needed = (inner_rows × tuple_width × hash_overhead)
 * 
 * For 1M rows, 100 bytes/row:
 * Memory = 1,000,000 × 100 × 1.5 ≈ 150MB
 * 
 * Set work_mem = 256MB to fit with margin
 * 
 * Warning: work_mem is PER OPERATION
 * Complex query with 5 sorts/hashes = 5 × work_mem
 * With 100 connections: potential 5 × 256MB × 100 = 128GB!
 */
```

***

## Part 6: Lock Contention Analysis

### 6.1 Lock Monitoring

```sql theme={null}
-- Current locks (who's blocking whom)
SELECT 
    blocked_locks.pid AS blocked_pid,
    blocked_activity.usename AS blocked_user,
    blocking_locks.pid AS blocking_pid,
    blocking_activity.usename AS blocking_user,
    blocked_activity.query AS blocked_statement,
    blocking_activity.query AS blocking_statement
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity 
    ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks 
    ON blocking_locks.locktype = blocked_locks.locktype
    AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
    AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
    AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
    AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
    AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
    AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
    AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
    AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
    AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
    AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity 
    ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;

-- Lock types and their conflicts
/*
┌─────────────────────────────────────────────────────────────────────────────┐
│                      LOCK CONFLICT MATRIX                                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│ Requested →   ACCESS  ROW    ROW    SHARE  SHARE  SHARE   EXCL   ACCESS    │
│ Held ↓        SHARE   SHARE  EXCL   UPDATE ROW    UPDATE         EXCL      │
│                                            EXCL   EXCL                       │
│ ─────────────────────────────────────────────────────────────────────────── │
│ ACCESS SHARE    -       -      -      -      -      -       -      X        │
│ ROW SHARE       -       -      -      -      -      -       X      X        │
│ ROW EXCLUSIVE   -       -      -      -      -      X       X      X        │
│ SHARE UPDATE EX -       -      -      -      X      X       X      X        │
│ SHARE           -       -      -      -      X      X       X      X        │
│ SHARE ROW EXCL  -       -      X      X      X      X       X      X        │
│ EXCLUSIVE       -       X      X      X      X      X       X      X        │
│ ACCESS EXCL     X       X      X      X      X      X       X      X        │
│                                                                              │
│ X = Conflict (must wait)                                                    │
│ - = No conflict (can proceed)                                               │
│                                                                              │
│ Common operations and their locks:                                          │
│ • SELECT: ACCESS SHARE                                                      │
│ • INSERT/UPDATE/DELETE: ROW EXCLUSIVE                                       │
│ • CREATE INDEX CONCURRENTLY: SHARE UPDATE EXCLUSIVE                         │
│ • VACUUM FULL, REINDEX: ACCESS EXCLUSIVE                                    │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘
*/
```

### 6.2 Deadlock Analysis

```sql theme={null}
-- Enable deadlock logging
-- In postgresql.conf:
-- log_lock_waits = on
-- deadlock_timeout = 1s

-- Check logs for deadlock patterns:
/*
LOG:  process 12345 detected deadlock while waiting for ShareLock on 
      transaction 987654 after 1000.123 ms
DETAIL:  Process holding the lock: 12346. Wait queue: .
CONTEXT:  while updating tuple (0,42) in relation "orders"
STATEMENT:  UPDATE orders SET status = 'shipped' WHERE id = 100

LOG:  process 12345 still waiting for ShareLock on transaction 987654 
      after 1000.456 ms
DETAIL:  Process holding the lock: 12346. Wait queue: 12345.
CONTEXT:  while updating tuple (0,42) in relation "orders"
STATEMENT:  UPDATE orders SET status = 'shipped' WHERE id = 100
*/

-- Analyze deadlock from logs:
/*
 Deadlock scenario:
 
 Transaction A:                  Transaction B:
 UPDATE orders WHERE id=100      UPDATE orders WHERE id=200
 (holds lock on row 100)         (holds lock on row 200)
 ...                             ...
 UPDATE orders WHERE id=200      UPDATE orders WHERE id=100
 (waits for B's lock)            (waits for A's lock)
 
 ↓ DEADLOCK
 
 Solution: Consistent lock ordering
 - Always acquire locks in same order (e.g., by id ASC)
 - Or use SELECT ... FOR UPDATE with explicit ordering
*/
```

***

## Part 7: Production Troubleshooting Runbook

### 7.1 High CPU Usage

```sql theme={null}
-- Step 1: Identify CPU-heavy queries
SELECT 
    pid,
    usename,
    state,
    wait_event_type,
    wait_event,
    left(query, 100)
FROM pg_stat_activity
WHERE state = 'active'
  AND wait_event_type IS NULL  -- Not waiting = actually running
ORDER BY backend_start;

-- Step 2: Check pg_stat_statements
SELECT 
    query,
    calls,
    mean_exec_time,
    total_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

-- Step 3: Profile with perf
-- perf record -g -p <pid> -- sleep 30
-- perf report

-- Common causes:
-- • Missing index (ExecSeqScan in profile)
-- • Complex expression evaluation (ExecEvalExpr)
-- • Hash table operations (hash_search)
-- • Sort operations (tuplesort_*)
```

### 7.2 High I/O Wait

```sql theme={null}
-- Step 1: Check wait events
SELECT 
    wait_event_type,
    wait_event,
    count(*)
FROM pg_stat_activity
WHERE state = 'active'
  AND wait_event_type IN ('IO', 'BufferPin')
GROUP BY wait_event_type, wait_event;

-- Step 2: Check buffer cache hit ratio
SELECT 
    sum(heap_blks_read) as heap_read,
    sum(heap_blks_hit) as heap_hit,
    sum(heap_blks_hit) / 
        nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0) as ratio
FROM pg_statio_user_tables;
-- Should be > 0.99 for OLTP

-- Step 3: Check checkpoint frequency
SELECT 
    checkpoints_timed,
    checkpoints_req,  -- Requested = forced = BAD
    buffers_checkpoint,
    buffers_backend   -- Backend writes = BAD, should be 0
FROM pg_stat_bgwriter;

-- Step 4: System level
-- iostat -xz 1
-- iotop

-- Common causes:
-- • shared_buffers too small
-- • Checkpoint too aggressive (checkpoint_completion_target low)
-- • seq scans on large tables
-- • Bloated tables need VACUUM
```

### 7.3 Lock Contention

```sql theme={null}
-- Step 1: Check current locks
SELECT 
    locktype,
    relation::regclass,
    mode,
    granted,
    pid,
    left(query, 50) as query
FROM pg_locks l
JOIN pg_stat_activity a ON l.pid = a.pid
WHERE NOT granted
ORDER BY relation;

-- Step 2: Find blocking queries
SELECT * FROM pg_blocking_pids(12345);  -- Replace with waiting PID

-- Step 3: Check for long-running transactions
SELECT 
    pid,
    now() - xact_start as xact_duration,
    now() - query_start as query_duration,
    state,
    left(query, 60)
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
ORDER BY xact_start;

-- Step 4: Kill if necessary
SELECT pg_terminate_backend(12345);  -- Last resort

-- Prevention:
-- • Use statement_timeout and idle_in_transaction_session_timeout
-- • Use lock_timeout
-- • Avoid long transactions in OLTP
```

***

## Part 8: Interview Questions

### Staff/Principal Level

<AccordionGroup>
  <Accordion title="Q: How would you diagnose a sudden 10x latency increase in production?">
    **Systematic Approach**:

    1. **Immediate checks** (\< 1 min)
       * `pg_stat_activity`: What's running, what's waiting?
       * `pg_locks`: Any blocking?
       * System metrics: CPU, I/O, memory via monitoring

    2. **Compare to baseline**
       * pg\_stat\_statements: Which queries slowed down?
       * Was there a deploy? Schema change?
       * Traffic pattern change?

    3. **Drill down**
       * EXPLAIN ANALYZE the slow queries
       * Check for plan changes (pg\_stat\_plans if available)
       * Statistics stale? Run ANALYZE

    4. **System level if needed**
       * iostat: I/O saturation?
       * vmstat: Memory pressure?
       * perf/flamegraph: CPU profile

    5. **Common root causes**
       * Autovacuum not keeping up
       * Statistics out of date → bad plans
       * Connection saturation
       * Lock contention from long transaction
       * Checkpoint storm
  </Accordion>

  <Accordion title="Q: Design a monitoring system for PostgreSQL at scale">
    **Architecture**:

    1. **Metrics Collection**
       * pg\_stat\_\* views polled every 10-60s
       * pg\_stat\_statements for query patterns
       * OS metrics (node\_exporter)
       * Custom metrics via pgbouncer stats

    2. **Storage**
       * Time-series database (Prometheus, InfluxDB, TimescaleDB)
       * Retention: 15 days high-res, 1 year aggregated

    3. **Dashboards**
       * Golden signals: latency, throughput, errors, saturation
       * Per-query latency percentiles
       * Replication lag
       * Vacuum progress
       * Connection pool stats

    4. **Alerting**
       * Latency P99 > threshold
       * Replication lag > 30s
       * Connections > 80% max
       * Disk usage > 80%
       * Long-running transactions > 10 min

    5. **Query Insights**
       * Track query fingerprints over time
       * Detect plan regressions
       * Identify optimization opportunities
  </Accordion>

  <Accordion title="Q: How would you scale PostgreSQL to handle 1M QPS?">
    **Strategy**:

    1. **Read Scaling**
       * Read replicas (streaming replication)
       * Load balancer (HAProxy, pgpool) with read/write splitting
       * Connection pooling (PgBouncer) in front of each node
       * Caching layer (Redis) for hot data

    2. **Write Scaling** (harder)
       * Vertical scaling first (bigger instance)
       * Application-level sharding
       * Citus for distributed PostgreSQL
       * Partitioning for time-series data

    3. **Connection Management**
       * PgBouncer with transaction pooling
       * Aim for \~50 actual connections per node
       * 1000s of clients → PgBouncer → 50 DB connections

    4. **Query Optimization**
       * Prepared statements (reduce parsing)
       * Efficient indexes
       * Minimize lock contention

    5. **Infrastructure**
       * NVMe storage for IOPS
       * Sufficient RAM for working set
       * Network: 10+ Gbps between components

    **Reality check**: 1M QPS usually requires:

    * Many read replicas (10-50+)
    * Aggressive caching
    * Sharding for writes
    * Not all queries hit the database
  </Accordion>
</AccordionGroup>

***

## Production Troubleshooting Scenario 1: High-CPU Query Analysis

### Incident Report

**Symptoms**: Database server showing 95%+ CPU utilization. Application reporting 5x normal query latency.\
**Impact**: User-facing API timing out, customer complaints escalating.\
**Time to Resolve**: Target \< 15 minutes

### Investigation Workflow

**Step 1: Identify CPU-Intensive Queries (2 minutes)**

```sql theme={null}
-- Find queries consuming the most CPU time
SELECT 
    left(query, 100) as query_preview,
    calls,
    total_exec_time::numeric(12,2) as total_exec_ms,
    mean_exec_time::numeric(10,2) as mean_exec_ms,
    (total_exec_time / sum(total_exec_time) OVER ()) * 100 as pct_total_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

-- Check currently running expensive queries
SELECT 
    pid,
    usename,
    application_name,
    now() - query_start as duration,
    state,
    left(query, 100) as query_preview
FROM pg_stat_activity
WHERE state = 'active' 
  AND query NOT LIKE '%pg_stat_activity%'
ORDER BY query_start;
```

**What to Look For**:

* Queries with high `total_exec_time` but reasonable `mean_exec_time` → high call frequency issue
* Queries with high `mean_exec_time` → individual query performance issue
* Long-running queries that started recently → may be holding locks

**Step 2: Analyze the Problematic Query (3 minutes)**

```sql theme={null}
-- Get the full query text if truncated
SELECT query FROM pg_stat_statements WHERE queryid = <identified_queryid>;

-- Check execution plan
EXPLAIN (ANALYZE, BUFFERS, VERBOSE) 
<paste the problematic query here>;
```

**Common High-CPU Patterns**:

1. **Full Table Scans on Large Tables**
   * Look for `Seq Scan` on tables with millions of rows
   * Missing index or statistics out of date

2. **Nested Loop Joins with High Row Estimates**
   * `Nested Loop` on large result sets is CPU-intensive
   * May need hash join or merge join instead

3. **Complex Aggregations Without Indexes**
   * `GROUP BY` on non-indexed columns
   * Window functions scanning entire tables

4. **Regular Expression or String Operations**
   * `LIKE '%pattern%'` (can't use indexes)
   * `regexp_match` on large text fields

**Step 3: Check for Statistics Issues (2 minutes)**

```sql theme={null}
-- Check when tables were last analyzed
SELECT 
    schemaname,
    relname,
    last_analyze,
    last_autoanalyze,
    n_live_tup,
    n_dead_tup
FROM pg_stat_user_tables
WHERE relname IN (
    -- List tables from the problematic query
    'users', 'orders', 'products'
)
ORDER BY last_analyze NULLS FIRST;

-- If statistics are stale, run ANALYZE
ANALYZE users;
ANALYZE orders;
-- Check if plan improves
```

**Step 4: Immediate Mitigation (3 minutes)**

**Option A: Kill the Offending Queries**

```sql theme={null}
-- Identify specific PIDs
SELECT 
    pid,
    usename,
    left(query, 80),
    now() - query_start as duration
FROM pg_stat_activity
WHERE query ILIKE '%<pattern from step 1>%'
  AND state = 'active';

-- Terminate them
SELECT pg_terminate_backend(pid) FROM pg_stat_activity 
WHERE query ILIKE '%<pattern>%' AND state = 'active';
```

**Option B: Add Missing Index**

```sql theme={null}
-- If Seq Scan is the issue and WHERE clause is clear
CREATE INDEX CONCURRENTLY idx_users_created_at ON users(created_at);
-- CONCURRENTLY prevents table locking
```

**Option C: Query Rewrite (if possible)**

```sql theme={null}
-- Example: Replace full table scan with indexed lookup
-- Bad:
SELECT * FROM orders WHERE status IN ('pending', 'processing');

-- Better:
SELECT * FROM orders WHERE status = 'pending'
UNION ALL
SELECT * FROM orders WHERE status = 'processing';
-- If there's an index on status, this uses it twice instead of seq scan
```

**Step 5: Monitor and Verify (5 minutes)**

```sql theme={null}
-- Watch CPU and query stats
SELECT 
    sum(total_exec_time) as total_time,
    sum(calls) as total_calls
FROM pg_stat_statements;

-- Check every 30 seconds, verify total_time growth rate decreased
-- Monitor system CPU with `top` or monitoring dashboard
```

### Post-Incident Analysis

**Root Cause Categories**:

1. **Missing Index**: Application added new query pattern without proper indexing
2. **Statistics Drift**: Large data changes made planner choose wrong plan
3. **Query Regression**: Application deployed with inefficient query
4. **Load Spike**: Sudden traffic increase exposed existing slow query

**Preventive Measures**:

* Enable `auto_explain` with `log_min_duration` to catch slow queries before they become incidents
* Set up alerts on CPU >80% sustained for >2 minutes
* Run `ANALYZE` after bulk data imports
* Review new queries in development with `EXPLAIN ANALYZE`
* Implement query review process before production deployment

***

## Production Troubleshooting Scenario 2: I/O-Bound Workload with Low Cache Hit Ratio

### Incident Report

**Symptoms**: Response times degraded 10x. Disk I/O is saturated at 100%. Cache hit ratio dropped from 99% to 85%.\
**Impact**: All database-dependent services experiencing timeouts.\
**Time to Resolve**: Target \< 20 minutes

### Investigation Workflow

**Step 1: Confirm I/O Saturation (1 minute)**

```bash theme={null}
# Check disk I/O utilization (Linux)
iostat -x 1 5

# Look for:
# - %util near 100% (disk saturated)
# - await > 10ms (high latency)
# - r/s and w/s (read/write operations per second)

# PostgreSQL wait events
```

```sql theme={null}
SELECT 
    wait_event_type,
    wait_event,
    count(*) as waiting_backends
FROM pg_stat_activity
WHERE wait_event IS NOT NULL
GROUP BY wait_event_type, wait_event
ORDER BY count(*) DESC;

-- Look for: DataFileRead, DataFileWrite, WALWrite
```

**Step 2: Identify Cache-Unfriendly Queries (3 minutes)**

```sql theme={null}
-- Queries with worst buffer hit ratios
SELECT 
    left(query, 100) as query_preview,
    calls,
    shared_blks_read as disk_reads,
    shared_blks_hit as cache_hits,
    shared_blks_read + shared_blks_hit as total_blks,
    round(100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0), 2) as cache_hit_pct
FROM pg_stat_statements
WHERE shared_blks_read > 1000  -- Significant disk activity
ORDER BY shared_blks_read DESC
LIMIT 20;

-- Identify specific tables being read from disk
SELECT 
    schemaname,
    relname as table_name,
    heap_blks_read as disk_reads,
    heap_blks_hit as cache_hits,
    heap_blks_read + heap_blks_hit as total_accesses,
    round(100.0 * heap_blks_hit / nullif(heap_blks_hit + heap_blks_read, 0), 2) as hit_ratio
FROM pg_statio_user_tables
WHERE heap_blks_read > 0
ORDER BY heap_blks_read DESC
LIMIT 20;
```

**Step 3: Analyze Query Plans for I/O Issues (4 minutes)**

```sql theme={null}
-- Get explain plan with buffer statistics
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
<paste problematic query>;

-- Look for:
-- • "Buffers: shared read=XXXXX" - high numbers mean disk reads
-- • Seq Scans on large tables
-- • Index scans that read too many pages
-- • Sorts or aggregations spilling to disk
```

**Common I/O Problem Patterns**:

1. **Large Sequential Scans**
   * Reading entire large tables from disk
   * Solution: Add appropriate indexes or partition data

2. **Index Scans with Poor Selectivity**
   * Index scan fetching >10% of table (seq scan might be better)
   * Solution: Consider covering index or change query

3. **Work Memory Exhaustion**
   * Sorts/hashes spilling to temp files
   * Look for "external merge disk" in plan
   * Solution: Increase `work_mem`

4. **Table Bloat**
   * Dead tuples force reading more pages
   * Solution: Run VACUUM or VACUUM FULL

**Step 4: Check for Bloat and Fragmentation (3 minutes)**

```sql theme={null}
-- Identify bloated tables
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as total_size,
    n_live_tup,
    n_dead_tup,
    round(100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0), 2) as dead_tup_pct,
    last_vacuum,
    last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

-- Check if autovacuum is running
SELECT * FROM pg_stat_progress_vacuum;
```

**Step 5: Immediate Mitigation (5 minutes)**

**Option A: Reduce Query Load**

```sql theme={null}
-- Kill or throttle the I/O-heavy queries
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE query ILIKE '%<identified pattern>%'
  AND state = 'active'
  AND backend_type = 'client backend';

-- Or set a statement timeout for that session
ALTER DATABASE mydb SET statement_timeout = '5s';
```

**Option B: Emergency VACUUM**

```sql theme={null}
-- If bloat is the issue
VACUUM (ANALYZE, VERBOSE) problematic_table;

-- Monitor progress
SELECT * FROM pg_stat_progress_vacuum;
```

**Option C: Add Targeted Index**

```sql theme={null}
-- If missing index is clear from EXPLAIN
CREATE INDEX CONCURRENTLY idx_orders_user_created 
ON orders(user_id, created_at) 
INCLUDE (status, total);  -- Covering index to avoid heap lookups
```

**Option D: Temporarily Increase Shared Buffers**

```sql theme={null}
-- If you have available RAM and can restart
-- postgresql.conf:
-- shared_buffers = 8GB  (increase from current value)
-- effective_cache_size = 24GB  (hint for planner)

-- Or for immediate session-level help:
SET work_mem = '256MB';  -- For current session only
```

**Step 6: Monitor Recovery (4 minutes)**

```sql theme={null}
-- Watch cache hit ratio improve
SELECT 
    sum(heap_blks_hit) as cache_hits,
    sum(heap_blks_read) as disk_reads,
    round(100.0 * sum(heap_blks_hit) / 
          nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0), 2) as hit_ratio
FROM pg_statio_user_tables;

-- Check every 30 seconds, target >99%

-- Monitor I/O wait events
SELECT 
    wait_event_type,
    count(*) as waiting
FROM pg_stat_activity
WHERE wait_event_type = 'IO'
GROUP BY wait_event_type;
-- Should decrease significantly
```

### Post-Incident Analysis

**Root Cause Categories**:

1. **Working Set Exceeds Cache**: Data accessed frequently doesn't fit in shared\_buffers
2. **Sudden Query Change**: New query pattern reads cold data from disk
3. **Bloat Accumulation**: VACUUM not keeping up, forcing reads of many dead-tuple pages
4. **Cache Eviction Storm**: Massive seq scan pushed frequently-used data out of cache

**Preventive Measures**:

* **Right-size shared\_buffers**: Aim for 25-40% of system RAM
* **Monitor bloat proactively**: Set up alerts when dead\_tup\_pct >20%
* **Tune autovacuum aggressively**: Lower `autovacuum_vacuum_scale_factor`
* **Implement query review**: Catch full table scans in development
* **Set up I/O alerts**: Alert when disk await >10ms sustained or cache hit ratio under 95%
* **Consider read replicas**: Offload analytical queries to dedicated instances

***

## Module Mastery Checklist

Complete this checklist to confirm you've mastered Performance Engineering:

### Performance Investigation Framework

* [ ] Apply the 6-layer performance stack to diagnose issues systematically
* [ ] Implement the four golden signals for database monitoring
* [ ] Conduct root cause analysis using appropriate metrics at each layer
* [ ] Triage production incidents within the first 5 minutes
* [ ] Communicate performance findings to technical and non-technical stakeholders

### Statistics and Monitoring

* [ ] Configure and query pg\_stat\_statements for query performance analysis
* [ ] Interpret wait event statistics to identify resource bottlenecks
* [ ] Use pg\_stat\_activity to monitor active connections and queries
* [ ] Analyze pg\_stat\_user\_tables for table-level performance metrics
* [ ] Build custom monitoring queries for production dashboards

### CPU Profiling and System Analysis

* [ ] Use perf and flamegraphs to profile PostgreSQL processes
* [ ] Identify hot code paths and CPU-intensive operations
* [ ] Interpret flamegraph patterns for different workload types
* [ ] Analyze system-level CPU metrics and context switches
* [ ] Apply profiling insights to optimize queries

### I/O Performance

* [ ] Monitor and optimize buffer cache hit ratios
* [ ] Identify queries causing excessive physical I/O
* [ ] Diagnose disk saturation using iostat and wait events
* [ ] Configure shared\_buffers and effective\_cache\_size appropriately
* [ ] Implement strategies to reduce I/O amplification

### Query Optimization

* [ ] Read and interpret EXPLAIN and EXPLAIN ANALYZE output
* [ ] Identify suboptimal scan types and join algorithms
* [ ] Recognize when statistics are stale and run ANALYZE
* [ ] Force better query plans when necessary
* [ ] Optimize queries to use appropriate indexes

### Concurrency and Locking

* [ ] Diagnose lock contention using pg\_locks and wait events
* [ ] Identify and resolve deadlocks
* [ ] Understand lock mode compatibility
* [ ] Optimize transaction scope to minimize lock duration
* [ ] Debug serialization failures in high-concurrency scenarios

### Production Troubleshooting

* [ ] Diagnose and resolve high-CPU query workloads
* [ ] Fix I/O-bound performance degradation
* [ ] Respond to sudden query slowdowns systematically
* [ ] Debug connection pool exhaustion
* [ ] Identify and remediate bloat-related performance issues

### Configuration Tuning

* [ ] Tune shared\_buffers based on workload and available RAM
* [ ] Configure work\_mem to prevent disk spills
* [ ] Optimize checkpoint settings for write performance
* [ ] Tune autovacuum for production workloads
* [ ] Apply connection pooling strategies effectively

### Practical Application

* [ ] Complete production troubleshooting scenarios
* [ ] Build a monitoring dashboard for PostgreSQL
* [ ] Conduct performance reviews of new queries before deployment
* [ ] Implement proactive alerting for performance metrics
* [ ] Document performance baselines and SLOs for production systems

***

## Next Steps

<CardGroup cols={2}>
  <Card title="Storage Engine Deep Dive" icon="hard-drive" href="/courses/database-engineering/storage-deep-dive">
    MVCC internals, vacuum, WAL, and recovery
  </Card>

  <Card title="Distributed Systems" icon="network-wired" href="/courses/database-engineering/distributed-systems">
    Replication, consensus, and sharding
  </Card>
</CardGroup>

***

## Interview Deep-Dive

<AccordionGroup>
  <Accordion title="Production is slow. You have 5 minutes before the CEO's all-hands meeting. Walk me through your triage process.">
    **Strong Answer:**

    * **First 30 seconds**: Run the 30-second snapshot query: `SELECT state, wait_event_type, wait_event, count(*) FROM pg_stat_activity WHERE state != 'idle' GROUP BY 1, 2, 3 ORDER BY count(*) DESC`. This tells you where backends are spending time. If most are `active` with NULL wait\_event, the database is CPU-bound on query execution. If most show `Lock` wait events, you have contention. If `IO` wait events dominate, storage is the bottleneck.
    * **Next 60 seconds**: Check pg\_stat\_statements for the top offenders: `SELECT query, calls, mean_exec_time, total_exec_time FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 5`. This identifies whether one specific query is causing the problem or the load is distributed.
    * **Next 60 seconds**: If a specific query is the culprit, run `EXPLAIN (ANALYZE, BUFFERS)` on it. If lock contention is the issue, check `SELECT * FROM pg_locks WHERE NOT granted` to find blocked queries, then find what is holding the lock.
    * **Action (remaining 2 minutes)**: For a runaway query: `SELECT pg_terminate_backend(pid)` to kill it. For lock contention: identify and terminate the blocking transaction. For systemic slowness: check if autovacuum is running aggressively (anti-wraparound vacuum cannot be interrupted -- you may need to wait), or if a large COPY/DELETE is consuming I/O bandwidth.
    * The key principle: measure before acting. The difference between a junior and a senior responding to an incident is that the senior runs three diagnostic queries before touching anything, while the junior starts restarting services.

    **Follow-up: The snapshot shows 200 backends all waiting on LWLock:BufferMapping. What does this mean?**

    BufferMapping lock contention means too many backends are simultaneously trying to look up or insert pages in the buffer pool's hash table. This typically happens during high-concurrency sequential scans on large tables where many backends are fetching the same pages. Solutions: reduce the number of concurrent backends hitting the same table (queue at the application level), add indexes to convert sequential scans into index scans (fewer page lookups), or if the workload is legitimate, increase the number of buffer partitions (requires PostgreSQL recompilation -- not a quick fix).
  </Accordion>

  <Accordion title="Explain Amdahl's Law and Little's Law in the context of database performance. Give a concrete example for each.">
    **Strong Answer:**

    * **Amdahl's Law**: The maximum speedup of a system is limited by the fraction that cannot be parallelized. If 80% of a query's time is spent in a parallel hash join and 20% in a serial sort, making the hash join infinitely fast only gives you a 5x speedup (1 / 0.2). Concrete example: you enable parallel query with `max_parallel_workers_per_gather = 8`. The parallel sequential scan portion scales nearly linearly, but the final aggregation is serial. If aggregation is 30% of the total time, your 8 workers can only achieve \~3.3x speedup at best. This is why throwing more parallelism at a query with a large serial component shows diminishing returns.
    * **Little's Law**: L = lambda \* W. Average number of items in a system = arrival rate \* average time in system. Concrete example: your database receives 1000 queries/sec (lambda) and each query takes an average of 50ms (W = 0.05s). L = 1000 \* 0.05 = 50. You need 50 concurrent connections to sustain this throughput. If max\_connections is 30, you will see queuing. If a slow query increases average latency from 50ms to 500ms, you suddenly need 500 concurrent connections -- your connection pool explodes. This is why a single slow query can cascade into a full system outage.
    * These laws inform capacity planning. Before the CEO asks "can we handle 10x traffic?", I use Little's Law to calculate the required concurrency, Amdahl's Law to predict the benefit of adding parallel workers, and the bottleneck analysis to identify which resource (CPU, I/O, connections) will saturate first.

    **Follow-up: How do you apply the USE method (Utilization, Saturation, Errors) to PostgreSQL specifically?**

    For each resource: CPU -- utilization from OS metrics, saturation from `pg_stat_activity` showing backends in `active` state with NULL wait\_event (CPU-bound), errors from query timeouts. Disk I/O -- utilization from iostat, saturation from `pg_stat_activity` wait\_event\_type = 'IO', errors from `LOG: could not write to file`. Connections -- utilization from `SELECT count(*) FROM pg_stat_activity` vs max\_connections, saturation from application-side connection pool queue depth, errors from `FATAL: too many connections`. Locks -- utilization not directly measurable, saturation from `pg_stat_activity` wait\_event\_type = 'Lock', errors from deadlock\_detected counts.
  </Accordion>

  <Accordion title="How do you establish performance baselines and SLOs for a PostgreSQL database, and what happens when you violate them?">
    **Strong Answer:**

    * **Baseline establishment**: Run the production workload for 2 weeks and capture: P50/P95/P99 query latency (from pg\_stat\_statements), QPS (queries per second), buffer cache hit ratio, replication lag, connection count distribution, and disk I/O utilization. These form the baseline. Capture during peak hours and off-peak separately.
    * **SLO definition**: SLOs should be based on user-facing impact, not raw metrics. Example: "P99 query latency for the checkout flow will not exceed 100ms" (derived from the 200ms total API latency budget with 100ms allocated to database). "Replication lag will not exceed 5 seconds" (derived from the acceptable staleness for read-replica-served pages). "Buffer cache hit ratio will remain above 99.5%" (below this, the working set does not fit in memory and disk I/O degrades latency).
    * **Violation response**: Tier the response based on severity. P99 > 100ms for 5 minutes: page the on-call DBA, investigate via pg\_stat\_statements for new slow queries. P99 > 500ms for 1 minute: auto-scale read replicas, enable query circuit breakers in the application. Cache hit ratio \< 95%: emergency investigation for unexpected table growth or working set change. Replication lag > 60s: consider promoting a fresher replica.
    * The meta-principle: SLOs are contracts between the database team and the application team. They must be derived from user experience requirements, not from arbitrary database metrics. A 50% CPU utilization SLO is meaningless if the application is happy at 80% CPU. A 10ms P99 latency SLO is unnecessarily expensive if the application adds 200ms of its own processing.

    **Follow-up: How do you handle SLO erosion -- where latency gradually increases by 5% per month?**

    This is the most insidious performance problem because no single change triggers an alert. Automated regression detection: track pg\_stat\_statements mean\_exec\_time per query fingerprint over time. If the 30-day moving average increases by more than 20%, flag it for investigation. Common causes: data growth (table/index size increases, scans take longer), statistics drift (planner makes increasingly poor choices), index bloat (gradual degradation), or cumulative schema changes that add overhead. The fix is usually ANALYZE + REINDEX + reviewing recently added columns or constraints.
  </Accordion>
</AccordionGroup>