Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Module 4: Indexing Deep Dive
Indexing is the single most impactful skill for database performance. A well-placed index can turn a query from minutes to milliseconds. This module gives you deep understanding of how indexes work and when to use them.Estimated Time: 12-15 hours
Hands-On: Index analysis labs with real query plans
Key Skill: Knowing exactly which index to create for any query
Hands-On: Index analysis labs with real query plans
Key Skill: Knowing exactly which index to create for any query
4.1 How Indexes Work: Understanding the Fundamentals
What is an Index? (The Big Picture)
Real-World Analogy: Book Index vs. Table of Contents Think of a database table like a book:- Without an index: To find a topic, you’d read every page from start to finish (full table scan)
- With an index: You look at the index at the back, find the page number, and jump directly there
- The index is a separate structure that stores values in sorted order
- Each value points to where the actual data lives (like page numbers)
- Finding data becomes much faster!
- Names are sorted alphabetically (the index)
- Each name points to a phone number and address (the data)
- To find “Smith, John”, you don’t read every entry - you jump to the “S” section!
- An index is a separate data structure that stores column values in sorted order
- Each value points to the actual row location (like a page number)
- Queries can use the index to find rows quickly instead of scanning everything
Why Do We Need Indexes?
The Performance Problem: Imagine a table with 10 million users:- Database starts at row 1
- Checks: Is email = ‘john@example.com’? No → Move to row 2
- Checks: Is email = ‘john@example.com’? No → Move to row 3
- … continues checking all 10 million rows …
- Finally finds it at row 5,234,567
- Time: 5-10 seconds (must check millions of rows!)
- Database looks at the index (sorted by email)
- Uses binary search to find ‘john@example.com’
- Finds it in ~25 comparisons (log₂(10,000,000) ≈ 24)
- Index says: “The row is at location X”
- Database jumps directly to location X
- Time: 1-5 milliseconds (only ~25 comparisons!)
- Without index: O(n) - Linear time (check every row)
- With index: O(log n) - Logarithmic time (binary search)
- Without index: Up to 10 million checks
- With index: ~25 checks
The Problem: Full Table Scans
Without indexes, every query must scan every row:Understanding B-Tree: The Most Common Index Type
What is a B-Tree? (Before the technical details) A B-Tree is like a multi-level directory structure: Real-World Analogy: Library Organization Think of a library:- Top level (Root): “Books A-H” go left, “Books I-P” go middle, “Books Q-Z” go right
- Middle level (Branches): “Books A-C” go left, “Books D-F” go middle, “Books G-H” go right
- Bottom level (Leaves): Actual books sorted alphabetically
- Root: “D” is in “A-H” section → Go left
- Branch: “D” is in “D-F” section → Go middle
- Leaf: Find “Database Design” → Done!
- Root node: Contains ranges that point to branch nodes
- Branch nodes: Contain ranges that point to leaf nodes
- Leaf nodes: Contain actual values with pointers to table rows
- Balanced: All paths from root to leaves are the same length (guaranteed fast lookups)
- Sorted: Values are in order (enables range queries like “find all prices between 20”)
- Wide: Each node can hold many values (reduces tree height, fewer disk reads)
- Root: 23 < 50 → Go left
- Branch: 23 > 25? No, 23 is in left branch → Go to [13-24] leaf
- Leaf: Find 23 → Get pointer to actual row
- Total: 3 steps (instead of checking all 100 rows!)
B-Tree Index Structure
PostgreSQL’s default index type. Perfect for equality and range queries. Key Properties Explained:-
Balanced: All leaf nodes are at the same depth
- Why? Ensures every lookup takes the same number of steps
- Example: Finding ID 1 takes same steps as finding ID 1,000,000
-
Sorted: Values are stored in order
- Why? Enables range queries (“find all orders between Jan 1 and Jan 31”)
- Example: Leaf nodes are linked, so you can scan ranges efficiently
-
Dense: Every value in the table has an entry in the index
- Why? Can find any value quickly
- Trade-off: Index takes up space (but it’s worth it for speed!)
-
Multi-level: Tree has multiple levels (root → branches → leaves)
- Why? Keeps tree height small (fewer disk reads)
- Example: 10 million rows might only need 3-4 levels
What Indexes Support
4.2 Creating and Managing Indexes
Basic Index Creation
Index Naming Convention
Dropping and Rebuilding
4.3 Multi-Column Indexes
Understanding column order in composite indexes is crucial.The Leftmost Prefix Rule
- ✅
(user_id)- First column - ✅
(user_id, status)- First two columns - ✅
(user_id, status, created_at)- All columns - ❌
(status)- Not leftmost column! - ❌
(status, created_at)- Not leftmost! - ❌
(user_id, created_at)- Skipsstatus
GIN (Generalized Inverted Index)
Best for: Array containment, JSONB, full-text searchGiST (Generalized Search Tree)
Best for: Geometric data, ranges, full-text with phrasesBRIN (Block Range Index)
Best for: Very large tables with naturally ordered data (time series)4.5 Advanced Index Techniques
Partial Indexes
Index only the rows you actually query.Expression Indexes
Index computed values.Covering Indexes (Index-Only Scans)
Include extra columns to avoid table lookups entirely.Unique Indexes for Constraints
4.4 Internal Mechanics: B-Trees vs. LSM-Trees
While B-Trees are the standard for most relational databases, modern systems often choose between B-Trees and LSM-Trees depending on the read/write profile.B-Trees: The Read-Optimized Classic
A B-Tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. B-Tree Leaf Splits (The Write Penalty): When you insert a new key into a full node (typically 8KB or 16KB page), the database must perform a leaf split:- A new page is allocated.
- Half the keys from the full node are moved to the new page.
- The parent node is updated to include a pointer to the new page.
- If the parent is also full, the split propagates up the tree.
LSM-Trees: The Write-Optimized Modernist
Log-Structured Merge-Trees (LSM-Trees) are used in write-heavy systems like Cassandra, Bigtable, and RocksDB (and as an option in some relational extensions).- Memtable: All writes are first buffered in an in-memory sorted structure (the memtable). This is fast because it’s purely in-RAM.
- SSTables: Once the memtable is full, it is flushed to disk as a sorted, immutable file called an SSTable (Sorted String Table).
- Compaction: Background processes merge smaller SSTables into larger ones, removing deleted entries and resolving updates.
- Write Performance: Superior to B-Trees because writes are always sequential flushes.
- Read Performance: Potentially slower than B-Trees because a single read might have to check multiple SSTables (Read Amplification).
| Feature | B-Tree (PostgreSQL/MySQL) | LSM-Tree (Bigtable/Cassandra) |
|---|---|---|
| Write Type | Random / In-place update | Sequential / Append-only |
| Write Efficiency | Lower (Leaf splits/Locking) | Very High |
| Read Efficiency | Very High (Direct path) | Lower (May scan multiple files) |
| Space Efficiency | Lower (Fragmented pages) | Higher (Compressed SSTables) |
4.5 Advanced Indexing Strategies
Partial Indexes for High Cardinality
A partial index is built over a subset of a table defined by a conditional expression. This is exceptionally powerful for optimizing queries that target a specific lifecycle state of a record.Functional and Expression Indexes
You can index the result of a function or expression, rather than just column values. This is essential when the query logic involves transformations.Covering Indexes and the Visibility Map
A Covering Index uses theINCLUDE clause to add payload columns to a B-Tree’s leaf nodes. This enables Index-Only Scans, but there’s a catch: The Visibility Map.
For an index-only scan to be truly fast, the database must verify that the data in the index is “visible” to the current transaction without checking the table heap.
- If the page is marked as “all-visible” in the Visibility Map, the DB skips the heap.
- If the page has recently changed and isn’t marked yet, the DB must still fetch the row from disk, negating the speed benefit.
VACUUM (or aggressive autovacuum) is required to keep the Visibility Map fresh and ensure Index-Only Scans stay fast.
4.6 Reading Execution Plans
Understanding EXPLAIN output is crucial for index optimization.EXPLAIN Basics
Reading the Plan
Key Metrics to Watch
| Metric | Good Sign | Bad Sign |
|---|---|---|
Index Scan | ✅ Using index | |
Seq Scan on large table | ❌ Missing index | |
Buffers: shared hit | ✅ Data in cache | |
Buffers: read | ⚠️ Disk I/O | |
actual rows ≈ estimated rows | ✅ Good statistics | |
actual rows >> estimated rows | ❌ Run ANALYZE | |
Sort Method: quicksort Memory | ✅ Fits in memory | |
Sort Method: external merge Disk | ❌ Needs more work_mem |
Common Scan Types
4.7 Index Maintenance
Monitoring Index Usage
Index Bloat
Indexes accumulate dead entries over time, becoming larger and slower.Keeping Statistics Fresh
4.8 Practice Exercises
Exercise 1: Index Design
Given these queries, design optimal indexes:Solutions
Solutions
Exercise 2: Analyze Slow Query
Analysis and Solution
Analysis and Solution
Next Module
Module 5: Query Optimization
Learn systematic approaches to optimize any slow query
Interview Deep-Dive
You have a composite index on (user_id, created_at, status). A query filters on user_id and status but not created_at. Can this index be used? Explain the mechanics.
You have a composite index on (user_id, created_at, status). A query filters on user_id and status but not created_at. Can this index be used? Explain the mechanics.
Strong Answer:
- The index can be partially used. PostgreSQL will use the B-Tree to navigate to the correct user_id value (the leading column), but then it must scan all entries for that user_id and apply the status filter as a “filter condition” rather than an “index condition.” The created_at column in the middle breaks the ability to skip directly to the status values.
- This is the “leftmost prefix” rule in action. A B-Tree index on (A, B, C) is effectively three indexes: (A), (A, B), and (A, B, C). It is NOT usable as (A, C) with a skip scan — PostgreSQL does not currently implement B-Tree skip scans (though MySQL 8.0 does in limited cases, and it is under discussion for PostgreSQL).
- In EXPLAIN output, you will see the query use an Index Scan with
Index Cond: (user_id = 123)andFilter: (status = 'active'). The “Filter” line means rows are being fetched from the index and then discarded — wasted I/O. If the user has 100,000 entries but only 500 are active, you are reading 100,000 index entries to return 500 rows. - The fix depends on query patterns: if filtering by (user_id, status) is common, create a separate index on (user_id, status) or reorder the composite to (user_id, status, created_at). If both patterns are common, the cost of maintaining two indexes may be justified.
Explain when and why you would choose a BRIN index over a B-Tree index. What are the failure modes of BRIN?
Explain when and why you would choose a BRIN index over a B-Tree index. What are the failure modes of BRIN?
Strong Answer:
- BRIN (Block Range Index) stores the minimum and maximum values for a range of physical table blocks (default 128 blocks per range). Instead of indexing every row, it indexes block ranges. This makes BRIN indexes extremely small — a BRIN index on a 100GB table might be 200KB versus 2GB for a B-Tree.
- BRIN is ideal when the physical order of rows on disk strongly correlates with the indexed column’s logical order. The canonical example is a time-series table where rows are always inserted in chronological order. The BRIN index can say “blocks 0-127 contain timestamps from Jan 1-7, blocks 128-255 contain Jan 8-14” and skip irrelevant blocks entirely.
- The failure mode is data that is not physically ordered. If you do bulk loads out of order, or if UPDATE operations create new tuple versions on random pages (because PostgreSQL’s MVCC appends updated tuples wherever there is free space), the BRIN ranges become very wide. A range that should say “Jan 1-7” instead says “Jan 1 to Dec 31” because an updated row landed in that block range. At that point, BRIN cannot eliminate any blocks and degrades to a sequential scan.
- To maintain BRIN effectiveness: use append-only tables (INSERT only, no UPDATE), CLUSTER the table periodically if updates are unavoidable, and set a smaller pages_per_range for finer granularity at the cost of a slightly larger index.
Your team added 12 indexes to a table to cover various query patterns. Performance has degraded. What went wrong and how do you fix it?
Your team added 12 indexes to a table to cover various query patterns. Performance has degraded. What went wrong and how do you fix it?
Strong Answer:
- Over-indexing has three costs that compound: (1) Write amplification — every INSERT must update all 12 indexes, every UPDATE to an indexed column must update the affected indexes, and every DELETE must mark entries in all indexes for cleanup. If the table has high write throughput, the cumulative index maintenance can easily double or triple the write latency. (2) Vacuum overhead — autovacuum must clean dead tuples from the table AND all 12 indexes. If vacuum cannot keep up, index bloat causes further degradation in a vicious cycle. (3) Planner overhead — the query planner must evaluate all available indexes for every query, increasing planning time. With 12 indexes and complex queries, planning time can exceed execution time.
- The diagnostic process: query pg_stat_user_indexes to find indexes with idx_scan = 0 (never used) or very low scan counts. Drop those first. Then identify redundant indexes — an index on (user_id) is redundant if you also have (user_id, created_at), since the composite index covers single-column lookups. Use the pg_stat_user_tables view to check if n_dead_tup is growing faster than autovacuum can clean it.
- The target: a well-tuned table typically has 3-5 indexes. Each index should justify its existence with measurable query performance improvement that outweighs its write cost. I use the rule: if an index is not used in at least 1% of the queries hitting that table (measured over 30 days), it should be dropped.
UPDATE pg_index SET indisvalid = false WHERE indexrelid = 'idx_name'::regclass. This tells the planner to stop using the index, but the index still exists and is still maintained (so you can re-enable it quickly). Monitor for 1-2 weeks. If no query regressions appear, drop the index with DROP INDEX CONCURRENTLY idx_name. The CONCURRENTLY flag avoids taking an ACCESS EXCLUSIVE lock on the table. If you see regressions, re-enable with UPDATE pg_index SET indisvalid = true — instant rollback.