Module 15: Senior Interview Mastery
This module prepares you for database-focused questions in senior, staff, and principal engineering interviews. You’ll learn to explain complex concepts, discuss trade-offs, and demonstrate deep expertise.Estimated Time: 12-14 hours
Interview Level: Senior/Staff/Principal
Format: Deep dives + System design + Behavioral
Goal: Demonstrate mastery, not just knowledge
Interview Level: Senior/Staff/Principal
Format: Deep dives + System design + Behavioral
Goal: Demonstrate mastery, not just knowledge
15.1 Interview Types for Databases
15.2 Deep Dive Questions
MVCC Explanation
Q: Explain MVCC and why PostgreSQL uses it
Q: Explain MVCC and why PostgreSQL uses it
Strong Answer Structure:
- What it is: Multi-Version Concurrency Control - readers don’t block writers and vice versa by keeping multiple versions of data.
-
How it works in PostgreSQL:
- Each row has hidden columns: xmin (creating transaction), xmax (deleting transaction)
- New version created on UPDATE (not in-place modification)
- Visibility rules based on transaction’s snapshot
- Dead tuples cleaned up by VACUUM
-
Trade-offs:
- ✅ Better concurrency than locking
- ✅ Consistent snapshots for long queries
- ❌ Storage overhead (dead tuples)
- ❌ Needs VACUUM maintenance
- Comparison: Oracle uses undo logs (rollback segments), PostgreSQL uses in-place versioning.
Q: Walk me through what happens when you run a SELECT query
Q: Walk me through what happens when you run a SELECT query
Strong Answer (follow the pipeline):
- Parse: SQL text → tokens → parse tree
- Validates syntax
- Identifies keywords, identifiers, literals
- Analyze: Parse tree → Query tree
- Resolves table/column names to OIDs
- Type checking and coercion
- Function resolution
- Rewrite: Query tree → Rewritten tree(s)
- View expansion
- Rule processing
- Plan: Query tree → Plan tree
- Generate paths (seq scan, index scans, joins)
- Estimate costs using statistics
- Choose cheapest path
- Execute: Plan tree → Results
- Initialize plan nodes (iterator model)
- Fetch tuples through pull-based execution
- Apply filters, projections
Q: Explain B-Tree internals and how index lookups work
Q: Explain B-Tree internals and how index lookups work
Strong Answer:
-
Structure:
- Balanced tree: all leaf nodes at same depth
- Internal nodes contain keys and child pointers
- Leaf nodes contain keys and TIDs (row pointers)
- Nodes are 8KB pages
-
Lookup process:
- Range scans: Leaf nodes linked for efficient range traversal
-
PostgreSQL specifics:
- Lehman-Yao algorithm (concurrent-safe)
- High key in each page for sibling navigation
- Right-link for concurrent splits
Q: How does PostgreSQL handle concurrent transactions?
Q: How does PostgreSQL handle concurrent transactions?
Strong Answer:
- Isolation Levels:
- Read Committed: Each statement sees committed data as of statement start
- Repeatable Read: Entire transaction sees snapshot from start
- Serializable: Full isolation, SSI (Serializable Snapshot Isolation)
- Locking hierarchy:
- Row-level locks (shared, exclusive)
- Table-level locks (8 modes from ACCESS SHARE to ACCESS EXCLUSIVE)
- Predicate locks for serializable
- Deadlock handling:
- Detection via wait-for graph
- One transaction aborted as victim
- Application should retry
- Concurrency patterns:
- Optimistic: Read, compute, try update, retry on conflict
- Pessimistic: Lock early, update, release
SELECT FOR UPDATEfor row locking
15.3 System Design Questions
Design a Multi-Tenant SaaS Database
Design Twitter’s Timeline Database
Full Solution
Full Solution
Requirements Clarification:❌ O(n) per request where n = # followed
❌ Slow for users following many accountsApproach 2: Push Model (Fan-out on Write)✅ O(1) read
❌ Celebrity problem: millions of followers = slow writesApproach 3: Hybrid (Real Production Architecture)Database Choices:
- Users follow other users
- Home timeline shows posts from followed users
- Profile shows user’s own posts
- Scale: 500M users, 200M daily active, 400M tweets/day
- User/follows: PostgreSQL (relational, ACID)
- Tweets: PostgreSQL with partitioning by time
- Timeline cache: Redis sorted sets
- Full-text search: Elasticsearch
- Shard by user_id (follows, timelines)
- Shard by date (tweets - time-series)
- Read replicas for timeline reads
- Connection pooling: PgBouncer
15.4 SQL Live Coding
Common Patterns to Master
15.5 Troubleshooting Scenarios
Scenario: Production Database is Slow
Initial Investigation
Initial Investigation
Resource Analysis
Resource Analysis
Common Fixes
Common Fixes
15.6 Behavioral Questions
Framework: STAR-T (Situation, Task, Action, Result, Technical)
Example: 'Tell me about a database issue you solved'
Example: 'Tell me about a database issue you solved'
Situation:
At [Company], our e-commerce checkout was timing out during Black Friday, with 90th percentile response times exceeding 10 seconds.Task:
As the senior backend engineer, I needed to identify and fix the bottleneck before peak traffic.Action:
- Enabled pg_stat_statements and identified a slow query in the inventory check
- Analyzed with EXPLAIN ANALYZE - seq scan on 50M row products table
- Found the issue: implicit cast from VARCHAR to INT in WHERE clause, bypassing index
- Created proper index, fixed application code to use correct types
- Added query performance alerts to prevent regression
- Query time dropped from 3s to 5ms
- Checkout success rate increased from 85% to 99.5%
- No issues during peak traffic (10x normal load)
- “The implicit cast was from the ORM using string interpolation…”
- “We also added connection pooling with PgBouncer to handle the connection surge…“
15.7 Questions to Ask Interviewers
About the Stack
- “What databases do you use and why those choices?”
- “How do you handle schema migrations?”
- “What’s your read/write ratio?”
About Scale
- “What’s your data volume and growth rate?”
- “How do you handle peak load?”
- “Have you encountered scaling challenges?”
About Operations
- “How do you handle database incidents?”
- “What’s your backup and recovery strategy?”
- “How is database work prioritized?”
About Team
- “Is there a dedicated DBA team or embedded?”
- “How do engineers learn about database topics?”
- “What’s the biggest database project coming up?“
15.8 Mock Interview Practice
Practice Schedule
| Week | Focus | Time |
|---|---|---|
| 1 | Deep dive explanations (record yourself) | 2 hrs |
| 2 | SQL live coding (LeetCode, HackerRank) | 3 hrs |
| 3 | System design with database focus | 3 hrs |
| 4 | Troubleshooting scenarios | 2 hrs |
| 5 | Mock interviews with peers | 3 hrs |
| 6 | Full mock (all types) | 4 hrs |
Self-Evaluation Rubric
| Criterion | Junior | Senior | Staff |
|---|---|---|---|
| Correctness | Gets basics right | Handles edge cases | Anticipates issues |
| Depth | Surface level | Explains why | Discusses trade-offs |
| Breadth | Single approach | Multiple options | Cross-system thinking |
| Communication | Needs prompting | Clear structure | Teaches effectively |
| Experience | Theoretical | Real examples | War stories |
15.9 Cheat Sheet for Interview Day
Next Module
Module 16: System Design Case Studies
Real-world database design problems