Skip to main content

Module 15: Senior Interview Mastery

This module prepares you for database-focused questions in senior, staff, and principal engineering interviews. You’ll learn to explain complex concepts, discuss trade-offs, and demonstrate deep expertise.
Estimated Time: 12-14 hours
Interview Level: Senior/Staff/Principal
Format: Deep dives + System design + Behavioral
Goal: Demonstrate mastery, not just knowledge

15.1 Interview Types for Databases

┌─────────────────────────────────────────────────────────────────────────────┐
│                    DATABASE INTERVIEW CATEGORIES                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   1. TECHNICAL DEEP DIVE (30-45 min)                                        │
│      "Explain how MVCC works"                                               │
│      "Walk me through query execution"                                       │
│      "Describe B-Tree internals"                                            │
│                                                                              │
│   2. SYSTEM DESIGN - DATA LAYER (45-60 min)                                │
│      "Design a database for Twitter"                                        │
│      "How would you scale a multi-tenant SaaS?"                             │
│      "Design an event sourcing system"                                      │
│                                                                              │
│   3. SQL LIVE CODING (30-45 min)                                            │
│      Write complex queries under pressure                                   │
│      Optimize given slow queries                                            │
│      Design schema for requirements                                         │
│                                                                              │
│   4. TROUBLESHOOTING (30-45 min)                                            │
│      "Production is slow, walk me through diagnosis"                        │
│      "We're seeing lock contention, what do you check?"                     │
│      "Replication lag is increasing, how do you fix it?"                    │
│                                                                              │
│   5. BEHAVIORAL + TECHNICAL (30-45 min)                                     │
│      "Tell me about a database issue you solved"                            │
│      "How did you handle a production incident?"                            │
│      "Describe a decision between SQL and NoSQL"                            │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

15.2 Deep Dive Questions

MVCC Explanation

Strong Answer Structure:
  1. What it is: Multi-Version Concurrency Control - readers don’t block writers and vice versa by keeping multiple versions of data.
  2. How it works in PostgreSQL:
    • Each row has hidden columns: xmin (creating transaction), xmax (deleting transaction)
    • New version created on UPDATE (not in-place modification)
    • Visibility rules based on transaction’s snapshot
    • Dead tuples cleaned up by VACUUM
  3. Trade-offs:
    • ✅ Better concurrency than locking
    • ✅ Consistent snapshots for long queries
    • ❌ Storage overhead (dead tuples)
    • ❌ Needs VACUUM maintenance
  4. Comparison: Oracle uses undo logs (rollback segments), PostgreSQL uses in-place versioning.
Bonus Points: Mention HOT updates, visibility map, wraparound vacuum.
Strong Answer (follow the pipeline):
  1. Parse: SQL text → tokens → parse tree
    • Validates syntax
    • Identifies keywords, identifiers, literals
  2. Analyze: Parse tree → Query tree
    • Resolves table/column names to OIDs
    • Type checking and coercion
    • Function resolution
  3. Rewrite: Query tree → Rewritten tree(s)
    • View expansion
    • Rule processing
  4. Plan: Query tree → Plan tree
    • Generate paths (seq scan, index scans, joins)
    • Estimate costs using statistics
    • Choose cheapest path
  5. Execute: Plan tree → Results
    • Initialize plan nodes (iterator model)
    • Fetch tuples through pull-based execution
    • Apply filters, projections
Senior-level additions: Mention prepared statement caching, JIT compilation, parallel query.
Strong Answer:
  1. Structure:
    • Balanced tree: all leaf nodes at same depth
    • Internal nodes contain keys and child pointers
    • Leaf nodes contain keys and TIDs (row pointers)
    • Nodes are 8KB pages
  2. Lookup process:
    Root → Compare key, descend to child
    Internal nodes → Binary search, follow pointer
    Leaf node → Binary search for key(s)
    Heap → Fetch tuple using TID
    
  3. Range scans: Leaf nodes linked for efficient range traversal
  4. PostgreSQL specifics:
    • Lehman-Yao algorithm (concurrent-safe)
    • High key in each page for sibling navigation
    • Right-link for concurrent splits
Bonus: Discuss index-only scans (visibility map), partial indexes, covering indexes.
Strong Answer:
  1. Isolation Levels:
    • Read Committed: Each statement sees committed data as of statement start
    • Repeatable Read: Entire transaction sees snapshot from start
    • Serializable: Full isolation, SSI (Serializable Snapshot Isolation)
  2. Locking hierarchy:
    • Row-level locks (shared, exclusive)
    • Table-level locks (8 modes from ACCESS SHARE to ACCESS EXCLUSIVE)
    • Predicate locks for serializable
  3. Deadlock handling:
    • Detection via wait-for graph
    • One transaction aborted as victim
    • Application should retry
  4. Concurrency patterns:
    • Optimistic: Read, compute, try update, retry on conflict
    • Pessimistic: Lock early, update, release
    • SELECT FOR UPDATE for row locking
Demonstrate depth: Discuss SSI vs 2PL, write skew anomaly, SKIP LOCKED.

15.3 System Design Questions

Design a Multi-Tenant SaaS Database

┌─────────────────────────────────────────────────────────────────────────────┐
│          MULTI-TENANT SAAS DATABASE DESIGN                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   Approach 1: Shared Schema with tenant_id                                  │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                        DATABASE                                     │   │
│   │  users (id, tenant_id, name, ...)                                  │   │
│   │  orders (id, tenant_id, ...)                                       │   │
│   │                                                                     │   │
│   │  + Row Level Security:                                             │   │
│   │  CREATE POLICY tenant_isolation ON users                           │   │
│   │    USING (tenant_id = current_setting('app.tenant_id'));           │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│   ✅ Simple, efficient                                                      │
│   ✅ Easy cross-tenant analytics                                            │
│   ❌ Noisy neighbor risk                                                    │
│   ❌ Accidental data leaks if RLS misconfigured                            │
│                                                                              │
│   Approach 2: Schema per Tenant                                             │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                        DATABASE                                     │   │
│   │  tenant_001.users   tenant_002.users   tenant_003.users            │   │
│   │  tenant_001.orders  tenant_002.orders  tenant_003.orders           │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│   ✅ Good isolation                                                         │
│   ✅ Per-tenant backup/restore                                              │
│   ❌ Schema migrations more complex                                         │
│   ❌ Connection management (SET search_path)                               │
│                                                                              │
│   Approach 3: Database per Tenant                                           │
│   ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                        │
│   │ tenant_001  │  │ tenant_002  │  │ tenant_003  │                        │
│   │    users    │  │    users    │  │    users    │                        │
│   │   orders    │  │   orders    │  │   orders    │                        │
│   └─────────────┘  └─────────────┘  └─────────────┘                        │
│   ✅ Complete isolation                                                     │
│   ✅ Easy compliance (GDPR deletion)                                       │
│   ❌ Operational overhead                                                   │
│   ❌ Connection pooling complexity                                         │
│                                                                              │
│   RECOMMENDATION:                                                           │
│   • < 100 tenants: Schema per tenant                                       │
│   • 100-10K tenants: Shared schema with RLS                                │
│   • Enterprise needs: Database per tenant                                  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Design Twitter’s Timeline Database

Requirements Clarification:
  • Users follow other users
  • Home timeline shows posts from followed users
  • Profile shows user’s own posts
  • Scale: 500M users, 200M daily active, 400M tweets/day
Approach 1: Pull Model (Fan-in on Read)
-- Schema
CREATE TABLE users (id BIGINT PRIMARY KEY, ...);
CREATE TABLE follows (follower_id BIGINT, following_id BIGINT);
CREATE TABLE tweets (id BIGINT, author_id BIGINT, content TEXT, created_at TIMESTAMPTZ);

-- Home timeline query (SLOW!)
SELECT t.* FROM tweets t
JOIN follows f ON t.author_id = f.following_id
WHERE f.follower_id = $current_user
ORDER BY t.created_at DESC
LIMIT 20;
❌ O(n) per request where n = # followed ❌ Slow for users following many accountsApproach 2: Push Model (Fan-out on Write)
-- Materialized timeline per user
CREATE TABLE timelines (
    user_id BIGINT,
    tweet_id BIGINT,
    created_at TIMESTAMPTZ,
    PRIMARY KEY (user_id, created_at DESC, tweet_id)
);

-- On new tweet: fan out to all followers
INSERT INTO timelines (user_id, tweet_id, created_at)
SELECT follower_id, $tweet_id, NOW()
FROM follows WHERE following_id = $author_id;

-- Home timeline query (FAST!)
SELECT t.* FROM tweets t
JOIN timelines tl ON t.id = tl.tweet_id
WHERE tl.user_id = $current_user
ORDER BY tl.created_at DESC
LIMIT 20;
✅ O(1) read ❌ Celebrity problem: millions of followers = slow writesApproach 3: Hybrid (Real Production Architecture)
Normal users → Fan-out on write to followers' timelines
Celebrities (>10K followers) → Fan-in on read, merged at read time

Timeline = merge(pre-materialized + real-time celebrity fetch)
Database Choices:
  • User/follows: PostgreSQL (relational, ACID)
  • Tweets: PostgreSQL with partitioning by time
  • Timeline cache: Redis sorted sets
  • Full-text search: Elasticsearch
Scaling Strategy:
  • Shard by user_id (follows, timelines)
  • Shard by date (tweets - time-series)
  • Read replicas for timeline reads
  • Connection pooling: PgBouncer

15.4 SQL Live Coding

Common Patterns to Master

-- 1. Running totals / Cumulative sums
SELECT 
    date,
    amount,
    SUM(amount) OVER (ORDER BY date) AS running_total
FROM transactions;

-- 2. Top N per group
SELECT * FROM (
    SELECT 
        category,
        product,
        sales,
        ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) AS rn
    FROM products
) t WHERE rn <= 3;

-- 3. Gap and island problems
WITH numbered AS (
    SELECT 
        id, 
        date,
        date - (ROW_NUMBER() OVER (ORDER BY date))::int AS grp
    FROM events
)
SELECT MIN(date) AS start_date, MAX(date) AS end_date, COUNT(*) AS length
FROM numbered
GROUP BY grp;

-- 4. Hierarchical queries (recursive CTE)
WITH RECURSIVE org_tree AS (
    SELECT id, name, manager_id, 1 AS level
    FROM employees WHERE manager_id IS NULL
    UNION ALL
    SELECT e.id, e.name, e.manager_id, t.level + 1
    FROM employees e
    JOIN org_tree t ON e.manager_id = t.id
)
SELECT * FROM org_tree;

-- 5. Pivot / Cross-tab
SELECT 
    product,
    SUM(CASE WHEN month = 1 THEN sales END) AS jan,
    SUM(CASE WHEN month = 2 THEN sales END) AS feb,
    SUM(CASE WHEN month = 3 THEN sales END) AS mar
FROM sales_data
GROUP BY product;

-- 6. Deduplication (keep latest)
DELETE FROM events a
USING events b
WHERE a.id < b.id 
  AND a.user_id = b.user_id 
  AND a.event_type = b.event_type;

15.5 Troubleshooting Scenarios

Scenario: Production Database is Slow

-- 1. Check active queries
SELECT pid, state, query_start, query 
FROM pg_stat_activity 
WHERE state != 'idle'
ORDER BY query_start;

-- 2. Check for blocking
SELECT 
    blocked.pid AS blocked_pid,
    blocked.query AS blocked_query,
    blocking.pid AS blocking_pid,
    blocking.query AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_locks bl ON bl.pid = blocked.pid
JOIN pg_locks l ON l.locktype = bl.locktype 
    AND l.relation = bl.relation AND l.pid != bl.pid
JOIN pg_stat_activity blocking ON l.pid = blocking.pid
WHERE NOT bl.granted;

-- 3. Check wait events
SELECT pid, wait_event_type, wait_event, query
FROM pg_stat_activity
WHERE wait_event IS NOT NULL;
-- Buffer cache hit ratio (should be > 99%)
SELECT 
    sum(blks_hit) / nullif(sum(blks_hit + blks_read), 0) AS ratio
FROM pg_stat_database;

-- Slowest queries (need pg_stat_statements)
SELECT 
    substring(query, 1, 80) AS query,
    calls,
    mean_exec_time,
    total_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

-- Table bloat
SELECT 
    schemaname, tablename,
    pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) AS size,
    n_dead_tup, n_live_tup,
    round(n_dead_tup::numeric / nullif(n_live_tup, 0), 2) AS dead_ratio
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;
-- Kill long-running query
SELECT pg_terminate_backend(pid);

-- Force vacuum on bloated table
VACUUM (VERBOSE, ANALYZE) big_table;

-- Check/fix autovacuum
ALTER TABLE big_table SET (
    autovacuum_vacuum_scale_factor = 0.01,
    autovacuum_vacuum_threshold = 1000
);

-- Add missing index
CREATE INDEX CONCURRENTLY idx_orders_customer 
ON orders(customer_id);

-- Tune work_mem for complex query
SET work_mem = '256MB';

15.6 Behavioral Questions

Framework: STAR-T (Situation, Task, Action, Result, Technical)

Situation: At [Company], our e-commerce checkout was timing out during Black Friday, with 90th percentile response times exceeding 10 seconds.Task: As the senior backend engineer, I needed to identify and fix the bottleneck before peak traffic.Action:
  1. Enabled pg_stat_statements and identified a slow query in the inventory check
  2. Analyzed with EXPLAIN ANALYZE - seq scan on 50M row products table
  3. Found the issue: implicit cast from VARCHAR to INT in WHERE clause, bypassing index
  4. Created proper index, fixed application code to use correct types
  5. Added query performance alerts to prevent regression
Result:
  • Query time dropped from 3s to 5ms
  • Checkout success rate increased from 85% to 99.5%
  • No issues during peak traffic (10x normal load)
Technical Depth (add if they want more):
  • “The implicit cast was from the ORM using string interpolation…”
  • “We also added connection pooling with PgBouncer to handle the connection surge…“

15.7 Questions to Ask Interviewers

About the Stack

  • “What databases do you use and why those choices?”
  • “How do you handle schema migrations?”
  • “What’s your read/write ratio?”

About Scale

  • “What’s your data volume and growth rate?”
  • “How do you handle peak load?”
  • “Have you encountered scaling challenges?”

About Operations

  • “How do you handle database incidents?”
  • “What’s your backup and recovery strategy?”
  • “How is database work prioritized?”

About Team

  • “Is there a dedicated DBA team or embedded?”
  • “How do engineers learn about database topics?”
  • “What’s the biggest database project coming up?“

15.8 Mock Interview Practice

Practice Schedule

WeekFocusTime
1Deep dive explanations (record yourself)2 hrs
2SQL live coding (LeetCode, HackerRank)3 hrs
3System design with database focus3 hrs
4Troubleshooting scenarios2 hrs
5Mock interviews with peers3 hrs
6Full mock (all types)4 hrs

Self-Evaluation Rubric

CriterionJuniorSeniorStaff
CorrectnessGets basics rightHandles edge casesAnticipates issues
DepthSurface levelExplains whyDiscusses trade-offs
BreadthSingle approachMultiple optionsCross-system thinking
CommunicationNeeds promptingClear structureTeaches effectively
ExperienceTheoreticalReal examplesWar stories

15.9 Cheat Sheet for Interview Day

┌─────────────────────────────────────────────────────────────────────────────┐
│                    INTERVIEW DAY QUICK REFERENCE                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   BEFORE ANSWERING:                                                          │
│   1. Clarify requirements (don't assume)                                    │
│   2. State assumptions explicitly                                           │
│   3. Outline approach before diving in                                      │
│                                                                              │
│   FOR DEEP DIVES:                                                            │
│   • What → Why → How → Trade-offs                                           │
│   • "The reason PostgreSQL does X is..."                                    │
│   • "The alternative would be Y, but that has..."                           │
│                                                                              │
│   FOR SYSTEM DESIGN:                                                         │
│   • Requirements → Estimation → Schema → Queries → Scaling                  │
│   • Always discuss indexing strategy                                        │
│   • Mention consistency vs availability trade-offs                          │
│                                                                              │
│   FOR TROUBLESHOOTING:                                                       │
│   • Start with symptoms → gather data → hypothesize → verify                │
│   • Mention tools: EXPLAIN, pg_stat_*, logs                                 │
│   • Prioritize: quick wins first, root cause after stabilizing             │
│                                                                              │
│   POWER PHRASES:                                                             │
│   • "In my experience at [Company]..."                                      │
│   • "The trade-off here is..."                                              │
│   • "One approach is X, but we could also Y depending on..."                │
│   • "I'd want to measure before optimizing..."                              │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Next Module

Module 16: System Design Case Studies

Real-world database design problems