Database Interview Questions (50+ Detailed Q&A)
1. Fundamentals & SQL
1. ACID Properties
1. ACID Properties
- Atomicity: All or Nothing. (Transaction rolls back on error).
- Consistency: DB remains in valid state (Constraints satisfied).
- Isolation: Transactions don’t see each other’s partial writes.
- Durability: Committed data is saved to disk (WAL) and survives crash.
2. Isolation Levels (Read Phenomena)
2. Isolation Levels (Read Phenomena)
- Read Uncommitted: Dirty Read possible.
- Read Committed: No Dirty Read. Phantom/Non-repeatable read possible. (Default Postgres).
- Repeatable Read: No Non-repeatable read. Phantom possible. (Default MySQL).
- Serializable: Strict serial execution. Slowest.
3. Indexing: Clustered vs Non-Clustered
3. Indexing: Clustered vs Non-Clustered
- Clustered (Primary Key): Data rows stored inside the index leaf nodes. Sorts data on disk. Only 1 per table.
- Non-Clustered: Leaf nodes contain a Pointer to the data row (Heap). Multiple allowed.
4. B-Tree vs B+Tree
4. B-Tree vs B+Tree
- B-Tree: Data stored in internal nodes and leaf nodes.
- B+Tree: Data stored ONLY in leaf nodes. Internal nodes are keys. Leaf nodes linked (Linked List) -> Fast range scan. Used by MySQL/Postgres.
5. Normalization (1NF, 2NF, 3NF)
5. Normalization (1NF, 2NF, 3NF)
- 1NF: Atomic values (No lists).
- 2NF: 1NF + No Partial dependency (Composite key).
- 3NF: 2NF + No Transitive dependency (City depends on Zip, not UserID).
- Denormalization: Intentionally duplicating data for read performance (Star Schema).
6. SQL Joins
6. SQL Joins
- Inner: Match in both.
- Left: All from Left + Match Right (Null if missing).
- Right: All from Right.
- Full Outer: All from both.
- Cross: Cartesian product (NxM).
7. View vs Materialized View
7. View vs Materialized View
- View: Saved Query. Runs every time you select from it. Virtual.
- Materialized View: Snapshot on disk. Fast read. Stale data. Needs refresh (
REFRESH MATERIALIZED VIEW).
8. Stored Procedures vs Functions
8. Stored Procedures vs Functions
- Function: Returns value. Can use in SELECT. No transaction control usually.
- Proc: Returns void/cursor. Can manage transactions (COMMIT/ROLLBACK).
9. Constraint Types
9. Constraint Types
age > 0).10. Cursor
10. Cursor
2. Internals & Optimization
11. WAL (Write Ahead Log)
11. WAL (Write Ahead Log)
12. MVCC (Multi-Version Concurrency Control)
12. MVCC (Multi-Version Concurrency Control)
xmin, xmax in Postgres).
Old versions cleaned by VACUUM.13. Query Execution Plan
13. Query Execution Plan
EXPLAIN ANALYZE select *...
Look for: Seq Scan (Bad on large table), Index Scan (Good), Hash Join.14. Index Scan / Seek
14. Index Scan / Seek
- Seek: Jump to location B-Tree. O(log N).
- Scan: Read all leaf nodes. O(N).
15. Covering Index
15. Covering Index
SELECT name FROM users WHERE age = 10. Index on (age, name).
No Heap lookup required. Super fast.16. Hash Index
16. Hash Index
=).
No Range queries (>).
Postgres supports it but rarely used over B-Tree.17. Database Sharding Logic
17. Database Sharding Logic
18. Partitioning (Table)
18. Partitioning (Table)
Logs_2023_01, Logs_2023_02.
Optimizer skips partitions (WHERE date = ...).19. Vacuum (Postgres)
19. Vacuum (Postgres)
20. Connection Pooling
20. Connection Pooling
3. NoSQL (Mongo/Cassandra/Redis)
21. Document vs Column-Family vs Key-Value
21. Document vs Column-Family vs Key-Value
- Doc (Mongo): JSON. Flexible schema.
- Key-Value (Redis): Cache. Simple.
- Column (Cassandra): Wide column. High write throughput.
- Graph (Neo4j): Relations.
22. MongoDB Replication (Replica Set)
22. MongoDB Replication (Replica Set)
23. Cassandra Architecture
23. Cassandra Architecture
24. Redis Persistence
24. Redis Persistence
- RDB: Snapshot every X minutes. Fast restart. Last X mins data loss.
- AOF: Log every write. Slower restart. No data loss (fsync every sec).
25. Bloom Filter in NoSQL
25. Bloom Filter in NoSQL
26. CAP Theorem in practice
26. CAP Theorem in practice
- Mongo: CP (Consistency). Break partition -> Unavailability during election.
- Cassandra: AP (Availability). Network split -> Nodes accept writes -> Eventual consistency.
27. Geo-Spatial Index
27. Geo-Spatial Index
$near operator.
Store points in 2d plane.28. Write Concern (Mongo)
28. Write Concern (Mongo)
w=1: Ack by Primary.
w=majority: Ack by >50%. Safe.
j=true: Written to Journal (Disk).29. Redis Single Threaded?
29. Redis Single Threaded?
30. Graph DB Use Cases
30. Graph DB Use Cases
4. Advanced SQL Scenarios
31. Window Functions (`OVER`)
31. Window Functions (`OVER`)
RANK() OVER (PARTITION BY dept ORDER BY salary DESC).
Running Total, Moving Average.32. CTE (Common Table Expression)
32. CTE (Common Table Expression)
WITH cte AS (...) SELECT ....
Readable reusable subquery.
Recursive CTE: Used for Hierarchical data (Org Chart).33. Self Join
33. Self Join
ManagerID. Find Manager name for Emp.34. N+1 Problem in SQL
34. N+1 Problem in SQL
IN (...), JOIN, or Batching.35. Optimistic vs Pessimistic Locking
35. Optimistic vs Pessimistic Locking
- Pessimistic:
SELECT ... FOR UPDATE. Locks row. Usage: High conflict. - Optimistic: Version column. Check
ver=1on update. If row changed (ver=2), retry. Usage: Low conflict (Web).
36. Indexing a JSON column
36. Indexing a JSON column
37. Full Text Search
37. Full Text Search
tsvector (Tokens) and tsquery.
Inverted Index (GIN).
More powerful than LIKE %...%.38. Schema Migration Strategies
38. Schema Migration Strategies
39. SQL Injection
39. SQL Injection
' OR '1'='1.
Fix: Prepared Statements (Parametrized Queries).40. Soft Delete
40. Soft Delete
deleted_at timestamp column.
Pros: Recovery. Audit.
Cons: Queries needing WHERE deleted_at IS NULL, Index bloat.5. Operations & Scaling
41. Backup Types
41. Backup Types
- Full: Whole DB.
- Differential: Changes since last Full.
- Transaction Log: Stream of ops (Point in Time Recovery).
42. CDC (Change Data Capture)
42. CDC (Change Data Capture)
43. Read Replicas
43. Read Replicas
44. Deadlocks
44. Deadlocks
45. Vertical Partitioning (Column split)
45. Vertical Partitioning (Column split)
46. Database Federation
46. Database Federation
47. Time Series DB (Influx/Timescale)
47. Time Series DB (Influx/Timescale)
48. Ledger Database (QLDB)
48. Ledger Database (QLDB)
49. Row-Level Security (RLS)
49. Row-Level Security (RLS)
50. Two-Phase Commit (2PC)
50. Two-Phase Commit (2PC)
- Prepare: All nodes vote “Yes/No” (Lock resources).
- Commit: If all Yes, persist. Slow. Blocked if Coordinator dies.