Database Interview Questions (50+ Detailed Q&A)

1. Fundamentals & SQL

1. ACID Properties

Answer:

Atomicity: All or Nothing. (Transaction rolls back on error).
Consistency: DB remains in valid state (Constraints satisfied).
Isolation: Transactions don’t see each other’s partial writes.
Durability: Committed data is saved to disk (WAL) and survives crash.

2. Isolation Levels (Read Phenomena)

Answer:

Read Uncommitted: Dirty Read possible.
Read Committed: No Dirty Read. Phantom/Non-repeatable read possible. (Default Postgres).
Repeatable Read: No Non-repeatable read. Phantom possible. (Default MySQL).
Serializable: Strict serial execution. Slowest.

3. Indexing: Clustered vs Non-Clustered

Answer:

Clustered (Primary Key): Data rows stored inside the index leaf nodes. Sorts data on disk. Only 1 per table.
Non-Clustered: Leaf nodes contain a Pointer to the data row (Heap). Multiple allowed.

4. B-Tree vs B+Tree

Answer:

B-Tree: Data stored in internal nodes and leaf nodes.
B+Tree: Data stored ONLY in leaf nodes. Internal nodes are keys. Leaf nodes linked (Linked List) -> Fast range scan. Used by MySQL/Postgres.

5. Normalization (1NF, 2NF, 3NF)

Answer:

1NF: Atomic values (No lists).
2NF: 1NF + No Partial dependency (Composite key).
3NF: 2NF + No Transitive dependency (City depends on Zip, not UserID).
Denormalization: Intentionally duplicating data for read performance (Star Schema).

6. SQL Joins

Answer:

Inner: Match in both.
Left: All from Left + Match Right (Null if missing).
Right: All from Right.
Full Outer: All from both.
Cross: Cartesian product (NxM).

7. View vs Materialized View

Answer:

View: Saved Query. Runs every time you select from it. Virtual.
Materialized View: Snapshot on disk. Fast read. Stale data. Needs refresh (REFRESH MATERIALIZED VIEW).

8. Stored Procedures vs Functions

Answer:

Function: Returns value. Can use in SELECT. No transaction control usually.
Proc: Returns void/cursor. Can manage transactions (COMMIT/ROLLBACK).

9. Constraint Types

Answer: Primary Key, Foreign Key, Unique, Not Null, Check (age > 0).

10. Cursor

Answer: Pointer to a result set row. Allows iterating row-by-row. Expensive (Network roundtrips). Avoid if set-based operation possible.

2. Internals & Optimization

11. WAL (Write Ahead Log)

Answer: Changes are written to Append-Only Log before Apply to Data File. Why? Sequential write is fast. Ensures Durability on crash (Replay log).

12. MVCC (Multi-Version Concurrency Control)

Answer: Readers don’t block Writers. Writers don’t block Readers. Each transaction sees a “Snapshot”. Implemented via Tuple Versioning (xmin, xmax in Postgres). Old versions cleaned by VACUUM.

13. Query Execution Plan

Answer: Parser -> Optimizer -> Executor. EXPLAIN ANALYZE select *... Look for: Seq Scan (Bad on large table), Index Scan (Good), Hash Join.

14. Index Scan / Seek

Answer:

Seek: Jump to location B-Tree. O(log N).
Scan: Read all leaf nodes. O(N).

15. Covering Index

Answer: Index contains ALL columns required by query. SELECT name FROM users WHERE age = 10. Index on (age, name). No Heap lookup required. Super fast.

16. Hash Index

Answer: O(1) lookup. Equality only (=). No Range queries (>). Postgres supports it but rarely used over B-Tree.

17. Database Sharding Logic

Answer: Distributing rows across servers. Criteria: Range ID, Modulo Hash, Geo. Challenge: Cross-shard Join, Rebalancing.

18. Partitioning (Table)

Answer: Splitting huge table into smaller physical tables (Partitions) on SAME server. Ex: Logs_2023_01, Logs_2023_02. Optimizer skips partitions (WHERE date = ...).

19. Vacuum (Postgres)

Answer: Reclaiming space from Dead Tuples (Updated/Deleted rows). Prevents Transaction ID Wraparound. Auto-vacuum daemon.

20. Connection Pooling

Answer: Opening connection is expensive (Thread per connection model). Pool (PgBouncer) keeps connections open.

3. NoSQL (Mongo/Cassandra/Redis)

21. Document vs Column-Family vs Key-Value

Answer:

Doc (Mongo): JSON. Flexible schema.
Key-Value (Redis): Cache. Simple.
Column (Cassandra): Wide column. High write throughput.
Graph (Neo4j): Relations.

22. MongoDB Replication (Replica Set)

Answer: Primary Node (RW) + Secondaries (RO). Async replication (Oplog). Auto-failover.

23. Cassandra Architecture

Answer: Masterless (Ring). Peer-to-peer. Tunable Consistency. Tokens/VNodes. Hinted Handoff (If node down, neighbor holds write).

24. Redis Persistence

Answer:

RDB: Snapshot every X minutes. Fast restart. Last X mins data loss.
AOF: Log every write. Slower restart. No data loss (fsync every sec).

25. Bloom Filter in NoSQL

Answer: Used in Cassandra/HBase to quickly check if a Key exists in an SSTable on disk before reading.

26. CAP Theorem in practice

Answer:

Mongo: CP (Consistency). Break partition -> Unavailability during election.
Cassandra: AP (Availability). Network split -> Nodes accept writes -> Eventual consistency.

27. Geo-Spatial Index

Answer: QuadTree / Geohash. Mongo $near operator. Store points in 2d plane.

28. Write Concern (Mongo)

Answer: w=1: Ack by Primary. w=majority: Ack by >50%. Safe. j=true: Written to Journal (Disk).

29. Redis Single Threaded?

Answer: Yes, for command execution. Avoids context switch / locks. CPU is rarely bottleneck (Memory/Net is). Since v6, I/O threading available.

30. Graph DB Use Cases

Answer: Fraud detection (Ring of accounts). Social Network (Friends of Friends). Recommendation Engine. Traversal is O(1) per hop. SQL Joins are O(N).

4. Advanced SQL Scenarios

31. Window Functions (`OVER`)

Answer: Perform calcs across a set of rows related to current row. RANK() OVER (PARTITION BY dept ORDER BY salary DESC). Running Total, Moving Average.

32. CTE (Common Table Expression)

Answer: WITH cte AS (...) SELECT .... Readable reusable subquery. Recursive CTE: Used for Hierarchical data (Org Chart).

33. Self Join

Answer: Joining table to itself. Ex: Employee table has ManagerID. Find Manager name for Emp.

34. N+1 Problem in SQL

Answer: Loop executing 1 query per object. Fix: IN (...), JOIN, or Batching.

35. Optimistic vs Pessimistic Locking

Answer:

Pessimistic: SELECT ... FOR UPDATE. Locks row. Usage: High conflict.
Optimistic: Version column. Check ver=1 on update. If row changed (ver=2), retry. Usage: Low conflict (Web).

36. Indexing a JSON column

Answer: Postgres GIN Index. Inverted Index for JSONB keys.

37. Full Text Search

Answer: tsvector (Tokens) and tsquery. Inverted Index (GIN). More powerful than LIKE %...%.

38. Schema Migration Strategies

Answer: Add Column (Fast in PG 11+). Remove Column (Break code first). Rename (Downtime usually). Tool: Liquibase, Flyway.

39. SQL Injection

Answer: ' OR '1'='1. Fix: Prepared Statements (Parametrized Queries).

40. Soft Delete

Answer: deleted_at timestamp column. Pros: Recovery. Audit. Cons: Queries needing WHERE deleted_at IS NULL, Index bloat.

5. Operations & Scaling

41. Backup Types

Answer:

Full: Whole DB.
Differential: Changes since last Full.
Transaction Log: Stream of ops (Point in Time Recovery).

42. CDC (Change Data Capture)

Answer: Capture writes from Transaction Log (Debezium reading Postgres WAL) -> Stream to Kafka -> DW. Zero impact on Query performance.

43. Read Replicas

Answer: Offload Read traffic. Async replication. Reporting/Analytics queries.

44. Deadlocks

Answer: Cycle of locks. DB detects and kills one transaction. App must Retry. Prevention: Acquire locks in same order.

45. Vertical Partitioning (Column split)

Answer: Moving BLOB columns (Image, Text) to separate table. Keeps main table rows small (more fit in RAM page).

46. Database Federation

Answer: Treating multiple DBs as one. Postgres Foreign Data Wrapper (FDW). Query Mongo from SQL.

47. Time Series DB (Influx/Timescale)

Answer: Optimized for append-only timestamped data. Compression (Delta-Delta), Downsampling (Rollups), Retention policies.

48. Ledger Database (QLDB)

Answer: Immutable, cryptographically verifiable log changes. Financial/Supply Chain audit.

49. Row-Level Security (RLS)

Answer: DB enforces “User can only see their own rows”. Policy defined in SQL.

50. Two-Phase Commit (2PC)

Answer: Distributed Transaction.

Prepare: All nodes vote “Yes/No” (Lock resources).
Commit: If all Yes, persist. Slow. Blocked if Coordinator dies.

Interview Experiences

Interview Questions

Databases

Database Interview Questions (50+ Detailed Q&A)

1. Fundamentals & SQL

2. Internals & Optimization

3. NoSQL (Mongo/Cassandra/Redis)

4. Advanced SQL Scenarios

5. Operations & Scaling

Interview Experiences

Interview Questions

​Database Interview Questions (50+ Detailed Q&A)

​1. Fundamentals & SQL

​2. Internals & Optimization

​3. NoSQL (Mongo/Cassandra/Redis)

​4. Advanced SQL Scenarios

​5. Operations & Scaling

Database Interview Questions (50+ Detailed Q&A)

1. Fundamentals & SQL

2. Internals & Optimization

3. NoSQL (Mongo/Cassandra/Redis)

4. Advanced SQL Scenarios

5. Operations & Scaling