> ## Documentation Index > Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt > Use this file to discover all available pages before exploring further. # Database Engineering Mastery > Senior-level PostgreSQL expertise — from internals to open-source contributions # Database Engineering Mastery A comprehensive curriculum designed for engineers aiming for **senior/staff roles** and those interested in **contributing to PostgreSQL**. This isn't another SQL tutorial — it's a deep dive into how databases really work. **Course Duration**: 12-16 weeks (self-paced)\ **Target Outcome**: Senior Database Engineer / PostgreSQL Contributor\ **Prerequisites**: Basic programming, some SQL exposure\ **Primary Focus**: PostgreSQL (with universal RDBMS concepts) *** ## Why This Course? System design, deep dives, and trade-off discussions that senior roles demand Understand PostgreSQL codebase structure for meaningful contributions Patterns from systems handling billions of transactions Understand WHY things work, not just HOW to use them *** ## Course Structure The curriculum is organized into **4 tracks** progressing from foundations to contribution-ready expertise: Database Engineering course structure overview

Database Engineering course structure overview

``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ DATABASE ENGINEERING MASTERY │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ TRACK 1: FOUNDATIONS TRACK 2: PERFORMANCE │ │ ───────────────────── ────────────────────── │ │ □ SQL Mastery □ Indexing Deep Dive │ │ □ Database Design □ Query Optimization │ │ □ Transactions & Concurrency □ Performance Tuning │ │ │ │ TRACK 3: INTERNALS TRACK 4: PRODUCTION & OSS │ │ ────────────────── ───────────────────────── │ │ □ Storage Engine □ Replication & HA │ │ □ Query Processing □ Scaling Strategies │ │ □ PostgreSQL Architecture □ Operational Excellence │ │ □ PostgreSQL Source Code │ │ □ Contributing to PostgreSQL │ │ │ │ CAPSTONE │ │ ──────── │ │ □ Senior Interview Preparation │ │ □ System Design Case Studies │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` *** ## How to Use This Course Treat this as a **12-16 week roadmap** rather than a reference you skim once. The most common mistake learners make is trying to rush through the material without hands-on practice. Database engineering is a craft -- you develop intuition by running queries, breaking things, and observing the results yourself. * **Step 1 -- Foundations first**: Complete SQL Foundations, Database Design, and Transactions & Concurrency before touching performance or internals. These three modules build the mental model that everything else depends on. Skipping them is like trying to debug a distributed system without understanding networking. * **Step 2 -- Parallel practice**: For each concept, run the commands yourself on a local PostgreSQL instance (Docker is fine) and tweak examples until you can predict the outcome. If you can predict what `EXPLAIN ANALYZE` will show before you run it, you have internalized the concept. * **Step 3 -- Performance & internals**: Once you’re comfortable writing queries, move into Indexing, Query Optimization, Performance Tuning, and the Internals track. This is where you transition from "developer who uses a database" to "engineer who understands databases." * **Step 4 -- Production systems**: Finish with Replication, Scaling, Operational Excellence, and Case Studies; apply them to a project at work or a personal side project. Real production experience, even on a small scale, is worth more than reading about it. * **Step 5 -- Interview & OSS**: Use the Interview Prep, Source Code, and Contributing modules in the last 2-3 weeks to polish your narrative and ship at least one small contribution (docs patch, bug report, or extension experiment). Come back to specific modules (e.g., indexing, query plans, vacuum) whenever you hit related issues in real systems -- this course is meant to be a **long-term reference** as well as a linear curriculum. *** ## Track 1: Foundations Build unshakeable fundamentals that senior engineers rely on. **Duration**: 8-10 hours Not just syntax — understanding query semantics and execution. * Query execution order (FROM → WHERE → GROUP BY → SELECT → ORDER BY) * JOIN algorithms: Nested Loop, Hash Join, Merge Join * Subqueries vs CTEs vs derived tables — when to use each * Window functions for analytics and running calculations * Advanced: Recursive CTEs, LATERAL joins, JSON operations **Interview Focus**: Live SQL coding, query optimization challenges **Duration**: 10-12 hours Design schemas that scale and evolve gracefully. * Logical vs physical data modeling * Normalization theory (1NF through BCNF) — and when to break rules * Temporal data patterns (SCD Type 1, 2, 3) * Polymorphic associations and inheritance strategies * Schema evolution: migrations, backward compatibility, zero-downtime changes **Interview Focus**: Design a schema for X, trade-off discussions **Duration**: 12-14 hours The hardest topic in databases — master it. * ACID deep dive: what each property really guarantees * Isolation levels: Read phenomena, snapshot isolation, serializable * PostgreSQL's MVCC implementation details * Locking: row-level, table-level, advisory locks * Deadlock detection and prevention strategies * Optimistic vs pessimistic concurrency control **Interview Focus**: Concurrency bugs, isolation level selection *** ## Track 2: Performance The skills that differentiate senior from mid-level engineers. **Duration**: 10-12 hours Indexes are your primary performance tool — understand them completely. * B-Tree internals: structure, splits, fill factor * Hash indexes: when they outperform B-Trees * GIN indexes for full-text and JSONB * GiST and SP-GiST for geometric and range data * BRIN for time-series and append-only tables * Index-only scans and covering indexes * Partial indexes and expression indexes * Index bloat, reindexing, and maintenance **Interview Focus**: Index selection, composite index ordering **Duration**: 12-14 hours Systematic approach to making queries fast. * EXPLAIN ANALYZE mastery: reading every field * Query planner internals: cost estimation, cardinality * Statistics: pg\_statistic, extended statistics, manual adjustment * Common anti-patterns: implicit casts, functions on columns, OR chains * Join order optimization * Parallel query execution * Query plan caching and prepared statements **Interview Focus**: Given a slow query, how do you fix it? **Duration**: 8-10 hours Holistic system performance optimization. * Memory configuration: shared\_buffers, work\_mem, maintenance\_work\_mem * Checkpoint tuning and WAL configuration * Connection management and pooling strategies * Identifying bottlenecks: CPU, I/O, locks, network * Benchmarking methodology: pgbench, custom workloads * Performance regression testing **Interview Focus**: Tune this database for X workload *** ## Track 3: Internals Understand how PostgreSQL actually works — essential for senior roles and OSS contribution. **Duration**: 10-12 hours How data is stored, retrieved, and managed on disk. * Page layout: header, line pointers, tuples * Heap files and TOAST (oversized attributes) * Tuple structure: system columns, null bitmap, alignment * Free Space Map (FSM) and Visibility Map (VM) * Tablespaces and file organization * Heap-only tuples (HOT) optimization **OSS Prep**: Reading storage-related source code (src/backend/access/heap) **Duration**: 12-14 hours The journey from SQL text to execution results. * Parser: lexical analysis, syntax tree * Analyzer: semantic analysis, name resolution * Rewriter: rule system, view expansion * Planner: path generation, cost estimation, plan selection * Executor: node types, pipeline execution * Expression evaluation and JIT compilation **OSS Prep**: Understanding src/backend/parser, optimizer, executor **Duration**: 10-12 hours System architecture and component interaction. * Process model: postmaster, backends, auxiliary processes * Shared memory architecture * Buffer pool management and replacement algorithms * WAL: structure, write-ahead logging protocol * Checkpoint process and recovery * Background writer and checkpointer * Autovacuum subsystem **OSS Prep**: Navigating src/backend directory structure ### Advanced Internals Deep Dives For engineers targeting staff+ roles or PostgreSQL contributions, these modules provide production-grade depth: Spinlocks, LWLocks, heavyweight locks, deadlock detection algorithms, and lock monitoring. Clock sweep algorithm, buffer descriptors, ring buffers, dirty page handling, and pg\_prewarm. Memory contexts, palloc/pfree, AllocSet internals, shared memory, NUMA, and huge pages. pg\_class, pg\_attribute, pg\_type, syscache, relcache, and cache invalidation. Building extensions with hooks, custom types, operators, FDWs, and background workers. Repository structure, key data structures, coding conventions, and patch workflow. *** ## Track 4: Production & Open Source Real-world operations and contribution skills. **Duration**: 10-12 hours Build systems that survive failures. * Physical replication: streaming, WAL shipping * Replication slots and conflict handling * Synchronous vs asynchronous trade-offs * Logical replication: publications, subscriptions * Failover automation: Patroni, repmgr * Split-brain prevention * Connection pooling: PgBouncer, Pgpool-II **Interview Focus**: Design HA architecture, RTO/RPO discussions **Duration**: 12-14 hours From one server to global distribution. * Vertical vs horizontal scaling decision framework * Table partitioning: range, list, hash * Partition pruning and constraint exclusion * Sharding strategies and implementation patterns * Citus for distributed PostgreSQL * Read scaling with replicas * Caching layers: application, Redis, pg\_prewarm * Multi-region architectures **Interview Focus**: Scale this system 100x **Duration**: 10-12 hours Run databases like a senior SRE. * Monitoring: pg\_stat\_\* views, pg\_stat\_statements * Alerting strategies and runbooks * Backup strategies: pg\_dump, pg\_basebackup, pgBackRest * Point-in-time recovery (PITR) * Upgrade strategies: pg\_upgrade, logical replication * Security: roles, RLS, SSL, audit logging * Disaster recovery planning and testing **Interview Focus**: Incident response, DR scenarios **Duration**: 14-16 hours Navigate and understand the PostgreSQL codebase. * Repository structure and build system * Code style and conventions * Key data structures: List, Node, Plan, Path * Memory contexts and palloc * Error handling and elog/ereport * Catalog tables and system catalogs * Extension API and hooks * Reading existing patches **OSS Prep**: Setting up development environment **Duration**: 8-10 hours Become a PostgreSQL contributor. * Development workflow: mailing lists, commitfest * Finding contribution opportunities * Writing quality patches * Documentation contributions * Bug reporting and triage * Code review process * Community interaction etiquette * Building your contributor reputation **Outcome**: First accepted patch or documentation contribution *** ## Capstone: Senior Interview Preparation **Duration**: 12-14 hours Ace database questions in senior engineering interviews. **Technical Deep Dives**: * "Explain MVCC and its trade-offs" * "How would you debug a slow query in production?" * "Describe B-Tree internals" * "What happens when you run a SELECT query?" **System Design Integration**: * Database selection for different use cases * Scaling strategies and trade-offs * Consistency vs availability decisions * Data modeling for complex domains **Behavioral with Technical Context**: * "Tell me about a database issue you solved" * "How do you make decisions about denormalization?" * "Describe a time you optimized a critical query" **Duration**: 10-12 hours Real-world database design problems. * **E-Commerce Platform**: inventory, orders, payments * **Social Media Feed**: fan-out, timeline assembly * **Real-Time Analytics**: time-series, aggregations * **Multi-Tenant SaaS**: isolation, noisy neighbor * **Event Sourcing System**: append-only, projections * **Global Application**: multi-region, conflict resolution *** ## Learning Outcomes ### After Completing This Course, You Will: Deep understanding of database internals that enables you to make informed architectural decisions and debug complex production issues. Ability to discuss trade-offs, explain internals, and design data systems that impress senior/staff level interviewers. Understanding of PostgreSQL source code structure and development workflow to make meaningful contributions. Skills to operate, scale, and troubleshoot PostgreSQL in high-stakes production environments. *** ## Who Should Take This Course? | Profile | What You'll Gain | | ------------------------ | ---------------------------------- | | **Mid-Level Engineers** | Skills to reach senior level | | **Senior Engineers** | Depth for staff/principal roles | | **Backend Developers** | Database expertise that stands out | | **Platform Engineers** | Production PostgreSQL mastery | | **OSS Enthusiasts** | Path to PostgreSQL contribution | | **Interview Candidates** | Deep dive preparation | *** ## Course Materials 60+ hours of deep technical content Docker-based practice environments PostgreSQL codebase exploration 100+ senior-level questions Quick reference for interviews Discord + office hours *** ## Start Your Journey Begin with SQL Mastery Jump to internals if you're experienced *** ## Interview Deep-Dive **Strong Answer:** * The first question is the expected write throughput and data volume at maturity. If the dataset will comfortably fit on a single powerful node for the foreseeable future (say, under 2-5 TB with moderate write rates), PostgreSQL is almost always the right starting point. Its ecosystem, tooling, extension model (PostGIS, pg\_trgm, pgvector), and MVCC implementation are battle-tested for OLTP workloads. MySQL is a reasonable alternative if the team already has deep MySQL operational expertise, but PostgreSQL wins on standards compliance, JSON support, and advanced features like partial indexes and CTEs. * CockroachDB or similar distributed SQL enters the picture when you have a hard requirement for multi-region writes with strong consistency, or when your write volume will exceed what a single node can handle even after vertical scaling. The cost is real: higher write latency due to Raft consensus, operational complexity of a distributed cluster, and the inability to use many PostgreSQL-specific features. * The decision framework I use: (1) Can a single PostgreSQL node handle this for the next 2-3 years? If yes, start there. (2) Do I need multi-region active-active writes? If yes, CockroachDB or Spanner. (3) Is this a read-heavy workload that can tolerate slight staleness? PostgreSQL with read replicas. The worst decision is premature distribution -- you pay the complexity tax on day one but only need the scalability on day 500. **Follow-up: What if leadership pushes back and says "we need to build for scale from day one"?** This is a common organizational pressure. I would quantify the cost: distributed databases add 3-5x operational overhead (monitoring Raft groups, managing range splits, debugging distributed transaction anomalies). I would propose starting with PostgreSQL using a clean sharding-ready schema -- every table includes a tenant\_id or natural partition key in the primary key from day one. This makes a future migration to Citus or CockroachDB straightforward without paying the distributed tax prematurely. Instagram ran on a single PostgreSQL instance to 30 million users. Notion used PostgreSQL to well past Series B. Starting simple is not technical debt -- premature complexity is. **Strong Answer:** * **Not tuning default configuration.** PostgreSQL ships with conservative defaults designed to run on a Raspberry Pi. A team deploying on a 64GB server with default shared\_buffers of 128MB is leaving 95% of their hardware on the table. At minimum, set shared\_buffers to 25% of RAM, effective\_cache\_size to 75%, and random\_page\_cost to 1.1 for SSDs. * **Ignoring connection management.** Each PostgreSQL connection spawns an OS process consuming 5-10MB. Teams running 500 direct connections from microservices will hit memory walls fast. PgBouncer in transaction mode should be deployed from day one. * **Not monitoring autovacuum.** Teams assume it "just works" and then discover their 50GB table has bloated to 200GB because autovacuum could not keep up with the default scale factor of 0.2 on a high-churn table. Per-table autovacuum tuning is not optional for tables with heavy UPDATE/DELETE patterns. * **Treating PostgreSQL like an application datastore only.** Teams miss that PostgreSQL is also a job queue (SELECT FOR UPDATE SKIP LOCKED), a pub/sub system (LISTEN/NOTIFY), and a full-text search engine (tsvector/tsquery). Using Redis or Elasticsearch for capabilities PostgreSQL already provides adds unnecessary infrastructure. * **Schema migrations without considering locks.** Running ALTER TABLE ADD COLUMN with a DEFAULT on a 100M-row table in PostgreSQL versions before 11 would rewrite the entire table under an exclusive lock. Even in modern versions, adding a NOT NULL constraint without VALID requires a full table scan lock. **Follow-up: How do you set up monitoring for a new PostgreSQL deployment?** The essentials are pg\_stat\_statements for query-level performance tracking, pg\_stat\_user\_tables for vacuum and dead tuple monitoring, pg\_stat\_replication for replica lag, and pg\_stat\_activity for connection state distribution. I export these to Prometheus via postgres\_exporter and set alerts on: buffer cache hit ratio below 99%, replication lag above 10 seconds, long-running transactions over 5 minutes, and autovacuum failing to keep dead tuple ratios under 10%. **Strong Answer:** * The CAP theorem states that in the presence of a network partition, a distributed system must choose between consistency (every read returns the most recent write) and availability (every request receives a response). You cannot have both simultaneously when nodes cannot communicate. * The practical implication is not really "pick 2 of 3" -- network partitions are not optional, they will happen. So the real choice is CP (consistent but some requests fail during partition) versus AP (available but some reads may return stale data during partition). * PostgreSQL with synchronous streaming replication is a CP system: during a network partition, the primary refuses to commit if the synchronous standby is unreachable, preserving consistency at the cost of availability. With asynchronous replication, it behaves more like a single-node system that is not distributed at all -- the standby may lag, so reads from replicas are eventually consistent. * CockroachDB and Spanner are CP: they will reject writes to ranges whose Raft groups cannot achieve quorum. Cassandra and DynamoDB are AP: they continue accepting writes during partitions and resolve conflicts later via last-write-wins or vector clocks. * In system design interviews, I frame this as: "For financial data, I choose CP because showing stale account balances is unacceptable. For a social media feed, AP is fine because a user seeing a post 2 seconds late is tolerable, and feed unavailability is not." **Follow-up: Is the CAP theorem still relevant with modern databases that claim to offer "both"?** No database actually breaks CAP -- they make different tradeoffs along the spectrum. Spanner appears to offer both C and A by minimizing the window of unavailability using TrueTime and globally distributed Paxos groups, but during a true partition it will still sacrifice availability for consistency. What modern systems do well is minimize the practical impact: Spanner's 5-nines availability means the CAP tradeoff only manifests during extremely rare multi-zone failures. CockroachDB similarly tolerates single-node failures seamlessly via Raft, but a majority failure in a range still causes unavailability. The theorem is a useful framing for interviews, but the real question is always "what is the blast radius and duration of your tradeoff?"