Module 13: PostgreSQL Source Code
This module prepares you to read, understand, and contribute to the PostgreSQL codebase. You’ll learn the code structure, key patterns, and how to set up a development environment.
Estimated Time : 14-16 hours
Difficulty : Expert
Prerequisite : Complete Modules 7-9 (internals)
Outcome : Ready for first contribution
13.1 Repository Structure
┌─────────────────────────────────────────────────────────────────────────────┐
│ POSTGRESQL SOURCE STRUCTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ postgresql/ │
│ ├── src/ # Main source code │
│ │ ├── backend/ # Server-side code (THE CORE) │
│ │ │ ├── access/ # Table/index access methods │
│ │ │ ├── catalog/ # System catalog handling │
│ │ │ ├── commands/ # SQL command implementations │
│ │ │ ├── executor/ # Query executor │
│ │ │ ├── optimizer/ # Query planner/optimizer │
│ │ │ ├── parser/ # SQL parser │
│ │ │ ├── postmaster/ # Process management │
│ │ │ ├── replication/ # Streaming/logical replication │
│ │ │ ├── storage/ # Buffer pool, WAL, locks │
│ │ │ ├── tcop/ # Traffic cop (query dispatch) │
│ │ │ └── utils/ # Utilities, memory, caching │
│ │ ├── include/ # Header files │
│ │ ├── interfaces/ # Client libraries (libpq) │
│ │ ├── bin/ # Command-line tools │
│ │ │ ├── psql/ # Interactive terminal │
│ │ │ ├── pg_dump/ # Backup utility │
│ │ │ └── initdb/ # Database initialization │
│ │ ├── pl/ # Procedural languages │
│ │ └── test/ # Regression tests │
│ ├── contrib/ # Optional modules/extensions │
│ ├── doc/ # Documentation (SGML) │
│ └── config/ # Build configuration │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Key Directories Deep Dive
src/backend/parser
src/backend/optimizer
src/backend/executor
src/backend/storage
SQL Parsing parser/
├── scan.l # Flex lexer (tokenization)
├── gram.y # Bison grammar (syntax)
├── parser.c # Parser entry point
├── analyze.c # Semantic analysis
├── parse_expr.c # Expression parsing
├── parse_clause.c # FROM, WHERE, etc.
├── parse_func.c # Function call resolution
├── parse_relation.c # Table reference resolution
└── parse_type.c # Type handling
Query Planning optimizer/
├── plan/
│ ├── planner.c # Main planner entry
│ ├── createplan.c # Convert paths to plans
│ └── subselect.c # Subquery handling
├── path/
│ ├── allpaths.c # Path generation
│ ├── indxpath.c # Index paths
│ ├── joinpath.c # Join paths
│ └── costsize.c # Cost estimation
└── util/
├── clauses.c # Expression utilities
└── pathnode.c # Path node creation
Query Execution executor/
├── execMain.c # Executor entry point
├── execProcnode.c # Node dispatch
├── nodeSeqscan.c # Sequential scan
├── nodeIndexscan.c # Index scan
├── nodeBitmapscan.c # Bitmap scan
├── nodeHashjoin.c # Hash join
├── nodeMergejoin.c # Merge join
├── nodeNestloop.c # Nested loop join
├── nodeAgg.c # Aggregation
└── nodeSort.c # Sorting
Storage Layer storage/
├── buffer/
│ ├── bufmgr.c # Buffer manager
│ └── freelist.c # Buffer replacement
├── file/
│ └── fd.c # File descriptor mgmt
├── ipc/
│ └── shmem.c # Shared memory
├── lmgr/
│ └── lock.c # Lock manager
└── smgr/
└── md.c # Magnetic disk storage
13.2 Development Environment Setup
Building from Source
# Clone repository
git clone https://git.postgresql.org/git/postgresql.git
cd postgresql
# Configure build (debug mode for development)
./configure \
--enable-debug \
--enable-cassert \
--enable-tap-tests \
--prefix= $HOME /pg-dev \
CFLAGS="-O0 -g3"
# Build (use available cores)
make -j$( nproc )
# Install
make install
# Initialize database cluster
$HOME /pg-dev/bin/initdb -D $HOME /pg-dev/data
# Start server
$HOME /pg-dev/bin/pg_ctl -D $HOME /pg-dev/data -l logfile start
# Verify
$HOME /pg-dev/bin/psql -d postgres -c "SELECT version();"
Build Options Explained
Option Purpose --enable-debugInclude debugging symbols --enable-cassertEnable assertion checks --enable-tap-testsEnable Perl TAP tests CFLAGS="-O0"Disable optimization (better debugging) CFLAGS="-g3"Maximum debug info
IDE Setup
// .vscode/c_cpp_properties.json
{
"configurations" : [{
"name" : "PostgreSQL" ,
"includePath" : [
"${workspaceFolder}/src/include/**" ,
"${workspaceFolder}/src/backend/**"
],
"defines" : [
"FRONTEND" ,
"HAVE_CONFIG_H"
],
"compilerPath" : "/usr/bin/gcc" ,
"cStandard" : "c11"
}]
}
13.3 Code Conventions
Naming Conventions
/* Types: CamelCase */
typedef struct BufferDesc { ... } BufferDesc;
typedef struct RelationData * Relation;
/* Functions: lowercase_with_underscores */
extern Datum heap_insert (Relation relation , HeapTuple tup , ...);
extern void BufferGetTag (Buffer buffer , RelFileNode * rnode , ...);
/* Macros: UPPER_CASE */
#define BUFFER_LOCK_SHARE 1
#define InvalidOid ((Oid) 0 )
/* Global variables: CamelCase or descriptive */
bool enable_seqscan = true ;
int work_mem = 4096 ;
/* Local variables: short, lowercase */
int i, j, ntuples;
char * ptr;
Common Patterns
/* PostgreSQL uses memory contexts for allocation */
MemoryContext oldcontext;
/* Switch to appropriate context */
oldcontext = MemoryContextSwitchTo (CacheMemoryContext);
/* Allocate memory (auto-freed when context is reset/deleted) */
ptr = palloc (size);
str = pstrdup ( "string" );
/* Restore previous context */
MemoryContextSwitchTo (oldcontext);
/* Reset context (free all memory in it) */
MemoryContextReset (mycontext);
/* ereport for errors and logging */
ereport (ERROR,
( errcode (ERRCODE_INVALID_PARAMETER_VALUE),
errmsg ( "invalid value for parameter \" %s \" : %d " ,
name, value),
errdetail ( "Value must be between %d and %d ." ,
min, max),
errhint ( "Try using a smaller value." )));
/* Log levels: DEBUG5-DEBUG1, LOG, INFO, NOTICE, WARNING, ERROR */
elog (DEBUG1, "entering function %s " , __func__);
elog (LOG, "checkpoint starting" );
elog (WARNING, "skipping invalid entry" );
/* ERROR aborts transaction, FATAL terminates connection */
/* PANIC crashes the server (only for catastrophic issues) */
/* All parse/plan nodes inherit from Node */
typedef struct Node {
NodeTag type;
} Node;
/* Check node type */
if ( IsA (node, SeqScan))
process_seqscan ((SeqScan * ) node );
/* Safe casting with nodeTag check */
SeqScan * scan = castNode (SeqScan, node);
/* Copy nodes (deep copy) */
Node * copy = copyObject (original);
/* Node comparison */
if ( equal (node1, node2))
...
/* PostgreSQL uses custom List type (not arrays) */
List * mylist = NIL; /* Empty list */
/* Add to list */
mylist = lappend (mylist, item); /* Append */
mylist = lcons (item, mylist); /* Prepend */
mylist = list_concat (list1, list2); /* Concatenate */
/* Iterate */
ListCell * lc;
foreach (lc, mylist) {
Node * item = lfirst (lc);
/* or lfirst_int(lc) for int lists */
}
/* Get by index */
Node * third = list_nth (mylist, 2 );
/* List length */
int len = list_length (mylist);
13.4 Key Data Structures
Relation (Table)
/* RelationData represents an open relation (table/index) */
typedef struct RelationData {
RelFileNode rd_node; /* Physical file identifier */
Form_pg_class rd_rel; /* pg_class tuple */
TupleDesc rd_att; /* Tuple descriptor */
Oid rd_id; /* Relation OID */
/* Index info (if index) */
Form_pg_index rd_index;
/* Cached info */
bytea * rd_options; /* Parsed reloptions */
/* ... many more fields */
} RelationData;
typedef RelationData * Relation;
/* Open/close relations */
Relation rel = table_open (relid, AccessShareLock);
/* ... use relation ... */
table_close (rel, AccessShareLock);
HeapTuple
/* HeapTuple = pointer to tuple data + header */
typedef struct HeapTupleData {
uint32 t_len; /* Length of tuple */
ItemPointerData t_self; /* TID (block, offset) */
Oid t_tableOid; /* Table OID */
HeapTupleHeader t_data; /* -> tuple header+data */
} HeapTupleData;
typedef HeapTupleData * HeapTuple;
/* Access tuple attributes */
bool isnull;
Datum value = heap_getattr (tuple, attnum, tupdesc, & isnull );
/* Build new tuple */
HeapTuple newtup = heap_form_tuple (tupdesc, values, nulls);
Buffer
/* Buffer = index into shared buffer pool */
typedef int Buffer;
/* Buffer operations */
Buffer buf = ReadBuffer (rel, blocknum);
LockBuffer (buf, BUFFER_LOCK_SHARE);
Page page = BufferGetPage (buf);
/* ... access page data ... */
LockBuffer (buf, BUFFER_LOCK_UNLOCK);
ReleaseBuffer (buf);
/* Mark buffer dirty (for writes) */
MarkBufferDirty (buf);
13.5 Tracing Code Flow
Example: SELECT Query Path
/* 1. Entry point: postgres.c */
exec_simple_query (query_string)
│
├── pg_parse_query () /* parser/parser.c */
│ └── raw_parser ()
│ └── base_yyparse () /* gram.y generated */
│
├── pg_analyze_and_rewrite ()
│ ├── parse_analyze () /* parser/analyze.c */
│ └── pg_rewrite_query () /* rewrite/rewriteHandler.c */
│
├── pg_plan_queries ()
│ └── planner () /* optimizer/plan/planner.c */
│ ├── subquery_planner ()
│ └── create_plan ()
│
└── PortalRun ()
└── ExecutorRun () /* executor/execMain.c */
└── ExecutePlan ()
└── ExecProcNode () /* executor/execProcnode.c */
Using GDB
# Start PostgreSQL under GDB
gdb --args $HOME /pg-dev/bin/postgres -D $HOME /pg-dev/data
# Set breakpoint
( gdb ) break heap_insert
( gdb ) break costsize.c:200
# Run
( gdb ) run
# In another terminal, run SQL
psql -c "INSERT INTO test VALUES (1)"
# Back in GDB
( gdb ) backtrace # Show call stack
( gdb ) print * tuple # Inspect HeapTuple
( gdb ) p * rel - > rd_rel # Inspect relation metadata
( gdb ) next # Step over
( gdb ) step # Step into
13.6 Adding a Simple Feature
Example: Add New GUC Parameter
/* Step 1: Declare in src/backend/utils/misc/guc.c */
static int my_new_setting = 100 ; /* default value */
/* Step 2: Add to ConfigureNamesInt array */
{
{ "my_new_setting" , PGC_USERSET, CUSTOM_OPTIONS,
gettext_noop ( "Description of my setting." ),
NULL
},
& my_new_setting,
100 , /* default */
0 , /* min */
10000 , /* max */
NULL , NULL , NULL
},
/* Step 3: Declare extern in src/include/utils/guc.h */
extern int my_new_setting;
/* Step 4: Use in code */
if (my_new_setting > threshold) {
/* do something */
}
Example: Add New SQL Function
/* Step 1: Create function in src/backend/utils/adt/myfunc.c */
#include "postgres.h"
#include "fmgr.h"
PG_FUNCTION_INFO_V1 (my_add_one);
Datum
my_add_one (PG_FUNCTION_ARGS)
{
int32 input = PG_GETARG_INT32 ( 0 );
PG_RETURN_INT32 (input + 1 );
}
/* Step 2: Add to src/include/catalog/pg_proc.dat */
{ oid => '9999' , proname => 'my_add_one' ,
prorettype => 'int4' , proargtypes => 'int4' ,
prosrc => 'my_add_one' },
/* Step 3: Add to Makefile */
OBJS += myfunc.o
/* Step 4: Run initdb to update catalogs, or: */
CREATE FUNCTION my_add_one ( int ) RETURNS int AS 'my_add_one' LANGUAGE internal;
13.7 Running Tests
# Full regression test suite
make check
# Specific test file
make check TESTS="select"
# Parallel tests
make check-world -j4
# TAP tests (for utilities)
make check -C src/bin/psql
# Isolation tests (concurrency)
make check -C src/test/isolation
# Code coverage
./configure --enable-coverage
make check
make coverage-html
# View htmlcov/index.html
Writing Tests
-- src/test/regress/sql/mytest.sql
-- Test my_add_one function
SELECT my_add_one( 5 );
SELECT my_add_one( - 1 );
SELECT my_add_one( NULL );
-- Expected output in src/test/regress/expected/mytest.out
-- my_add_one
------------
6
( 1 row )
my_add_one
------------
0
( 1 row )
my_add_one
------------
( 1 row )
13.8 Key Source Files to Study
Area Files Purpose Query Entry tcop/postgres.cMain query loop Parsing parser/gram.y, scan.lSQL syntax Planning optimizer/plan/planner.cPlan generation Cost Model optimizer/path/costsize.cCost estimation Execution executor/execMain.cQuery execution Buffer Pool storage/buffer/bufmgr.cPage caching WAL access/transam/xlog.cWrite-ahead log Transactions access/transam/xact.cTransaction mgmt Heap Access access/heap/heapam.cTable operations Index Access access/nbtree/nbtree.cB-tree index
13.9 Practice Exercise
Exercise: Add Execution Time to EXPLAIN
Goal : Add “Planning Time: X ms” output to regular EXPLAIN (not just ANALYZE)Steps :
Find where EXPLAIN output is generated (commands/explain.c)
Study ExplainOnePlan() function
Add timing capture before/after planning in ExplainOneQuery()
Output the timing in ExplainPrintPlan()
Write regression tests
Submit patch to pgsql-hackers
Files to modify :
src/backend/commands/explain.c
src/test/regress/sql/explain.sql
src/test/regress/expected/explain.out
Next Module
Module 14: Contributing to PostgreSQL Submit your first patch to PostgreSQL