Skip to main content
Catalog System Deep Dive - System catalogs and caching

Catalog System Deep Dive

This module explores PostgreSQL’s system catalog — the metadata that describes all database objects. Understanding the catalog is essential for extension development, performance tuning, and contributing to PostgreSQL.
Target Audience: Extension developers, DBAs, core contributors
Prerequisites: SQL foundations, basic PostgreSQL internals
Source Directories: src/backend/catalog/, src/include/catalog/
Interview Relevance: Staff+ database roles, extension development

Part 1: Catalog Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                    SYSTEM CATALOG ARCHITECTURE                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   System catalogs are regular tables stored in pg_catalog schema.           │
│   They describe ALL database objects: tables, columns, functions, etc.      │
│                                                                              │
│   Key Catalogs:                                                              │
│   ┌───────────────────────────────────────────────────────────────────────┐ │
│   │                                                                       │ │
│   │   pg_class ───────► Describes tables, indexes, views, sequences      │ │
│   │       │                                                               │ │
│   │       └───► pg_attribute ───► Columns of each relation               │ │
│   │       │                                                               │ │
│   │       └───► pg_index ───► Index information                          │ │
│   │                                                                       │ │
│   │   pg_type ───────► All data types                                    │ │
│   │       │                                                               │ │
│   │       └───► pg_operator ───► Operators for types                     │ │
│   │                                                                       │ │
│   │   pg_proc ───────► Functions and procedures                          │ │
│   │       │                                                               │ │
│   │       └───► pg_aggregate ───► Aggregate function details             │ │
│   │                                                                       │ │
│   │   pg_namespace ──► Schemas                                           │ │
│   │   pg_database ───► Databases                                         │ │
│   │   pg_tablespace ─► Tablespaces                                       │ │
│   │   pg_authid ─────► Roles/users                                       │ │
│   │                                                                       │ │
│   └───────────────────────────────────────────────────────────────────────┘ │
│                                                                              │
│   Catalogs → Cache → Backend Access                                         │
│   ─────────────────────────────────────────────────────────────────────────  │
│   Direct catalog access is slow (disk I/O).                                 │
│   PostgreSQL caches catalog data in memory:                                 │
│                                                                              │
│   ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐        │
│   │   pg_class      │───►│  RelCache       │───►│  Relation       │        │
│   │   (on disk)     │    │  (per-backend)  │    │  (in memory)    │        │
│   └─────────────────┘    └─────────────────┘    └─────────────────┘        │
│                                                                              │
│   ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐        │
│   │   pg_type       │───►│  TypeCache      │───►│  TypeCacheEntry │        │
│   │   pg_operator   │    │  (per-backend)  │    │  (in memory)    │        │
│   └─────────────────┘    └─────────────────┘    └─────────────────┘        │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Part 2: Key System Catalogs

2.1 pg_class — Relations

-- pg_class stores ALL "relations": tables, indexes, views, sequences, etc.

SELECT 
    oid,                    -- Object ID (unique identifier)
    relname,                -- Relation name
    relnamespace,           -- Schema OID (pg_namespace.oid)
    reltype,                -- Row type OID (pg_type.oid)
    reloftype,              -- Typed table's type (0 if not typed)
    relowner,               -- Owner (pg_authid.oid)
    relam,                  -- Access method (pg_am.oid for indexes)
    relfilenode,            -- Physical file name
    reltablespace,          -- Tablespace (0 = default)
    relpages,               -- Size in pages (estimated by ANALYZE)
    reltuples,              -- Row count (estimated by ANALYZE)
    relallvisible,          -- Pages marked all-visible
    reltoastrelid,          -- TOAST table OID (if any)
    relhasindex,            -- Has indexes?
    relisshared,            -- Shared across all databases?
    relpersistence,         -- 'p'=permanent, 'u'=unlogged, 't'=temp
    relkind,                -- 'r'=table, 'i'=index, 'v'=view, etc.
    relnatts,               -- Number of columns
    relchecks,              -- Number of CHECK constraints
    relhasrules,            -- Has rules?
    relhastriggers,         -- Has triggers?
    relhassubclass,         -- Has child tables?
    relrowsecurity,         -- RLS enabled?
    relforcerowsecurity,    -- Force RLS even for owner?
    relreplident,           -- Replica identity
    relispartition,         -- Is a partition?
    relrewrite,             -- During ALTER TABLE rewrite
    relfrozenxid,           -- Oldest XID for vacuum
    relminmxid,             -- Oldest MultiXact
    relacl,                 -- Access privileges
    reloptions              -- Storage options (WITH clause)
FROM pg_class
WHERE relname = 'users';

-- relkind values:
-- 'r' = ordinary table
-- 'i' = index
-- 'S' = sequence
-- 't' = TOAST table
-- 'v' = view
-- 'm' = materialized view
-- 'c' = composite type
-- 'f' = foreign table
-- 'p' = partitioned table
-- 'I' = partitioned index

2.2 pg_attribute — Columns

-- pg_attribute describes columns of ALL relations

SELECT 
    attrelid,           -- Table OID (pg_class.oid)
    attname,            -- Column name
    atttypid,           -- Data type OID (pg_type.oid)
    attstattarget,      -- Statistics target (-1 = use default)
    attlen,             -- Type length (-1 = variable)
    attnum,             -- Column number (1-based for user cols)
    attndims,           -- Array dimensions (0 if not array)
    attcacheoff,        -- Cached offset (filled at runtime)
    atttypmod,          -- Type modifier (e.g., varchar(100))
    attbyval,           -- Passed by value? (else by reference)
    attstorage,         -- Storage strategy (TOAST)
    attalign,           -- Alignment requirement
    attnotnull,         -- NOT NULL constraint?
    atthasdef,          -- Has DEFAULT?
    atthasmissing,      -- Has missing value for ALTER?
    attidentity,        -- 'a'=always, 'd'=by default, ''=none
    attgenerated,       -- 's'=stored generated, ''=none
    attisdropped,       -- Column was dropped (invisible)
    attislocal,         -- Defined locally (not inherited)
    attinhcount,        -- Number of inheritance ancestors
    attcollation,       -- Collation OID
    attacl,             -- Column-level privileges
    attoptions,         -- Attribute options
    attfdwoptions       -- FDW-specific options
FROM pg_attribute
WHERE attrelid = 'users'::regclass
  AND attnum > 0        -- Skip system columns
  AND NOT attisdropped  -- Skip dropped columns
ORDER BY attnum;

-- System columns (attnum < 0):
-- -1 = ctid (tuple ID)
-- -2 = xmin (insert transaction)
-- -3 = cmin (insert command)
-- -4 = xmax (delete transaction)
-- -5 = cmax (delete command)
-- -6 = tableoid (table OID)

2.3 pg_type — Data Types

-- pg_type defines ALL data types

SELECT 
    oid,
    typname,            -- Type name
    typnamespace,       -- Schema
    typowner,           -- Owner
    typlen,             -- Size (-1=variable, -2=cstring)
    typbyval,           -- Pass by value?
    typtype,            -- 'b'=base, 'c'=composite, 'd'=domain, etc.
    typcategory,        -- Category for parser (S=string, N=numeric, etc.)
    typispreferred,     -- Preferred in category?
    typisdefined,       -- Is defined?
    typdelim,           -- Array delimiter
    typrelid,           -- For composite: pg_class OID
    typelem,            -- For array: element type
    typarray,           -- Array type of this type
    typinput,           -- Input function
    typoutput,          -- Output function
    typreceive,         -- Binary input
    typsend,            -- Binary output
    typmodin,           -- Type modifier input
    typmodout,          -- Type modifier output
    typanalyze,         -- Custom analyze function
    typalign,           -- Alignment requirement
    typstorage,         -- TOAST strategy
    typnotnull,         -- For domains: NOT NULL?
    typbasetype,        -- For domains: base type
    typtypmod,          -- For domains: typmod
    typndims,           -- For domains: array dimensions
    typcollation,       -- Collation
    typdefaultbin,      -- Default expression (nodeToString)
    typdefault,         -- Default as text
    typacl              -- Access privileges
FROM pg_type
WHERE typname = 'int4';

-- typtype values:
-- 'b' = base type
-- 'c' = composite type (row type)
-- 'd' = domain
-- 'e' = enum
-- 'p' = pseudo-type (any, record, trigger, etc.)
-- 'r' = range type
-- 'm' = multirange type

2.4 pg_proc — Functions

-- pg_proc stores functions, procedures, and aggregates

SELECT 
    oid,
    proname,            -- Function name
    pronamespace,       -- Schema
    proowner,           -- Owner
    prolang,            -- Language (pg_language.oid)
    procost,            -- Estimated execution cost
    prorows,            -- Estimated rows (for set-returning)
    provariadic,        -- Variadic argument type (0=none)
    prosupport,         -- Planner support function
    prokind,            -- 'f'=function, 'p'=procedure, 'a'=aggregate, 'w'=window
    prosecdef,          -- Security definer?
    proleakproof,       -- Leak-proof?
    proisstrict,        -- Strict (NULL in = NULL out)?
    proretset,          -- Returns set?
    provolatile,        -- 'i'=immutable, 's'=stable, 'v'=volatile
    proparallel,        -- 's'=safe, 'r'=restricted, 'u'=unsafe
    pronargs,           -- Number of arguments
    pronargdefaults,    -- Number of arguments with defaults
    prorettype,         -- Return type OID
    proargtypes,        -- Argument types (oidvector)
    proallargtypes,     -- All arg types (including OUT)
    proargmodes,        -- Argument modes (i=IN, o=OUT, b=INOUT, v=VARIADIC)
    proargnames,        -- Argument names
    proargdefaults,     -- Default expressions
    protrftypes,        -- Transform types
    prosrc,             -- Source code or link symbol
    probin,             -- Shared library path
    proconfig,          -- proconfig options
    proacl              -- Access privileges
FROM pg_proc
WHERE proname = 'array_agg';

Part 3: Catalog Caching

3.1 System Cache (SysCache)

/* src/backend/utils/cache/syscache.c */

/*
 * SysCache provides hash-based lookup of catalog tuples.
 * Each cache is keyed by one or more columns.
 */

/* Cache identifiers */
typedef enum SysCacheIdentifier
{
    AMNAME,             /* pg_am by name */
    AMOID,              /* pg_am by OID */
    ATTNAME,            /* pg_attribute by (relid, name) */
    ATTNUM,             /* pg_attribute by (relid, attnum) */
    AUTHMEMROLEMEM,     /* pg_auth_members by (role, member) */
    AUTHNAME,           /* pg_authid by name */
    AUTHOID,            /* pg_authid by OID */
    CLAOID,             /* pg_class by OID */
    CLARELNAME,         /* pg_class by (namespace, name) */
    PROCNAMEARGSNSP,    /* pg_proc by (name, argtypes, namespace) */
    PROCOID,            /* pg_proc by OID */
    TYPEOID,            /* pg_type by OID */
    TYPENAME,           /* pg_type by (namespace, name) */
    /* ... many more ... */
} SysCacheIdentifier;

/* Looking up in syscache */
HeapTuple
SearchSysCache1(int cacheId, Datum key1)
{
    /* Returns cached copy of catalog tuple */
    /* Caller must call ReleaseSysCache() when done */
}

/* Example: Look up type by OID */
HeapTuple tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typeId));
if (HeapTupleIsValid(tuple))
{
    Form_pg_type typeForm = (Form_pg_type) GETSTRUCT(tuple);
    char *typname = NameStr(typeForm->typname);
    /* ... use type info ... */
    ReleaseSysCache(tuple);  /* Must release! */
}

/* Convenience macros */
Form_pg_class classForm = (Form_pg_class) GETSTRUCT(
    SearchSysCache1(CLAOID, ObjectIdGetDatum(relid)));

/* Or use GetSysCacheOid which returns just the OID */
Oid typeId = GetSysCacheOid2(TYPENAME,
                             Anum_pg_type_oid,
                             CStringGetDatum(typname),
                             ObjectIdGetDatum(namespaceId));

3.2 Relation Cache (RelCache)

/* src/backend/utils/cache/relcache.c */

/*
 * RelCache stores open relation descriptors.
 * Much more than just pg_class tuple - includes all relation metadata.
 */

typedef struct RelationData
{
    /* Basic info */
    Oid         rd_id;              /* Relation OID */
    Form_pg_class rd_rel;           /* COPY of pg_class row */
    TupleDesc   rd_att;             /* Tuple descriptor */
    
    /* Storage info */
    RelFileNode rd_node;            /* Physical storage identifiers */
    SMgrRelation rd_smgr;           /* Storage manager handle */
    
    /* Indexes */
    Oid        *rd_indexlist;       /* OIDs of indexes on relation */
    int         rd_indexvalid;      /* Is index list valid? */
    
    /* Constraints */
    Bitmapset  *rd_keyattr;         /* Key column numbers */
    
    /* Partition info */
    PartitionKey rd_partkey;        /* Partition key */
    PartitionDesc rd_partdesc;      /* Partition descriptor */
    
    /* Trigger info */
    TriggerDesc *trigdesc;          /* Trigger data */
    
    /* RLS info */
    RowSecurityDesc *rd_rsdesc;     /* Row security policies */
    
    /* Statistics */
    int32       rd_rel->relpages;   /* Size estimate */
    float4      rd_rel->reltuples;  /* Row estimate */
    
    /* ... many more fields ... */
} RelationData;

/* Opening a relation */
Relation
table_open(Oid relationId, LOCKMODE lockmode)
{
    Relation    rel;
    
    /* Look up in relcache */
    rel = RelationIdGetRelation(relationId);
    
    if (!RelationIsValid(rel))
        ereport(ERROR, ...);
    
    /* Acquire the requested lock */
    LockRelationOid(relationId, lockmode);
    
    return rel;
}

/* Closing a relation */
void
table_close(Relation relation, LOCKMODE lockmode)
{
    /* Release lock if we acquired one */
    if (lockmode != NoLock)
        UnlockRelationOid(relation->rd_id, lockmode);
    
    /* Decrement reference count */
    RelationClose(relation);
}

3.3 Cache Invalidation

/*
 * When catalog changes, all backends must invalidate cached copies.
 * PostgreSQL uses "sinval" (shared invalidation) messages.
 */

/* Invalidation message types */
typedef enum
{
    SHAREDINVALCATALOG_ID,      /* Catalog tuple changed */
    SHAREDINVALRELCACHE_ID,     /* Relation cache entry invalid */
    SHAREDINVALSMGR_ID,         /* Physical file changed */
    SHAREDINVALRELMAP_ID,       /* Relfilenode mapping changed */
    SHAREDINVALSNAPSHOT_ID      /* Snapshot-related invalidation */
} SharedInvalidationMessageType;

/* Sending invalidation */
void
CacheInvalidateHeapTuple(Relation relation, HeapTuple oldtuple, HeapTuple newtuple)
{
    /* Called when a catalog tuple is modified */
    
    /* Build invalidation message */
    SharedInvalidationMessage msg;
    msg.id = SHAREDINVALCATALOG_ID;
    msg.cat.catId = relation->rd_id;
    msg.cat.hashValue = GetTupleHashValue(...);
    
    /* Queue message for other backends */
    SendSharedInvalidMessages(&msg, 1);
}

/* Receiving invalidation */
void
InvalidateSystemCaches(void)
{
    /* Called at safe points (query boundaries, etc.) */
    
    /* Process pending invalidation messages */
    AcceptInvalidationMessages();
    
    /* Each message triggers cache cleanup */
    /* e.g., syscache entries removed, relcache entries rebuilt */
}

/*
 * Invalidation timing:
 *
 * 1. Transaction modifies pg_class
 * 2. At COMMIT: invalidation message queued
 * 3. Other backends check for messages at:
 *    - Start of each command
 *    - Lock acquisition
 *    - Explicit AcceptInvalidationMessages()
 * 4. Backend processes message, invalidates cache
 * 5. Next access rebuilds cache from disk
 */

Part 4: Catalog Access Patterns

4.1 Reading Catalog Information

-- Using information_schema (SQL standard, but slow)
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'users';

-- Using pg_catalog (faster, more complete)
SELECT 
    a.attname,
    pg_catalog.format_type(a.atttypid, a.atttypmod) as data_type,
    NOT a.attnotnull as is_nullable
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = 'users'::regclass
  AND a.attnum > 0
  AND NOT a.attisdropped;

-- Type resolution with regtype
SELECT 'int4'::regtype::oid;           -- 23
SELECT 23::regtype;                     -- integer
SELECT 'int4[]'::regtype;               -- integer[]

-- Relation resolution with regclass
SELECT 'users'::regclass::oid;          -- Relation OID
SELECT 'public.users'::regclass;        -- Schema-qualified

-- Function resolution with regprocedure
SELECT 'array_agg(anynonarray)'::regprocedure::oid;

4.2 Catalog Modification

/* Extension creating a new type */

/* Step 1: Build tuple for pg_type */
Datum values[Natts_pg_type];
bool nulls[Natts_pg_type];

memset(nulls, false, sizeof(nulls));

values[Anum_pg_type_typname - 1] = NameGetDatum(&typeName);
values[Anum_pg_type_typnamespace - 1] = ObjectIdGetDatum(namespace);
values[Anum_pg_type_typowner - 1] = ObjectIdGetDatum(GetUserId());
values[Anum_pg_type_typlen - 1] = Int16GetDatum(sizeof(MyType));
values[Anum_pg_type_typbyval - 1] = BoolGetDatum(true);
values[Anum_pg_type_typtype - 1] = CharGetDatum(TYPTYPE_BASE);
/* ... set all required fields ... */

/* Step 2: Insert into pg_type */
Relation typeRel = table_open(TypeRelationId, RowExclusiveLock);

HeapTuple tuple = heap_form_tuple(RelationGetDescr(typeRel),
                                  values, nulls);

/* Assign OID */
Oid typeOid = GetNewOidWithIndex(typeRel, TypeOidIndexId,
                                  Anum_pg_type_oid);
HeapTupleSetOid(tuple, typeOid);

CatalogTupleInsert(typeRel, tuple);

table_close(typeRel, RowExclusiveLock);

/* Step 3: Create dependency records */
ObjectAddress myself, referenced;

myself.classId = TypeRelationId;
myself.objectId = typeOid;
myself.objectSubId = 0;

referenced.classId = NamespaceRelationId;
referenced.objectId = namespace;
referenced.objectSubId = 0;

recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL);

4.3 Using Catalog for Query Planning

/* Planner accessing catalog info */

/* Get relation statistics */
void
get_relation_info(PlannerInfo *root, Oid relationOid,
                  bool inhparent, RelOptInfo *rel)
{
    Relation    relation;
    
    relation = table_open(relationOid, NoLock);  /* Already locked */
    
    /* Get size estimates */
    rel->pages = relation->rd_rel->relpages;
    rel->tuples = relation->rd_rel->reltuples;
    rel->allvisfrac = (double) relation->rd_rel->relallvisible / 
                      (double) relation->rd_rel->relpages;
    
    /* Get index info */
    if (relation->rd_rel->relhasindex)
    {
        List *indexoidlist = RelationGetIndexList(relation);
        
        foreach(l, indexoidlist)
        {
            Oid indexoid = lfirst_oid(l);
            /* Build IndexOptInfo for each index */
            get_index_info(root, indexoid, inhparent, rel);
        }
    }
    
    table_close(relation, NoLock);
}

/* Get column statistics */
void
examine_variable(PlannerInfo *root, Node *node, int varRelid,
                 VariableStatData *vardata)
{
    /* Look up pg_statistic for the column */
    HeapTuple statstuple = SearchSysCache3(STATRELATTINH,
                                           ObjectIdGetDatum(relid),
                                           Int16GetDatum(attnum),
                                           BoolGetDatum(false));
    
    if (HeapTupleIsValid(statstuple))
    {
        /* Extract selectivity estimates, histograms, etc. */
        Form_pg_statistic stats = (Form_pg_statistic) GETSTRUCT(statstuple);
        vardata->stadistinct = stats->stadistinct;
        /* ... */
        ReleaseSysCache(statstuple);
    }
}

Part 5: Statistics (pg_statistic)

5.1 Understanding pg_statistic

-- pg_statistic stores column statistics for query planning

-- Raw access (complex structure)
SELECT * FROM pg_statistic WHERE starelid = 'users'::regclass;

-- Better: Use pg_stats view
SELECT 
    attname,
    null_frac,          -- Fraction of NULL values
    avg_width,          -- Average column width in bytes
    n_distinct,         -- Number of distinct values (or ratio if negative)
    most_common_vals,   -- Most common values
    most_common_freqs,  -- Frequencies of most common values
    histogram_bounds,   -- Bounds for histogram buckets
    correlation         -- Physical vs logical ordering correlation
FROM pg_stats
WHERE tablename = 'users';

-- n_distinct interpretation:
-- > 0: Estimated number of distinct values
-- -1: All values are distinct (unique)
-- < 0: abs(n_distinct) is fraction of rows that are distinct

-- correlation interpretation:
-- 1.0: Perfect correlation (values in physical order)
-- 0.0: No correlation (random order)
-- -1.0: Perfect anti-correlation (reverse order)
-- High correlation means index range scans are efficient

5.2 Extended Statistics

-- For correlated columns, single-column stats aren't enough

-- Create extended statistics for correlation
CREATE STATISTICS users_city_country_stats (dependencies, ndistinct, mcv)
ON city, country FROM users;

-- Force ANALYZE to compute
ANALYZE users;

-- View extended statistics
SELECT 
    stxname,
    stxkeys,            -- Column numbers
    stxkind,            -- Statistic kinds
    stxndistinct,       -- N-distinct for column combinations
    stxdependencies,    -- Functional dependencies
    stxexprs            -- Expression stats
FROM pg_statistic_ext_data sed
JOIN pg_statistic_ext se ON sed.stxoid = se.oid
WHERE se.stxrelid = 'users'::regclass;

-- Why extended stats matter:
-- Query: WHERE city = 'Paris' AND country = 'France'
-- Without extended stats: Planner assumes independence
--   selectivity = sel(city='Paris') * sel(country='France')
--   Might estimate 0.01 * 0.05 = 0.0005 (too low!)
-- With extended stats: Planner knows they're correlated
--   Correctly estimates rows where city=Paris implies country=France

Part 6: Dependency Tracking

6.1 pg_depend

-- pg_depend tracks dependencies between database objects

SELECT 
    classid::regclass as dependent_type,
    objid,
    objsubid,           -- 0=whole object, >0=column number
    refclassid::regclass as referenced_type,
    refobjid,
    refobjsubid,
    deptype             -- Dependency type
FROM pg_depend
WHERE refobjid = 'users'::regclass::oid
LIMIT 10;

-- deptype values:
-- 'n' = NORMAL: Drop of referenced obj drops dependent obj
-- 'a' = AUTO: Automatic dependency (created by system)
-- 'i' = INTERNAL: Dependent obj is component of referenced
-- 'P' = PARTITION_PRI: Partition primary dependency
-- 'S' = PARTITION_SEC: Partition secondary dependency
-- 'e' = EXTENSION: Dependent obj is member of extension

-- Find what depends on a table
SELECT 
    CASE classid
        WHEN 'pg_constraint'::regclass THEN 'constraint'
        WHEN 'pg_trigger'::regclass THEN 'trigger'
        WHEN 'pg_rewrite'::regclass THEN 'view/rule'
        WHEN 'pg_class'::regclass THEN 'index/sequence'
        ELSE classid::regclass::text
    END as type,
    objid::regclass::text as dependent_object
FROM pg_depend
WHERE refclassid = 'pg_class'::regclass
  AND refobjid = 'users'::regclass::oid
  AND deptype = 'n';

6.2 Using Dependencies in Code

/* Recording a dependency */
void
recordDependencyOn(const ObjectAddress *depender,
                   const ObjectAddress *referenced,
                   DependencyType behavior)
{
    /* Insert into pg_depend */
    Relation dependDesc = table_open(DependRelationId, RowExclusiveLock);
    
    Datum values[Natts_pg_depend];
    bool nulls[Natts_pg_depend];
    
    values[Anum_pg_depend_classid - 1] = ObjectIdGetDatum(depender->classId);
    values[Anum_pg_depend_objid - 1] = ObjectIdGetDatum(depender->objectId);
    values[Anum_pg_depend_objsubid - 1] = Int32GetDatum(depender->objectSubId);
    values[Anum_pg_depend_refclassid - 1] = ObjectIdGetDatum(referenced->classId);
    values[Anum_pg_depend_refobjid - 1] = ObjectIdGetDatum(referenced->objectId);
    values[Anum_pg_depend_refobjsubid - 1] = Int32GetDatum(referenced->objectSubId);
    values[Anum_pg_depend_deptype - 1] = CharGetDatum((char) behavior);
    
    CatalogTupleInsert(dependDesc, heap_form_tuple(...));
    table_close(dependDesc, RowExclusiveLock);
}

/* Checking dependencies before DROP */
ObjectAddresses *
findDependentObjects(const ObjectAddress *object,
                     int flags,
                     ObjectAddresses *pendingObjects,
                     const ObjectAddresses *alreadyDeleted)
{
    /* Recursively find all dependent objects */
    /* Returns list for cascade delete or error for restrict */
}

/* Example: Why DROP TABLE CASCADE works */
/*
 * 1. Find all objects depending on table
 * 2. For each NORMAL dependency: plan to drop
 * 3. Recursively find dependencies of those
 * 4. Sort by dependency order
 * 5. Drop in order (dependents first)
 */

Part 7: Catalog Access in Extensions

7.1 Safe Catalog Reading

/* Extension reading catalog safely */

#include "catalog/pg_type.h"
#include "utils/syscache.h"
#include "utils/lsyscache.h"

/* Use lsyscache.h convenience functions when possible */
char *
get_type_name(Oid typid)
{
    /* Convenience function - handles cache internally */
    return format_type_be(typid);
}

/* Or use syscache directly */
Oid
get_type_input_function(Oid typid)
{
    HeapTuple   tp;
    Form_pg_type typtup;
    Oid         result;
    
    tp = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typid));
    if (!HeapTupleIsValid(tp))
        elog(ERROR, "cache lookup failed for type %u", typid);
    
    typtup = (Form_pg_type) GETSTRUCT(tp);
    result = typtup->typinput;
    
    ReleaseSysCache(tp);  /* Don't forget! */
    
    return result;
}

/* Scanning a catalog table */
void
scan_user_tables(void)
{
    Relation    rel;
    TableScanDesc scan;
    HeapTuple   tuple;
    
    rel = table_open(RelationRelationId, AccessShareLock);
    scan = table_beginscan_catalog(rel, 0, NULL);
    
    while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
    {
        Form_pg_class classForm = (Form_pg_class) GETSTRUCT(tuple);
        
        if (classForm->relkind == RELKIND_RELATION &&
            classForm->relnamespace != PG_CATALOG_NAMESPACE)
        {
            elog(NOTICE, "Found table: %s", NameStr(classForm->relname));
        }
    }
    
    table_endscan(scan);
    table_close(rel, AccessShareLock);
}

7.2 Catalog Modification in Extensions

/* Creating a custom catalog table (advanced) */

/* 
 * Extensions typically use regular tables, not catalogs.
 * But some need true catalog tables for:
 * - Integration with DROP CASCADE
 * - Visibility in pg_dump
 * - Participation in cache invalidation
 */

void
CreateMyExtensionCatalog(void)
{
    Relation    rel;
    TupleDesc   tupdesc;
    Oid         relid;
    
    /* Define columns */
    tupdesc = CreateTemplateTupleDesc(3);
    TupleDescInitEntry(tupdesc, 1, "id", OIDOID, -1, 0);
    TupleDescInitEntry(tupdesc, 2, "name", TEXTOID, -1, 0);
    TupleDescInitEntry(tupdesc, 3, "value", INT4OID, -1, 0);
    
    /* Create the table */
    relid = heap_create_with_catalog(
        "my_catalog",           /* Table name */
        PG_CATALOG_NAMESPACE,   /* Put in pg_catalog */
        InvalidOid,             /* Tablespace */
        InvalidOid,             /* Type OID (auto-assign) */
        InvalidOid,             /* Array type OID (auto-assign) */
        GetUserId(),            /* Owner */
        TableAmOid,             /* Heap access method */
        tupdesc,                /* Tuple descriptor */
        NIL,                    /* Column defaults */
        RELKIND_RELATION,       /* Relation kind */
        RELPERSISTENCE_PERMANENT,
        false,                  /* Not shared */
        false,                  /* Not mapped */
        ONCOMMIT_NOOP,          /* On commit action */
        NULL,                   /* Reloptions */
        false,                  /* Use user's acl */
        true,                   /* Allow system columns */
        false,                  /* Is internal */
        InvalidOid,             /* relrewrite */
        NULL                    /* typaddress */
    );
    
    /* Create index on id */
    /* ... */
    
    /* Record in pg_extension_config_dump for pg_dump */
    /* ... */
}

Part 8: Interview Questions

Answer:Catalog changes:
  1. Insert new row into pg_attribute with new column info
  2. Update pg_class.relnatts (increment column count)
  3. If column has default: update pg_attrdef
  4. If column has constraint: update pg_constraint
Cache invalidation: 5. Send invalidation message for the relation 6. All backends will rebuild their RelCache entryData changes (depends on column type):
  • No default, nullable: No data changes! Just metadata.
  • With constant default (PG11+): Store default in catalog, compute on read
  • With volatile default or old versions: Rewrite entire table
The key insight: Adding a nullable column without default is nearly instant regardless of table size, because it’s metadata-only.
Answer:
  1. Query pg_depend: Find all objects with NORMAL dependency on the table
SELECT * FROM pg_depend 
WHERE refobjid = 'mytable'::regclass 
AND deptype = 'n';
  1. Recursive expansion: For each dependent object, find its dependents
  2. Topological sort: Order objects so dependents are dropped before what they depend on
  3. Execute drops: Drop in sorted order
Dependency types:
  • Views depending on table
  • Foreign key constraints
  • Triggers on the table
  • Indexes
  • Policies (RLS)
  • Sequences (if SERIAL)
Without CASCADE: Report error listing dependents
Answer:Per-backend caching advantages:
  1. No locking overhead: Reading cached data requires no locks. Shared cache would need read locks for every access.
  2. Transaction isolation: Each backend sees catalog as of its snapshot. Shared cache would need version management.
  3. Simplicity: No complex shared data structure. Just hash tables in local memory.
  4. Cache warmth: Each backend caches what it needs. Working sets often differ.
Trade-offs:
  1. Memory usage: Same data cached multiple times
    • Mitigated: Catalog data is small compared to data pages
  2. Cache invalidation: Must propagate changes
    • Solved: Shared invalidation message queue (sinval)
  3. Cold start: New backends start with empty cache
    • Mitigated: Relation cache init file preloads common entries
The key insight: For catalog data, avoiding lock contention is worth the memory duplication.

Next Steps