Catalog System Deep Dive
This module explores PostgreSQL’s system catalog — the metadata that describes all database objects. Understanding the catalog is essential for extension development, performance tuning, and contributing to PostgreSQL.Target Audience: Extension developers, DBAs, core contributors
Prerequisites: SQL foundations, basic PostgreSQL internals
Source Directories:
Interview Relevance: Staff+ database roles, extension development
Prerequisites: SQL foundations, basic PostgreSQL internals
Source Directories:
src/backend/catalog/, src/include/catalog/Interview Relevance: Staff+ database roles, extension development
Part 1: Catalog Architecture Overview
Copy
┌─────────────────────────────────────────────────────────────────────────────┐
│ SYSTEM CATALOG ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ System catalogs are regular tables stored in pg_catalog schema. │
│ They describe ALL database objects: tables, columns, functions, etc. │
│ │
│ Key Catalogs: │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ pg_class ───────► Describes tables, indexes, views, sequences │ │
│ │ │ │ │
│ │ └───► pg_attribute ───► Columns of each relation │ │
│ │ │ │ │
│ │ └───► pg_index ───► Index information │ │
│ │ │ │
│ │ pg_type ───────► All data types │ │
│ │ │ │ │
│ │ └───► pg_operator ───► Operators for types │ │
│ │ │ │
│ │ pg_proc ───────► Functions and procedures │ │
│ │ │ │ │
│ │ └───► pg_aggregate ───► Aggregate function details │ │
│ │ │ │
│ │ pg_namespace ──► Schemas │ │
│ │ pg_database ───► Databases │ │
│ │ pg_tablespace ─► Tablespaces │ │
│ │ pg_authid ─────► Roles/users │ │
│ │ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │
│ Catalogs → Cache → Backend Access │
│ ───────────────────────────────────────────────────────────────────────── │
│ Direct catalog access is slow (disk I/O). │
│ PostgreSQL caches catalog data in memory: │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ pg_class │───►│ RelCache │───►│ Relation │ │
│ │ (on disk) │ │ (per-backend) │ │ (in memory) │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ pg_type │───►│ TypeCache │───►│ TypeCacheEntry │ │
│ │ pg_operator │ │ (per-backend) │ │ (in memory) │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Part 2: Key System Catalogs
2.1 pg_class — Relations
Copy
-- pg_class stores ALL "relations": tables, indexes, views, sequences, etc.
SELECT
oid, -- Object ID (unique identifier)
relname, -- Relation name
relnamespace, -- Schema OID (pg_namespace.oid)
reltype, -- Row type OID (pg_type.oid)
reloftype, -- Typed table's type (0 if not typed)
relowner, -- Owner (pg_authid.oid)
relam, -- Access method (pg_am.oid for indexes)
relfilenode, -- Physical file name
reltablespace, -- Tablespace (0 = default)
relpages, -- Size in pages (estimated by ANALYZE)
reltuples, -- Row count (estimated by ANALYZE)
relallvisible, -- Pages marked all-visible
reltoastrelid, -- TOAST table OID (if any)
relhasindex, -- Has indexes?
relisshared, -- Shared across all databases?
relpersistence, -- 'p'=permanent, 'u'=unlogged, 't'=temp
relkind, -- 'r'=table, 'i'=index, 'v'=view, etc.
relnatts, -- Number of columns
relchecks, -- Number of CHECK constraints
relhasrules, -- Has rules?
relhastriggers, -- Has triggers?
relhassubclass, -- Has child tables?
relrowsecurity, -- RLS enabled?
relforcerowsecurity, -- Force RLS even for owner?
relreplident, -- Replica identity
relispartition, -- Is a partition?
relrewrite, -- During ALTER TABLE rewrite
relfrozenxid, -- Oldest XID for vacuum
relminmxid, -- Oldest MultiXact
relacl, -- Access privileges
reloptions -- Storage options (WITH clause)
FROM pg_class
WHERE relname = 'users';
-- relkind values:
-- 'r' = ordinary table
-- 'i' = index
-- 'S' = sequence
-- 't' = TOAST table
-- 'v' = view
-- 'm' = materialized view
-- 'c' = composite type
-- 'f' = foreign table
-- 'p' = partitioned table
-- 'I' = partitioned index
2.2 pg_attribute — Columns
Copy
-- pg_attribute describes columns of ALL relations
SELECT
attrelid, -- Table OID (pg_class.oid)
attname, -- Column name
atttypid, -- Data type OID (pg_type.oid)
attstattarget, -- Statistics target (-1 = use default)
attlen, -- Type length (-1 = variable)
attnum, -- Column number (1-based for user cols)
attndims, -- Array dimensions (0 if not array)
attcacheoff, -- Cached offset (filled at runtime)
atttypmod, -- Type modifier (e.g., varchar(100))
attbyval, -- Passed by value? (else by reference)
attstorage, -- Storage strategy (TOAST)
attalign, -- Alignment requirement
attnotnull, -- NOT NULL constraint?
atthasdef, -- Has DEFAULT?
atthasmissing, -- Has missing value for ALTER?
attidentity, -- 'a'=always, 'd'=by default, ''=none
attgenerated, -- 's'=stored generated, ''=none
attisdropped, -- Column was dropped (invisible)
attislocal, -- Defined locally (not inherited)
attinhcount, -- Number of inheritance ancestors
attcollation, -- Collation OID
attacl, -- Column-level privileges
attoptions, -- Attribute options
attfdwoptions -- FDW-specific options
FROM pg_attribute
WHERE attrelid = 'users'::regclass
AND attnum > 0 -- Skip system columns
AND NOT attisdropped -- Skip dropped columns
ORDER BY attnum;
-- System columns (attnum < 0):
-- -1 = ctid (tuple ID)
-- -2 = xmin (insert transaction)
-- -3 = cmin (insert command)
-- -4 = xmax (delete transaction)
-- -5 = cmax (delete command)
-- -6 = tableoid (table OID)
2.3 pg_type — Data Types
Copy
-- pg_type defines ALL data types
SELECT
oid,
typname, -- Type name
typnamespace, -- Schema
typowner, -- Owner
typlen, -- Size (-1=variable, -2=cstring)
typbyval, -- Pass by value?
typtype, -- 'b'=base, 'c'=composite, 'd'=domain, etc.
typcategory, -- Category for parser (S=string, N=numeric, etc.)
typispreferred, -- Preferred in category?
typisdefined, -- Is defined?
typdelim, -- Array delimiter
typrelid, -- For composite: pg_class OID
typelem, -- For array: element type
typarray, -- Array type of this type
typinput, -- Input function
typoutput, -- Output function
typreceive, -- Binary input
typsend, -- Binary output
typmodin, -- Type modifier input
typmodout, -- Type modifier output
typanalyze, -- Custom analyze function
typalign, -- Alignment requirement
typstorage, -- TOAST strategy
typnotnull, -- For domains: NOT NULL?
typbasetype, -- For domains: base type
typtypmod, -- For domains: typmod
typndims, -- For domains: array dimensions
typcollation, -- Collation
typdefaultbin, -- Default expression (nodeToString)
typdefault, -- Default as text
typacl -- Access privileges
FROM pg_type
WHERE typname = 'int4';
-- typtype values:
-- 'b' = base type
-- 'c' = composite type (row type)
-- 'd' = domain
-- 'e' = enum
-- 'p' = pseudo-type (any, record, trigger, etc.)
-- 'r' = range type
-- 'm' = multirange type
2.4 pg_proc — Functions
Copy
-- pg_proc stores functions, procedures, and aggregates
SELECT
oid,
proname, -- Function name
pronamespace, -- Schema
proowner, -- Owner
prolang, -- Language (pg_language.oid)
procost, -- Estimated execution cost
prorows, -- Estimated rows (for set-returning)
provariadic, -- Variadic argument type (0=none)
prosupport, -- Planner support function
prokind, -- 'f'=function, 'p'=procedure, 'a'=aggregate, 'w'=window
prosecdef, -- Security definer?
proleakproof, -- Leak-proof?
proisstrict, -- Strict (NULL in = NULL out)?
proretset, -- Returns set?
provolatile, -- 'i'=immutable, 's'=stable, 'v'=volatile
proparallel, -- 's'=safe, 'r'=restricted, 'u'=unsafe
pronargs, -- Number of arguments
pronargdefaults, -- Number of arguments with defaults
prorettype, -- Return type OID
proargtypes, -- Argument types (oidvector)
proallargtypes, -- All arg types (including OUT)
proargmodes, -- Argument modes (i=IN, o=OUT, b=INOUT, v=VARIADIC)
proargnames, -- Argument names
proargdefaults, -- Default expressions
protrftypes, -- Transform types
prosrc, -- Source code or link symbol
probin, -- Shared library path
proconfig, -- proconfig options
proacl -- Access privileges
FROM pg_proc
WHERE proname = 'array_agg';
Part 3: Catalog Caching
3.1 System Cache (SysCache)
Copy
/* src/backend/utils/cache/syscache.c */
/*
* SysCache provides hash-based lookup of catalog tuples.
* Each cache is keyed by one or more columns.
*/
/* Cache identifiers */
typedef enum SysCacheIdentifier
{
AMNAME, /* pg_am by name */
AMOID, /* pg_am by OID */
ATTNAME, /* pg_attribute by (relid, name) */
ATTNUM, /* pg_attribute by (relid, attnum) */
AUTHMEMROLEMEM, /* pg_auth_members by (role, member) */
AUTHNAME, /* pg_authid by name */
AUTHOID, /* pg_authid by OID */
CLAOID, /* pg_class by OID */
CLARELNAME, /* pg_class by (namespace, name) */
PROCNAMEARGSNSP, /* pg_proc by (name, argtypes, namespace) */
PROCOID, /* pg_proc by OID */
TYPEOID, /* pg_type by OID */
TYPENAME, /* pg_type by (namespace, name) */
/* ... many more ... */
} SysCacheIdentifier;
/* Looking up in syscache */
HeapTuple
SearchSysCache1(int cacheId, Datum key1)
{
/* Returns cached copy of catalog tuple */
/* Caller must call ReleaseSysCache() when done */
}
/* Example: Look up type by OID */
HeapTuple tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typeId));
if (HeapTupleIsValid(tuple))
{
Form_pg_type typeForm = (Form_pg_type) GETSTRUCT(tuple);
char *typname = NameStr(typeForm->typname);
/* ... use type info ... */
ReleaseSysCache(tuple); /* Must release! */
}
/* Convenience macros */
Form_pg_class classForm = (Form_pg_class) GETSTRUCT(
SearchSysCache1(CLAOID, ObjectIdGetDatum(relid)));
/* Or use GetSysCacheOid which returns just the OID */
Oid typeId = GetSysCacheOid2(TYPENAME,
Anum_pg_type_oid,
CStringGetDatum(typname),
ObjectIdGetDatum(namespaceId));
3.2 Relation Cache (RelCache)
Copy
/* src/backend/utils/cache/relcache.c */
/*
* RelCache stores open relation descriptors.
* Much more than just pg_class tuple - includes all relation metadata.
*/
typedef struct RelationData
{
/* Basic info */
Oid rd_id; /* Relation OID */
Form_pg_class rd_rel; /* COPY of pg_class row */
TupleDesc rd_att; /* Tuple descriptor */
/* Storage info */
RelFileNode rd_node; /* Physical storage identifiers */
SMgrRelation rd_smgr; /* Storage manager handle */
/* Indexes */
Oid *rd_indexlist; /* OIDs of indexes on relation */
int rd_indexvalid; /* Is index list valid? */
/* Constraints */
Bitmapset *rd_keyattr; /* Key column numbers */
/* Partition info */
PartitionKey rd_partkey; /* Partition key */
PartitionDesc rd_partdesc; /* Partition descriptor */
/* Trigger info */
TriggerDesc *trigdesc; /* Trigger data */
/* RLS info */
RowSecurityDesc *rd_rsdesc; /* Row security policies */
/* Statistics */
int32 rd_rel->relpages; /* Size estimate */
float4 rd_rel->reltuples; /* Row estimate */
/* ... many more fields ... */
} RelationData;
/* Opening a relation */
Relation
table_open(Oid relationId, LOCKMODE lockmode)
{
Relation rel;
/* Look up in relcache */
rel = RelationIdGetRelation(relationId);
if (!RelationIsValid(rel))
ereport(ERROR, ...);
/* Acquire the requested lock */
LockRelationOid(relationId, lockmode);
return rel;
}
/* Closing a relation */
void
table_close(Relation relation, LOCKMODE lockmode)
{
/* Release lock if we acquired one */
if (lockmode != NoLock)
UnlockRelationOid(relation->rd_id, lockmode);
/* Decrement reference count */
RelationClose(relation);
}
3.3 Cache Invalidation
Copy
/*
* When catalog changes, all backends must invalidate cached copies.
* PostgreSQL uses "sinval" (shared invalidation) messages.
*/
/* Invalidation message types */
typedef enum
{
SHAREDINVALCATALOG_ID, /* Catalog tuple changed */
SHAREDINVALRELCACHE_ID, /* Relation cache entry invalid */
SHAREDINVALSMGR_ID, /* Physical file changed */
SHAREDINVALRELMAP_ID, /* Relfilenode mapping changed */
SHAREDINVALSNAPSHOT_ID /* Snapshot-related invalidation */
} SharedInvalidationMessageType;
/* Sending invalidation */
void
CacheInvalidateHeapTuple(Relation relation, HeapTuple oldtuple, HeapTuple newtuple)
{
/* Called when a catalog tuple is modified */
/* Build invalidation message */
SharedInvalidationMessage msg;
msg.id = SHAREDINVALCATALOG_ID;
msg.cat.catId = relation->rd_id;
msg.cat.hashValue = GetTupleHashValue(...);
/* Queue message for other backends */
SendSharedInvalidMessages(&msg, 1);
}
/* Receiving invalidation */
void
InvalidateSystemCaches(void)
{
/* Called at safe points (query boundaries, etc.) */
/* Process pending invalidation messages */
AcceptInvalidationMessages();
/* Each message triggers cache cleanup */
/* e.g., syscache entries removed, relcache entries rebuilt */
}
/*
* Invalidation timing:
*
* 1. Transaction modifies pg_class
* 2. At COMMIT: invalidation message queued
* 3. Other backends check for messages at:
* - Start of each command
* - Lock acquisition
* - Explicit AcceptInvalidationMessages()
* 4. Backend processes message, invalidates cache
* 5. Next access rebuilds cache from disk
*/
Part 4: Catalog Access Patterns
4.1 Reading Catalog Information
Copy
-- Using information_schema (SQL standard, but slow)
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'users';
-- Using pg_catalog (faster, more complete)
SELECT
a.attname,
pg_catalog.format_type(a.atttypid, a.atttypmod) as data_type,
NOT a.attnotnull as is_nullable
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = 'users'::regclass
AND a.attnum > 0
AND NOT a.attisdropped;
-- Type resolution with regtype
SELECT 'int4'::regtype::oid; -- 23
SELECT 23::regtype; -- integer
SELECT 'int4[]'::regtype; -- integer[]
-- Relation resolution with regclass
SELECT 'users'::regclass::oid; -- Relation OID
SELECT 'public.users'::regclass; -- Schema-qualified
-- Function resolution with regprocedure
SELECT 'array_agg(anynonarray)'::regprocedure::oid;
4.2 Catalog Modification
Copy
/* Extension creating a new type */
/* Step 1: Build tuple for pg_type */
Datum values[Natts_pg_type];
bool nulls[Natts_pg_type];
memset(nulls, false, sizeof(nulls));
values[Anum_pg_type_typname - 1] = NameGetDatum(&typeName);
values[Anum_pg_type_typnamespace - 1] = ObjectIdGetDatum(namespace);
values[Anum_pg_type_typowner - 1] = ObjectIdGetDatum(GetUserId());
values[Anum_pg_type_typlen - 1] = Int16GetDatum(sizeof(MyType));
values[Anum_pg_type_typbyval - 1] = BoolGetDatum(true);
values[Anum_pg_type_typtype - 1] = CharGetDatum(TYPTYPE_BASE);
/* ... set all required fields ... */
/* Step 2: Insert into pg_type */
Relation typeRel = table_open(TypeRelationId, RowExclusiveLock);
HeapTuple tuple = heap_form_tuple(RelationGetDescr(typeRel),
values, nulls);
/* Assign OID */
Oid typeOid = GetNewOidWithIndex(typeRel, TypeOidIndexId,
Anum_pg_type_oid);
HeapTupleSetOid(tuple, typeOid);
CatalogTupleInsert(typeRel, tuple);
table_close(typeRel, RowExclusiveLock);
/* Step 3: Create dependency records */
ObjectAddress myself, referenced;
myself.classId = TypeRelationId;
myself.objectId = typeOid;
myself.objectSubId = 0;
referenced.classId = NamespaceRelationId;
referenced.objectId = namespace;
referenced.objectSubId = 0;
recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL);
4.3 Using Catalog for Query Planning
Copy
/* Planner accessing catalog info */
/* Get relation statistics */
void
get_relation_info(PlannerInfo *root, Oid relationOid,
bool inhparent, RelOptInfo *rel)
{
Relation relation;
relation = table_open(relationOid, NoLock); /* Already locked */
/* Get size estimates */
rel->pages = relation->rd_rel->relpages;
rel->tuples = relation->rd_rel->reltuples;
rel->allvisfrac = (double) relation->rd_rel->relallvisible /
(double) relation->rd_rel->relpages;
/* Get index info */
if (relation->rd_rel->relhasindex)
{
List *indexoidlist = RelationGetIndexList(relation);
foreach(l, indexoidlist)
{
Oid indexoid = lfirst_oid(l);
/* Build IndexOptInfo for each index */
get_index_info(root, indexoid, inhparent, rel);
}
}
table_close(relation, NoLock);
}
/* Get column statistics */
void
examine_variable(PlannerInfo *root, Node *node, int varRelid,
VariableStatData *vardata)
{
/* Look up pg_statistic for the column */
HeapTuple statstuple = SearchSysCache3(STATRELATTINH,
ObjectIdGetDatum(relid),
Int16GetDatum(attnum),
BoolGetDatum(false));
if (HeapTupleIsValid(statstuple))
{
/* Extract selectivity estimates, histograms, etc. */
Form_pg_statistic stats = (Form_pg_statistic) GETSTRUCT(statstuple);
vardata->stadistinct = stats->stadistinct;
/* ... */
ReleaseSysCache(statstuple);
}
}
Part 5: Statistics (pg_statistic)
5.1 Understanding pg_statistic
Copy
-- pg_statistic stores column statistics for query planning
-- Raw access (complex structure)
SELECT * FROM pg_statistic WHERE starelid = 'users'::regclass;
-- Better: Use pg_stats view
SELECT
attname,
null_frac, -- Fraction of NULL values
avg_width, -- Average column width in bytes
n_distinct, -- Number of distinct values (or ratio if negative)
most_common_vals, -- Most common values
most_common_freqs, -- Frequencies of most common values
histogram_bounds, -- Bounds for histogram buckets
correlation -- Physical vs logical ordering correlation
FROM pg_stats
WHERE tablename = 'users';
-- n_distinct interpretation:
-- > 0: Estimated number of distinct values
-- -1: All values are distinct (unique)
-- < 0: abs(n_distinct) is fraction of rows that are distinct
-- correlation interpretation:
-- 1.0: Perfect correlation (values in physical order)
-- 0.0: No correlation (random order)
-- -1.0: Perfect anti-correlation (reverse order)
-- High correlation means index range scans are efficient
5.2 Extended Statistics
Copy
-- For correlated columns, single-column stats aren't enough
-- Create extended statistics for correlation
CREATE STATISTICS users_city_country_stats (dependencies, ndistinct, mcv)
ON city, country FROM users;
-- Force ANALYZE to compute
ANALYZE users;
-- View extended statistics
SELECT
stxname,
stxkeys, -- Column numbers
stxkind, -- Statistic kinds
stxndistinct, -- N-distinct for column combinations
stxdependencies, -- Functional dependencies
stxexprs -- Expression stats
FROM pg_statistic_ext_data sed
JOIN pg_statistic_ext se ON sed.stxoid = se.oid
WHERE se.stxrelid = 'users'::regclass;
-- Why extended stats matter:
-- Query: WHERE city = 'Paris' AND country = 'France'
-- Without extended stats: Planner assumes independence
-- selectivity = sel(city='Paris') * sel(country='France')
-- Might estimate 0.01 * 0.05 = 0.0005 (too low!)
-- With extended stats: Planner knows they're correlated
-- Correctly estimates rows where city=Paris implies country=France
Part 6: Dependency Tracking
6.1 pg_depend
Copy
-- pg_depend tracks dependencies between database objects
SELECT
classid::regclass as dependent_type,
objid,
objsubid, -- 0=whole object, >0=column number
refclassid::regclass as referenced_type,
refobjid,
refobjsubid,
deptype -- Dependency type
FROM pg_depend
WHERE refobjid = 'users'::regclass::oid
LIMIT 10;
-- deptype values:
-- 'n' = NORMAL: Drop of referenced obj drops dependent obj
-- 'a' = AUTO: Automatic dependency (created by system)
-- 'i' = INTERNAL: Dependent obj is component of referenced
-- 'P' = PARTITION_PRI: Partition primary dependency
-- 'S' = PARTITION_SEC: Partition secondary dependency
-- 'e' = EXTENSION: Dependent obj is member of extension
-- Find what depends on a table
SELECT
CASE classid
WHEN 'pg_constraint'::regclass THEN 'constraint'
WHEN 'pg_trigger'::regclass THEN 'trigger'
WHEN 'pg_rewrite'::regclass THEN 'view/rule'
WHEN 'pg_class'::regclass THEN 'index/sequence'
ELSE classid::regclass::text
END as type,
objid::regclass::text as dependent_object
FROM pg_depend
WHERE refclassid = 'pg_class'::regclass
AND refobjid = 'users'::regclass::oid
AND deptype = 'n';
6.2 Using Dependencies in Code
Copy
/* Recording a dependency */
void
recordDependencyOn(const ObjectAddress *depender,
const ObjectAddress *referenced,
DependencyType behavior)
{
/* Insert into pg_depend */
Relation dependDesc = table_open(DependRelationId, RowExclusiveLock);
Datum values[Natts_pg_depend];
bool nulls[Natts_pg_depend];
values[Anum_pg_depend_classid - 1] = ObjectIdGetDatum(depender->classId);
values[Anum_pg_depend_objid - 1] = ObjectIdGetDatum(depender->objectId);
values[Anum_pg_depend_objsubid - 1] = Int32GetDatum(depender->objectSubId);
values[Anum_pg_depend_refclassid - 1] = ObjectIdGetDatum(referenced->classId);
values[Anum_pg_depend_refobjid - 1] = ObjectIdGetDatum(referenced->objectId);
values[Anum_pg_depend_refobjsubid - 1] = Int32GetDatum(referenced->objectSubId);
values[Anum_pg_depend_deptype - 1] = CharGetDatum((char) behavior);
CatalogTupleInsert(dependDesc, heap_form_tuple(...));
table_close(dependDesc, RowExclusiveLock);
}
/* Checking dependencies before DROP */
ObjectAddresses *
findDependentObjects(const ObjectAddress *object,
int flags,
ObjectAddresses *pendingObjects,
const ObjectAddresses *alreadyDeleted)
{
/* Recursively find all dependent objects */
/* Returns list for cascade delete or error for restrict */
}
/* Example: Why DROP TABLE CASCADE works */
/*
* 1. Find all objects depending on table
* 2. For each NORMAL dependency: plan to drop
* 3. Recursively find dependencies of those
* 4. Sort by dependency order
* 5. Drop in order (dependents first)
*/
Part 7: Catalog Access in Extensions
7.1 Safe Catalog Reading
Copy
/* Extension reading catalog safely */
#include "catalog/pg_type.h"
#include "utils/syscache.h"
#include "utils/lsyscache.h"
/* Use lsyscache.h convenience functions when possible */
char *
get_type_name(Oid typid)
{
/* Convenience function - handles cache internally */
return format_type_be(typid);
}
/* Or use syscache directly */
Oid
get_type_input_function(Oid typid)
{
HeapTuple tp;
Form_pg_type typtup;
Oid result;
tp = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typid));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for type %u", typid);
typtup = (Form_pg_type) GETSTRUCT(tp);
result = typtup->typinput;
ReleaseSysCache(tp); /* Don't forget! */
return result;
}
/* Scanning a catalog table */
void
scan_user_tables(void)
{
Relation rel;
TableScanDesc scan;
HeapTuple tuple;
rel = table_open(RelationRelationId, AccessShareLock);
scan = table_beginscan_catalog(rel, 0, NULL);
while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
{
Form_pg_class classForm = (Form_pg_class) GETSTRUCT(tuple);
if (classForm->relkind == RELKIND_RELATION &&
classForm->relnamespace != PG_CATALOG_NAMESPACE)
{
elog(NOTICE, "Found table: %s", NameStr(classForm->relname));
}
}
table_endscan(scan);
table_close(rel, AccessShareLock);
}
7.2 Catalog Modification in Extensions
Copy
/* Creating a custom catalog table (advanced) */
/*
* Extensions typically use regular tables, not catalogs.
* But some need true catalog tables for:
* - Integration with DROP CASCADE
* - Visibility in pg_dump
* - Participation in cache invalidation
*/
void
CreateMyExtensionCatalog(void)
{
Relation rel;
TupleDesc tupdesc;
Oid relid;
/* Define columns */
tupdesc = CreateTemplateTupleDesc(3);
TupleDescInitEntry(tupdesc, 1, "id", OIDOID, -1, 0);
TupleDescInitEntry(tupdesc, 2, "name", TEXTOID, -1, 0);
TupleDescInitEntry(tupdesc, 3, "value", INT4OID, -1, 0);
/* Create the table */
relid = heap_create_with_catalog(
"my_catalog", /* Table name */
PG_CATALOG_NAMESPACE, /* Put in pg_catalog */
InvalidOid, /* Tablespace */
InvalidOid, /* Type OID (auto-assign) */
InvalidOid, /* Array type OID (auto-assign) */
GetUserId(), /* Owner */
TableAmOid, /* Heap access method */
tupdesc, /* Tuple descriptor */
NIL, /* Column defaults */
RELKIND_RELATION, /* Relation kind */
RELPERSISTENCE_PERMANENT,
false, /* Not shared */
false, /* Not mapped */
ONCOMMIT_NOOP, /* On commit action */
NULL, /* Reloptions */
false, /* Use user's acl */
true, /* Allow system columns */
false, /* Is internal */
InvalidOid, /* relrewrite */
NULL /* typaddress */
);
/* Create index on id */
/* ... */
/* Record in pg_extension_config_dump for pg_dump */
/* ... */
}
Part 8: Interview Questions
Q: What happens when you ALTER TABLE ADD COLUMN?
Q: What happens when you ALTER TABLE ADD COLUMN?
Answer:Catalog changes:
- Insert new row into
pg_attributewith new column info - Update
pg_class.relnatts(increment column count) - If column has default: update
pg_attrdef - If column has constraint: update
pg_constraint
- No default, nullable: No data changes! Just metadata.
- With constant default (PG11+): Store default in catalog, compute on read
- With volatile default or old versions: Rewrite entire table
Q: How does PostgreSQL know what to drop when you DROP TABLE CASCADE?
Q: How does PostgreSQL know what to drop when you DROP TABLE CASCADE?
Answer:
- Query pg_depend: Find all objects with NORMAL dependency on the table
Copy
SELECT * FROM pg_depend
WHERE refobjid = 'mytable'::regclass
AND deptype = 'n';
- Recursive expansion: For each dependent object, find its dependents
- Topological sort: Order objects so dependents are dropped before what they depend on
- Execute drops: Drop in sorted order
- Views depending on table
- Foreign key constraints
- Triggers on the table
- Indexes
- Policies (RLS)
- Sequences (if SERIAL)
Q: Why does PostgreSQL cache catalog data per-backend instead of using shared memory?
Q: Why does PostgreSQL cache catalog data per-backend instead of using shared memory?
Answer:Per-backend caching advantages:
- No locking overhead: Reading cached data requires no locks. Shared cache would need read locks for every access.
- Transaction isolation: Each backend sees catalog as of its snapshot. Shared cache would need version management.
- Simplicity: No complex shared data structure. Just hash tables in local memory.
- Cache warmth: Each backend caches what it needs. Working sets often differ.
- Memory usage: Same data cached multiple times
- Mitigated: Catalog data is small compared to data pages
- Cache invalidation: Must propagate changes
- Solved: Shared invalidation message queue (sinval)
- Cold start: New backends start with empty cache
- Mitigated: Relation cache init file preloads common entries