pg_dump Reference
Use this skill when implementing or debugging pgschema features that involve extracting schema information from PostgreSQL databases. pg_dump is the canonical PostgreSQL schema dumping tool and serves as a reference implementation for how to query system catalogs correctly.
When to Use This Skill
Invoke this skill when:
- •Adding support for new PostgreSQL schema objects
- •Debugging system catalog queries in
ir/inspector.go - •Understanding how PostgreSQL represents objects internally
- •Handling version-specific PostgreSQL features (versions 14-17)
- •Learning correct DDL formatting patterns
- •Understanding object dependency relationships
Source Code Locations
Main pg_dump repository: https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/
Key files to reference:
- •
pg_dump.c- Main implementation with system catalog queries - •
pg_dump.h- Data structures and function declarations - •
pg_dump_sort.c- Dependency sorting logic - •
pg_backup_archiver.c- Output formatting - •
common.c- Shared utility functions for querying system catalogs
Step-by-Step Workflow
1. Identify the Schema Object
Determine which PostgreSQL object type you're working with:
- •Tables and columns
- •Constraints (PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK)
- •Indexes (regular, unique, partial, functional)
- •Triggers (including WHEN conditions, constraint triggers)
- •Views and materialized views
- •Functions and procedures
- •Sequences
- •Types (enum, composite, domain)
- •Policies (row-level security)
- •Aggregates
- •Comments
2. Find the Relevant pg_dump Function
Search pg_dump.c for the function that handles your object type:
| Object Type | pg_dump Function | System Catalogs Used |
|---|---|---|
| Tables & Columns | getTables() | pg_class, pg_attribute, pg_type |
| Indexes | getIndexes() | pg_index, pg_class |
| Triggers | getTriggers() | pg_trigger, pg_proc |
| Functions | getFuncs() | pg_proc |
| Procedures | getProcs() | pg_proc |
| Views | getViews() | pg_class, pg_rewrite |
| Materialized Views | getMatViews() | pg_class |
| Sequences | getSequences() | pg_sequence, pg_class |
| Constraints | getConstraints() | pg_constraint |
| Policies | getPolicies() | pg_policy |
| Aggregates | getAggregates() | pg_aggregate, pg_proc |
| Types | getTypes() | pg_type |
| Comments | getComments() | pg_description |
3. Analyze the System Catalog Query
Examine the SQL query used by pg_dump:
- •Which system catalog tables are joined
- •Which columns are selected
- •How version-specific features are handled
- •How PostgreSQL internal functions are used (
pg_get_expr,pg_get_constraintdef, etc.)
Example - Extracting trigger WHEN conditions:
-- pg_dump's approach (from getTriggers):
SELECT t.tgname,
pg_get_expr(t.tgqual, t.tgrelid, false) as when_clause
FROM pg_catalog.pg_trigger t
WHERE t.tgqual IS NOT NULL
Note: information_schema.triggers.action_condition is NOT reliable for WHEN clauses. Always use pg_get_expr(t.tgqual, ...) from pg_catalog.pg_trigger.
4. Check for Special Cases
Look for how pg_dump handles:
- •Version compatibility: Different queries for different PostgreSQL versions
- •NULL handling: How missing values are interpreted
- •Default values: System vs. user-defined defaults
- •Internal objects: Filtering out system-generated objects
- •Dependencies: How object relationships are tracked
5. Adapt for pgschema
Apply the pattern to pgschema's codebase:
For database introspection (ir/inspector.go):
- •Adapt the system catalog query for Go/pgx
- •Use pgx parameter binding for safety
- •Handle NULL values appropriately
- •Add proper error handling
For SQL parsing (ir/parser.go):
- •Understand how pg_dump formats DDL
- •Use pg_query_go to parse SQL statements
- •Extract relevant fields into IR structures
For DDL generation (internal/diff/*.go):
- •Follow pg_dump's quoting rules
- •Use PostgreSQL functions for formatting complex expressions
- •Handle version-specific syntax
Key System Catalog Tables
Core Tables
- •
pg_class- Tables, indexes, views, sequences - •
pg_attribute- Table columns - •
pg_type- Data types - •
pg_constraint- Constraints (PK, FK, UNIQUE, CHECK) - •
pg_index- Index definitions
Functions & Triggers
- •
pg_proc- Functions, procedures, trigger functions - •
pg_trigger- Trigger definitions - •
pg_aggregate- Aggregate function definitions
Access Control
- •
pg_policy- Row-level security policies
Metadata
- •
pg_description- Comments on database objects - •
pg_depend- Object dependencies
Helper Functions
- •
pg_get_expr(expr, relation, pretty)- Deparse expressions - •
pg_get_constraintdef(constraint_oid, pretty)- Get constraint definition - •
pg_get_indexdef(index_oid, column, pretty)- Get index definition - •
pg_get_triggerdef(trigger_oid, pretty)- Get trigger definition
Important Considerations
pgschema is NOT pg_dump
Key differences:
- •Format: pgschema outputs declarative schema files for editing, pg_dump creates archive dumps for restore
- •Scope: pgschema focuses on single-schema objects, pg_dump handles entire databases
- •Workflow: pgschema supports plan/apply (Terraform-style), pg_dump is dump/restore only
- •Normalization: pgschema normalizes for comparison, pg_dump preserves exact format
When NOT to Copy pg_dump
Don't blindly copy pg_dump for:
- •Output formatting (pgschema has different conventions)
- •Archive/restore logic (not applicable to pgschema)
- •Full database dumps (pgschema is schema-focused)
- •Ancient version support (pgschema supports PostgreSQL 14+)
When pg_dump is Authoritative
Always reference pg_dump for:
- •System catalog query patterns
- •Understanding PostgreSQL internals
- •Correct use of
pg_get_*functions - •Version-specific feature detection
- •Object dependency tracking
Examples
Example 1: Extracting Generated Column Information
pg_dump approach:
SELECT a.attname,
a.attgenerated,
pg_get_expr(ad.adbin, ad.adrelid) as generation_expr
FROM pg_attribute a
LEFT JOIN pg_attrdef ad ON (a.attrelid = ad.adrelid AND a.attnum = ad.adnum)
WHERE a.attgenerated != ''
pgschema adaptation (in ir/inspector.go):
query := `
SELECT a.attname,
a.attgenerated,
pg_get_expr(ad.adbin, ad.adrelid) as generation_expr
FROM pg_attribute a
LEFT JOIN pg_attrdef ad ON (a.attrelid = ad.adrelid AND a.attnum = ad.adnum)
WHERE a.attrelid = $1 AND a.attgenerated != ''
`
rows, err := conn.Query(ctx, query, tableOID)
Example 2: Handling Partial Indexes
pg_dump extracts WHERE clauses:
SELECT pg_get_expr(i.indpred, i.indrelid, true) as index_predicate FROM pg_index i WHERE i.indpred IS NOT NULL
pgschema stores in IR (ir/ir.go):
type Index struct {
Name string
Columns []string
Predicate string // WHERE clause for partial indexes
// ...
}
Tips for Success
- •
Search strategically: Clone postgres repo and use grep/ag to search for specific system catalog columns or keywords
- •
Check git history: Use
git log -por GitHub blame to see when features were added and understand the evolution - •
Read comments carefully: pg_dump.c contains valuable comments explaining PostgreSQL internals and edge cases
- •
Cross-reference documentation: Always combine pg_dump source with official PostgreSQL documentation:
- •System catalogs: https://www.postgresql.org/docs/current/catalogs.html
- •Functions: https://www.postgresql.org/docs/current/functions-info.html
- •
Test incrementally: After adapting from pg_dump, test against real PostgreSQL instances using pgschema's embedded-postgres integration tests
- •
Version awareness: Check how pg_dump handles version differences - pgschema supports PostgreSQL 14-17, so you may need conditional logic
Verification Checklist
After consulting pg_dump and implementing in pgschema:
- • System catalog query correctly extracts all necessary fields
- • NULL values are handled appropriately
- • Version-specific features are detected and handled
- • Internal/system objects are filtered out
- • Dependencies are tracked correctly
- • Integration test added in
testdata/diff/ - • Test passes with
go test -v ./internal/diff -run TestDiffFromFiles - • Test passes with
go test -v ./cmd -run TestPlanAndApply - • Tested against multiple PostgreSQL versions (14-17)