PostgreSQL Deep Dive¶
PostgreSQL is Warmwind's primary data store for relational container metadata and
semi-structured AI agent event logs in JSONB. This chapter walks through real
query plans, index selection rationale, JSONB operator semantics, connection
pooling, Row-Level Security for multi-tenant isolation, LISTEN/NOTIFY for
real-time event propagation, table partitioning with partition pruning, and
vacuum tuning -- all illustrated with EXPLAIN ANALYZE output you can reproduce
on a local instance.
1. Reading EXPLAIN ANALYZE -- A Complete Walkthrough¶
Coming from JPA
In JPA/Hibernate you enable SQL logging with hibernate.show_sql=true and
maybe add hibernate.generate_statistics=true. That shows what SQL ran
but not how the planner executed it. EXPLAIN ANALYZE is the PostgreSQL
equivalent of -XX:+PrintCompilation -- it reveals the execution engine's
actual decisions. Learn to read plans and you will debug performance issues
faster than any ORM-level logging allows.
1.1 The Query¶
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT ae.id, ae.payload->>'tool' AS tool_name,
(ae.payload->>'duration_ms')::int AS duration_ms
FROM agent_events ae
WHERE ae.container_id = 'c9f0a1b2-3d4e-5f67-8901-abcdef012345'
AND ae.created_at > NOW() - INTERVAL '1 hour'
AND ae.payload @> '{"type": "tool_call"}'
ORDER BY ae.created_at DESC
LIMIT 50;
1.2 Annotated Plan Output¶
Limit (cost=12.45..12.47 rows=50 width=80)
(actual time=0.187..0.201 rows=50 loops=1)
-> Sort (cost=12.45..12.52 rows=73 width=80)
(actual time=0.185..0.193 rows=50 loops=1)
Sort Key: ae.created_at DESC
Sort Method: top-N heapsort Memory: 32kB -- ❶
-> Bitmap Heap Scan on agent_events_2026_03 ae -- ❷
(cost=8.53..11.92 rows=73 width=80)
(actual time=0.098..0.142 rows=73 loops=1)
Recheck Cond: (container_id = 'c9f0a1b2-...'::uuid)
Filter: (payload @> '{"type": "tool_call"}'::jsonb)
Rows Removed by Filter: 12 -- ❸
Heap Blocks: exact=5
Buffers: shared hit=8 -- ❹
-> Bitmap Index Scan on idx_events_container
(cost=0.00..8.51 rows=85 width=0)
(actual time=0.071..0.071 rows=85 loops=1)
Index Cond: (container_id = 'c9f0a1b2-...'::uuid)
Buffers: shared hit=3
Planning Time: 0.254 ms
Execution Time: 0.228 ms
| Annotation | What It Tells You |
|---|---|
| ❶ | top-N heapsort means PostgreSQL only keeps the N smallest/largest rows in memory instead of sorting everything -- triggered by LIMIT. |
| ❷ | The planner pruned to agent_events_2026_03 -- only the March 2026 partition was scanned (partition pruning in action). |
| ❸ | 12 rows matched the B-tree index on container_id but were eliminated by the JSONB @> filter. If this ratio grows, add a GIN index on payload. |
| ❹ | shared hit=8 means all 8 buffer pages came from PostgreSQL's shared buffer cache -- zero disk I/O. |
1.3 Estimated vs Actual Rows¶
When these diverge by more than 10x, statistics are stale. Fix it:
-- Per-table refresh (non-blocking)
ANALYZE agent_events;
-- Increase sampling for a specific column
ALTER TABLE agent_events ALTER COLUMN container_id SET STATISTICS 1000;
ANALYZE agent_events;
Stale statistics are the #1 cause of bad plans
Autovacuum runs ANALYZE implicitly, but on high-churn tables the default
thresholds may be too conservative. See Vacuum Tuning
below.
2. Index Decision Tree¶
Use this decision tree when adding an index:
flowchart TD
START["New slow query"] --> Q1{"What operator?"}
Q1 -->|"=, <, >, BETWEEN, IN, LIKE 'prefix%'"| BTREE["B-tree (default)"]
Q1 -->|"@>, ?, ?&, ?|, full-text @@"| GIN["GIN"]
Q1 -->|"geometric &&, <<, range overlap"| GIST["GiST"]
Q1 -->|"time column, append-only"| BRIN["BRIN"]
Q1 -->|"equality only, no range"| HASH["Hash"]
BTREE --> PARTIAL{"High-selectivity subset?"}
PARTIAL -->|Yes| PBTREE["Partial B-tree\nWHERE status = 'ACTIVE'"]
PARTIAL -->|No| COVER{"Covering index\nneeded?"}
COVER -->|Yes| INCL["INCLUDE (col1, col2)"]
COVER -->|No| DONE["CREATE INDEX"]
GIN --> JSONB_OPS{"Querying keys\nor paths?"}
JSONB_OPS -->|"@> path only"| PATHOPS["jsonb_path_ops\n(3x smaller)"]
JSONB_OPS -->|"?, ?|, ?& key existence"| DEFOPS["default jsonb_ops"]
Index Creation Examples¶
-- B-tree: equality and range scans on container_id
CREATE INDEX idx_events_container
ON agent_events (container_id);
-- B-tree partial: only index active sessions (high selectivity)
CREATE INDEX idx_sessions_active
ON agent_sessions (user_id, started_at DESC)
WHERE ended_at IS NULL;
-- GIN with jsonb_path_ops: only supports @> containment
CREATE INDEX idx_events_payload
ON agent_events USING GIN (payload jsonb_path_ops);
-- GIN default ops: supports @>, ?, ?|, ?& operators
CREATE INDEX idx_events_payload_keys
ON agent_events USING GIN (payload);
-- BRIN: excellent for append-only time-series (tiny index, ~0.1% of B-tree size)
CREATE INDEX idx_events_created
ON agent_events USING BRIN (created_at)
WITH (pages_per_range = 32);
-- Covering index: index-only scan avoids heap access
CREATE INDEX idx_events_cover
ON agent_events (container_id, created_at DESC)
INCLUDE (payload);
-- Hash: pure equality, slightly faster than B-tree for = only
CREATE INDEX idx_events_hash_container
ON agent_events USING HASH (container_id);
Verify Index Usage¶
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM agent_events WHERE container_id = 'c9f0a1b2-...';
-- Check index hit rate across all tables
SELECT schemaname, relname, idx_scan, seq_scan,
ROUND(100.0 * idx_scan / NULLIF(idx_scan + seq_scan, 0), 1) AS idx_pct
FROM pg_stat_user_tables
ORDER BY seq_scan DESC;
3. JSONB Deep Dive¶
3.1 Operators Reference¶
| Operator | Description | Example | Uses GIN? |
|---|---|---|---|
-> |
Get JSON object field (as jsonb) |
payload->'meta' |
No |
->> |
Get JSON object field (as text) |
payload->>'tool' |
No |
#> |
Get nested path (as jsonb) |
payload#>'{meta,version}' |
No |
#>> |
Get nested path (as text) |
payload#>>'{meta,version}' |
No |
@> |
Contains (left contains right) | payload @> '{"type":"tool_call"}' |
Yes |
<@ |
Contained by | '{"type":"tool_call"}' <@ payload |
Yes |
? |
Key exists | payload ? 'error' |
Yes (default ops) |
?| |
Any key exists | payload ?| array['error','warning'] |
Yes (default ops) |
?& |
All keys exist | payload ?& array['tool','duration_ms'] |
Yes (default ops) |
@? |
JSON path exists (PG12+) | payload @? '$.meta ? (@.version > 2)' |
No |
@@ |
JSON path predicate (PG12+) | payload @@ '$.duration_ms > 5000' |
No |
Coming from JPA
JPA's @Column(columnDefinition = "jsonb") gives you storage but no query
support. Hibernate 6 added JsonNode mapping but still lacks operator
support -- you end up writing native queries. In TypeORM, JSONB queries go
through QueryBuilder with raw WHERE fragments. The key insight:
@> is the only operator that hits a jsonb_path_ops GIN index. If
you need ? or ?|, use the default jsonb_ops GIN class (larger index,
broader operator support).
3.2 Practical Queries¶
-- Find all tool calls that took over 5 seconds
SELECT id,
payload->>'tool' AS tool_name,
(payload->>'duration_ms')::int AS duration_ms,
payload#>>'{meta,model}' AS model
FROM agent_events
WHERE payload @> '{"type": "tool_call"}' -- GIN index hit
AND (payload->>'duration_ms')::int > 5000 -- filter after index
ORDER BY (payload->>'duration_ms')::int DESC;
-- Aggregate tool usage per session
SELECT s.id AS session_id,
jsonb_object_agg(
ae.payload->>'tool',
ae.payload->>'duration_ms'
) AS tool_durations
FROM agent_sessions s
JOIN agent_events ae ON ae.session_id = s.id
WHERE ae.payload @> '{"type": "tool_call"}'
GROUP BY s.id;
-- PostgreSQL 17: JSON_TABLE for normalized output
SELECT jt.*
FROM agent_events ae,
JSON_TABLE(ae.payload, '$'
COLUMNS (
tool_name TEXT PATH '$.tool',
duration INT PATH '$.duration_ms',
model TEXT PATH '$.meta.model'
)
) AS jt
WHERE ae.container_id = 'c9f0a1b2-...'
AND ae.payload @> '{"type": "tool_call"}';
3.3 Indexing Strategy for JSONB¶
-- Strategy 1: jsonb_path_ops -- 3x smaller, only supports @>
CREATE INDEX idx_payload_path ON agent_events USING GIN (payload jsonb_path_ops);
-- Strategy 2: expression index on a frequently extracted field
CREATE INDEX idx_payload_tool ON agent_events ((payload->>'tool'))
WHERE payload @> '{"type": "tool_call"}';
-- Strategy 3: generated column + B-tree (PG12+)
ALTER TABLE agent_events
ADD COLUMN tool_name TEXT GENERATED ALWAYS AS (payload->>'tool') STORED;
CREATE INDEX idx_tool_name ON agent_events (tool_name);
4. Connection Pooling with PgBouncer¶
4.1 Modes Compared¶
| Mode | Connection Returned | Session State Preserved | Use Case |
|---|---|---|---|
| session | On client disconnect | Yes (SET, PREPARE, LISTEN) | LISTEN/NOTIFY consumers, long-lived workers |
| transaction | After COMMIT/ROLLBACK |
No | API request handlers (Warmwind default) |
| statement | After each statement | No | Simple read-only queries, connection-starved setups |
4.2 Production Configuration¶
;; /etc/pgbouncer/pgbouncer.ini
[databases]
warmwind = host=127.0.0.1 port=5432 dbname=warmwind_production
[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt
;; Pool sizing: (CPU cores * 2) + effective_spindle_count
;; For 8-core + NVMe: (8 * 2) + 1 = 17, round to 20
pool_mode = transaction
default_pool_size = 20
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3
;; Timeouts
server_idle_timeout = 300
client_idle_timeout = 600
query_timeout = 30
query_wait_timeout = 120
;; Monitoring
stats_period = 60
log_connections = 0
log_disconnections = 0
4.3 Monitoring Pool Health¶
-- Connect to PgBouncer admin console
-- psql -p 6432 -U pgbouncer pgbouncer
SHOW POOLS;
-- database | user | cl_active | cl_waiting | sv_active | sv_idle | sv_used
-- warmwind | app | 45 | 0 | 12 | 8 | 0
SHOW STATS;
-- total_xact_count | total_query_count | avg_xact_time | avg_query_time
Transaction-mode gotchas
SET statements, prepared statements, LISTEN/NOTIFY, and advisory locks
do not survive transaction boundaries in transaction mode. If your
NestJS code uses SET app.current_tenant = '...' for RLS, wrap it in the
same transaction as your queries.
5. Row-Level Security for Multi-Tenant Isolation¶
5.1 Complete Working Example¶
-- Step 1: Create the tenant-scoped tables
CREATE TABLE tenants (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL UNIQUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE containers (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id),
name TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'PENDING',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE agent_sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
container_id UUID NOT NULL REFERENCES containers(id),
tenant_id UUID NOT NULL REFERENCES tenants(id),
user_id UUID NOT NULL,
started_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
ended_at TIMESTAMPTZ
);
-- Step 2: Create a non-superuser application role
CREATE ROLE app_user LOGIN PASSWORD 'strong-random-password';
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO app_user;
-- Step 3: Enable RLS (superuser/table owner bypasses by default)
ALTER TABLE containers ENABLE ROW LEVEL SECURITY;
ALTER TABLE agent_sessions ENABLE ROW LEVEL SECURITY;
-- Force RLS even for table owner (important for testing)
ALTER TABLE containers FORCE ROW LEVEL SECURITY;
ALTER TABLE agent_sessions FORCE ROW LEVEL SECURITY;
-- Step 4: Define policies
CREATE POLICY tenant_isolation_containers ON containers
FOR ALL
TO app_user
USING (tenant_id = current_setting('app.current_tenant')::uuid)
WITH CHECK (tenant_id = current_setting('app.current_tenant')::uuid);
CREATE POLICY tenant_isolation_sessions ON agent_sessions
FOR ALL
TO app_user
USING (tenant_id = current_setting('app.current_tenant')::uuid)
WITH CHECK (tenant_id = current_setting('app.current_tenant')::uuid);
-- Step 5: Test isolation
INSERT INTO tenants (id, name) VALUES
('aaaa0000-0000-0000-0000-000000000001', 'Acme Corp'),
('bbbb0000-0000-0000-0000-000000000002', 'Globex Inc');
INSERT INTO containers (tenant_id, name) VALUES
('aaaa0000-0000-0000-0000-000000000001', 'acme-agent-1'),
('bbbb0000-0000-0000-0000-000000000002', 'globex-agent-1');
-- As app_user:
SET ROLE app_user;
SET app.current_tenant = 'aaaa0000-0000-0000-0000-000000000001';
SELECT * FROM containers;
-- Returns: acme-agent-1 only
SET app.current_tenant = 'bbbb0000-0000-0000-0000-000000000002';
SELECT * FROM containers;
-- Returns: globex-agent-1 only
RESET ROLE;
5.2 NestJS Integration¶
// middleware/tenant.middleware.ts
@Injectable()
export class TenantMiddleware implements NestMiddleware {
constructor(private dataSource: DataSource) {}
async use(req: Request, _res: Response, next: NextFunction) {
const tenantId = req.user?.tenantId; // from JWT payload
if (!tenantId) throw new ForbiddenException('No tenant context');
// Set tenant for the duration of this request's connection
const queryRunner = this.dataSource.createQueryRunner();
await queryRunner.connect();
await queryRunner.query(`SET app.current_tenant = $1`, [tenantId]);
// Attach queryRunner to request so services use the same connection
req['queryRunner'] = queryRunner;
next();
}
}
RLS + PgBouncer transaction mode
SET app.current_tenant must happen inside every transaction because
PgBouncer in transaction mode recycles connections between requests.
Use TypeORM's queryRunner.startTransaction() to guarantee the SET
and your queries share the same connection.
6. LISTEN/NOTIFY for Real-Time Events¶
6.1 Trigger-Based Notification¶
-- Trigger function: emit JSON payload on status change
CREATE OR REPLACE FUNCTION notify_container_status()
RETURNS TRIGGER AS $$
BEGIN
IF OLD.status IS DISTINCT FROM NEW.status THEN
PERFORM pg_notify(
'container_status',
json_build_object(
'container_id', NEW.id,
'tenant_id', NEW.tenant_id,
'old_status', OLD.status,
'new_status', NEW.status,
'changed_at', NOW()
)::text
);
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_container_status_change
AFTER UPDATE OF status ON containers
FOR EACH ROW
EXECUTE FUNCTION notify_container_status();
6.2 NestJS Listener (pg driver)¶
import { Client } from 'pg';
@Injectable()
export class ContainerNotificationService implements OnModuleInit, OnModuleDestroy {
private pgClient: Client;
constructor(
private config: ConfigService,
private eventEmitter: EventEmitter2,
) {}
async onModuleInit() {
// Dedicated connection -- NOT pooled (LISTEN requires session mode)
this.pgClient = new Client({
connectionString: this.config.get('DATABASE_URL'),
});
await this.pgClient.connect();
await this.pgClient.query('LISTEN container_status');
this.pgClient.on('notification', (msg) => {
const payload = JSON.parse(msg.payload!);
this.eventEmitter.emit('container.status_changed', payload);
});
}
async onModuleDestroy() {
await this.pgClient.query('UNLISTEN *');
await this.pgClient.end();
}
}
6.3 Limitations¶
| Constraint | Detail |
|---|---|
| Payload limit | 8000 bytes per notification. For large payloads, send the ID and let the listener query the row. |
| No persistence | If no listener is connected, the notification is lost. For guaranteed delivery, use a pending_events table + polling or an external queue (Redis Streams, NATS). |
| Connection | LISTEN requires a dedicated, non-pooled connection (PgBouncer session mode or a raw pg.Client). |
| Throughput | pg_notify is synchronous within the trigger. At >1000 notifications/sec, consider batching or an outbox pattern. |
7. Table Partitioning¶
7.1 Schema¶
CREATE TABLE agent_events (
id UUID DEFAULT gen_random_uuid(),
container_id UUID NOT NULL,
session_id UUID,
tenant_id UUID NOT NULL,
event_type TEXT NOT NULL,
payload JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (id, created_at) -- PK must include partition key
) PARTITION BY RANGE (created_at);
-- Monthly partitions
CREATE TABLE agent_events_2026_01 PARTITION OF agent_events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE agent_events_2026_02 PARTITION OF agent_events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
CREATE TABLE agent_events_2026_03 PARTITION OF agent_events
FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');
-- Default partition catches anything outside defined ranges
CREATE TABLE agent_events_default PARTITION OF agent_events DEFAULT;
-- Indexes are created per-partition automatically when defined on the parent
CREATE INDEX idx_events_container ON agent_events (container_id);
CREATE INDEX idx_events_payload ON agent_events USING GIN (payload jsonb_path_ops);
7.2 EXPLAIN Showing Partition Pruning¶
EXPLAIN (ANALYZE, COSTS OFF)
SELECT * FROM agent_events
WHERE created_at >= '2026-03-10' AND created_at < '2026-03-20'
AND container_id = 'c9f0a1b2-...';
Append (actual time=0.032..0.051 rows=24 loops=1)
Subplans Removed: 3 -- ❶ 3 partitions pruned
-> Index Scan using agent_events_2026_03_container_id_idx
on agent_events_2026_03 ae -- ❷ only March scanned
(actual time=0.030..0.047 rows=24 loops=1)
Index Cond: (container_id = 'c9f0a1b2-...'::uuid)
Filter: (created_at >= ... AND created_at < ...)
Planning Time: 0.412 ms
Execution Time: 0.073 ms
❶ Subplans Removed: 3 -- January, February, and the default partition were pruned at plan time.
❷ Only agent_events_2026_03 was scanned, and within that partition, the B-tree on container_id was used.
7.3 Automating Partition Creation¶
-- pg_partman extension (recommended for production)
CREATE EXTENSION pg_partman;
SELECT partman.create_parent(
p_parent_table := 'public.agent_events',
p_control := 'created_at',
p_type := 'native',
p_interval := 'monthly',
p_premake := 3 -- create 3 future partitions
);
-- Run maintenance periodically (cron or pg_cron)
SELECT partman.run_maintenance('public.agent_events');
7.4 Partition Detach for Archival¶
-- Non-blocking detach (PG14+)
ALTER TABLE agent_events DETACH PARTITION agent_events_2025_01 CONCURRENTLY;
-- Archive to S3 / cold storage, then drop
-- pg_dump -t agent_events_2025_01 | gzip | aws s3 cp - s3://warmwind-archive/...
DROP TABLE agent_events_2025_01;
8. Vacuum Tuning¶
8.1 Why Vacuum Matters (MVCC Recap)¶
PostgreSQL never overwrites rows in place. An UPDATE writes a new row version
and marks the old one as dead. DELETE only marks the row dead. Dead tuples
accumulate and cause:
- Table bloat: heap pages fill with dead rows, sequential scans slow down.
- Index bloat: indexes still point to dead tuples.
- Transaction ID wraparound: after ~2 billion XIDs PostgreSQL must freeze old tuples or it shuts down to prevent data corruption.
8.2 Per-Table Autovacuum Tuning¶
-- High-write table: agent_events
ALTER TABLE agent_events SET (
autovacuum_vacuum_scale_factor = 0.01, -- trigger at 1% dead (default 20%)
autovacuum_vacuum_cost_delay = 2, -- less sleeping between pages (default 2ms)
autovacuum_vacuum_cost_limit = 1000, -- more pages per wake cycle (default 200)
autovacuum_analyze_scale_factor = 0.005 -- re-analyze at 0.5% changes
);
-- Low-write lookup table: templates
ALTER TABLE templates SET (
autovacuum_vacuum_scale_factor = 0.2, -- default is fine
autovacuum_enabled = true
);
8.3 Monitoring Dead Tuples¶
-- Top bloated tables
SELECT relname,
n_live_tup,
n_dead_tup,
ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pct,
last_autovacuum,
last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;
-- Check for impending wraparound (should stay below 200M for comfort)
SELECT relname,
age(relfrozenxid) AS xid_age,
pg_size_pretty(pg_total_relation_size(oid)) AS total_size
FROM pg_class
WHERE relkind = 'r'
ORDER BY age(relfrozenxid) DESC
LIMIT 10;
8.4 Manual Vacuum for Emergencies¶
-- Regular vacuum (non-blocking, reclaims space for reuse)
VACUUM (VERBOSE) agent_events;
-- VACUUM FULL: rewrites the entire table (EXCLUSIVE LOCK -- blocks all queries)
-- Only use for extreme bloat when table is e.g. 50% dead tuples
VACUUM FULL agent_events;
-- pg_repack: online alternative to VACUUM FULL (no long lock)
-- CREATE EXTENSION pg_repack;
-- $ pg_repack -t agent_events -d warmwind_production
9. Replication Overview¶
| Type | Replicates | Latency | Use Case |
|---|---|---|---|
| Streaming (physical) | Entire WAL byte stream | Sub-second | HA failover, read replicas with identical schema |
| Logical | Decoded row changes per publication | Seconds | Selective replication, CDC to external systems, major-version upgrades |
-- Logical replication: publish agent_events to analytics cluster
CREATE PUBLICATION warmwind_events FOR TABLE agent_events;
-- On subscriber:
CREATE SUBSCRIPTION analytics_sub
CONNECTION 'host=primary port=5432 dbname=warmwind_production'
PUBLICATION warmwind_events;
10. PostgreSQL 17 Highlights¶
What's new (2025--2026)
- Incremental backup (
pg_basebackup --incremental): recovery from 78 min to 4 min in benchmarks. JSON_TABLE: SQL-standard function for normalizing JSON into rows/columns (replacesjsonb_to_recordsetgymnastics).- Improved BRIN: better handling of out-of-order inserts reduces false positives.
MERGEimprovements:RETURNINGclause, better performance on partitioned targets.- Bulk
COPYspeedup: 2x faster for large imports via optimized WAL insertion. pg_stat_checkpointer: new view replacingpg_stat_bgwriterfor checkpoint-specific monitoring.
Glossary¶
Glossary
- EXPLAIN ANALYZE
- Executes a query and returns the actual execution plan with timing,
row counts, and buffer statistics. Without
ANALYZE, PostgreSQL shows only the estimated plan. - B-tree Index
- Default index type. Supports
=,<,>,BETWEEN,IN,LIKE 'prefix%',IS NULL, andORDER BY. Tree height is O(log N) -- typically 3-4 levels for millions of rows. - GIN Index
- Generalized Inverted Index. Maps each key (JSONB key, array element,
tsvector lexeme) to a posting list of heap TIDs. Fast reads, slower writes
(use
fastupdateto batch insertions). - BRIN Index
- Block Range INdex. Stores min/max per block range (default 128 pages).
Tiny on disk (~0.1% of B-tree size). Only effective if physical row order
correlates with the indexed column (e.g.,
created_aton an append-only table). - GiST Index
- Generalized Search Tree. Supports geometric types (
&&overlap,<->distance), range types, and full-text search (alternative to GIN with different trade-offs). - PgBouncer
- Lightweight connection pooler sitting between the application and PostgreSQL. Reduces per-connection memory overhead (~10 MB/connection in PostgreSQL) and limits total backend connections.
- Partition Pruning
- Planner optimization that eliminates partitions from the scan based on
WHEREclause constraints on the partition key. Visible asSubplans RemovedinEXPLAIN. - MVCC (Multi-Version Concurrency Control)
- PostgreSQL's concurrency model: each transaction sees a snapshot of the database.
Writers never block readers. Dead row versions are reclaimed by
VACUUM. - Row-Level Security (RLS)
- Database-enforced access control that appends filter predicates to every query
based on
USING/WITH CHECKpolicies. Operates at the planner level -- queries never see rows they are not authorized to access. - Transaction ID Wraparound
- PostgreSQL uses 32-bit transaction IDs. After ~2 billion transactions, the
counter wraps around.
VACUUMfreezes old tuples to prevent data loss. Failure to vacuum causes PostgreSQL to shut down preventatively. - LISTEN/NOTIFY
- Lightweight pub/sub built into PostgreSQL.
NOTIFY channel, 'payload'sends a message to all sessions that have executedLISTEN channel. No persistence -- if no listener is connected, the message is lost. - pg_partman
- Extension for automated creation and maintenance of time-based and serial-based partitions. Handles premake, retention, and partition detachment.
- Logical Replication
- Publishes decoded row changes (INSERT, UPDATE, DELETE) from a publication to a subscription. Allows selective replication of specific tables and cross-version replication.