PostgreSQL Deep Dive¶

PostgreSQL is Warmwind's primary data store for relational container metadata and semi-structured AI agent event logs in JSONB. This chapter walks through real query plans, index selection rationale, JSONB operator semantics, connection pooling, Row-Level Security for multi-tenant isolation, LISTEN/NOTIFY for real-time event propagation, table partitioning with partition pruning, and vacuum tuning -- all illustrated with EXPLAIN ANALYZE output you can reproduce on a local instance.

1. Reading EXPLAIN ANALYZE -- A Complete Walkthrough¶

Coming from JPA

In JPA/Hibernate you enable SQL logging with hibernate.show_sql=true and maybe add hibernate.generate_statistics=true. That shows what SQL ran but not how the planner executed it. EXPLAIN ANALYZE is the PostgreSQL equivalent of -XX:+PrintCompilation -- it reveals the execution engine's actual decisions. Learn to read plans and you will debug performance issues faster than any ORM-level logging allows.

1.1 The Query¶

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT ae.id, ae.payload->>'tool' AS tool_name,
       (ae.payload->>'duration_ms')::int AS duration_ms
FROM   agent_events ae
WHERE  ae.container_id = 'c9f0a1b2-3d4e-5f67-8901-abcdef012345'
  AND  ae.created_at > NOW() - INTERVAL '1 hour'
  AND  ae.payload @> '{"type": "tool_call"}'
ORDER  BY ae.created_at DESC
LIMIT  50;

1.2 Annotated Plan Output¶

Limit  (cost=12.45..12.47 rows=50 width=80)
       (actual time=0.187..0.201 rows=50 loops=1)
  ->  Sort  (cost=12.45..12.52 rows=73 width=80)
            (actual time=0.185..0.193 rows=50 loops=1)
        Sort Key: ae.created_at DESC
        Sort Method: top-N heapsort  Memory: 32kB          -- ❶
        ->  Bitmap Heap Scan on agent_events_2026_03 ae     -- ❷
              (cost=8.53..11.92 rows=73 width=80)
              (actual time=0.098..0.142 rows=73 loops=1)
              Recheck Cond: (container_id = 'c9f0a1b2-...'::uuid)
              Filter: (payload @> '{"type": "tool_call"}'::jsonb)
              Rows Removed by Filter: 12                    -- ❸
              Heap Blocks: exact=5
              Buffers: shared hit=8                         -- ❹
              ->  Bitmap Index Scan on idx_events_container
                    (cost=0.00..8.51 rows=85 width=0)
                    (actual time=0.071..0.071 rows=85 loops=1)
                    Index Cond: (container_id = 'c9f0a1b2-...'::uuid)
                    Buffers: shared hit=3
Planning Time: 0.254 ms
Execution Time: 0.228 ms

Annotation	What It Tells You
❶	`top-N heapsort` means PostgreSQL only keeps the N smallest/largest rows in memory instead of sorting everything -- triggered by `LIMIT`.
❷	The planner pruned to `agent_events_2026_03` -- only the March 2026 partition was scanned (partition pruning in action).
❸	12 rows matched the B-tree index on `container_id` but were eliminated by the JSONB `@>` filter. If this ratio grows, add a GIN index on `payload`.
❹	`shared hit=8` means all 8 buffer pages came from PostgreSQL's shared buffer cache -- zero disk I/O.

1.3 Estimated vs Actual Rows¶

rows=73 (estimated)  vs  rows=73 (actual)

When these diverge by more than 10x, statistics are stale. Fix it:

-- Per-table refresh (non-blocking)
ANALYZE agent_events;

-- Increase sampling for a specific column
ALTER TABLE agent_events ALTER COLUMN container_id SET STATISTICS 1000;
ANALYZE agent_events;

Stale statistics are the #1 cause of bad plans

Autovacuum runs ANALYZE implicitly, but on high-churn tables the default thresholds may be too conservative. See Vacuum Tuning below.

2. Index Decision Tree¶

Use this decision tree when adding an index:

flowchart TD
    START["New slow query"] --> Q1{"What operator?"}
    Q1 -->|"=, <, >, BETWEEN, IN, LIKE 'prefix%'"| BTREE["B-tree (default)"]
    Q1 -->|"@>, ?, ?&, ?|, full-text @@"| GIN["GIN"]
    Q1 -->|"geometric &&, <<, range overlap"| GIST["GiST"]
    Q1 -->|"time column, append-only"| BRIN["BRIN"]
    Q1 -->|"equality only, no range"| HASH["Hash"]

    BTREE --> PARTIAL{"High-selectivity subset?"}
    PARTIAL -->|Yes| PBTREE["Partial B-tree\nWHERE status = 'ACTIVE'"]
    PARTIAL -->|No| COVER{"Covering index\nneeded?"}
    COVER -->|Yes| INCL["INCLUDE (col1, col2)"]
    COVER -->|No| DONE["CREATE INDEX"]

    GIN --> JSONB_OPS{"Querying keys\nor paths?"}
    JSONB_OPS -->|"@> path only"| PATHOPS["jsonb_path_ops\n(3x smaller)"]
    JSONB_OPS -->|"?, ?|, ?& key existence"| DEFOPS["default jsonb_ops"]

Index Creation Examples¶

-- B-tree: equality and range scans on container_id
CREATE INDEX idx_events_container
    ON agent_events (container_id);

-- B-tree partial: only index active sessions (high selectivity)
CREATE INDEX idx_sessions_active
    ON agent_sessions (user_id, started_at DESC)
    WHERE ended_at IS NULL;

-- GIN with jsonb_path_ops: only supports @> containment
CREATE INDEX idx_events_payload
    ON agent_events USING GIN (payload jsonb_path_ops);

-- GIN default ops: supports @>, ?, ?|, ?& operators
CREATE INDEX idx_events_payload_keys
    ON agent_events USING GIN (payload);

-- BRIN: excellent for append-only time-series (tiny index, ~0.1% of B-tree size)
CREATE INDEX idx_events_created
    ON agent_events USING BRIN (created_at)
    WITH (pages_per_range = 32);

-- Covering index: index-only scan avoids heap access
CREATE INDEX idx_events_cover
    ON agent_events (container_id, created_at DESC)
    INCLUDE (payload);

-- Hash: pure equality, slightly faster than B-tree for = only
CREATE INDEX idx_events_hash_container
    ON agent_events USING HASH (container_id);

Verify Index Usage¶

EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM agent_events WHERE container_id = 'c9f0a1b2-...';

-- Check index hit rate across all tables
SELECT schemaname, relname, idx_scan, seq_scan,
       ROUND(100.0 * idx_scan / NULLIF(idx_scan + seq_scan, 0), 1) AS idx_pct
FROM   pg_stat_user_tables
ORDER  BY seq_scan DESC;

3. JSONB Deep Dive¶

3.1 Operators Reference¶

Operator	Description	Example	Uses GIN?
`->`	Get JSON object field (as `jsonb`)	`payload->'meta'`	No
`->>`	Get JSON object field (as `text`)	`payload->>'tool'`	No
`#>`	Get nested path (as `jsonb`)	`payload#>'{meta,version}'`	No
`#>>`	Get nested path (as `text`)	`payload#>>'{meta,version}'`	No
`@>`	Contains (left contains right)	`payload @> '{"type":"tool_call"}'`	Yes
`<@`	Contained by	`'{"type":"tool_call"}' <@ payload`	Yes
`?`	Key exists	`payload ? 'error'`	Yes (default ops)
`?\|`	Any key exists	`payload ?\| array['error','warning']`	Yes (default ops)
`?&`	All keys exist	`payload ?& array['tool','duration_ms']`	Yes (default ops)
`@?`	JSON path exists (PG12+)	`payload @? '$.meta ? (@.version > 2)'`	No
`@@`	JSON path predicate (PG12+)	`payload @@ '$.duration_ms > 5000'`	No

Coming from JPA

JPA's @Column(columnDefinition = "jsonb") gives you storage but no query support. Hibernate 6 added JsonNode mapping but still lacks operator support -- you end up writing native queries. In TypeORM, JSONB queries go through QueryBuilder with raw WHERE fragments. The key insight: @> is the only operator that hits a jsonb_path_ops GIN index. If you need ? or ?|, use the default jsonb_ops GIN class (larger index, broader operator support).

3.2 Practical Queries¶

-- Find all tool calls that took over 5 seconds
SELECT id,
       payload->>'tool' AS tool_name,
       (payload->>'duration_ms')::int AS duration_ms,
       payload#>>'{meta,model}' AS model
FROM   agent_events
WHERE  payload @> '{"type": "tool_call"}'        -- GIN index hit
  AND  (payload->>'duration_ms')::int > 5000      -- filter after index
ORDER  BY (payload->>'duration_ms')::int DESC;

-- Aggregate tool usage per session
SELECT s.id AS session_id,
       jsonb_object_agg(
         ae.payload->>'tool',
         ae.payload->>'duration_ms'
       ) AS tool_durations
FROM   agent_sessions s
JOIN   agent_events ae ON ae.session_id = s.id
WHERE  ae.payload @> '{"type": "tool_call"}'
GROUP  BY s.id;

-- PostgreSQL 17: JSON_TABLE for normalized output
SELECT jt.*
FROM   agent_events ae,
       JSON_TABLE(ae.payload, '$'
         COLUMNS (
           tool_name TEXT PATH '$.tool',
           duration  INT  PATH '$.duration_ms',
           model     TEXT PATH '$.meta.model'
         )
       ) AS jt
WHERE  ae.container_id = 'c9f0a1b2-...'
  AND  ae.payload @> '{"type": "tool_call"}';

3.3 Indexing Strategy for JSONB¶

-- Strategy 1: jsonb_path_ops -- 3x smaller, only supports @>
CREATE INDEX idx_payload_path ON agent_events USING GIN (payload jsonb_path_ops);

-- Strategy 2: expression index on a frequently extracted field
CREATE INDEX idx_payload_tool ON agent_events ((payload->>'tool'))
    WHERE payload @> '{"type": "tool_call"}';

-- Strategy 3: generated column + B-tree (PG12+)
ALTER TABLE agent_events
    ADD COLUMN tool_name TEXT GENERATED ALWAYS AS (payload->>'tool') STORED;
CREATE INDEX idx_tool_name ON agent_events (tool_name);

4. Connection Pooling with PgBouncer¶

4.1 Modes Compared¶

Mode	Connection Returned	Session State Preserved	Use Case
session	On client disconnect	Yes (SET, PREPARE, LISTEN)	LISTEN/NOTIFY consumers, long-lived workers
transaction	After `COMMIT`/`ROLLBACK`	No	API request handlers (Warmwind default)
statement	After each statement	No	Simple read-only queries, connection-starved setups

4.2 Production Configuration¶

;; /etc/pgbouncer/pgbouncer.ini

[databases]
warmwind = host=127.0.0.1 port=5432 dbname=warmwind_production

[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt

;; Pool sizing: (CPU cores * 2) + effective_spindle_count
;; For 8-core + NVMe: (8 * 2) + 1 = 17, round to 20
pool_mode = transaction
default_pool_size = 20
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3

;; Timeouts
server_idle_timeout = 300
client_idle_timeout = 600
query_timeout = 30
query_wait_timeout = 120

;; Monitoring
stats_period = 60
log_connections = 0
log_disconnections = 0

4.3 Monitoring Pool Health¶

-- Connect to PgBouncer admin console
-- psql -p 6432 -U pgbouncer pgbouncer

SHOW POOLS;
-- database | user | cl_active | cl_waiting | sv_active | sv_idle | sv_used
-- warmwind | app  |    45     |     0      |    12     |   8     |   0

SHOW STATS;
-- total_xact_count | total_query_count | avg_xact_time | avg_query_time

Transaction-mode gotchas

SET statements, prepared statements, LISTEN/NOTIFY, and advisory locks do not survive transaction boundaries in transaction mode. If your NestJS code uses SET app.current_tenant = '...' for RLS, wrap it in the same transaction as your queries.

5. Row-Level Security for Multi-Tenant Isolation¶

5.1 Complete Working Example¶

-- Step 1: Create the tenant-scoped tables
CREATE TABLE tenants (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name        TEXT NOT NULL UNIQUE,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE containers (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL REFERENCES tenants(id),
    name        TEXT NOT NULL,
    status      TEXT NOT NULL DEFAULT 'PENDING',
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE agent_sessions (
    id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    container_id  UUID NOT NULL REFERENCES containers(id),
    tenant_id     UUID NOT NULL REFERENCES tenants(id),
    user_id       UUID NOT NULL,
    started_at    TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    ended_at      TIMESTAMPTZ
);

-- Step 2: Create a non-superuser application role
CREATE ROLE app_user LOGIN PASSWORD 'strong-random-password';
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO app_user;

-- Step 3: Enable RLS (superuser/table owner bypasses by default)
ALTER TABLE containers      ENABLE ROW LEVEL SECURITY;
ALTER TABLE agent_sessions  ENABLE ROW LEVEL SECURITY;

-- Force RLS even for table owner (important for testing)
ALTER TABLE containers      FORCE ROW LEVEL SECURITY;
ALTER TABLE agent_sessions  FORCE ROW LEVEL SECURITY;

-- Step 4: Define policies
CREATE POLICY tenant_isolation_containers ON containers
    FOR ALL
    TO app_user
    USING  (tenant_id = current_setting('app.current_tenant')::uuid)
    WITH CHECK (tenant_id = current_setting('app.current_tenant')::uuid);

CREATE POLICY tenant_isolation_sessions ON agent_sessions
    FOR ALL
    TO app_user
    USING  (tenant_id = current_setting('app.current_tenant')::uuid)
    WITH CHECK (tenant_id = current_setting('app.current_tenant')::uuid);

-- Step 5: Test isolation
INSERT INTO tenants (id, name) VALUES
    ('aaaa0000-0000-0000-0000-000000000001', 'Acme Corp'),
    ('bbbb0000-0000-0000-0000-000000000002', 'Globex Inc');

INSERT INTO containers (tenant_id, name) VALUES
    ('aaaa0000-0000-0000-0000-000000000001', 'acme-agent-1'),
    ('bbbb0000-0000-0000-0000-000000000002', 'globex-agent-1');

-- As app_user:
SET ROLE app_user;

SET app.current_tenant = 'aaaa0000-0000-0000-0000-000000000001';
SELECT * FROM containers;
-- Returns: acme-agent-1 only

SET app.current_tenant = 'bbbb0000-0000-0000-0000-000000000002';
SELECT * FROM containers;
-- Returns: globex-agent-1 only

RESET ROLE;

5.2 NestJS Integration¶

// middleware/tenant.middleware.ts
@Injectable()
export class TenantMiddleware implements NestMiddleware {
  constructor(private dataSource: DataSource) {}

  async use(req: Request, _res: Response, next: NextFunction) {
    const tenantId = req.user?.tenantId; // from JWT payload
    if (!tenantId) throw new ForbiddenException('No tenant context');

    // Set tenant for the duration of this request's connection
    const queryRunner = this.dataSource.createQueryRunner();
    await queryRunner.connect();
    await queryRunner.query(`SET app.current_tenant = $1`, [tenantId]);

    // Attach queryRunner to request so services use the same connection
    req['queryRunner'] = queryRunner;
    next();
  }
}

RLS + PgBouncer transaction mode

SET app.current_tenant must happen inside every transaction because PgBouncer in transaction mode recycles connections between requests. Use TypeORM's queryRunner.startTransaction() to guarantee the SET and your queries share the same connection.

6. LISTEN/NOTIFY for Real-Time Events¶

6.1 Trigger-Based Notification¶

-- Trigger function: emit JSON payload on status change
CREATE OR REPLACE FUNCTION notify_container_status()
RETURNS TRIGGER AS $$
BEGIN
  IF OLD.status IS DISTINCT FROM NEW.status THEN
    PERFORM pg_notify(
      'container_status',
      json_build_object(
        'container_id', NEW.id,
        'tenant_id',    NEW.tenant_id,
        'old_status',   OLD.status,
        'new_status',   NEW.status,
        'changed_at',   NOW()
      )::text
    );
  END IF;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_container_status_change
  AFTER UPDATE OF status ON containers
  FOR EACH ROW
  EXECUTE FUNCTION notify_container_status();

6.2 NestJS Listener (pg driver)¶

import { Client } from 'pg';

@Injectable()
export class ContainerNotificationService implements OnModuleInit, OnModuleDestroy {
  private pgClient: Client;

  constructor(
    private config: ConfigService,
    private eventEmitter: EventEmitter2,
  ) {}

  async onModuleInit() {
    // Dedicated connection -- NOT pooled (LISTEN requires session mode)
    this.pgClient = new Client({
      connectionString: this.config.get('DATABASE_URL'),
    });
    await this.pgClient.connect();
    await this.pgClient.query('LISTEN container_status');

    this.pgClient.on('notification', (msg) => {
      const payload = JSON.parse(msg.payload!);
      this.eventEmitter.emit('container.status_changed', payload);
    });
  }

  async onModuleDestroy() {
    await this.pgClient.query('UNLISTEN *');
    await this.pgClient.end();
  }
}

6.3 Limitations¶

Constraint	Detail
Payload limit	8000 bytes per notification. For large payloads, send the ID and let the listener query the row.
No persistence	If no listener is connected, the notification is lost. For guaranteed delivery, use a `pending_events` table + polling or an external queue (Redis Streams, NATS).
Connection	LISTEN requires a dedicated, non-pooled connection (PgBouncer session mode or a raw `pg.Client`).
Throughput	pg_notify is synchronous within the trigger. At >1000 notifications/sec, consider batching or an outbox pattern.

7. Table Partitioning¶

7.1 Schema¶

CREATE TABLE agent_events (
    id            UUID DEFAULT gen_random_uuid(),
    container_id  UUID NOT NULL,
    session_id    UUID,
    tenant_id     UUID NOT NULL,
    event_type    TEXT NOT NULL,
    payload       JSONB NOT NULL DEFAULT '{}',
    created_at    TIMESTAMPTZ NOT NULL DEFAULT NOW(),

    PRIMARY KEY (id, created_at)  -- PK must include partition key
) PARTITION BY RANGE (created_at);

-- Monthly partitions
CREATE TABLE agent_events_2026_01 PARTITION OF agent_events
    FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE agent_events_2026_02 PARTITION OF agent_events
    FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
CREATE TABLE agent_events_2026_03 PARTITION OF agent_events
    FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');

-- Default partition catches anything outside defined ranges
CREATE TABLE agent_events_default PARTITION OF agent_events DEFAULT;

-- Indexes are created per-partition automatically when defined on the parent
CREATE INDEX idx_events_container ON agent_events (container_id);
CREATE INDEX idx_events_payload   ON agent_events USING GIN (payload jsonb_path_ops);

7.2 EXPLAIN Showing Partition Pruning¶

EXPLAIN (ANALYZE, COSTS OFF)
SELECT * FROM agent_events
WHERE created_at >= '2026-03-10' AND created_at < '2026-03-20'
  AND container_id = 'c9f0a1b2-...';

Append (actual time=0.032..0.051 rows=24 loops=1)
  Subplans Removed: 3                                          -- ❶ 3 partitions pruned
  ->  Index Scan using agent_events_2026_03_container_id_idx
        on agent_events_2026_03 ae                             -- ❷ only March scanned
        (actual time=0.030..0.047 rows=24 loops=1)
        Index Cond: (container_id = 'c9f0a1b2-...'::uuid)
        Filter: (created_at >= ... AND created_at < ...)
Planning Time: 0.412 ms
Execution Time: 0.073 ms

❶ Subplans Removed: 3 -- January, February, and the default partition were pruned at plan time. ❷ Only agent_events_2026_03 was scanned, and within that partition, the B-tree on container_id was used.

7.3 Automating Partition Creation¶

-- pg_partman extension (recommended for production)
CREATE EXTENSION pg_partman;

SELECT partman.create_parent(
    p_parent_table  := 'public.agent_events',
    p_control       := 'created_at',
    p_type          := 'native',
    p_interval      := 'monthly',
    p_premake       := 3  -- create 3 future partitions
);

-- Run maintenance periodically (cron or pg_cron)
SELECT partman.run_maintenance('public.agent_events');

7.4 Partition Detach for Archival¶

-- Non-blocking detach (PG14+)
ALTER TABLE agent_events DETACH PARTITION agent_events_2025_01 CONCURRENTLY;

-- Archive to S3 / cold storage, then drop
-- pg_dump -t agent_events_2025_01 | gzip | aws s3 cp - s3://warmwind-archive/...
DROP TABLE agent_events_2025_01;

8. Vacuum Tuning¶

8.1 Why Vacuum Matters (MVCC Recap)¶

PostgreSQL never overwrites rows in place. An UPDATE writes a new row version and marks the old one as dead. DELETE only marks the row dead. Dead tuples accumulate and cause:

Table bloat: heap pages fill with dead rows, sequential scans slow down.
Index bloat: indexes still point to dead tuples.
Transaction ID wraparound: after ~2 billion XIDs PostgreSQL must freeze old tuples or it shuts down to prevent data corruption.

8.2 Per-Table Autovacuum Tuning¶

-- High-write table: agent_events
ALTER TABLE agent_events SET (
    autovacuum_vacuum_scale_factor = 0.01,       -- trigger at 1% dead (default 20%)
    autovacuum_vacuum_cost_delay = 2,             -- less sleeping between pages (default 2ms)
    autovacuum_vacuum_cost_limit = 1000,          -- more pages per wake cycle (default 200)
    autovacuum_analyze_scale_factor = 0.005       -- re-analyze at 0.5% changes
);

-- Low-write lookup table: templates
ALTER TABLE templates SET (
    autovacuum_vacuum_scale_factor = 0.2,         -- default is fine
    autovacuum_enabled = true
);

8.3 Monitoring Dead Tuples¶

-- Top bloated tables
SELECT relname,
       n_live_tup,
       n_dead_tup,
       ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pct,
       last_autovacuum,
       last_autoanalyze
FROM   pg_stat_user_tables
WHERE  n_dead_tup > 1000
ORDER  BY n_dead_tup DESC;

-- Check for impending wraparound (should stay below 200M for comfort)
SELECT relname,
       age(relfrozenxid) AS xid_age,
       pg_size_pretty(pg_total_relation_size(oid)) AS total_size
FROM   pg_class
WHERE  relkind = 'r'
ORDER  BY age(relfrozenxid) DESC
LIMIT  10;

8.4 Manual Vacuum for Emergencies¶

-- Regular vacuum (non-blocking, reclaims space for reuse)
VACUUM (VERBOSE) agent_events;

-- VACUUM FULL: rewrites the entire table (EXCLUSIVE LOCK -- blocks all queries)
-- Only use for extreme bloat when table is e.g. 50% dead tuples
VACUUM FULL agent_events;

-- pg_repack: online alternative to VACUUM FULL (no long lock)
-- CREATE EXTENSION pg_repack;
-- $ pg_repack -t agent_events -d warmwind_production

9. Replication Overview¶

Type	Replicates	Latency	Use Case
Streaming (physical)	Entire WAL byte stream	Sub-second	HA failover, read replicas with identical schema
Logical	Decoded row changes per publication	Seconds	Selective replication, CDC to external systems, major-version upgrades

-- Logical replication: publish agent_events to analytics cluster
CREATE PUBLICATION warmwind_events FOR TABLE agent_events;

-- On subscriber:
CREATE SUBSCRIPTION analytics_sub
    CONNECTION 'host=primary port=5432 dbname=warmwind_production'
    PUBLICATION warmwind_events;

10. PostgreSQL 17 Highlights¶

What's new (2025--2026)

Incremental backup (pg_basebackup --incremental): recovery from 78 min to 4 min in benchmarks.
JSON_TABLE: SQL-standard function for normalizing JSON into rows/columns (replaces jsonb_to_recordset gymnastics).
Improved BRIN: better handling of out-of-order inserts reduces false positives.
MERGE improvements: RETURNING clause, better performance on partitioned targets.
Bulk COPY speedup: 2x faster for large imports via optimized WAL insertion.
pg_stat_checkpointer: new view replacing pg_stat_bgwriter for checkpoint-specific monitoring.

Glossary¶

Glossary

EXPLAIN ANALYZE: Executes a query and returns the actual execution plan with timing, row counts, and buffer statistics. Without ANALYZE, PostgreSQL shows only the estimated plan.
B-tree Index: Default index type. Supports =, <, >, BETWEEN, IN, LIKE 'prefix%', IS NULL, and ORDER BY. Tree height is O(log N) -- typically 3-4 levels for millions of rows.
GIN Index: Generalized Inverted Index. Maps each key (JSONB key, array element, tsvector lexeme) to a posting list of heap TIDs. Fast reads, slower writes (use fastupdate to batch insertions).
BRIN Index: Block Range INdex. Stores min/max per block range (default 128 pages). Tiny on disk (~0.1% of B-tree size). Only effective if physical row order correlates with the indexed column (e.g., created_at on an append-only table).
GiST Index: Generalized Search Tree. Supports geometric types (&& overlap, <-> distance), range types, and full-text search (alternative to GIN with different trade-offs).
PgBouncer: Lightweight connection pooler sitting between the application and PostgreSQL. Reduces per-connection memory overhead (~10 MB/connection in PostgreSQL) and limits total backend connections.
Partition Pruning: Planner optimization that eliminates partitions from the scan based on WHERE clause constraints on the partition key. Visible as Subplans Removed in EXPLAIN.
MVCC (Multi-Version Concurrency Control): PostgreSQL's concurrency model: each transaction sees a snapshot of the database. Writers never block readers. Dead row versions are reclaimed by VACUUM.
Row-Level Security (RLS): Database-enforced access control that appends filter predicates to every query based on USING / WITH CHECK policies. Operates at the planner level -- queries never see rows they are not authorized to access.
Transaction ID Wraparound: PostgreSQL uses 32-bit transaction IDs. After ~2 billion transactions, the counter wraps around. VACUUM freezes old tuples to prevent data loss. Failure to vacuum causes PostgreSQL to shut down preventatively.
LISTEN/NOTIFY: Lightweight pub/sub built into PostgreSQL. NOTIFY channel, 'payload' sends a message to all sessions that have executed LISTEN channel. No persistence -- if no listener is connected, the message is lost.
pg_partman: Extension for automated creation and maintenance of time-based and serial-based partitions. Handles premake, retention, and partition detachment.
Logical Replication: Publishes decoded row changes (INSERT, UPDATE, DELETE) from a publication to a subscription. Allows selective replication of specific tables and cross-version replication.