← Back

12 seconds to scan 3,000 rows — oops

Two things saved me this week: EXPLAIN ANALYZE and SET LOCAL enable_nestloop = off. The first is Postgres’ way of showing you what the query planner actually did — not what it thinks it’ll do, but real execution times, real row counts, real join strategies. The second is a transaction-scoped override that tells the planner “you are not allowed to use nested loop joins for this query.” You wrap it in a BEGIN / COMMIT block, it applies to that single execution and doesn’t leak to anything else on the connection. It’s ugly. It’s a blunt instrument. It gave me a 16× speedup on a query where three increasingly sophisticated SQL rewrites each made things worse.

The problem: Postgres’ WindowAgg node — the thing that executes your ROW_NUMBER(), RANK(), SUM() OVER (...) — doesn’t propagate cardinality estimates. The planner evaluates the window function’s output, panics, and estimates rows=1. If that window function sits inside a view that feeds into downstream joins, that single bad estimate cascades into every subsequent join decision.

The planner picks nested loops because nested loops are optimal for one row. You don’t have one row. You have thousands. Each iteration re-scans a multi-thousand-row view. Millions of intermediate rows are generated and immediately discarded. Your query takes 12 seconds on 3,000 source rows and you lose an hour of your life trying rewrites that all fail because the underlying estimation bug poisons every query shape you throw at it. SET LOCAL enable_nestloop = off forces hash joins, which tear through the actual data in under a second. It’s a known limitation with no clean workaround on managed Postgres. Sometimes the pragmatic fix is the right one.