Another Avenue of Database Testing

Re: Production query plans without production data.

Radim Marek on 2026-03-08:

PostgreSQL 18 changed that. Two new functions: pg_restore_relation_stats and pg_restore_attribute_stats write numbers directly into the catalog tables. Combined with pg_dump --statistics-only, you can treat optimizer statistics as a deployable artifact. Compact, portable, plain SQL.

This article was both informative and enlightening; I recommend reading it in full. At work, my team works primarily with Postgres, schema migration done with Flyway SQL scripts, and queried from Kotlin REST services. Often the queries are written with JPA Repositories, backed by Hibernate. The article has inspired two intriguing ideas in me.

Manual Query Testing

By company policy, software engineers don't have write access to the production databases, and require business justification to have read access. Infrequently I've run into situations where a migration script or service query in development contain write statements. As a result we can't EXPLAIN the queries (no write access means EXPLAIN UPDATE ... is rejected) or get an idea of its performance characteristics ahead of time. In the past we've had to create a ticket for a database admin to EXPLAIN the queries for us.

With the availability of these statistics, it would be feasible to dump the production database statistics (a read operation) to a developer's machine. Then a script could run Postgres in a docker image locally, apply the known Flyway migrations, and then apply the production statistics. This would result in a locally running Postgres instance with read and write permissions, and with table/column statistics that matches production — perfect for EXPLAINing the in-development write queries.

Automated Query Testing

Less feasible, but similarly intriguing, is using the production database statistics to assert performance characteristics of the known queries used in a service. A similar process to the manual query testing could be implemented, with automated tests written to assert on EXPLAIN results for each of the JPA-defined queries. The only difficulty is in how, and how often, to update the captured production statistics. We don't have much of an avenue for automated capture of these statistics, so I'm not sure how valuable these tests would be with potentially months-stale data.