Skip to main content

Monitor with pg_stat_statements

Beta

Lakebase Postgres (Autoscaling Beta) is the next version of Lakebase, available for evaluation only. For production workloads, use Lakebase Public Preview. See choosing between versions to understand which version is right for you.

pg_stat_statements is a Postgres extension that provides a detailed statistical view of SQL statement execution within your Lakebase Postgres database. It tracks information such as execution counts, total and average execution times, and more, helping you analyze and optimize SQL query performance.

When to use pg_stat_statements

Use pg_stat_statements when you need:

  • Detailed query execution statistics and performance metrics
  • Identification of slow or frequently executed queries
  • Query performance analysis and optimization insights
  • Database workload analysis and capacity planning
  • Integration with custom monitoring tools and dashboards

Enable pg_stat_statements

The pg_stat_statements extension is available in Lakebase Postgres. To enable it:

  1. Connect to your database using the SQL editor or a Postgres client.

  2. Run the following SQL command to create the extension:

    SQL
    CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
  3. The extension begins collecting statistics immediately after creation.

Data persistence

Statistics collected by the pg_stat_statements extension are stored in memory and aren't retained when your Lakebase compute is suspended or restarted. For example, if your compute scales down due to inactivity, any existing statistics are lost. New statistics are gathered once your compute restarts.

This behavior means that:

  • Statistics reset after compute restarts or suspensions
  • Long-running performance analysis requires consistent compute availability
  • You may want to export important statistics before planned maintenance or restarts
note

Consider running your monitoring queries regularly and storing results externally if you need historical performance data across compute lifecycle events.

Learn more: Postgres extensions

Query execution statistics

After enabling the extension, you can query execution statistics using the pg_stat_statements view. This view contains one row per distinct database query, showing various statistics:

SQL
SELECT * FROM pg_stat_statements LIMIT 10;

The view contains details like:

userid

dbid

queryid

query

calls

16391

16384

-9047282044438606287

SELECT * FROM users;

10

For a complete list of columns and descriptions, see the PostgreSQL documentation.

Key monitoring queries

Use these queries to analyze your database performance:

Find slowest queries

This query identifies queries with the highest average execution time, which can indicate inefficient queries that need optimization:

SQL
SELECT
query,
calls,
total_exec_time,
mean_exec_time,
(total_exec_time / calls) AS avg_time_ms
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 20;

Find most frequently executed queries

The most frequently executed queries are often critical paths and optimization candidates. This query includes cache hit ratios to help identify queries that might benefit from better indexing:

SQL
SELECT
query,
calls,
total_exec_time,
rows,
100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements
ORDER BY calls DESC
LIMIT 20;

Find queries with highest I/O

This query identifies queries that perform the most disk I/O operations, which can impact overall database performance:

SQL
SELECT
query,
calls,
shared_blks_read + shared_blks_written AS total_io,
shared_blks_read,
shared_blks_written
FROM pg_stat_statements
ORDER BY (shared_blks_read + shared_blks_written) DESC
LIMIT 20;

Find most time-consuming queries

This query identifies queries that consume the most total execution time across all runs:

SQL
SELECT
query,
calls,
total_exec_time,
mean_exec_time,
rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Find queries that return many rows

This query identifies queries that return large result sets, which might benefit from pagination or filtering:

SQL
SELECT
query,
calls,
rows,
(rows / calls) AS avg_rows_per_call
FROM pg_stat_statements
ORDER BY rows DESC
LIMIT 10;

Reset statistics

To reset the statistics collected by pg_stat_statements:

note

Only databricks_superuser roles have the privilege required to execute this function. The default role created with a Lakebase project and roles created in the Lakebase App are granted membership in the databricks_superuser role.

SQL
SELECT pg_stat_statements_reset();

This function clears all accumulated statistical data, such as execution times and counts for SQL statements, and begins collecting fresh data. It's particularly useful when you want to start fresh with collecting performance statistics.

Resources

Learn more: PostgreSQL documentation