PostgreSQL query tuning checklist

January 19, 2023

In this post, we are going to see how to improve PostgreSQL queries. Performance is one of the most important factors for end users. No one wants to wait for 10 minutes to get the results of the SELECT query. 

There is no fixed formula for improving an PostgreSQL query, as query performance can be affected by many things, such as the hardware of the system (RAM, CPU, Disk type, Disk configurations). Here is a checklist (or hints) of places where we should look to analyze workload or queries that are running very slowly. Performance is a very vast topic so it is also recommended to refer to the official online PostgreSQL documentation for this topic: https://www.postgresql.org/docs/12/performance-tips.html.

 

1. EXPLAIN  

The EXPLAIN command shows the execution plan of a statement and is based on the statistics about the table. Always remember the right plan is very critical for good performance. The EXPLAIN command will break down how PostgreSQL will execute the SQL query, but it is more of an estimation. We can use the ANALYZE keyword in the explain plan, which actually executes the query, and then displays the true row counts and true run time accumulated within each plan node.

 

2. VACUUM

The VACUUM command basically removes the tuples that are deleted or made obsolete by an update but were not physically removed from their table. This is very helpful to avoid bloating. It is good practice to analyze a keyword with VACUUM

Please refer to the PostgreSQL documentation for more details:

https://www.postgresql.org/docs/12/sql-vacuum.html.

 

3. ANALYZE

ANALYZE collects statistics about the contents of tables in the database, which helps the query planner to find the most efficient execution plans for queries.

Please refer to the PostgreSQL documentation for more details:

https://www.postgresql.org/docs/12/sql-analyze.html.

 

4. Database Indexes

It is always recommended to create indexes on a table. Otherwise, PostgreSQL will execute a full table scan, which makes executing a query very slow.   

The types of indexes available in PostgreSQL are B-tree (the default index), hash, GiST, SP-GiST, and GIN. PostgreSQL creates implicit indexes when the table has a primary/unique key. 

Also, we should try to avoid creating unused or unnecessary indexes on a table, as these could also affect performance.

We can also use the REINDEX command to fix unusable indexes or when an index gets bloated.

Please refer to the PostgreSQL documentation for more details: 

https://www.postgresql.org/docs/12/sql-createindex.html.

 

5. auto_explain module

The auto_explain module is used for logging execution plans of slow statements automatically. 

Please refer to the PostgreSQL documentation for more details: https://www.postgresql.org/docs/12/auto-explain.html.

 

6. pg_stat_statements

The pg_stat_statements module is used for tracking the execution statistics of SQL statements to identify the queries that are slow. 

Please refer to the PostgreSQL documentation for more details:

https://www.postgresql.org/docs/12/pgstatstatements.html.

 

7. Logging

log_min_duration_statement (integer) can be helpful in tracking down unoptimized queries.

 

8. pgBadger 

pgBadger is an open source tool that generates a detailed report of activity on the database server, including temp files, slow queries, etc. It can be downloaded from here: https://pgbadger.darold.net.

 

9. PostgreSQL configuration parameters

We may use some of the PostgreSQL configuration parameters, which we can change to get better performance. To do this we need to edit the postgresql.conf file, which resides under $data/ directory of your installation. Parameters we can configure include max_connections, checkpoint_segments, work_mem, and random_page_cost.

Please refer to the PostgreSQL documentation for more details:

https://www.postgresql.org/docs/12/runtime-config-resource.html

https://www.postgresql.org/docs/12/runtime-config-query.html

 

Hope it helps!

 

Share this