With BigAnimal (DBaaS) release, EDB has greatly reduced effort required from DBAs in the following:
- Provisioning system
- Postgres installation and configuration
- Setting up Postgres high availability
- Disaster recovery mitigation using automated backups
- Setting up a proper monitoring system for visibility
- Securing the system so that you could start fast and safe
However, there is still a critical activity that DBAs have to do, i.e., keep an eye on the performance of the database and system.
When we talk about performance, we talk about the system’s throughput. To maintain/increase throughput, DBAs have to monitor and tune a system accordingly.
Most of the time, monitoring and tuning include :
- Monitoring the system performance (CPU load, disk IO and memory usage, etc.)
- Adding resources like CPU/Memory to the system
- Increasing disk IOPs if disk IO is a bottleneck
- Optimizing operating system settings
- Tuning Postgres parameters based on workload monitoring
- Performing maintenance operations on the system based on monitoring like reindexing, creating new indexes, performing vacuum operation
- Identifying slow SQL/Queries in Postgres
In this blog, I will discuss options available for identifying slow queries.
Ways to find SQL slowness
Postgres out of the box gives multiple options to DBAs/developers to identify slow queries. Following is the list of those options in Postgres:
1. Logging of slow queries
log_min_duration_statement – this parameter causes the logging of SQL statements that exceed specified run time. For example, if the parameter is set to 1000ms, all SQL statements that run 1000ms (1 sec) or longer will be logged. This way, DBAs/Developers can track all slow queries for later optimization. Please note DBAs need to determine an appropriate threshold for this parameter based on their workload.
For more information, refer to the following link:
https://www.postgresql.org/docs/current/runtime-config-logging.html
2. EXPLAIN command
The EXPLAIN command allows DBAs to investigate SQL manually. This command shows the execution plan necessary to run the SQL. EXPLAIN with ANALYZE executes the SQL and shows the plan in addition to the planning time and execution time. For more information, please refer to the following link:
https://www.postgresql.org/docs/current/using-explain.html
3. Extensions:
a. pg_stat_statements
The pg_stat_statements extension groups identical queries and aggregates runtime information in a system view.
pg_stat_statements is really like a Swiss army knife. It allows you to understand what is really happening in your system. For more information, please refer to the following link:
https://www.postgresql.org/docs/current/pgstatstatements.html
b. auto_explain
The auto_explain extension provides ways of automatically logging slow queries and their execution plans, without running EXPLAIN by hand. This is very useful for tracking down un-optimized queries by execution plan in large applications. For more information, please refer to the following link:
https://www.postgresql.org/docs/current/auto-explain.html
4. Views
a. pg_stat_activity
This Postgres view helps in identifying slow-running queries or long-running queries. Using this view, DBAs can also investigate queries running on the system and causing performance issues in real-time. This view also gives information about WaitEvent if the backend process is waiting. For more information, please refer to the following link:
https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW
b. pg_locks
This view helps in identifying blocked queries. Using this view, DBAs can find out the queries which are causing performance degradation due to locks on the system. For more information, please refer to the following link:
https://www.postgresql.org/docs/current/view-pg-locks.html
A performance analysis gap to be filled
DBAs can find slow queries or slow running procedures in Postgres with the above methods. However, there is a gap in identifying slow-running queries within a procedure.
All the methods listed above don't give visibility to the statements inside of the procedure, which takes more time.
To solve this problem, EDB had added a package called DBMS_PROFILER to EDB Postgres Advanced Server version (formerly Postgres Plus Advanced Server) 9.1 as part of the Oracle compatibility package.
The DBMS_PROFILER package permits developers and DBAs to profile EDB SPL or PL/pgSQL procedures and functions’ run-time behavior, making it easier to spot performance bottlenecks that may be investigated more closely.
The DBMS_PROFILER package comes with the following procedures that allow you to start observing runtime behavior on-demand
- dbms_profiler.start_profiler – This procedure allows users to start collecting session data.
- dbms_profiler.flush_data – This procedure flushes the data collected in the current session without terminating the profiler session.
- dbms_profiler.stop_profiler – This procedure stops a profiling session and flushes the performance information to the DBMS_PROFILER tables and view.
and it keeps the raw data and data summary in the following tables:
- sys.plsql_profiler_rawdata – This table contains the performance counters and timers for each statement executed within the session.
- sys.plsql_profiler_runs – This table summarizes each run (aggregating the information found in sys.plsql_profiler_rawdata).
- sys.plsql_profiler_units – This table summarizes each code unit (function, procedure, trigger, or anonymous block) executed within a session.
Let’s have a look at how all these work together. First, let’s create a sample table and INSERT some data.
CREATE TABLE user_data (
id bigint,
data text
);
INSERT INTO user_data
SELECT
id,
cash_words(id::money)
FROM
generate_series(1, 100000) gen (id);
Next, we will create the same procedure
CREATE OR REPLACE PROCEDURE spl_perform_transaction (data_id bigint)
IS
found boolean := FALSE;
data_row user_data%ROWTYPE;
BEGIN
-- CHECK if VALUE Exists or not
SELECT CASE WHEN COUNT(1) > 0 THEN TRUE ELSE FALSE END INTO found FROM user_data WHERE id = data_id;
IF NOT found THEN
DBMS_OUTPUT.PUT_LINE ('Data ID doesn’t exists');
DBMS_OUTPUT.PUT_LINE ('Inserting ID '||data_id);
INSERT INTO user_data VALUES (data_id, cash_words(data_id::money));
ELSE
SELECT * INTO data_row FROM user_data WHERE id = data_id;
DBMS_OUTPUT.PUT_LINE ('Data ID already exists');
DBMS_OUTPUT.PUT_LINE ('ID: ' || data_row.id || ' | data: ' || data_row.data);
END IF;
END;
After creating a table and procedure, let’s start profiling the execution of the spl_perform_transaction procedure.
edb=# EXEC dbms_profiler.start_profiler('profile spl_perform_transaction_check');
EDB-SPL Procedure successfully completed
After starting the named profiling of a session, let’s execute the procedure:
edb=# EXEC dbms_profiler.start_profiler('profile spl_perform_transaction_check');
EDB-SPL Procedure successfully completed
edb=# exec spl_perform_transaction(3989890);
Data ID already exists
ID: 3989890 | data: Three million nine hundred eighty nine thousand eight hundred ninety dollars and zero cents
EDB-SPL Procedure successfully completed
edb=# exec spl_perform_transaction(3989899);
Data ID doesn’t exists
Inserting ID 3989899
Now, we can stop the profiler after the execution of the procedure.
edb=# EXEC dbms_profiler.stop_profiler;
EDB-SPL Procedure successfully completed
Now we have data to find total excess runtime in our procedure. To find the excess runtime, we need to start by finding the runid of our session profile. A developer can use the following query to find the runid:
edb=# SELECT runid FROM plsql_profiler_runs WHERE run_comment = 'profile spl_perform_transaction_check';
runid
-------
4
After finding the runid, we could use the following SQL to find the procedure’s line numbers which are slower than we’d like:
SELECT a.runid,
a.run_total_time / 1000000000 run_secs, c.total_occur,
c.total_time / 1000000000 line_total_secs, c.line#, u.text
FROM plsql_profiler_runs a,
plsql_profiler_units b,
plsql_profiler_data c,
user_source u
WHERE a.runid = b.runid
AND a.runid = c.runid
AND UPPER(split_part(b.unit_name,'(',1)) = u.NAME
AND c.line# = u.line
AND c.total_time / 1000000000 > 0
AND a.runid=4;
runid | run_secs | total_occur | line_total_secs | line# | text
-------+----------+-------------+------------------------+-------+----------------------------------------------------------------------------------------------------------
4 | 0 | 2 | 1.3e-14 | 4 | data_row user_data%ROWTYPE;
4 | 0 | 2 | 1.2165e-11 | 6 | -- CHECK if VALUE Exists or not
4 | 0 | 2 | 1.5999999999999998e-14 | 7 | SELECT CASE WHEN COUNT(1) > 0 THEN TRUE ELSE FALSE END INTO found FROM user_data WHERE id = data_id;
4 | 0 | 1 | 1.3200000000000002e-13 | 8 | IF NOT found THEN
4 | 0 | 1 | 9.5e-14 | 9 | DBMS_OUTPUT.PUT_LINE ('Data ID doesn’t exists');
4 | 0 | 1 | 7.8e-14 | 10 | DBMS_OUTPUT.PUT_LINE ('Inserting ID '||data_id);
4 | 0 | 1 | 5.9449999999999994e-12 | 12 | ELSE
4 | 0 | 1 | 5.2e-14 | 13 | SELECT * INTO data_row FROM user_data WHERE id = data_id;
4 | 0 | 1 | 2.4000000000000002e-14 | 14 | DBMS_OUTPUT.PUT_LINE ('Data ID already exists');
(9 rows)
The above output shows the line number with text and its total execution time. Based on this information, a developer and DBA can work on tuning the portions of the procedure that becomes a performance bottleneck.
The following are some more SQL statements that developers and DBAs can use to trace slowness in their procedures.
1. Show procedure name, line_number, line text, total calls, total time and line’s min and max execution time
SELECT
trim(decode(unit_type, 'PACKAGE SPEC', 'PACKAGE', unit_type) || ' ' || trim(pu.unit_owner) || '.' || trim(pu.unit_name)) object_name,
' (line ' || pd.line# || ')' line_number,
src.text,
pd.total_occur,
pd.total_time,
pd.min_time,
pd.max_time
FROM
plsql_profiler_units pu,
plsql_profiler_data pd,
all_source src
WHERE
pu.unit_owner = USER
AND pu.runid = 4
AND pu.runid = pd.runid
AND pu.unit_number = pd.unit_number
AND src.owner = UPPER(pu.unit_owner)
AND src.type = pu.unit_type
AND src.name = UPPER(split_part(pu.unit_name, '(', 1))
AND src.line = pd.line#
2. Show object type, object name with total, min and max execution time
SELECT
u.unit_type,
split_part(u.unit_name, '(', 1),
d.total_time / 10000000 total,
d.total_occur,
d.min_time,
d.max_time
FROM
plsql_profiler_units AS u,
plsql_profiler_data AS d
WHERE
u.runid = 1
AND u.unit_owner = 'enterprisedb'
AND d.runid = u.runid
AND d.unit_number = u.unit_number
AND d.total_time / 1000000000 > 0.00
ORDER BY
d.total_time DESC;
In summary
Postgres has many capabilities that DBAs can use to identify poor performance caused by unoptimized code from developers/applications. However, there is still a gap in identifying slow statements within procedures and functions. To fill this gap, EDB Advanced Server provides the DBMS_PROFILER package, which allows DBAs/developers to trace slow statements in a procedure based on execution time and the number of calls, and assists in optimizing and tuning those SQL calls.