By: Simon Riggs, Marc Linster, and Bruce Momjian
The news is out - PostgreSQL 14 is now available. It’s been a year to celebrate the world’s best database. The leading data technology review site, DB-Engines, named PostgreSQL Database Management System of the Year 2020, continuing to outpace MongoDB, Redis, and MySQL. And in the Stack Overflow 2021 Developer Survey, PostgreSQL leads the pack with 44% of professional developers choosing it as the most popular technology and 77% choosing it as most loved. Developers also selected Postgres as the most wanted database above all others. The Cloud Native Computing Foundation positioned Postgres as the #1 database on its technology radar - a clear endorsement of Postgres in one of the most exciting new strategic technologies.
This release pushes PostgreSQL to new levels of performance for high-end workloads. It handles partition writes faster, index growth better, and idle connections with less overhead. There are also improvements for managing larger databases and analytical systems, as well as improvements for time-series, graph query, and document data management.
We at EDB are once again delighted to be part of the latest release, reflecting our ongoing commitment to helping the Community constantly improve and build on the software. EDB staff in the Community made significant contributions to PostgreSQL 14 in these areas:
Security
The default setting of password_encryption parameter is now SCRAM-SHA-256 to ensure that new users take full advantage of this important security feature, deprecating MD5 for new users. Passwords can now be much longer in both pgbouncer and PostgreSQL, allowing users to take full advantage of the security that SCRAM-SHA-256 provides. In addition, we now allow an SSL certificate's distinguished name (DN) to be matched for client certificate authentication, and we add Server Name Indication (SNI) in SSL connection packets.
Manageability
VACUUM can now skip index vacuuming when the number of removable index entries is insignificant. The vacuum parameter INDEX_CLEANUP has a new default of auto that enables this optimization. New capabilities allow vacuum to be more aggressive in removing dead rows during minimal-locking index operations and if the table is nearing the xid or multixact wraparound thresholds. This is controlled by the parameters vacuum_failsafe_age and vacuum_multixact_failsafe_age.
Parameter changes that prevent replay on standby servers now cause replay to halt, rather than cause a server shutdown, improving server availability. Various additional monitoring features ensure that we know what the server is doing and what exact state it is currently in.
Scalability and performance
PostgreSQL 14 delivers huge gains in the performance of updates and deletes on tables that have a large number of partitions. This change greatly reduces the planner's overhead for such cases, and also allows updates/deletes on partitioned tables to use execution-time partition pruning. Most importantly, we now scale to even more partitions than before, allowing PostgreSQL to cope with much larger databases than was previously possible.
Partitions can now also be detached concurrently, making it easier to manage large time-series data tables. The new date_bin() function allows users to more easily summarize time-series data into date ranges. Commonly used window functions are now easy to use with more flexible data types for defaults.
Applications with larger JSON and XML documents will be enhanced by a new compression option for TOASTed data, LZ4. The new method uses the latest compression technology to greatly improve the speed of compression and decompression of document data.
Application performance tuning and indexes
Application tuning via index management is improved in PostgreSQL 14. Specifically, CREATE INDEX CONCURRENTLY and REINDEX CONCURRENTLY no longer limit the dead row removal of other relations, nor do they interfere with each other if running multiple commands at once.
Data loading performance has been improved by enhancements to the COPY command with the FREEZE option now acting more effectively by updating page visibility bits. Binary-mode COPY is also now faster.
BRIN indexes have seen several improvements. They have become more flexible since they can record multiple min/max values per range, which helps when there are groups of values in each page range. BRIN indexes can now use bloom filters, which allows BRIN indexes to be used effectively with data that is not well-localized in the heap, or later becomes so because of updates.
Materialized views can now be REFRESH'd using parallel query to improve performance.
Complex query processing improvements using extended statistics
PostgreSQL has sometimes been criticised for not advocating hints that provide short-term tweaks to query performance. The best long-term approach to solving optimizer problems is in a declarative manner, focusing on the data itself. EDB staff pioneered the CREATE STATISTICS command and continue to enhance this declarative approach. PostgreSQL 14 increases the number of places extended statistics can be used for OR clause estimation, and also for expressions. This allows statistics on a group of expressions and columns, rather than only columns like previously. System view pg_stats_ext_exprs reports such statistics.
Standardization
EDB's respect for openness and standardization is shown by the latest round of enhancements to SQL Standard support. Server-side functions and procedures that use LANGUAGE SQL can now use SQL-standard compliant function bodies. Previously only string-literal function bodies were supported. When writing a function or procedure in SQL-standard syntax, the body is parsed immediately and stored as a parse tree. This allows better tracking of function dependencies, which has security benefits.
Graph-query support is enhanced by SEARCH and CYCLE clauses, which improve the ease of writing complex graph queries in PostgreSQL. These are part of the SQL Standard also. Recent developments in the SQL Standard are also being tracked for later inclusion.
Other minor standards compliance features include GROUP BY DISTINCT and GRANTED BY clauses, alias for JOIN .. USING, as well as new functions CURRENT_ROLE, trim_array(), and new substring() syntax.
Enhanced logical replication
Logical replication has been enhanced to allow streaming of long in-progress transactions to Subscribers, or via the API. Previously, transactions larger than logical_decoding_work_mem were written to disk until the transaction completed, which could have significant impact.
Logical replication now handles large numbers of DDL statements more easily. Cache invalidation messages are written to the WAL after each command when using logical replication, so they can be processed by streaming in-progress transactions.
Sharding
A query referencing multiple foreign tables can perform foreign table scans in parallel, significantly improving the performance of postgres_fdw supports for this type of scan if async_capable is set. This is an enabling feature for future multi-node analytics. Connection cacheing also allows connections to stay open across transactions to further improve performance.
Libpq pipeline mode also allows a stream of changes to be sent without waiting for network confirmation on each command, greatly improving performance for distributed databases.
EDB staff were also involved in many minor improvements, including replication recovery control, and diagnostic features such as pg_amcheck to check for corruptions in PostgreSQL databases. And of course there were many other community contributions not mentioned here.
Thanks and kudos to all EDB contributors, as well as to the wider PostgreSQL Community for making PostgreSQL 14 another great release. Be sure to check out the announcement and read the release notes for a detailed list of what’s new.