When developing a database solution architecture, it is very important to carefully consider all aspects of the future functionality and respective business needs, so as to avoid unnecessary complexity and expense while ensuring that all business requirements are met. This is particularly important when designing a database system with high availability. In information technology, high availability is typically achieved by providing redundancy of critical components, and you want your architecture to be able to provide the necessary level of resilience without incurring excessive costs of components that, one hopes, are actively used only on rare occasions.
What is High Availability?
High availability (HA) refers to a system's ability to remain operational and accessible in the event of a component failure. In the context of databases, this means having strategies in place to maintain data access during planned maintenance and unexpected outages.
Two metrics are commonly used to define the level of availability of a database solution:
- Recovery Time Objective (RTO) determines how long the database system can remain offline. This is the time you have to bring the database to an operational state.
- Recovery Point Objective (RPO) determines how far back in time, relative to the moment of failure, the recovered database snapshot can be. This is the amount of transactional changes preceding the failure that are allowed to be lost during recovery. RPO zero means that absolutely no transactions that are considered committed by the application can be lost.
Each step towards zero for either of these metrics increases the cost and complexity of the solution architecture, requiring these objectives to be weighed against the desired business outcomes.
PostgreSQL offers built-in mechanisms to support high availability, which form the basis of the alternatives described below.
Terminology
Cluster
Traditionally, when speaking about PostgreSQL, the term cluster means a collection of one or more databases managed by a single PostgreSQL process. It is also sometimes referred to as an instance of PostgreSQL.
In the high availability context, cluster denotes a group of redundant PostgreSQL instances, together with any related components such as failover manager processes, that form a highly available database solution.
To avoid confusion, in this document we will use the term cluster in the latter meaning, while using instance for the former concept.
Node
A node of a cluster is a single server, physical or virtual, where one or more software components of an HA solution run. Typically a node would host a PostgreSQL instance, but this is not always the case: for example, a solution might contain a separate dedicated node for the Barman backup software.
Location
Nodes of a cluster can be deployed in one or more geographic locations: cloud provider’s availability zones (AZs) or regions, or actual data centres (DCs). Geographically distributed database clusters provide resilience against regional infrastructure failures and disasters, although they are slightly more complex to set up and maintain. Choose a design that matches your overall disaster preparedness strategy.
Witness
A witness, also called a tiebreaker, is necessary in a cluster that has an even number of database nodes, to help achieve quorum when a decision to promote a replica needs to be made. Most failover clustering solutions require the majority of nodes to make a promotion decision. If the cluster has an even number of nodes and half of them fail, the majority cannot be achieved without an extra witness.
Achieving High Availability in PostgreSQL
Point in Time Recovery
Point in time recovery (PITR) of the database from a backup can be considered a part of the database high availability arsenal, if it supports the required RTO and RPO goals. This approach requires the failed database to be restored from its backup copy, and a series of archived WAL files replayed to the desired point in time, which makes the recovery time dependent entirely on the database size and the number of WAL files to be processed. The recovery point is determined by the frequency of WAL file archiving. PITR may be an acceptable high availability option for databases that change infrequently and applications that can tolerate some downtime.
Replication
PostgreSQL offers two built-in replication methods: physical and logical. EDB Postgres Distributed (PGD) extends the native PostgreSQL logical replication implementation by allowing multi-directional (active-active) replication and adding enterprise-grade features, making it suitable for high availability applications.
PostgreSQL replication is asynchronous by default, meaning that there is no guarantee that replicas have received or replayed all transactions from the primary node prior to its failure. However, you can configure synchronous replication at different levels, in which case a transaction will not be considered complete until all, or at least some, of the replicas confirm that they have received the corresponding data. Synchronous replication offers lower RPO, often near zero, at the cost of reducing the database transaction throughput.
Each replication option has its own strengths and weaknesses, making them suitable for different use cases.
Physical Replication
How it works:
Physical replication involves copying the entire transaction log (WAL) records of changes from the primary server to the replica servers. Since all databases in a PostgreSQL instance share the same WAL files, this replication method moves data of all databases.
Physical replication can be implemented by two methods, differing by the RPO they can support.
- Log shipping: The primary server regularly archives WAL files to a shared location, where replicas retrieve those files and replay transactions present in them.
- Streaming: Replicas are constantly connected to the primary server, which sends WAL records via those connections as soon as the corresponding transactions are committed.
Advantages
-Simple setup: Configuring physical replication requires fewer steps.
-Replicates database schema: Not only changes to the data, but also modifications of the table structures and other database objects are replicated.
-High performance: In some cases it can be faster and more efficient than logical replication.
-Low overhead: Minimal processing is required on the primary server.
Disadvantages:
-Limited flexibility: Does not allow filtering or transformation of data during replication.
-Version compatibility: Requires the same major versions of PostgreSQL on both the primary and replica servers.
-Standby servers: Primarily used for high availability and disaster recovery. Allows read-only queries to be executed on the replicas.
Logical Replication
How it works:
Logical replication involves capturing logical changes (inserts, updates, deletes) to specific tables or databases and sending them to the replica server as a sequence of data change instructions.
Advantages:
-Flexibility: Allows for filtering, transforming, and routing data to multiple replicas.
-Version compatibility: Can replicate data between different major PostgreSQL versions.
-Read-write replicas: Allows read-write operations on the replica servers.
Disadvantages:
-Higher overhead: More processing is required on the primary server.
-Potential for data conflicts: Updates on replicas can introduce data divergence or conflicting changes.
-No schema changes: Table structure changes are not replicated and must be maintained manually. Schema deviations can break replication.
-Not suited for HA: Implementation of a robust high availability architecture using the native PostgreSQL logical replication is not practical.
EDB Postgres Distributed
How it works:
EDB Postgres Distributed (PGD) improves upon the built-in logical replication by allowing multi-directional replication, adding DDL replication, data conflict resolution, improved consistency controls, and other features.
Advantages over native logical replication:
-Schema changes: DDL commands are replicated, ensuring consistent table structure on all replicas.
-Conflict avoidance and resolution: Allows safe concurrent updates on all replicas.
-Data consistency: Greater flexibility to allow improved data consistency across the chosen set of replicas.
When to use which:
Physical Replication:
- High-availability setups
- Disaster recovery
- Read-only replicas for load balancing
Logical Replication:
- Data warehousing and analytics
- Data distribution across multiple regions
- Test and development environments
- Data integration with other systems
EDB Postgres Distributed combines the best of the two methods above, providing the robustness and reliability of physical replication while offering the flexibility of logical replication.
Each option allows you to reach certain RTO and RPO goals; the following diagram shows their relative capabilities in those domains.
Failover Clustering
For critical applications, failover clusters provide a comprehensive solution for high availability. Tools like EDB Failover Manager, Patroni, or repmgr can be utilized with physical replication to automatically manage the promotion of standby servers when the primary server fails. These tools simplify cluster management and reduce the complexity involved in failover processes.
EDB Postgres Distributed implements failover management as part of its base functionality and does not require setting up additional tools to automate failover.
Load Balancing
Load balancing helps distribute database requests across multiple servers. This not only improves performance but also ensures that if one server is down, the others can handle the load. Tools like HAProxy can be configured to manage connections efficiently, especially when working in tandem with a failover management solution.
Backup and Recovery
Regular backups are an important aspect of high availability. PostgreSQL supports various backup methods, including physical and logical backups, as well as snapshots at the file system or storage device level. Combining these with point-in-time recovery (PITR) capability allows you to restore your database to a specific state in case of failure and application or user error. Tools like Barman and pgBackRest help with managing recovery objects (database and WAL files) and simplify backup and recovery tasks.
Connection Pooling
Connection poolers like PgBouncer and Pgpool-II help maintain database performance during high load periods by reusing existing database connections and redirect application connections to the new primary after a failover. This can prevent overwhelming the database with new connection requests and improve resilience.
Best Practices for Implementing High Availability
- Monitoring and Alerts: Regularly monitor your database performance and set up alerts for any issues. Tools like Postgres Enterprise Manager, Prometheus, or Grafana can help track important metrics, provide insights into the health of your database, and notify you of adverse conditions.
- Testing Failover Procedures: Regularly test your failover procedures to ensure they work as expected. Simulating failover scenarios can help identify weak points in your HA strategy.
- Documentation: Maintain clear documentation of your high availability configuration and standard procedures. This can streamline troubleshooting and improve response times during failures when manual intervention is required.
Sample Architecture Patterns
Due to the large number of different software components that can be used for implementing a PostgreSQL high availability solution, and the variety of business requirements and other external factors informing the design, it is not practical to thoroughly describe all possible architectural options within the scope of this document. We provide a few sample solutions that we hope will help you in your own research and planning.
Each of the architecture patterns described here can be implemented using either PostgreSQL physical replication or EDB Postgres Distributed; the choice of the underlying technology is influenced by the availability of certain features in either software product, but also by non-technical criteria, such as cost considerations and tolerance for complexity.
We refer to these architecture patterns by their basic layout, which includes the number of deployment locations and the number of cluster nodes.
One by Three
This architecture comprises only one location, where three cluster nodes are deployed, as shown in the following figure.
In this particular example the cluster has only two database nodes: primary and its physical streaming replica. The third node is shared by two components:
- The failover management software that serves as a witness, or tiebreaker, for automatic promotion of the replica in case of the primary node failure.
- The Barman backup and recovery software.
This is the most economical high availability architecture, containing only two components that potentially require a paid license (two database servers), with the smallest footprint. It protects the system from a single data node failure within its location.
This architecture, however, does not protect you from complete location failure, especially if the backup copies are not stored in a different location.
One by Three Plus One
This pattern extends the architecture described above by adding another location for disaster recovery purposes, as shown below.
In this configuration the cluster can withstand the loss of its primary location, although the surviving offsite replica cannot be automatically promoted due to the lack of quorum. Manual intervention will be required to promote the replica and, possibly, extend the surviving location to a complete One by Three configuration.
Two by Three Plus One
This example of EDB Postgres Distributed deployment offers the ultimate high availability and disaster recovery capabilities. Not only does it provide resilience for a single node failure in the primary location, but it also allows fully automatic, nearly instantaneous disaster recovery if the entire primary location is lost.
The latter capability is made possible by the presence of a witness node in a third location. The witness does not contain any data, but it provides the vital tie-breaking vote for transparent failover between the availability zones when it becomes necessary. Clearly, this level of availability comes at the expense of having to maintain multiple redundant database nodes.