Streaming Replication (SR) is a way of achieving High Availability with Postgres. There are many parameter settings that need to be set correctly for streaming replication. In this blog, we will be discussing the essential parameters and configuration for streaming replication in Postgres. The techniques and parameters in this blog are equally applicable to PostgreSQL 12, the open-source database, and to EDB Postgres Advanced Server 12, the Oracle compatible version of PostgreSQL.
Before we deep dive into the settings, let’s have a look at the Streaming Replication architecture. The below diagram shows the architecture of one master and two standbys. In the picture, we have also listed the important streaming replication parameters and configurations of each server.
In the above diagram, you can see we have a master and two standbys. The two standbys are streaming the data from the master using two different physical replication slots: standby_east and standby_west. To make the standby use physical replication slots, it is necessary to create the physical replication slots on master before standbys connect to the master. To create physical replication slot DBA(s)/user(s) can use the following function:
SELECT pg_create_physical_replication_slot(‘slot_name’);
To create the physical replication slots in the diagram, we have used the following SQLs on the master to create the slots:
SELECT pg_create_physical_replication_slot(‘standby_east’);
SELECT pg_create_physical_replication_slot(‘standby_west’);
Before we get into the understanding of parameters, let’s understand the Replication Slots.
Replication slot provides an automated way to make sure that the master does not remove WAL segments until all standbys have received them. It was introduced in Postgres 9.4. Since then DBAs are relying on the replication slot technology to have a stable streaming replication between a master and its standby(s).
With replication slot streaming replication, it is highly recommended that people should monitor their master server’s pg_wal directory and drop the unused slots or slots which are not getting used due to standby(s) failure, to avoid filling up the disks with WAL that is waiting to be consumed
For dropping a slot, DBA(s)/User(s) can use pg_drop_replication_slot(slot_name) function.
Host-based Access Configuration (pg_hba.conf):
Before setting the streaming replication, it is important to allow the connections between the standby(s) server and the master server. For allowing the connections it is necessary to have an entry in the pg_hba.conf file for replication. Through pg_hba user would be able to control which standby(s) should receive data from the master.
If you are planning to have cascading replication, then it is recommended to have pg_hba configuration on standby(s).
Please refer to the following documentation for more information on pg_hba and supported authentication.
Following is an example of an entry in pg_hba.conf for replication based on the diagram, we have presented:
# TYPE DATABASE USER ADDRESS METHOD
host replication rep_user 10.10.1.8/32 scram-sha-256
host replication rep_user 10.11.1.9/32 scram-sha-256
host replication rep_user 10.12.1.10/32 scram-sha-256
Above you can notice that the database is replication, replication is not a database in the master. However, replication shows that connections matching the entry/address are going to use physical replication connection.
Replication Settings for the Master Server
The following are the details of the parameters
- max_wal_senders - This parameter specifies the maximum number of concurrent connections allowed on the sender from standby(s) or tools like pg_basebackup clients for streaming. The default value is 10. Set this parameter based on number standby(s) and pg_basebackup like tool going to stream the data from a sender. Based on our diagram, we don’t need to make any changes because the default value is giving us enough room for our two standby(s)
- max_replication_slots - This parameter specifies the maximum number of replication slots that a sender can support, and the number of standby(s)/tools can be allowed to use replication slots to stream data. The default value is 10. Set this parameter based on the number of standby(s)/tools is going to use for streaming.
- wal_sender_timeout - With this parameter, a sender/master decides on terminating replication connections that are inactive for longer than the amount of time mentioned for the parameter. This parameter is useful to detect a standby crash or network outage. The default value is 60 seconds. A value of zero disables the timeout mechanism.
- wal_level = replica - With this parameter settings, the master server decides on how much information is written to the WAL. The default value of this parameter is replica. With this setting, the sender writes enough data to enable streaming replication.
- wal_log_hints = on - This parameter makes a master server write each disk page to WAL with the first modification with hint bits. With this setting, you can use pg_rewind to make a master be standby of a new master after the failover.
- synchronous_standby_names - This parameter specifies standby(s) that can support synchronous replication for very high availability and protection against data loss. Synchronous replication in Postgres gives the ability to confirm that all changes made on a master have been transferred to standby(s) servers.
Based on the diagram (mentioned above) if a user wants to make any standby(s) to be synchronous standby, then DBA(s) can use application_name mentioned in the primary_conninfo parameter of standby.
Following are the syntax of the setting of this parameter:
- [FIRST] num_sync ( standby_name [, ...] )
Examples:
FIRST 1 (standby_east, standby_west)
The FIRST keyword with num_sync, specifies a priority-based synchronous replication, and makes transaction commits wait until their WAL records are received by num_sync (e.g. 1) synchronous standbys chosen based on their priorities. The standbys which names appear first in the list are given higher priority and considered as synchronous.
If one of the synchronous standbys disconnects for any reason, it will be replaced immediately with the next-highest-priority standby.
- ANY num_sync ( standby_name [, ...] )
Example:
ANY 3 (standby_east, standby_west, eu_standby_east, eu_standby_west)
ANY keyword with num_sync means a quorum-based synchronous replication. This setting makes the master wait for each transaction commits until their WAL records are received by at least num_sync standby(s) in the list.
- 's1,s2,s3,...' (list of standby(s))
Example:
‘standby_east, standby_west’
This syntax is the same as FIRST 1 (s1, s2,s3...).
For example, FIRST 1 (s1, s2) and 's1, s2' have the same meaning: either s1 or s2 is chosen as a synchronous standby.
Standby Server Parameters
The following are the lists of standby parameters:
- primary_conninfo = ‘connection string of master/sender’
Example:
primary_conninfo = ‘user=postgres host=10.10.1.8
passfile=/var/lib/pgsql/.pgpass port=5432
application_name=standby_east’
This parameter specifies a master server's connection for the standby/receiver server to connect. The connection string format is described here.
If any options in the connection string are empty, then the corresponding environment variable is used for connection to a master/sender. Find more detail on environment variables here.
- primary_slot_name = ‘name of the slot on sender/master’
Example:
primary_slot_name = 'standby_east'
As mentioned in the description of max_replication_slots parameters description, the replication slots provide an automated way to make sure that the master does not remove WAL segments until all standbys have received them. Using this parameter tells the standby to use named replication slots for streaming data from a master/sender.
- promote_trigger_file = /trigger/file/location
Example:
promote_trigger_file = /var/lib/pgsql/12/data/standby.trigger
This parameter specifies a trigger file whose presence will promote a standby to become a master. If the value of this parameter is not set, then a user can use either of the following commands for promoting a standby.
pg_ctl promote --pgdata=DATADIR
SELECT pg_promote();
- hot_standby = on
This parameter configures a standby(s) to allow connections and execute READ queries during recovery. The default value of this parameter is on.
- recovery_target_timeline = latest
This parameter makes Postgres/standby(s) follow the new master in case of the failure of an old master.
The above recommended parameters for streaming replication will help you create a Highly Available cluster with Postgres.
Learn more about how to manage failover and replication for Postgres high availability here.