Delightful, production-grade replication for Postgres

July 30, 2024

You’re looking for production-grade Postgres. You need to keep multiple copies of your data around for high-availability. And this data must stay consistent across the copies. You need to handle rolling upgrades of the cluster. You need to handle taking a node out of the cluster for maintenance. You need to handle DDL changes across nodes in the cluster. And we might be getting ahead of ourselves, but you may also need low-latency read replicas or even need to geo-distribute your data while adhering to data governance policies.

It’s true you can’t do these tasks in vanilla Postgres. Nor even with some common high availability (HA) tools. To understand why though, we need to talk about the two different ways you can do replication in Postgres: physical and logical.

Let's tackle physical replication first. Let’s say we have two Postgres nodes. An "upstream" node that directly receives queries from a user, and a "downstream" node that subscribes to the upstream node for changes. Physical replication takes bytes from disk on the upstream node and copies them over the network to the downstream node. The downstream node then writes the bytes back to disk verbatim.

Physical replication happens at such a low level that neither node has any idea what it is actually copying in terms of data it originally received from a user. So if the upstream node and the downstream node are running different versions of Postgres, there’s no guarantee the downstream node can make any sense of the physical data the upstream node sent.

Many commonly used HA tools are based on physical replication. So any requirements like rolling upgrades (e.g. temporarily replicating from a Postgres 12 node to a Postgres 13 node while upgrading the cluster) automatically rule these tools out.

What about logical replication? Unlike physical replication, logical replication retains awareness of user data. Instead of replicating opaque bytes from an upstream node to a downstream node, it replicates logical changes (inserts, updates, and deletes) from upstream to downstream node. Because it operates at this higher level, logical replication actually allows for a cluster running mixed versions of Postgres.

But logical replication in Postgres has some big limitations. The biggest one to me is that you can’t replicate DDL. This limitation is especially challenging since DDL in Postgres is transactional! We can create a table and then roll back that change. But not easily with logical replication in vanilla Postgres.

My coworker Peter has a great post on the history of replication if you want to learn more.

EDB Postgres Distributed

So, we’ve established that these production-grade Postgres requirements are hard. But I want to show you how we can do all of them pretty easily with EDB Postgres Distributed (PGD). And in particular, one of the coolest things PGD builds on top of vanilla Postgres logical replication is the ability to replicate transactional DDL in addition to DML. So on a single machine we’ll build a three-node Postgres cluster with PGD, running three different versions of Postgres. And we’ll see DDL and DML replicating between these nodes.

While we discourage you from running different versions of Postgres permanently, PGD was explicitly designed to support rolling upgrades. Running a mixed version cluster at all is an interesting feat, so we'll go ahead anyway in this post.

I’m going to show you the manual way to set everything up, so nothing feels hidden. But I think you’ll see that this is pretty easily scriptable.

While we support other Debian-based and RHEL-based Linux distros, the instructions in this post will assume you're on Debian 12. So grab an amd64 machine running Debian 12 and let's go!

Grab Postgres 14, 15, and 16

First we follow the postgresql.org instructions for configuring the package repository.

# Import the repository signing key:
sudo apt install curl ca-certificates
sudo install -d /usr/share/postgresql-common/pgdg
sudo curl -o /usr/share/postgresql-common/pgdg/apt.postgresql.org.asc --fail https://www.postgresql.org/media/keys/ACCC4CF8.asc

# Create the repository configuration file:
sudo sh -c 'echo "deb [signed-by=/usr/share/postgresql-common/pgdg/apt.postgresql.org.asc] https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'

# Update the package lists:
sudo apt update

Now we can install the three versions of Postgres.

sudo apt install -y postgresql-16 postgresql-client-16
sudo apt install -y postgresql-15 postgresql-client-15
sudo apt install -y postgresql-14 postgresql-client-14

Grab PGD

Get your EDB repository token from the EDB Repos 2.0 page. If you don't already have an EDB account, you can sign up for free to get a trial subscription to EDB's software repositories.

With that token, set an environment variable:

export EDB_SUBSCRIPTION_TOKEN=<your-repo-token>

And use our helper script to set up package repositories:

curl -1sLf "https://downloads.enterprisedb.com/$EDB_SUBSCRIPTION_TOKEN/postgres_distributed/setup.deb.sh" | sudo -E bash

Now install PGD for each Postgres version:

sudo apt install -y edb-bdr5-pg14
sudo apt install -y edb-bdr5-pg15
sudo apt install -y edb-bdr5-pg16

Starting up three Postgres instances

Now that we’ve got three Postgres versions installed locally, we can start creating databases and starting them. Before we start each database though, we need to edit the database’s postgresql.conf so that it loads the PGD extension, which is called bdr.

Creating the Postgres 14 database

Although the Debian packages set up some directories and ran initdb, let's give ourselves a clean environment.

admin@localhost $ sudo su postgres
postgres@localhost $ rm -rf /var/lib/postgresql/14
postgres@localhost $ /usr/lib/postgresql/14/bin/initdb /var/lib/postgresql/pg14
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "C.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /var/lib/postgresql/pg14 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/lib/postgresql/14/bin/pg_ctl -D /var/lib/postgresql/pg14 -l logfile start

Configuring postgresql.conf for PGD

Now run the following to set up the database to run PGD.

postgres@localhost $ echo "
shared_preload_libraries = 'bdr'
wal_level = 'logical'
track_commit_timestamp = on

# For bigger clusters you may want to decrease shared buffers or the
# OS may complain you are using too much shared memory.
shared_buffers = 1MB

# These depend on the size of the cluster.
max_worker_processes = 12
max_wal_senders = 6
max_replication_slots = 6" | tee -a /var/lib/postgresql/pg14/postgresql.conf

Start up Postgres 14 database

And start it as suggested, but running on port 8014:

postgres@localhost $ /usr/lib/postgresql/14/bin/pg_ctl -o "-p 8014" -D /var/lib/postgresql/pg14 -l /tmp/logfile14 start
waiting for server to start.... done
server started

Creating the Postgres 15 database

Now let’s create and start the Postgres 15 database.

postgres@localhost $ rm -rf /var/lib/postgresql/15
postgres@localhost $ /usr/lib/postgresql/15/bin/initdb /var/lib/postgresql/pg15
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "C.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /var/lib/postgresql/pg15 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/lib/postgresql/15/bin/pg_ctl -D /var/lib/postgresql/pg15 -l logfile start

Configuring postgresql.conf for PGD

Now set up the database to run PGD.

postgres@localhost $ echo "
shared_preload_libraries = 'bdr'
wal_level = 'logical'
track_commit_timestamp = on

# For bigger clusters you may want to decrease shared buffers or the
# OS may complain you are using too much shared memory.
shared_buffers = 1MB

# These depend on the size of the cluster.
max_worker_processes = 12
max_wal_senders = 6
max_replication_slots = 6" | tee -a /var/lib/postgresql/pg15/postgresql.conf

Start up Postgres 15 database

Now run the Postgres 15 database.

postgres@localhost $ /usr/lib/postgresql/15/bin/pg_ctl -o "-p 8015" -D /var/lib/postgresql/pg15 -l /tmp/logfile15 start
waiting for server to start.... done
server started

Creating the Postgres 16 database

Finally, create the Postgres 16 database.

postgres@localhost $ rm -rf /var/lib/postgresql/16
postgres@localhost $ /usr/lib/postgresql/16/bin/initdb /var/lib/postgresql/pg16
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "C.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /var/lib/postgresql/pg15 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/lib/postgresql/16/bin/pg_ctl -D /var/lib/postgresql/pg16 -l logfile start

Configuring postgresql.conf for PGD

Once again run the following to set up the database to run PGD.

postgres@localhost $ echo "
shared_preload_libraries = 'bdr'
wal_level = 'logical'
track_commit_timestamp = on

# For bigger clusters you may want to decrease shared buffers or the
# OS may complain you are using too much shared memory.
shared_buffers = 1MB

# These depend on the size of the cluster.
max_worker_processes = 12
max_wal_senders = 6
max_replication_slots = 6" | tee -a /var/lib/postgresql/pg16/postgresql.conf

Start up Postgres 16 database

Finally, start the Postgres 16 database.

postgres@localhost $ /usr/lib/postgresql/16/bin/pg_ctl -o "-p 8016" -D /var/lib/postgresql/pg16 -l /tmp/logfile16 start
waiting for server to start.... done
server started

Turning three databases into a cluster

We’ve set up three nodes and they are running the PGD extension. But they aren’t connected to each other. Let’s connect them!

First just to remind ourselves where we are, let’s query each running database for its version.

postgres@localhost $ /usr/lib/postgresql/14/bin/psql -p 8014 -c 'select version()'
                                                        version
-----------------------------------------------------------------------------------------------------------------------
 PostgreSQL 14.12 (Debian 14.12-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 row)

postgres@localhost $ /usr/lib/postgresql/15/bin/psql -p 8015 -c 'select version()'
                                                       version
---------------------------------------------------------------------------------------------------------------------
 PostgreSQL 15.7 (Debian 15.7-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 row)

postgres@localhost $ /usr/lib/postgresql/16/bin/psql -p 8016 -c 'select version()'
                                                       version
---------------------------------------------------------------------------------------------------------------------
 PostgreSQL 16.3 (Debian 16.3-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 row)

Looks good.

Building the PGD cluster

For each node in the cluster we must call CREATE EXTENSION bdr CASCADE;, then we need to call bdr.create_node on each node in the cluster to register it as a PGD node. Finally, we pick a single node on which to create a node group. And then on the other two nodes we give it details so it can join the node group we created.

So on the Postgres 14 server we’ll create a database in which we want to replicate tables, create the extension, create the PGD node, and create the PGD node group.

postgres@localhost $ /usr/lib/postgresql/14/bin/psql -p 8014 postgres -c "CREATE DATABASE pgdtest;"
CREATE DATABASE
postgres@localhost $ /usr/lib/postgresql/14/bin/psql -p 8014 pgdtest -c "CREATE EXTENSION bdr CASCADE;"
CREATE EXTENSION
postgres@localhost $ /usr/lib/postgresql/14/bin/psql -p 8014 pgdtest -c "SELECT bdr.create_node(node_name := 'pg14', local_dsn := 'port=8014 dbname=pgdtest host=localhost user=postgres');"
 create_node
-------------
  1151355342
(1 row)

postgres@localhost $ /usr/lib/postgresql/14/bin/psql -p 8014 pgdtest -c "SELECT bdr.create_node_group(node_group_name := 'pgdtest-group');"
 create_node_group
-------------------
         117160161
(1 row)

Great! We’ve got a PGD node and a PGD node group.

Now we can do the same thing for the Postgres 15 node. But instead of creating a new node group we join the existing node group.

postgres@localhost $ /usr/lib/postgresql/15/bin/psql -p 8015 postgres -c "CREATE DATABASE pgdtest;"
CREATE DATABASE
postgres@localhost $ /usr/lib/postgresql/15/bin/psql -p 8015 pgdtest -c "CREATE EXTENSION bdr CASCADE;"
CREATE EXTENSION
postgres@localhost $ /usr/lib/postgresql/15/bin/psql -p 8015 pgdtest -c "SELECT bdr.create_node(node_name := 'pg15', local_dsn := 'port=8015 dbname=pgdtest host=localhost user=postgres');"
 create_node
-------------
  2481762504
(1 row)

postgres@localhost $ /usr/lib/postgresql/15/bin/psql -p 8015 pgdtest -c "SELECT bdr.join_node_group(join_target_dsn := 'port=8014 dbname=pgdtest host=localhost user=postgres');"
 join_node_group
-----------------

(1 row)

And now the same for the Postgres 16 node!

postgres@localhost $ /usr/lib/postgresql/16/bin/psql -p 8016 postgres -c "CREATE DATABASE pgdtest;"
CREATE DATABASE
postgres@localhost $ /usr/lib/postgresql/16/bin/psql -p 8016 pgdtest -c "CREATE EXTENSION bdr CASCADE;"
CREATE EXTENSION
postgres@localhost $ /usr/lib/postgresql/16/bin/psql -p 8016 pgdtest -c "SELECT bdr.create_node(node_name := 'pg16', local_dsn := 'port=8016 dbname=pgdtest host=localhost user=postgres');"
 create_node
-------------
  1945575858
(1 row)

postgres@localhost $ /usr/lib/postgresql/16/bin/psql -p 8016 pgdtest -c "SELECT bdr.join_node_group(join_target_dsn := 'port=8014 dbname=pgdtest host=localhost user=postgres');"
 join_node_group
-----------------

(1 row)

Giving it a go

Let me remind you that we are running a single cluster with a mix of Postgres versions. Now not only can we do that but we can replicate DDL and DML within this cluster. Switching between nodes for reads and writes. (In the real world we suggest using our support for write-leaders and not mixing writes on different nodes.)

Let’s create a table on the Postgres 14 node and add some data.

postgres@localhost $ /usr/lib/postgresql/14/bin/psql -p 8014 pgdtest -c "CREATE TABLE x (a int primary key);"
CREATE TABLE
postgres@localhost $ /usr/lib/postgresql/14/bin/psql -p 8014 pgdtest -c "INSERT INTO x VALUES (1), (32), (19), (0);"
INSERT 0 4

And query the same node:

postgres@localhost $ /usr/lib/postgresql/14/bin/psql -p 8014 pgdtest -c "SELECT * FROM x;"
 a
----
  1
 32
 19
  0
(4 rows)

It’s normal Postgres! Nothing interesting so far.

Let’s query the Postgres 15 node.

postgres@localhost $ /usr/lib/postgresql/15/bin/psql -p 8015 pgdtest -c "SELECT * FROM x;"
 a
----
  1
 32
 19
  0
(4 rows)

Now we’re talking. We replicated data from a Postgres 14 cluster seamlessly to a Postgres 15 cluster.

What about Postgres 16.

postgres@localhost $ /usr/lib/postgresql/16/bin/psql -p 8016 pgdtest -c "SELECT * FROM x;"
 a
----
  1
 32
 19
  0
(4 rows)

And if we add data on the Postgres 16 node?

postgres@localhost $ /usr/lib/postgresql/16/bin/psql -p 8016 pgdtest -c "INSERT INTO x VALUES (10000);"
INSERT 0 1

And read from the Postgres 14 node?

postgres@localhost $ /usr/lib/postgresql/14/bin/psql -p 8014 pgdtest -c "SELECT * FROM x;"
   a
-------
     1
    32
    19
     0
 10000
(5 rows)

That’s pretty nice.

Where from here?

We used PGD’s production-grade replication to replicate data to and from a diverse cluster of Postgres nodes. We can replicate DDL.

We certainly covered a lot of tedious steps. But these steps are trivially scriptable. Once you’ve got these steps scripted, you’ve got what I think is a pretty delightful, production-grade setup for Postgres.

Check out the PGD docs and look forward to future posts covering other PGD features like scaling read-only queries with subscriber-only nodes.

Share this

More Blogs

pgAdmin CI/CD

Almost exactly three years ago I wrote a blog on my personal page entitled Testing pgAdmin which went into great detail discussing how we test pgAdmin prior to releases. Back...
August 24, 2023