Leverage CloudNativePG to ensure smooth transitions and minimal downtime
In a previous blog post, Leverage a New Way to Import an Existing Postgres Database to Kubernetes, we discussed migrating a PostgreSQL 10 database from RDS to a CloudNativePG (CNPG) managed PostgreSQL database cluster within a Kubernetes environment. This utilized the database import capability, leveraging logical backup and restore methods such as pg_dump
and pg_restore
.
However, this type of database migration has a significant limitation: it does not capture changes made to the origin database once the process begins. Consequently, transitioning from the old to the new database requires you to halt write operations on the origin's primary database until the migration is complete. This approach is known as offline database import, and when upgrading to a new major PostgreSQL version, it becomes an offline major upgrade.
This process allows you to:
- Verify that the import or upgrade processes function correctly on a new cluster, ensuring compatibility with your applications on the updated Postgres version.
- Accurately estimate the cutover time, primarily based on import duration.
The first point assesses readiness for a major version upgrade, while the second considers whether an offline major upgrade aligns with your organization's tolerance for downtime during the database cutover. If cutover timing is suitable, you can proceed with an offline upgrade. Otherwise, you might explore using PostgreSQL's native logical replication to minimize cutover downtime to nearly zero.
This article aims to update on achieving approximately zero cutover time with CloudNativePG when performing database imports or major version upgrades of PostgreSQL within a Kubernetes cluster. Known as online database import and major online upgrades, these processes ensure the destination database uses a higher major PostgreSQL version than the original, which is the main focus of this article.
Major Online Upgrades
Major online upgrades with PostgreSQL have been an essential technique for many years. Originally, these upgrades relied on PostgreSQL's extensibility and triggers. The central concept involved using triggers to route changes in the database to a queue, which consumers would then process and apply to the destination database. One prominent tool for this was Londiste, developed at Skype, which handled data and sequence synchronization, ensuring zero cutover time. However, the trigger-based approach had notable drawbacks, particularly its impact on the primary database. Each change initiated an event stored within the queue (inserts, updates, deletes), represented by rotating tables.
With the introduction of logical decoding in PostgreSQL 9.4, extensions like pglogical 2 became popular for performing major online upgrades. Logical replication decoding proved to be lighter than the trigger-based method, easing performance burdens on pglogical performance. With the release of PostgreSQL 10, the database introduced native support for logical replication via publications and subscriptions, which has become crucial for CloudNativePG (CNPG) major version upgrade practices. This article will leverage these native PostgreSQL capabilities to perform major online upgrades within CloudNativePG.
Before we continue, it's crucial to acknowledge certain limitations of logical replication in PostgreSQL, including:
- Replication of Data Definition Language (DDL), such as
CREATE TABLE
andALTER TABLE
, does not occur. - Sequences are not replicated.
- Large objects (LOBs) are excluded from replication.
In practical terms, you are responsible for ensuring the following during a cloudnativepg major upgrade:
- Managing database schema changes, as DDL is not replicated. Avoid schema modifications during upgrades or apply changes manually if needed.
- Updating sequences as part of the cutover process.
Migrating large objects if any exist. Today, binary data in PostgreSQL is typically stored as bytea fields, so this issue may not arise.
The Database of Origin
The source database is the one you aim to migrate to Kubernetes using CloudNativePG. This database could be hosted anywhere, such as in an Amazon RDS instance like the one detailed in Leverage a New Way to Import an Existing Postgres Database to Kubernetes. Alternatively, it could exist within the same Kubernetes ecosystem, perhaps managed by another operator or already harnessed within CloudNativePG, such as when upgrading Postgres 13 to Postgres 16.
The process remains consistent regardless of the initial setup. To proceed, certain prerequisites must be fulfilled:
- A comprehensive list of tables targeted for importation, which can include all tables within a specific database or schema.
- A user with logical replication privileges who can establish a connection to the source database from the newly configured database in Kubernetes, and who has the necessary permissions to access the specified tables.
- A logical replication PUBLICATION configured in the PostgreSQL origin database.
These steps should typically be automated, employing scripts to simplify execution. For illustration, the following YAML configuration for CloudNativePG demonstrates how to initiate a PostgreSQL 13 instance named pg13
within your Kubernetes cluster, meeting the above conditions:
apiVersion: v1
data:
password: QW5ndXNZb3VuZzE5NTU=
username: YW5ndXM=
kind: Secret
metadata:
name: angus
---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: pg13
spec:
imageName: ghcr.io/cloudnative-pg/postgresql:13
enableSuperuserAccess: false
instances: 1
storage:
size: 1Gi
bootstrap:
initdb:
dataChecksums: true
walSegmentSize: 32
postInitApplicationSQL:
- CREATE TABLE songs (id SERIAL PRIMARY KEY, title TEXT)
- ALTER TABLE songs OWNER TO app
- INSERT INTO songs(title) VALUES ('Back in black')
- CREATE PUBLICATION migrate FOR TABLE songs
managed:
roles:
- name: angus
ensure: present
comment: User for logical replication connections
login: true
replication: true
inRoles:
- app
passwordSecret:
name: angus
postgresql:
pg_hba:
- hostssl replication angus 10.0.0.0/8 md5
The YAML utilizes a standard PostgreSQL 13 image (imageName
) from the CloudNativePG community repository. The bootstrap
section executes several SQL commands executed by the postgres
user to configure the application database (app
). These commands include:
- Setting up a table named
songs
with aSERIAL
type to establish a sequence. - Passing ownership of the table to the application user (
app
), who owns theapp
database. - Inserting a song entry into the table.
- Creating a publication titled
migrate
, focused on thesongs
table (all tables
could also be specified – refer to the official "CREATE PUBLICATION" documentation for additional options).
Within the .spec.managed.roles
section, the user angus
(name assigned arbitrarily) is established to facilitate connectivity from the target database, ensuring:
- Membership to the
app
role, thereby grantingangus
access to read thesongs
table. - A secret storing the password (labeled
angus
) is included at the beginning of the YAML file, generated through:
kubectl create secret \
--dry-run=client -o yaml generic angus \
--from-literal=username=angus \
--from-literal=password=AngusYoung1955
Proceed by applying the provided manifest to your Kubernetes cluster. For those without a Kubernetes setup, consider utilizing KinD (Kubernetes in Docker) as explained in CloudNativePG quickstart.
After a few seconds, your pg13
cluster should be up:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
pg13-1 1/1 Running 0 29s
You can now verify the songs
table is in the app
database and is owned by the app
user:
$ kubectl exec -ti pg13-1 -c postgres -- psql app -c '\d'
List of relations
Schema | Name | Type | Owner
--------+--------------+----------+-------
public | songs | table | app
public | songs_id_seq | sequence | app
(2 rows)
Let’s now query the songs
table:
$ kubectl exec -ti pg13-1 -c postgres -- psql app -c 'SELECT * FROM songs'
id | title
----+---------------
1 | Back in black
(1 row)
As expected, the angus
user exists and is affiliated with the app
role:
$ kubectl exec -ti pg13-1 -c postgres -- psql app -c '\du angus'
List of roles
Role name | Attributes | Member of
-----------------+-------------+-----------
angus | Replication | {app}
Finally, let’s check the publication:
$ kubectl exec -ti pg13-1 -c postgres -- psql app -c '\dRp+'
Publication migrate
Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
postgres | f | t | t | t | t | f
Tables:
"public.songs"
Everything is in line with our configuration file.
The Database of Destination
While the source database can vary, the destination database will reliably utilize CloudNativePG, allowing us predictable control over its setup. Our goal here is to create a new PostgreSQL 16 instance, referred to as pg16
. Although PostgreSQL 16 is in beta at the time of writing, using it facilitates robust testing ahead of the anticipated official release by the PostgreSQL Global Development Group (PGDG) later in 2023.
In this example, assume both pg13
and pg16
are housed in the same namespace within a Kubernetes cluster, though this setup is adaptable to different environments.
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: pg16
spec:
imageName: ghcr.io/cloudnative-pg/postgresql:16
enableSuperuserAccess: false
instances: 1
storage:
size: 1Gi
bootstrap:
initdb:
dataChecksums: true
walSegmentSize: 32
import:
schemaOnly: true
type: microservice
databases:
- app
source:
externalCluster: pg13
postImportApplicationSQL:
- CREATE SUBSCRIPTION migrate CONNECTION 'host=pg13-rw user=angus dbname=app sslmode=require password=AngusYoung1955' PUBLICATION migrate
externalClusters:
- name: pg13
connectionParameters:
# Use the correct IP or host name for the source database
host: pg13-rw
user: angus
dbname: postgres
password:
name: angus
key: password
As you can see, we create a new
Here, we create a new Postgres instance from scratch and, upon setup, import only the schema of the app
database from the pg13
cluster, as detailed under the externalClusters
section. This involves connecting to the primary endpoint designated by the pg13-rw
service, utilizing the angus
user credentials stored in the same secret as used for pg13
.
On completing the import process, we request the operator to establish a SUBSCRIPTION to the migrate
PUBLICATION on pg13. Note: Security-conscious users will observe a plaintext password in the subscription
connection string. This will be rectified in an upcoming release (issue #2386) from version 1.20.2.
Apply the above manifest to your Kubernetes cluster. Following the import job, the pg16
cluster will become operational:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
pg13-1 1/1 Running 0 10m
pg16-1 1/1 Running 0 2m36s
The import phase successfully creates the database schema:
$ kubectl exec -ti pg16-1 -c postgres -- psql app -c '\d'
List of relations
Schema | Name | Type | Owner
--------+--------------+----------+-------
public | songs | table | app
public | songs_id_seq | sequence | app
(2 rows)
To verify, query the songs
table in the pg16
instance's app
database:
$ kubectl exec -ti pg16-1 -c postgres -- psql app -c 'SELECT * FROM songs'
id | title
----+---------------
1 | Back in black
(1 row)
The presence of the record confirms successful logical replication. Next, add another song to the pg13
instance and verify its replication to pg16
:
$ kubectl exec -ti pg13-1 -c postgres \
-- psql app -c "INSERT INTO songs(title) VALUES('Thunderstruck')"
INSERT 0 1
Then check:
$ kubectl exec -ti pg16-1 -c postgres \
-- psql app -c 'SELECT * FROM songs'
id | title
----+---------------
1 | Back in black
2 | Thunderstruck
(2 rows)
Replication works!
Before promoting the pg16
database to handle write operations from your application, ensure the songs_id_seq
sequence is properly set using the setval()
function to sync with the sequence in pg13
.
Ready to Revolutionize Your Database Management
This article illustrates how, starting with CloudNativePG version 1.20.1, you can leverage the schemaOnly
feature of the import
capability to execute major online upgrades and seamless database imports using a blend of declarative and imperative methodologies.
Here’s the streamlined step-by-step online import and upgrade process:
- Deploy the target database as demonstrated in the
pg16
example, executing a schema-only import. - Implement the subscription either within the post-import segment or by executing the relevant SQL statement imperatively, which prompts the start of replication.
- Upon synchronizing the source and target databases, conduct a thorough data verification.
- Initiate the downtime for your application.
- Adjust the sequences by using a script or migration approach, tailored to your specific requirements.
- Once complete, deploy the upgraded version of your application pointing to the new PostgreSQL
Cluster
. - Conclude the application downtime efficiently.
We encourage feedback on the current methodology to enhance CloudNativePG, aiming for a fully automated and declarative process. Potential improvements encompass:
- Declarative management of logical replication subscriptions.
- Automated sequence synchronization during the cutover procedure – potentially as part of an execution script.
- An advanced bootstrap strategy coordinating these steps for enhanced import functionality.
With the right application of native logical replication, migrating a PostgreSQL 10+ database into a CloudNativePG-managed cluster of the same or a newer major version can achieve near-zero cutover time. This methodology is equally effective for transitioning to CloudNativePG from other PostgreSQL operators in Kubernetes, minimizing downtime to negligible levels.
While requiring temporary additional resources, this method offers valuable benefits:
- Extended testing time on the new database version, minimizing post-migration issue risks.
- Rollback capabilities if initial testing is unsuccessful.
- Greater flexibility in managing SQL extensions, enabling upgrades of both PostgreSQL and core extensions.
- Minimal cutover downtime.
When CloudNativePG introduces pg_upgrade
support (already achievable through hibernation capabilities when applied imperatively), you’ll have three distinct PostgreSQL database upgrade options:
- Major offline upgrades via the import feature.
- Seamless major online upgrades using import and logical replication.
- In-place offline upgrades utilizing
pg_upgrade
.
Select the approach that aligns optimally with your strategic database management goals.
Join our CloudNativePG community chat and participate in our bi-weekly development meetings to share your insights, contributing to the progression of PostgreSQL solutions on Kubernetes.
For comprehensive support with the open-source stack, EDB, the creators of CloudNativePG and key PostgreSQL project contributors, offer our Community 360 services. We also empower you to deploy EDB Postgres Extended and EDB Postgres Advanced, featuring our Oracle compatibility layer, through EDB CloudNativePG Cluster – compatible with Red Hat OpenShift and available with Long Term Support terms. Engage with us to seamlessly integrate Postgres within your Kubernetes infrastructure!
Featured picture: 2 Container ships alongside in Le Verdon, a subsidiary of Bordeaux port, France, available under CC BY-SA 3.0.
A CloudNativePG major upgrade involves transitioning a PostgreSQL database to a newer major version within a Kubernetes cluster, ensuring more features and improvements without interrupting service.
A CNPG major version upgrade leverages logical replication to synchronize data between the old and new PostgreSQL clusters, allowing you to minimize downtime during the transition.
Logical replication enables the transfer of real-time changes between databases, facilitating seamless upgrades and minimizing downtime by ensuring data consistency across PostgreSQL databases.
Yes, by using CloudNativePG’s logical replication and schema-only features, you can achieve nearly zero downtime during major version upgrades of PostgreSQL databases.
You need logical replication privileges, a configured publication in the source database, and Kubernetes cluster setup to perform seamless major upgrades with CloudNativePG.
CloudNativePG requires manual management of schema changes as DDL commands are not replicated. It's advisable to avoid schema modifications during the upgrade or apply them manually.
A Kubernetes cluster provides a scalable and flexible environment for deploying and managing PostgreSQL databases, enabling efficient resource use and simplified database operations.
Yes, using CloudNativePG, you can rollback an upgrade if necessary. The method allows extended testing on the new version, ensuring minimal risks and flexible management.
With CloudNativePG, you can choose major offline upgrades, seamless major online upgrades using logical replication, or in-place offline upgrades utilizing pg_upgrade.
CloudNativePG provides a comprehensive framework for managing PostgreSQL databases in Kubernetes, offering features like logical replication, upgrades, and compatibility with cloud-native architectures.
Since large objects (LOBs) are not replicated, they should be migrated manually. Nowadays, it’s common to store binary data as bytea fields, which may simplify this process.
You can verify logical replication by checking if data changes in the source database are reflected in the destination database using SQL queries within the Kubernetes environment.