Enhancing PostgreSQL in Kubernetes: Seamless Upgrades with CloudNativePG

November 12, 2024

Leverage CloudNativePG to ensure smooth transitions and minimal downtime

In a previous blog post, Leverage a New Way to Import an Existing Postgres Database to Kubernetes, we discussed migrating a PostgreSQL 10 database from RDS to a CloudNativePG (CNPG) managed PostgreSQL database cluster within a Kubernetes environment. This utilized the database import capability, leveraging logical backup and restore methods such as pg_dump and pg_restore.

However, this type of database migration has a significant limitation: it does not capture changes made to the origin database once the process begins. Consequently, transitioning from the old to the new database requires you to halt write operations on the origin's primary database until the migration is complete. This approach is known as offline database import, and when upgrading to a new major PostgreSQL version, it becomes an offline major upgrade.

This process allows you to:

  • Verify that the import or upgrade processes function correctly on a new cluster, ensuring compatibility with your applications on the updated Postgres version.
  • Accurately estimate the cutover time, primarily based on import duration.

The first point assesses readiness for a major version upgrade, while the second considers whether an offline major upgrade aligns with your organization's tolerance for downtime during the database cutover. If cutover timing is suitable, you can proceed with an offline upgrade. Otherwise, you might explore using PostgreSQL's native logical replication to minimize cutover downtime to nearly zero.

This article aims to update on achieving approximately zero cutover time with CloudNativePG when performing database imports or major version upgrades of PostgreSQL within a Kubernetes cluster. Known as online database import and major online upgrades, these processes ensure the destination database uses a higher major PostgreSQL version than the original, which is the main focus of this article.

Major Online Upgrades

Major online upgrades with PostgreSQL have been an essential technique for many years. Originally, these upgrades relied on PostgreSQL's extensibility and triggers. The central concept involved using triggers to route changes in the database to a queue, which consumers would then process and apply to the destination database. One prominent tool for this was Londiste, developed at Skype, which handled data and sequence synchronization, ensuring zero cutover time. However, the trigger-based approach had notable drawbacks, particularly its impact on the primary database. Each change initiated an event stored within the queue (inserts, updates, deletes), represented by rotating tables.

With the introduction of logical decoding in PostgreSQL 9.4, extensions like pglogical 2 became popular for performing major online upgrades. Logical replication decoding proved to be lighter than the trigger-based method, easing performance burdens on pglogical performance. With the release of PostgreSQL 10, the database introduced native support for logical replication via publications and subscriptions, which has become crucial for CloudNativePG (CNPG) major version upgrade practices. This article will leverage these native PostgreSQL capabilities to perform major online upgrades within CloudNativePG.

Before we continue, it's crucial to acknowledge certain limitations of logical replication in PostgreSQL, including:

  • Replication of Data Definition Language (DDL), such as CREATE TABLE and ALTER TABLE, does not occur.
  • Sequences are not replicated.
  • Large objects (LOBs) are excluded from replication.

In practical terms, you are responsible for ensuring the following during a cloudnativepg major upgrade:

  • Managing database schema changes, as DDL is not replicated. Avoid schema modifications during upgrades or apply changes manually if needed.
  • Updating sequences as part of the cutover process.

Migrating large objects if any exist. Today, binary data in PostgreSQL is typically stored as bytea fields, so this issue may not arise.

The Database of Origin

The source database is the one you aim to migrate to Kubernetes using CloudNativePG. This database could be hosted anywhere, such as in an Amazon RDS instance like the one detailed in Leverage a New Way to Import an Existing Postgres Database to Kubernetes. Alternatively, it could exist within the same Kubernetes ecosystem, perhaps managed by another operator or already harnessed within CloudNativePG, such as when upgrading Postgres 13 to Postgres 16.

The process remains consistent regardless of the initial setup. To proceed, certain prerequisites must be fulfilled:

  • A comprehensive list of tables targeted for importation, which can include all tables within a specific database or schema.
  • A user with logical replication privileges who can establish a connection to the source database from the newly configured database in Kubernetes, and who has the necessary permissions to access the specified tables.
  • A logical replication PUBLICATION configured in the PostgreSQL origin database.

These steps should typically be automated, employing scripts to simplify execution. For illustration, the following YAML configuration for CloudNativePG demonstrates how to initiate a PostgreSQL 13 instance named pg13 within your Kubernetes cluster, meeting the above conditions:

apiVersion: v1
data:
  password: QW5ndXNZb3VuZzE5NTU=
  username: YW5ndXM=
kind: Secret
metadata:
  name: angus
---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: pg13
spec:
  imageName: ghcr.io/cloudnative-pg/postgresql:13
  enableSuperuserAccess: false
  instances: 1
  storage:
    size: 1Gi
  bootstrap:
    initdb:
      dataChecksums: true
      walSegmentSize: 32
      postInitApplicationSQL:
      - CREATE TABLE songs (id SERIAL PRIMARY KEY, title TEXT)
      - ALTER TABLE songs OWNER TO app
      - INSERT INTO songs(title) VALUES ('Back in black')
      - CREATE PUBLICATION migrate FOR TABLE songs
  managed:
    roles:
    - name: angus
      ensure: present
      comment: User for logical replication connections
      login: true
      replication: true
      inRoles:
      - app
      passwordSecret:
        name: angus
  postgresql:
    pg_hba:
    - hostssl replication angus 10.0.0.0/8 md5

The YAML utilizes a standard PostgreSQL 13 image (imageName) from the CloudNativePG community repository. The bootstrap section executes several SQL commands executed by the postgres user to configure the application database (app). These commands include:

  • Setting up a table named songs with a SERIAL type to establish a sequence.
  • Passing ownership of the table to the application user (app), who owns the app database.
  • Inserting a song entry into the table.
  • Creating a publication titled migrate, focused on the songs table (all tables could also be specified – refer to the official "CREATE PUBLICATION" documentation for additional options).

Within the .spec.managed.roles section, the user angus (name assigned arbitrarily) is established to facilitate connectivity from the target database, ensuring:

  • Membership to the app role, thereby granting angus access to read the songs table.
  • A secret storing the password (labeled angus) is included at the beginning of the YAML file, generated through:
kubectl create secret \
  --dry-run=client -o yaml generic angus \
  --from-literal=username=angus \
  --from-literal=password=AngusYoung1955

Proceed by applying the provided manifest to your Kubernetes cluster. For those without a Kubernetes setup, consider utilizing KinD (Kubernetes in Docker) as explained in CloudNativePG quickstart.

After a few seconds, your pg13 cluster should be up:

$ kubectl get pods
NAME     READY   STATUS    RESTARTS   AGE
pg13-1   1/1     Running   0          29s

You can now verify the songs table is in the app database and is owned by the app user:

$ kubectl exec -ti pg13-1 -c postgres -- psql app -c '\d'
            List of relations
 Schema |     Name     |   Type   | Owner
--------+--------------+----------+-------
 public | songs        | table    | app
 public | songs_id_seq | sequence | app
(2 rows)

Let’s now query the songs table:

$ kubectl exec -ti pg13-1 -c postgres -- psql app -c 'SELECT * FROM songs'
 id |     title
----+---------------
  1 | Back in black
(1 row)

As expected, the angus user exists and is affiliated with the app role:

$ kubectl exec -ti pg13-1 -c postgres -- psql app -c '\du angus'
               List of roles
    Role name    | Attributes  | Member of
-----------------+-------------+-----------
 angus | Replication | {app}

Finally, let’s check the publication:

$ kubectl exec -ti pg13-1 -c postgres -- psql app -c '\dRp+'
                            Publication migrate
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
 postgres | f          | t       | t       | t       | t         | f
Tables:
    "public.songs"

Everything is in line with our configuration file.

The Database of Destination

While the source database can vary, the destination database will reliably utilize CloudNativePG, allowing us predictable control over its setup. Our goal here is to create a new PostgreSQL 16 instance, referred to as pg16. Although PostgreSQL 16 is in beta at the time of writing, using it facilitates robust testing ahead of the anticipated official release by the PostgreSQL Global Development Group (PGDG) later in 2023.

In this example, assume both pg13 and pg16 are housed in the same namespace within a Kubernetes cluster, though this setup is adaptable to different environments.

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: pg16
spec:
  imageName: ghcr.io/cloudnative-pg/postgresql:16
  enableSuperuserAccess: false
  instances: 1
  storage:
    size: 1Gi
  bootstrap:
    initdb:
      dataChecksums: true
      walSegmentSize: 32
      import:
        schemaOnly: true
        type: microservice
        databases:
        - app
        source:
          externalCluster: pg13
        postImportApplicationSQL:
        - CREATE SUBSCRIPTION migrate CONNECTION 'host=pg13-rw user=angus dbname=app sslmode=require password=AngusYoung1955' PUBLICATION migrate
  externalClusters:
  - name: pg13
    connectionParameters:
      # Use the correct IP or host name for the source database
      host: pg13-rw
      user: angus
      dbname: postgres
    password:
      name: angus
      key: password
As you can see, we create a new 

Here, we create a new Postgres instance from scratch and, upon setup, import only the schema of the app database from the pg13 cluster, as detailed under the externalClusters section. This involves connecting to the primary endpoint designated by the pg13-rw service, utilizing the angus user credentials stored in the same secret as used for pg13.

On completing the import process, we request the operator to establish a SUBSCRIPTION to the migrate PUBLICATION on pg13. Note: Security-conscious users will observe a plaintext password in the subscription connection string. This will be rectified in an upcoming release (issue #2386) from version 1.20.2.

Apply the above manifest to your Kubernetes cluster. Following the import job, the pg16 cluster will become operational:

$ kubectl get pods
NAME     READY   STATUS    RESTARTS   AGE
pg13-1   1/1     Running   0          10m
pg16-1   1/1     Running   0          2m36s

The import phase successfully creates the database schema:

$ kubectl exec -ti pg16-1 -c postgres -- psql app -c '\d'
            List of relations
 Schema |     Name     |   Type   | Owner
--------+--------------+----------+-------
 public | songs        | table    | app
 public | songs_id_seq | sequence | app
(2 rows)

To verify, query the songs table in the pg16 instance's app database:

$ kubectl exec -ti pg16-1 -c postgres -- psql app -c 'SELECT * FROM songs'
 id |     title
----+---------------
  1 | Back in black
(1 row)

The presence of the record confirms successful logical replication. Next, add another song to the pg13 instance and verify its replication to pg16:

$ kubectl exec -ti pg13-1 -c postgres \
  -- psql app -c "INSERT INTO songs(title) VALUES('Thunderstruck')"
INSERT 0 1

Then check:

$ kubectl exec -ti pg16-1 -c postgres \
  -- psql app -c 'SELECT * FROM songs'
 id |     title
----+---------------
  1 | Back in black
  2 | Thunderstruck
(2 rows)

Replication works!

Before promoting the pg16 database to handle write operations from your application, ensure the songs_id_seq sequence is properly set using the setval() function to sync with the sequence in pg13.

Ready to Revolutionize Your Database Management

This article illustrates how, starting with CloudNativePG version 1.20.1, you can leverage the schemaOnly feature of the import capability to execute major online upgrades and seamless database imports using a blend of declarative and imperative methodologies.

Here’s the streamlined step-by-step online import and upgrade process:

  • Deploy the target database as demonstrated in the pg16 example, executing a schema-only import.
  • Implement the subscription either within the post-import segment or by executing the relevant SQL statement imperatively, which prompts the start of replication.
  • Upon synchronizing the source and target databases, conduct a thorough data verification.
  • Initiate the downtime for your application.
  • Adjust the sequences by using a script or migration approach, tailored to your specific requirements.
  • Once complete, deploy the upgraded version of your application pointing to the new PostgreSQL Cluster.
  • Conclude the application downtime efficiently.

We encourage feedback on the current methodology to enhance CloudNativePG, aiming for a fully automated and declarative process. Potential improvements encompass:

  • Declarative management of logical replication subscriptions.
  • Automated sequence synchronization during the cutover procedure – potentially as part of an execution script.
  • An advanced bootstrap strategy coordinating these steps for enhanced import functionality.

With the right application of native logical replication, migrating a PostgreSQL 10+ database into a CloudNativePG-managed cluster of the same or a newer major version can achieve near-zero cutover time. This methodology is equally effective for transitioning to CloudNativePG from other PostgreSQL operators in Kubernetes, minimizing downtime to negligible levels.

While requiring temporary additional resources, this method offers valuable benefits:

  • Extended testing time on the new database version, minimizing post-migration issue risks.
  • Rollback capabilities if initial testing is unsuccessful.
  • Greater flexibility in managing SQL extensions, enabling upgrades of both PostgreSQL and core extensions.
  • Minimal cutover downtime.

When CloudNativePG introduces pg_upgrade support (already achievable through hibernation capabilities when applied imperatively), you’ll have three distinct PostgreSQL database upgrade options:

  • Major offline upgrades via the import feature.
  • Seamless major online upgrades using import and logical replication.
  • In-place offline upgrades utilizing pg_upgrade.

Select the approach that aligns optimally with your strategic database management goals.

Join our CloudNativePG community chat and participate in our bi-weekly development meetings to share your insights, contributing to the progression of PostgreSQL solutions on Kubernetes.

For comprehensive support with the open-source stack, EDB, the creators of CloudNativePG and key PostgreSQL project contributors, offer our Community 360 services. We also empower you to deploy EDB Postgres Extended and EDB Postgres Advanced, featuring our Oracle compatibility layer, through EDB CloudNativePG Cluster – compatible with Red Hat OpenShift and available with Long Term Support terms. Engage with us to seamlessly integrate Postgres within your Kubernetes infrastructure!

Featured picture: 2 Container ships alongside in Le Verdon, a subsidiary of Bordeaux port, France, available under CC BY-SA 3.0.

Share this
What is a CloudNativePG major upgrade? chevron_right

A CloudNativePG major upgrade involves transitioning a PostgreSQL database to a newer major version within a Kubernetes cluster, ensuring more features and improvements without interrupting service.

How does a CNPG major version upgrade work? chevron_right

A CNPG major version upgrade leverages logical replication to synchronize data between the old and new PostgreSQL clusters, allowing you to minimize downtime during the transition.

Why is logical replication important for PostgreSQL database upgrades? chevron_right

Logical replication enables the transfer of real-time changes between databases, facilitating seamless upgrades and minimizing downtime by ensuring data consistency across PostgreSQL databases.

Can I perform a PostgreSQL database upgrade without downtime? chevron_right

Yes, by using CloudNativePG’s logical replication and schema-only features, you can achieve nearly zero downtime during major version upgrades of PostgreSQL databases.

What are the prerequisites for performing a major upgrade in a Kubernetes cluster? chevron_right

You need logical replication privileges, a configured publication in the source database, and Kubernetes cluster setup to perform seamless major upgrades with CloudNativePG.

How does CloudNativePG handle PostgreSQL database schema changes during upgrades? chevron_right

CloudNativePG requires manual management of schema changes as DDL commands are not replicated. It's advisable to avoid schema modifications during the upgrade or apply them manually.

What is the role of a Kubernetes cluster in managing PostgreSQL databases? chevron_right

A Kubernetes cluster provides a scalable and flexible environment for deploying and managing PostgreSQL databases, enabling efficient resource use and simplified database operations.

Is it possible to rollback a PostgreSQL upgrade if issues are encountered? chevron_right

Yes, using CloudNativePG, you can rollback an upgrade if necessary. The method allows extended testing on the new version, ensuring minimal risks and flexible management.

What are the options for PostgreSQL database upgrades with CloudNativePG? chevron_right

With CloudNativePG, you can choose major offline upgrades, seamless major online upgrades using logical replication, or in-place offline upgrades utilizing pg_upgrade.

How does CloudNativePG enhance database management in Kubernetes? chevron_right

CloudNativePG provides a comprehensive framework for managing PostgreSQL databases in Kubernetes, offering features like logical replication, upgrades, and compatibility with cloud-native architectures.

What should be done if large objects are present in a PostgreSQL database during upgrades? chevron_right

Since large objects (LOBs) are not replicated, they should be migrated manually. Nowadays, it’s common to store binary data as bytea fields, which may simplify this process.

How can you verify successful logical replication in CloudNativePG? chevron_right

You can verify logical replication by checking if data changes in the source database are reflected in the destination database using SQL queries within the Kubernetes environment.

Transform Your Database Solution with EDB

Unlock the power of PostgreSQL and Kubernetes for optimized data management