How to Create a PostGIS Database in Kubernetes

Gabriele Bartolini

September 22, 2022

One of the main reasons why PostgreSQL has become so popular over the decades is its extensibility: the capability to extend the database engine with new data types and functions.

PostGIS, distributed under GNU GPL v2, is the open source extension that has contributed the most to the adoption of PostgreSQL, in my opinion. PostGIS enhances PostgreSQL by adding geometrical data types like points and polygons, spatial indexes, as well as many functions to query the database using SQL (for example, to calculate distances or the shortest path between two connected points).

However, the goal of this article is not to list the benefits of PostGIS. I am not a PostGIS expert, and there are many resources out there that can better describe it.

This article is about “PostGIS and Kubernetes,” with the dual goal to:

Demonstrate how to create a PostGIS spatial database in a PostgreSQL cluster in Kubernetes using the CloudNativePG operator, of which I am one of the maintainers
Trigger some conversations with PostGIS experts and users to improve the overall spatial experience in Kubernetes through CloudNativePG

In order to fully understand and connect your applications to the PostGIS database, please refer to the amazing article “EDB Postgres for Kubernetes for Application Developers” written by my colleague Leonardo Cecchi. Although it is written primarily for EDB Postgres for Kubernetes, most concepts apply to CloudNativePG as well. In particular, pay close attention to the sections about services, credentials and how to connect for local dev/testing.

The PostGIS operand images

CloudNativePG is designed around the concept of Immutable Application Containers (IAC), as explained in a blog article I wrote. This provides a range of benefits —to security and more — but one standout advantage is that this design ensures that such container images cannot be modified when they are running. Their content must be pre-defined at build time and properly versionized to be used in “infrastructure as code” (IaC) contexts.

For this reason, if you want to run PostGIS, you need its libraries to be installed in the container images that will be used as PostgreSQL operands by the CloudNativePG operator

Fortunately, the CloudNativePG community maintains container images for PostGIS, which are based on the official DockerHub ones.

The same mechanism can be generalized as long as the “Container Image Requirements” are met.

Creating the PostGIS cluster

A PostGIS cluster is essentially a PostgreSQL cluster with a primary and an arbitrary number of standby instances where the running operand image contains all the required PostGIS related extensions. These are installed in the application database.

By default, CloudNativePG is designed to create a single application database (called “app”) owned by a user with the same name (“app”). Although you can create multiple databases and users through the superuser access, our recommendation is to reserve a single cluster for a single database, therefore adopting the microservice approach.

All you need to do is create a new “Cluster” resource using the preferred PostgreSQL/PostGIS combination image from the above catalog, then add the required CREATE EXTENSION statements in the post initialization phase of the application database. Provided you have installed the CloudNativePG open source operator in your Kubernetes cluster, the above translates into the following YAML lines:

apiVersion: postgresql.cnpg.io/v1

kind: Cluster

metadata:

name: postgis-example

spec:

instances: 3

imageName: ghcr.io/cloudnative-pg/postgis:14-3.2

bootstrap:

initdb:

postInitApplicationSQL:

- CREATE EXTENSION postgis;

- CREATE EXTENSION postgis_topology;

- CREATE EXTENSION fuzzystrmatch;

- CREATE EXTENSION postgis_tiger_geocoder;

storage:

size: 10Gi

walStorage:

size: 2Gi

The above manifest creates a 3 node PostgreSQL 14 cluster, with a primary and two replicas, each with 10Gi for PGDATA, 2Gi reserved for WALs, and PostGIS 3.2 installed in the main application database alongside a few more extensions.

Connecting to the PostGIS cluster

CloudNativePG automatically provides a “ClusterIP” type of service to connect to the primary (and updates it after failovers and switchovers), so that applications residing inside the same Kubernetes cluster only need to point to it. Such a service uses the same name of the cluster with an “-rw” suffix. In the above example, applications can connect to the PostGIS cluster using the “postgis-example-rw” host name, the “app” user and the “app”database. The password for the “app” user is in the “postgis-example-app” secret, as explained in the documentation.

Once you have the service, you can also configure access from outside the Kubernetes cluster, using different types of services, depending also on the provider.

In case you are not familiar with some of these Kubernetes practical aspects, the article by Leonardo Cecchi I shared above will be very useful.

Conclusions

As I mentioned at the beginning of this article, I am here to trigger new conversations in the PostGIS and Kubernetes space. We welcome feedback and contributions from PostGIS users and we encourage joining our open and vendor neutral CloudNativePG community. I would love to hear stories of users that run PostGIS applications in Kubernetes using our operator to help us learn more about the goals you are trying to achieve and improve our software.

EDB provides community images for PostgreSQL and PostGIS based on RedHat UBI 8, as well as PostGIS images for our Postgres Advanced server that runs with the EDB Postgres for Kubernetes operator.

Try CloudNativePG now by following the instructions you find on the website. EDB, as the original creator of CloudNativePG and member of the community, delivers professional support through the Community 360 plan on all supported versions of Kubernetes.

EDB will be participating in the next KubeCon NA conference in Detroit as silver sponsors. On October 27, I will be speaking with Chris Milsted from Ondat about “Data On Kubernetes, Deploying And Running PostgreSQL And Patterns For Databases In a Kubernetes Cluster”. Don’t miss this opportunity!

Exploring Postgres's arena allocator by writing an HTTP server from scratch

Postgres manages memory through nested arenas called MemoryContexts. MemoryContexts are convenient because, for the most part, you don't need to worry about explicitly freeing memory because memory allocated within the...

November 06, 2024

Achieving Production-Grade Postgres Replication for High Availability

Learn how EDB Postgres Distributed improves logical replication for seamless upgrades, high availability, and optimized DML performance in PostgreSQL clusters.

October 07, 2024

Resource Feature Callout 1

Create a PostGIS Database in Kubernetes with CloudNativePG and Improve the Spatial Experience