PG Phriday: pglogical and Postgres 10 Partitions

September 22, 2017

During the Postgres Open 2017 conference in San Francisco, someone came to the 2ndQuadrant booth and struck up a conversation with me. During our shameless geeking out over database mechanics, he asked me if pglogical supported the new Postgres 10 partitions. Given my noted expertise in all things Postgres, I answered in the appropriate manner:

"I have no idea. I’ll have to look into that."

Well, after a bit of experimentation, I have a more concrete answer, and it’s reassuringly positive.

The Problem

Given a table on a provider node, is it possible to capture only INSERT traffic such that it accumulates on a subscribed system for archival purposes? It’s a fairly common tactic, and allows an active OLTP system to regularly purge old data, while a reporting OLAP system keeps it available in posterity.

To get this experiment going, it’s necessary to begin with a regular table that might fit this model.

CREATE TABLE sensor_log (
  id            SERIAL PRIMARY KEY NOT NULL,
  location      VARCHAR NOT NULL,
  reading       BIGINT NOT NULL,
  reading_date  TIMESTAMP NOT NULL
);

INSERT INTO sensor_log (location, reading, reading_date)
SELECT s.id % 1000, round(random() * 100),
       CURRENT_DATE + INTERVAL '1d' - ((s.id * 10)::TEXT || 's')::INTERVAL
  FROM generate_series(1, 1000000) s(id);

CREATE EXTENSION pglogical;

SELECT pglogical.create_node(
    node_name := 'prod_sensors',
    dsn := 'host=localhost port=5434 dbname=phriday'
);

SELECT pglogical.create_replication_set(
    set_name := 'logging',
    replicate_insert := TRUE, replicate_update := FALSE,
    replicate_delete := FALSE, replicate_truncate := FALSE
);

SELECT pglogical.replication_set_add_table(
    set_name := 'logging', relation := 'sensor_log', 
    synchronize_data := TRUE
);

There’s nothing really surprising here. We create the table, install pglogical, and register the node itself. Next, we create a replication set that captures only INSERT activity. Why just inserts? It’s probably safe to also include UPDATE actions, but for the sake of this demonstration, we have a write-only ledger-style table.

After creating the new replication set, we just need to add any table(s) that fit that insert model. While pglogical provides a default_insert_only replication set that does this for us, we find it’s generally better to be explicit to avoid any unintended (and unexpected) magic.

Proof of Concept

With the provider properly configured, all that remains is to set up the subscriber node. This is very similar to setting up the provider node: create table, install pglogical, create subscription. We can do that now:

CREATE TABLE sensor_log (
  id            INT PRIMARY KEY NOT NULL,
  location      VARCHAR NOT NULL,
  reading       BIGINT NOT NULL,
  reading_date  TIMESTAMP NOT NULL
);

CREATE EXTENSION pglogical;

SELECT pglogical.create_node(
    node_name := 'sensor_warehouse',
    dsn := 'host=localhost port=5435 dbname=phriday'
);

SELECT pglogical.create_subscription(
    subscription_name := 'wh_sensor_data',
    replication_sets := array['logging'],
    provider_dsn := 'host=localhost port=5434 dbname=phriday'
);

SELECT pg_sleep(10);

SELECT COUNT(*) FROM sensor_log;

  count  
---------
 1000000

Once again, we err on the side of caution and do a couple of things manually that may not necessarily be entirely necessary. By that, we mean manually creating the sensor_log table on the subsriber node.

The create_subscription function has a parameter called synchronize_structure to skip the table-creation step. On the other hand, it uses pg_dump to obtain table structure DDL, so the import might fail if recipient database isn’t empty. We can skip that whole dance by not using the parameter at all.

Once we’ve verified the one-million sample rows have transferred, our job is done, right?

Starting Over

Well, almost. There’s still time to be fancy. While we have proven it’s possible to capture only inserted data, Postgres 10 table partitions are still an unknown quantity in this relationship. It turns out, their implementation under the hood is an extremely relevant detail.

To see just how, we need to tear down the subscription and drop the recipient table on the subscriber:

SELECT pglogical.drop_subscription(
    subscription_name := 'wh_sensor_data'
);

DROP TABLE sensor_log;

Don’t worry, our sensor_log table will be back, and better than ever.

Down the Rabbit Hole

We only inserted one million rows into the provider node’s copy of sensor_log. As it turns out, the dates we generated don’t even exit 2017. That’s fine though, because with Postgres 10 partitions, even a single partition is sufficient to demonstrate the process.

Let’s start with a single table partitioned by the reading_date column:

CREATE TABLE sensor_log (
  id             SERIAL,
  location       VARCHAR NOT NULL,
  reading        BIGINT NOT NULL,
  reading_date   TIMESTAMP NOT NULL
)
PARTITION BY RANGE (reading_date);

CREATE TABLE sensor_log_part_2017
PARTITION OF sensor_log
FOR VALUES FROM ('2017-01-01') TO ('2018-01-01');

CREATE UNIQUE INDEX udx_sensor_log_2017_sensor_log_id ON sensor_log_part_2017 (sensor_log_id);
CREATE INDEX idx_sensor_log_2017_location ON sensor_log_part_2017 (location);
CREATE INDEX idx_sensor_log_2017_date ON sensor_log_part_2017 (reading_date);

We lament the inability to use the LIKE syntax to copy any placeholder indexes on the root table, but maybe that’ll show up in Postgres 11 or 12. Regardless, we now have one partitioned table backed by a single partition.

Staring into the Abyss

This is where the fun starts! All we need to do is recreate the subscription, and the sensor_log table data should be redirected into the 2017 partition, thus proving pglogical works with Postgres 10 partitions.

Let’s try it out:

SELECT pglogical.create_subscription(
    subscription_name := 'wh_sensor_data',
    replication_sets := array['logging'],
    provider_dsn := 'host=localhost port=5434 dbname=phriday'
);

SELECT pg_sleep(10);

SELECT COUNT(*) FROM sensor_log;

 count 
-------
     0

SELECT pglogical.drop_subscription(
    subscription_name := 'wh_sensor_data'
);

Wait, what’s going on here? Why isn’t the table being copied at all? Let’s see what the logs have to say…

2017-09-18 14:36:03.065 CDT [4196] LOG:  starting receiver for subscription wh_sensor_data
2017-09-18 14:36:03.111 CDT [4196] ERROR:  pglogical target reation "public.sensor_log" is not a table

Oh…

The Abyss Stares Back

It just so happens that Postgres partitioned tables aren’t actually tables. They’re more of a table-like structure that allow certain database operations to target the underlying partitions. We can even see this for ourselves by checking out the pg_class system catalog table:

SELECT relname, relkind
  FROM pg_class
 WHERE relname LIKE 'sensor_log%';

       relname        | relkind 
----------------------+---------
 sensor_log           | p
 sensor_log_id_seq    | S
 sensor_log_part_2017 | r

The relkind column tells us which type of object we’re looking at. Normal tables in Postgres are usually marked ‘r’ for relation. The sensor_log table on the subscriber however, shows ‘p’ for partitioned table. That actually matters, because only relations can store data. When pglogical sees that the partitioned table isn’t a relation, it refuses to continue.

Pglogical’s decision to refuse to insert into sensor_log isn’t unique. Had we attempted this experiment with Postgres 10’s new PUBLICATION / SUBSCRIPTION logical replication system, we would get the same result. Not even Postgres 10’s built-in logical replication is compatible with partitioned tables; they’re just too new.

A Way Out

Despite implementation details causing a bit of a non-intuitive roadblock, there’s a way around this: we cheat. Unlike Postgres 10’s built-in logical replication, pglogical exposes advanced API hooks. One of those is the Postgres Server Programming Interface.

The default behavior of logical decoding is to try and match the Postgres internal objects to prevent structural incompatibilities. As such, it matters that sensor_log isn’t a relation; it’s ultimately ephemeral, and can’t store the same data.

But what if pglogical could convert the logical decoding into literal INSERT statements instead? Well, the pglogical documentation tells us we can do that by setting these parameters in postgresql.conf:

pglogical.conflict_resolution = false
pglogical.use_spi = true

The first disables conflict resolution. We don’t really need that on the subscriber, since it’s simply receiving a stream of inserts. Then we enable the SPI process which converts the logical decoding directly into actual INSERT statements.

Back to Reality

If we try the subscription again, we should see our expected result:

SELECT pglogical.create_subscription(
    subscription_name := 'wh_sensor_data',
    replication_sets := array['logging'],
    provider_dsn := 'host=localhost port=5434 dbname=phriday'
);

SELECT pg_sleep(10);

SELECT COUNT(*) FROM sensor_log;

  count  
---------
 1000000

SELECT COUNT(*) FROM sensor_log_part_2017;

  count  
---------
 1000000

So not only has the partition system worked, we didn’t need any triggers as with previous attempts to implement this model. Postgres 10 worked as advertised, and it’s still a beta build at the time of this writing.

While it’s unfortunate we had to jump through a couple of odd hoops to reach our intended destination, we still arrived intact. What’s more, we can see that though Postgres 10 does offer internal logical replication, it’s still an evolving feature that isn’t quite complete yet.

Postgres 11, 12, and future versions will slowly fill those cracks as patches are incorporated. In the meantime, pglogical will continue to leverage the EXTENSION system to add advanced features that Postgres core isn’t quite ready to absorb. And indeed, redirecting logical replication into a partition is a somewhat advanced use case.

I’ve always loved the Postgres EXTENSION system; augmenting Postgres functionality with things like pglogical ensures that even difficult edge cases often have a workable solution.

Share this

Relevant Blogs

Random Data

This post continues from my report on Random Numbers. I have begun working on a random data generator so I want to run some tests to see whether different random...
December 03, 2020

More Blogs

Full-text search since PostgreSQL 8.3

Welcome to the third – and last – part of this blog series, exploring how the PostgreSQL performance evolved over the years. The first part looked at OLTP workloads, represented...
November 05, 2020

Números aleatorios

He estado trabajando gradualmente en el desarrollo desde cero de herramientas para probar el rendimiento de los sistemas de bases de datos de código abierto. Uno de los componentes de...
November 04, 2020