Can PgBouncer session survive everything we throw at it?

February 06, 2020

 

This is part four of my technical response to a series of questions about the use of PgBouncer and what you need to look out for. To read the other posts in the series, use the links below. 

Part 1: Installing, configuring and testing persistent PostgreSQL connection pooling

Part 2: PgBouncer Connection Pooling: What to do when persistent connectivity is lost

Part 3: Can PgBouncer handle failover to a new machine?

 

In this post we're going to try something a little more extreme than in the previous blogs. We’ll bring the Master database down completely (in a controlled fashion - we don’t want to break anything, just simulate a complete database outage) and see if the PgBouncer session can survive everything we throw at it!

For the purpose of demonstration, I’ll use EDB Failover Manager™ (EFM) to control failover and management of the Virtual IP (VIP) during this process. EFM detects and controls any deliberate or accidental failover between databases in streaming replication, and passes the VIP between the current Master database to allow the application a single IP address with which to access the cluster currently maintaining the single point of truth in the system.

You don’t need to use EFM; there are other options available with similar capabilities and with varying levels of support. For instance, pgpool-ii can provide connection pooling, replication and load balancing in addition to controlling the VIP. You can find information on the community edition of pgpool-ii (including watchdog VIP management) on this website. And, if you are looking for enterprise support, you can learn about the EDB version of pgpool-ii on the EnterpriseDB website

 

Performing Controlled Switchover

I’ll start by performing a controlled switchover (or “promote” command) using EFM, however, later I’ll bring down the Master database completely to simulate database failure. The VIP in my setup is always set to the reserved address 192.168.15.200, and PgBouncer will always connect to this IP address. The database host holding the VIP is marked with an asterisk * to the right of the VIP column.

# ./efm promote efm -switchover

Promote/switchover command accepted by local agent. Proceeding with promotion and will reconfigure original master. Run the 'cluster-status' command for information about the new cluster state.

 

And checking the EFM cluster status:

# ./efm cluster-status efm

Cluster Status: efm



Agent Type  Address              Agent  DB       VIP

-----------------------------------------------------------------------

Master      192.168.15.6         UP     UP       192.168.15.200*

Standby     192.168.15.8         UP     UP       192.168.15.200

Witness     192.168.15.9         UP     N/A      192.168.15.200



Allowed node host list:

192.168.15.9 192.168.15.6 192.168.15.8



Membership coordinator: 192.168.15.9



Standby priority host list:

192.168.15.8



Promote Status:



DB Type     Address              WAL Received LSN   WAL Replayed LSN   Info

---------------------------------------------------------------------------

Master      192.168.15.6                            0/9000060          

Standby     192.168.15.8         0/9000060          0/9000060          



Standby database(s) in sync with master. It is safe to promote.

 

Bringing Down the Master Database

Let’s bring down the Master, wait for EFM to fail over to the Standby and see if PgBouncer can tell the difference. (A quick test showed that our PgBouncer connection is still active.)

On the Master:

# systemctl stop edb-as-11

 

And, once automatic failover has completed, check the cluster status:

# ./efm cluster-status efm

Cluster Status: efm



Agent Type  Address              Agent  DB       VIP

-----------------------------------------------------------------------

Idle        192.168.15.6         UP     UNKNOWN  192.168.15.200

Master      192.168.15.8         UP     UP       192.168.15.200*

Witness     192.168.15.9         UP     N/A      192.168.15.200



Allowed node host list:

192.168.15.9 192.168.15.6 192.168.15.8



Membership coordinator: 192.168.15.9



Standby priority host list:

(List is empty.)



Promote Status:



DB Type     Address              WAL Received LSN   WAL Replayed LSN   Info

---------------------------------------------------------------------------

Master      192.168.15.8                            0/A0001E0          



No standby databases were found.



Idle Node Status (idle nodes ignored in WAL LSN comparisons):



Address              WAL Received LSN   WAL Replayed LSN   Info

---------------------------------------------------------------

192.168.15.6         UNKNOWN            UNKNOWN            Connection to 192.168.15.6:5444 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.

 

Failover to the New Master Database

The new Master has taken over, has taken control of the VIP, and there is a warning in the Node Status that the old Master is no longer accepting connections.

So what happens on PgBouncer? Still in the same session as before:

pgb=# select * from test;

 x 

---

 1

 2

 3

 4

(4 rows)

 

Which is great news. The PgBouncer session is persistent and has failed over onto the new Master without incident.

Once we reintroduce the failed Master back into the EFM cluster as a Standby, we get:

# ./efm cluster-status efm

Cluster Status: efm



Agent Type  Address              Agent  DB       VIP

-----------------------------------------------------------------------

Standby     192.168.15.6         UP     UP       192.168.15.200

Master      192.168.15.8         UP     UP       192.168.15.200*

Witness     192.168.15.9         UP     N/A      192.168.15.200



Allowed node host list:

192.168.15.9 192.168.15.6 192.168.15.8



Membership coordinator: 192.168.15.9



Standby priority host list:

192.168.15.6



Promote Status:



DB Type     Address              WAL Received LSN   WAL Replayed LSN   Info

---------------------------------------------------------------------------

Master      192.168.15.8                            0/A0001E0          

Standby     192.168.15.6         null               0/A000098          



One or more standby databases are not in sync with the master database. The following node(s) returned a WAL LSN that does not match the master: 192.168.15.6


Resyncing the Master Database and EFM

We now have a DBA task to bring the Master and EFM back into sync, but that’s to be expected.

So let’s do that and take everything back to the way it was before.

# ./efm cluster-status efm

Cluster Status: efm



Agent Type  Address              Agent  DB       VIP

-----------------------------------------------------------------------

Master      192.168.15.6         UP     UP       192.168.15.200*

Standby     192.168.15.8         UP     UP       192.168.15.200

Witness     192.168.15.9         UP     N/A      192.168.15.200



Allowed node host list:

192.168.15.9 192.168.15.6 192.168.15.8



Membership coordinator: 192.168.15.9



Standby priority host list:

192.168.15.8



Promote Status:



DB Type     Address              WAL Received LSN   WAL Replayed LSN   Info

---------------------------------------------------------------------------

Master      192.168.15.6                            0/11000060         

Standby     192.168.15.8         0/11000060         0/11000060         



Standby database(s) in sync with master. It is safe to promote.

 

Finally, what happens if there is a big transaction in flight, and the Master database goes down?

Back to pgbench. I’m dropping the “test” table and initialising the pgbench run, all from PgBouncer.

pgb=# drop table test;

DROP TABLE

pgb=# exit

$ /usr/pgsql-11/bin/pgbench --initialize --scale=10 -U pgb -h 127.0.0.1 -p 6543 pgb

Password: 

dropping old tables...

creating tables...

generating data...

100000 of 1000000 tuples (10%) done (elapsed 0.02 s, remaining 0.14 s)

200000 of 1000000 tuples (20%) done (elapsed 0.04 s, remaining 0.18 s)

...

900000 of 1000000 tuples (90%) done (elapsed 1.69 s, remaining 0.19 s)

1000000 of 1000000 tuples (100%) done (elapsed 2.30 s, remaining 0.00 s)

vacuuming...

creating primary keys...

done.

 

So, a baseline test with ten users, each with a fresh connection per transaction, on this dataset:

$ /usr/pgsql-11/bin/pgbench --client=10 --connect -U pgb -h 127.0.0.1 -p 6543 pgb

Password: 

starting vacuum...end.

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 10

query mode: simple

number of clients: 10

number of threads: 1

number of transactions per client: 10

number of transactions actually processed: 100/100

latency average = 43.163 ms

tps = 231.679552 (including connections establishing)

tps = 234.195544 (excluding connections establishing)

 

Increasing Client Connections

But what if we increase the number of clients beyond the PgBouncer maximum connections? Well, we get a lot of these:

...

ERROR:  no more connections allowed (max_client_conn)

client 997 aborted while establishing connection

connection to database "pgb" failed:

ERROR:  no more connections allowed (max_client_conn)

client 998 aborted while establishing connection

connection to database "pgb" failed:

ERROR:  no more connections allowed (max_client_conn)

client 999 aborted while establishing connection

transaction type: <builtin: TPC-B (sort of)>

...

number of transactions actually processed: 1000/10000

latency average = 2801.775 ms

tps = 356.916542 (including connections establishing)

tps = 356.993342 (excluding connections establishing)

 

What if we increase the PgBouncer maximum connections to 1000, while leaving the target database at 100?

;; Total number of clients that can connect

max_client_conn = 1000

 

The answer is, we hit a timeout. And get a lot of:

...

client 990 aborted in command 4 (SQL) of script 0; ERROR:  query_wait_timeout



client 991 aborted in command 4 (SQL) of script 0; ERROR:  query_wait_timeout



client 992 aborted in command 4 (SQL) of script 0; ERROR:  query_wait_timeout



client 993 aborted in command 4 (SQL) of script 0; ERROR:  query_wait_timeout



client 994 aborted in command 4 (SQL) of script 0; ERROR:  query_wait_timeout



transaction type: <builtin: TPC-B (sort of)>

scaling factor: 10

query mode: simple

number of clients: 1000

number of threads: 1

number of transactions per client: 10

number of transactions actually processed: 250/10000

latency average = 484472.060 ms

tps = 2.064103 (including connections establishing)

tps = 2.066158 (excluding connections establishing)

 

So, while simply increasing the number of attached clients allows all of the clients to connect, PgBouncer still times out most of that connection activity, as there are simply too many clients. This behaviour can of course be controlled using the PgBouncer timeout configuration parameters. But, this may cause even more problems than it solves if the additional load is anything other than a brief spike in connections. 

PgBouncer can, however, handle dialling down the number of connections to something it can process internally, while still handling more connections than the database itself will accept, like 200:

$ /usr/pgsql-11/bin/pgbench --client=200 --connect -U pgb -h 127.0.0.1 -p 6543 pgb

Password: 

starting vacuum...end.

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 10

query mode: simple

number of clients: 200

number of threads: 1

number of transactions per client: 10

number of transactions actually processed: 2000/2000

latency average = 452.066 ms

tps = 442.412851 (including connections establishing)

tps = 442.971877 (excluding connections establishing)

 

And, not only are we handling more clients (double!) than the database can accept at any given time, the throughput has almost doubled too.

 

Failover in the Middle of Job Processing

Let’s run a big batch job now, and fail over to a standby in the middle of it. A hundred million rows should do.

$ /usr/pgsql-11/bin/pgbench --initialize --scale=1000 -U pgb -h 127.0.0.1 -p 6543 pgb

Password: 

dropping old tables...

creating tables...

generating data...

100000 of 100000000 tuples (0%) done (elapsed 0.01 s, remaining 14.52 s)

200000 of 100000000 tuples (0%) done (elapsed 0.05 s, remaining 26.93 s)

...

99900000 of 100000000 tuples (99%) done (elapsed 400.55 s, remaining 0.40 s)

100000000 of 100000000 tuples (100%) done (elapsed 400.97 s, remaining 0.00 s)

vacuuming...

creating primary keys...

done.

 

I’m going to make this more realistic by only using one client connection as if it were a “real” batch process.

Without a failover, running 10,000 transactions should take about 90 seconds:

$ /usr/pgsql-11/bin/pgbench --client=1 --transactions=10000 -U pgb -h 127.0.0.1 -p 6543 pgb

Password: 

starting vacuum...end.

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 1000

query mode: simple

number of clients: 1

number of threads: 1

number of transactions per client: 10000

number of transactions actually processed: 10000/10000

latency average = 14.169 ms

tps = 70.576774 (including connections establishing)

tps = 70.577073 (excluding connections establishing)

 

And, taking the Master database down in the middle of processing:

$ /usr/pgsql-11/bin/pgbench --client=1 --transactions=10000 -U pgb -h 127.0.0.1 -p 6543 pgb

Password: 

starting vacuum...end.

FATAL:  terminating connection due to administrator command

ERROR:  server conn crashed?

client 0 aborted in command 4 (SQL) of script 0; perhaps the backend died while processing

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 1000

query mode: simple

number of clients: 1

number of threads: 1

number of transactions per client: 10000

number of transactions actually processed: 356/10000

latency average = 14.251 ms

tps = 70.168951 (including connections establishing)

tps = 70.173780 (excluding connections establishing)

 

Not so good. So if the server dies during a batch job, all is lost. Which is, after all, the point of transaction management. That's the A for Atomicity in ACID…or, maybe not. Remember, we have our pooling mode set to transaction. What happens if we change it to a pooling mode of statement?

$ /usr/pgsql-11/bin/pgbench --client=1 -U pgb -h 127.0.0.1 -p 6543 pgb

Password: 

starting vacuum...end.

client 0 aborted in command 4 (SQL) of script 0; ERROR:  transaction blocks not allowed in statement pooling mode



transaction type: <builtin: TPC-B (sort of)>

scaling factor: 1000

query mode: simple

number of clients: 1

number of threads: 1

number of transactions per client: 10

number of transactions actually processed: 0/10

 

We can’t use transactions in statement mode. However, if we run the built-in SELECT only script…

$ /usr/pgsql-11/bin/pgbench --client=1 --select-only -U pgb -h 127.0.0.1 -p 6543 pgb 

Password: 

starting vacuum...end.

transaction type: <builtin: select only>

scaling factor: 1000

query mode: simple

number of clients: 1

number of threads: 1

number of transactions per client: 10

number of transactions actually processed: 10/10

latency average = 4.289 ms

tps = 233.178013 (including connections establishing)

tps = 235.642136 (excluding connections establishing)

 

No problem. And a decent throughput.

Removing the transactions, then, by using a custom script without them:

$ cat no_txn.sql

\set aid random(1, 100000 * :scale)

\set bid random(1, 1 * :scale)

\set tid random(1, 10 * :scale)

\set delta random(-5000, 5000)

UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;

SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;

UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;

INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);

 

This now works:

$ /usr/pgsql-11/bin/pgbench --client=1 -f no_txn.sql -U pgb -h 127.0.0.1 -p 6543 pgb

Password: 

starting vacuum...end.

transaction type: no_txn.sql

scaling factor: 1

query mode: simple

number of clients: 1

number of threads: 1

number of transactions per client: 10

number of transactions actually processed: 10/10

latency average = 34.313 ms

tps = 29.143190 (including connections establishing)

tps = 29.173952 (excluding connections establishing)

 

Putting all that together, scaling it up, and pulling the database out and failing over in the middle of processing:

$ /usr/pgsql-11/bin/pgbench --client=1 --transactions=10000 -f no_txn.sql -U pgb -h 127.0.0.1 -p 6543 pgb

Password: 

starting vacuum...end.

client 0 aborted in command 8 (SQL) of script 0; FATAL:  terminating connection due to administrator command



transaction type: no_txn.sql

scaling factor: 1

query mode: simple

number of clients: 1

number of threads: 1

number of transactions per client: 10000

number of transactions actually processed: 122/10000

latency average = 32.393 ms

tps = 30.871287 (including connections establishing)

tps = 30.873907 (excluding connections establishing)

 

Not only did this not succeed overall, some, but not all of the statements did succeed individually! We obviously want to avoid this at all costs, so don’t do it!

 

What Did We Learn?

In summary:

  • PgBouncer connections can survive failover between databases when pointing to the VIP 
  • PgBouncer provides connection pooling and more efficient processing during high periods of rapidly reconnecting sessions 
  • Levels of reconnection that are too high will cause PgBouncer to error out and cause processing errors 
  • PgBouncer connections work best in transaction pool mode during failover to preserve atomicity 
  • PgBouncer in transaction mode does not work effectively for long-running batch jobs during failover
  • PgBouncer in session mode can fail over in batch jobs that do not contain transactions but will not process all commands and should not be used 

In the next part of this series, we’ll evaluate whether PgBouncer can survive Transient Network Events.

 

Share this