Connection pooling is a big deal in Postgres. Connections to the database can take a relatively large amount of overhead, especially if there are a lot of them, and here at EDB Professional Services we often recommend an open-source tool called PgBouncer to reuse and buffer connections between the database and application.
Assuming you are using a flavor of *NIX, it may be worth noting that each new connection that is made uses the UNIX fork, and effectively copies the edb-postmaster process. This alone can take up around 1.3MB. That child process manages the individual connection until it is terminated, meaning that fast-paced connections and disconnections copy nearly 2MB per new connection very regularly, and are then quickly torn down again to release the memory for reuse. This is one reason why there is a good-sized performance overhead for new connections.
However, although (in typical *NIX fashion) PgBouncer is a lightweight tool and it does its one job very well it can be limited to doing that one job well but not doing much else. This is in stark contrast to other tools provided by the community and EDB such as pgpool-ii, which can in addition to pooling also handle replication and load balancing, but are heavier and can be less efficient - both overall and for pooling in particular.
Recently, while hosting a training session, I was asked some very insightful questions, which I decided to pursue further. In essence, having PgBouncer external to a database (or databases), can PgBouncer handle persistent connections during events like network outages, database restarts, and even automated failover or controlled switchover between entirely different machines?
So, making the minimal configuration changes possible, how would PgBouncer handle something as drastic as loss of connection to a database - and even a failover event - while still providing persistent connections (and as a bonus, still provide pooling on reconnection) when the connection to the database comes back up again?
As these questions address a number of points I’ve decided to break up how to work with PgBouncer in several blogs. First, let’s take a look at installing, configuring and testing PgBouncer.
So, first, we install PgBouncer on an external server (or it, or they, would be ideally suited to some kind of a container) and for convenience, it is good to have something running while we bounce or failover our server. For testing purposes, I’m going to install psql for some basic tests, and PgBench for more advanced tests later - both are bundled in the standard PostgreSQL client.
You can compile PgBouncer from source, or you can get it from the PGDG standard repository for your distribution. I’m using CentOS 7, so I’ll install the repository .rpm first, and then PgBouncer. You’ll see I’m also going to install pgbconsole (a kind of resource monitor for PgBouncer) which will let me monitor the internals of PgBouncer in real-time.
$ sudo yum install ./pgdg-redhat-repo-latest.noarch.rpm
...
---> Package pgdg-redhat-repo.noarch 0:42.0-4 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
...
Installed:
pgdg-redhat-repo.noarch 0:42.0-4
...
Complete!
$ sudo yum install pgbouncer pgbconsole
...
--> Finished Dependency Resolution
Dependencies Resolved
...
Installing:
pgbconsole x86_64 0.1.1-1.rhel7
pgdg10 26 k
pgbouncer x86_64 1.11.0-1.rhel7
pgdg10 206 k
...
Transaction Summary
Install 2 Packages (+3 Dependent packages)
...
Installed:
pgbconsole.x86_64 0:0.1.1-1.rhel7
pgbouncer.x86_64 0:1.11.0-1.rhel7
Dependency Installed:
...
Complete!
And then, to get psql and PgBench, I install the postgresql11 client packages:
$ sudo yum install postgresql11
...
--> Finished Dependency Resolution
...
Installed:
postgresql11.x86_64 0:11.5-1PGDG.rhel7
...
Complete!
And, just to make sure that we now have pgbench…
$ find /usr -name pgbench
/usr/pgsql-11/bin/pgbench
So, now we have all our software, we can spin up a new VM with our target database and connect PgBouncer to it.
Once you have PgBouncer installed, and a target database set up and started (don’t forget to configure pg_hba.conf to allow PgBouncer to connect, and open a port in the firewall to allow network traffic between the two boxes) we can then configure PgBouncer.
Note: On my target database I set up the “pgb” database and user like so:
psql.bin (11.5.12)
Type "help" for help.
edb=# create user pgb with superuser login password 'xxx';
CREATE ROLE
edb=# create database pgb owner pgb;
CREATE DATABASE
Then edit the pgbouncer configuration file:
$ nano /etc/pgbouncer/pgbouncer.ini
...And update:
[databases]
pgb = dbname=pgb host=192.168.15.6 port=5432
Enter your own access details as appropriate there.
To allow md5 authentication, add the following to the pgbouncer.ini and then enter your username and password into the auth_file.
;;;
;;; Authentication settings
;;;
;; any, trust, plain, md5, cert, hba, pam
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
Like so:
# cat /etc/pgbouncer/userlist.txt
"pgb" "xxx"
And then we can test connectivity from PgBouncer, which is shown configured as listening on port 6543 on the loopback IP address:
$ psql -U pgb -h 127.0.0.1 -p 6543 pgb
Password for user pgb:
psql (11.5, server 11.5.12)
Type "help" for help.
pgb=# create table test(x numeric);
CREATE TABLE
pgb=# insert into test values (1),(2);
INSERT 0 2
pgb=# select * from test;
x
---
1
2
(2 rows)
pgb=# drop table test;
DROP TABLE
Success!!
Now, let’s see if that connection survives a brief network error by default…
$ psql -U pgb -h 127.0.0.1 -p 6543 pgb
Password for user pgb:
Type "help" for help.
pgb=# create table test(x numeric);
CREATE TABLE
pgb=# insert into test values (1),(2);
INSERT 0 2
On the database box, I’ll bounce the network port:
# ifdown eth0 && ifup eth0
Then, back in PgBouncer:
pgb=# insert into test values (3),(4);
INSERT 0 2
pgb=# select * from test;
x
---
1
2
3
4
(4 rows)
You can use standard community edition Postgresql, or try out EDB Postgres Advanced Server (EPAS) as I have , as I used another EDB product (EDB Failover Manager™ - EFM) to manage and display output from controlled switchover and automated failover later. EFM works just fine with community edition Postgresql, or feel free to manage downtime however you normally would.
And then follow the EPAS installation guide, or use community edition as you prefer.
In the next blog we will look at the scenario if you have downtime and need to reset your PgBouncer connection.