Step 7 - Configure proxies v5

Configure proxies

PGD can use proxies to direct traffic to one of the cluster's nodes, selected automatically by the cluster. There are performance and availability reasons for using a proxy:

  • Performance: By directing all traffic (in particular, write traffic) to one node, the node can resolve write conflicts locally and more efficiently.
  • Availability: When a node is taken down for maintenance or goes offline for other reasons, the proxy can direct new traffic to a new write leader that it selects.

It's best practice to configure PGD Proxy for clusters to enable this behavior.

Configure the cluster for proxies

To set up a proxy, you need to first prepare the cluster and subgroup the proxies will be working with by:

  • Logging in and setting the enable_raft and enable_proxy_routing node group options to true for the subgroup. Use bdr.alter_node_group_option, passing the subgroup name, option name, and new value as parameters.
  • Create as many uniquely named proxies as you plan to deploy using bdr.create_proxy and passing the new proxy name and the subgroup to attach it to. The bdr.create_proxy does not create a proxy, but creates a space for a proxy to register itself with the cluster. The space contains configuration values which can be modified later. Initially it is configured with default proxy options such as setting the listen_address to 0.0.0.0.
  • Configure proxy routes to each node by setting route_dsn for each node in the subgroup. The route_dsn is the connection string that the proxy should use to connect to that node. Use bdr.alter_node_option to set the route_dsn for each node in the subgroup.
  • Create a pgdproxy user on the cluster with a password or other authentication.

Configure each host as a proxy

Once the cluster is ready, you need to configure each host to run pgd-proxy:

  • Create a pgdproxy local user.
  • Create a .pgpass file for that user that allows the user to log into the cluster as pgdproxy.
  • Modify the systemd service file for pgdproxy to use the pgdproxy user.
  • Create a proxy config file for the host that lists the connection strings for all the nodes in the subgroup, specifies the name for the proxy to use when fetching proxy options like listen_address and listen_port.
  • Install that file as /etc/edb/pgd-proxy/pgd-proxy-config.yml.
  • Restart the systemd service and check its status.
  • Log in to the proxy and verify its operation.

Further detail on all these steps is included in the worked example.

Worked example

Preparing for proxies

For proxies to function, the dc1 subgroup must enable Raft and routing.

Log in to any node in the cluster, using psql to connect to the bdrdb database as the enterprisedb user. Execute:

SELECT bdr.alter_node_group_option('dc1', 'enable_raft', 'true');
SELECT bdr.alter_node_group_option('dc1', 'enable_proxy_routing', 'true');

You can use the bdr.node_group_summary view to check the status of options previously set with bdr.alter_node_group_option():

SELECT node_group_name, enable_proxy_routing, enable_raft
            FROM bdr.node_group_summary
           WHERE parent_group_name IS NOT NULL;
Output
 node_group_name | enable_proxy_routing | enable_raft
-----------------+----------------------+-------------
 dc1             | t                    | t
(1 row)

bdrdb=#

Next, create a PGD proxy within the cluster using the bdr.create_proxy function. This function takes two parameters: the proxy's unique name and the group you want it to be a proxy for.

In this example, you want a proxy on each host in the dc1 subgroup:

SELECT bdr.create_proxy('pgd-proxy-one','dc1');
SELECT bdr.create_proxy('pgd-proxy-two','dc1');
SELECT bdr.create_proxy('pgd-proxy-three','dc1');

You can use the bdr.proxy_config_summary view to check that the proxies were created:

SELECT proxy_name, node_group_name
             FROM bdr.proxy_config_summary;
Output
   proxy_name    | node_group_name
-----------------+-----------------
 pgd-proxy-one   | dc1
 pgd-proxy-two   | dc1
 pgd-proxy-three | dc1

 bdrdb=#

Create a pgdproxy user on the database

Create a user named pgdproxy and give it a password. This example uses proxysecret.

On any node, log into the bdrdb database as enterprisedb/postgres.

CREATE USER pgdproxy PASSWORD 'proxysecret';
GRANT bdr_superuser TO pgdproxy;

Configure proxy routes to each node

Once a proxy has connected, it gets its dsn values (connection strings) from the cluster. The cluster needs to know the connection details that a proxy should use for each node in the subgroup. This is done by setting the route_dsn option for each node to a connection string that the proxy can use to connect to that node.

Please note that when a proxy starts, it gets the initial dsn from the proxy's config file. The route_dsn value set in this step and in config file should match.

On any node, log into the bdrdb database as enterprisedb/postgres.

SELECT bdr.alter_node_option('node-one', 'route_dsn', 'host=host-one dbname=bdrdb port=5444 user=pgdproxy');
SELECT bdr.alter_node_option('node-two', 'route_dsn', 'host=host-two dbname=bdrdb port=5444 user=pgdproxy');
SELECT bdr.alter_node_option('node-three', 'route_dsn', 'host=host-three dbname=bdrdb port=5444 user=pgdproxy');

Note that the endpoints in this example specify port=5444. This is necessary for EDB Postgres Advanced Server instances. For EDB Postgres Extended and community PostgreSQL, you can omit this.

Create a pgdproxy user on each host

sudo adduser pgdproxy

This user needs credentials to connect to the server. Create a .pgpass file with the proxysecret password in it. Then lock down the .pgpass file so it's accessible only by its owner.

echo -e "*:*:*:pgdproxy:proxysecret" | sudo tee /home/pgdproxy/.pgpass
sudo chown pgdproxy /home/pgdproxy/.pgpass
sudo chmod 0600 /home/pgdproxy/.pgpass

Configure the systemd service on each host

Switch the service file from using root to using the pgdproxy user.

sudo sed -i s/root/pgdproxy/ /usr/lib/systemd/system/pgd-proxy.service

Reload the systemd daemon.

sudo systemctl daemon-reload

Create a proxy config file for each host

The proxy configuration file is slightly different for each host. It's a YAML file that contains a cluster object. The cluster object has three properties:

  • The name of the PGD cluster's top-level group (as name)
  • An array of endpoints of databases (as endpoints)
  • The proxy definition object with a name and endpoint (as proxy)

The first two properties are the same for all hosts:

cluster:
  name: pgd
  endpoints:
    - "host=host-one dbname=bdrdb port=5444 user=pgdproxy"
    - "host=host-two dbname=bdrdb port=5444 user=pgdproxy"
    - "host=host-three dbname=bdrdb port=5444 user=pgdproxy"

Remember that host-one, host-two, and host-three are the systems on which the cluster nodes (node-one, node-two, node-three) are running. You use the name of the host, not the node, for the endpoint connection.

Again, note that the endpoints in this example specify port=5444. This is necessary for EDB Postgres Advanced Server instances. For EDB Postgres Extended and community PostgreSQL, you can set this to port=5432.

The third property, proxy, has a name property. The name property is a name created with bdr.create_proxy earlier, and it's different on each host. A proxy can't be on the same port as the Postgres server and, ideally, should be on a commonly used port different from direct connections, even when no Postgres server is running on the host. Typically, you use port 6432 for PGD proxies.

  proxy:
    name: pgd-proxy-one

In this case, using localhost in the endpoint specifies that this proxy will listen on the host where the proxy is running.

Install a PGD proxy configuration on each host

For each host, create the /etc/edb/pgd-proxy directory:

sudo mkdir -p /etc/edb/pgd-proxy

Then, on each host, write the appropriate configuration to the pgd-proxy-config.yml file in the /etc/edb/pgd-proxy directory.

For this example, you can run this on host-one to create the file:

cat <<EOF | sudo tee /etc/edb/pgd-proxy/pgd-proxy-config.yml
cluster:
  name: pgd
  endpoints:
    - "host=host-one dbname=bdrdb port=5444 user=pgdproxy"
    - "host=host-two dbname=bdrdb port=5444 user=pgdproxy"
    - "host=host-three dbname=bdrdb port=5444 user=pgdproxy"
  proxy:
    name: pgd-proxy-one
EOF

Restart the service

On each host where the proxy is being installed, restart the pgd-proxy service.

sudo systemctl restart pgd-proxy

Confirm it's running correctly:

sudo systemctl status pgd-proxy

When running, it shows Active: (running) in the opening details.

Test the proxy

At this point, connecting to the PGD Proxy port on any host in the cluster results in the connection being routed to the current write lead node.

For example, and assuming you've installed the proxy on all three hosts, then connecting to the proxy on host-three results in the connection being routed to node-one.

This example passes connection details, using the -d flag of psql with the hostname for the proxy you just configured and the proxy port number:

sudo -iu enterprisedb psql -d "host=host-three dbname=bdrdb port=6432"
Output
psql (16.1.0, server 16.1.0)
Type "help" for help.

bdrdb=#

Once connected to the proxy, you can query the server to find out which node the proxy connected you to:

SELECT node_name FROM bdr.local_node_summary;
Output
 node_name
-----------
 node-one
(1 row)
bdrdb#

You should have connected to the current write leader of the subgroup. You can confirm that by querying which node is the write leader for the subgroup you're connected to:

SELECT node_group_name, write_lead FROM bdr.node_group_routing_summary;
Output
 node_group_name | write_lead
-----------------+------------
 dc1             | node-one
(1 row)

bdrdb=#

And the write_lead is node-one, too, so you confirm you're being proxy-connected to the write leader.