Limitations v5.6
Take these EDB Postgres Distributed (PGD) design limitations into account when planning your deployment.
Nodes
PGD can run hundreds of nodes, assuming adequate hardware and network. However, for mesh-based deployments, we generally don’t recommend running more than 48 nodes in one cluster. If you need extra read scalability beyond the 48-node limit, you can add subscriber-only nodes without adding connections to the mesh network.
The minimum recommended number of nodes in a group is three to provide fault tolerance for PGD's consensus mechanism. With just two nodes, consensus would fail if one of the nodes were unresponsive. Consensus is required for some PGD operations, such as distributed sequence generation. For more information about the consensus mechanism used by EDB Postgres Distributed, see Architectural details.
Multiple databases on single instances
Support for using PGD for multiple databases on the same Postgres instance is deprecated beginning with PGD 5 and will no longer be supported with PGD 6. As we extend the capabilities of the product, the added complexity introduced operationally and functionally is no longer viable in a multi-database design.
It's best practice and we recommend that you configure only one database per PGD instance.
The deployment automation with TPA and the tooling such as the CLI and PGD Proxy already codify that recommendation.
While it's still possible to host up to 10 databases in a single instance, doing so incurs many immediate risks and current limitations:
If PGD configuration changes are needed, you must execute administrative commands for each database. Doing so increases the risk for potential inconsistencies and errors.
You must monitor each database separately, adding overhead.
TPAexec assumes one database. Additional coding is needed by customers or by the EDB Professional Services team in a post-deploy hook to set up replication for more databases.
PGD Proxy works at the Postgres instance level, not at the database level, meaning the leader node is the same for all databases.
Each additional database increases the resource requirements on the server. Each one needs its own set of worker processes maintaining replication, for example, logical workers, WAL senders, and WAL receivers. Each one also needs its own set of connections to other instances in the replication cluster. These needs might severely impact performance of all databases.
Synchronous replication methods, for example, CAMO and Group Commit, won’t work as expected. Since the Postgres WAL is shared between the databases, a synchronous commit confirmation can come from any database, not necessarily in the right order of commits.
CLI and OTEL integration (new with v5) assumes one database.
Durability options (Group Commit/CAMO)
There are various limits on how the PGD durability options work. These limitations are a product of the interactions between Group Commit and CAMO, and how they interact with PGD features such as the WAL decoder and transaction streaming.
Also, there are limitations on interoperability with legacy synchronous replication, interoperability with explicit two-phase commit, and unsupported combinations within commit scope rules.
See Durability limitations for a full and current listing.
Mixed PGD versions
PGD was developed to enable rolling upgrades of PGD by allowing mixed versions of PGD to operate during the upgrade process. We expect users to run mixed versions only during upgrades and, once an upgrade starts, that they complete that upgrade. We don't support running mixed versions of PGD except during an upgrade.
Other limitations
This noncomprehensive list includes other limitations that are expected and are by design. We don't expect to resolve them in the future. Consider these limitations when planning your deployment:
- A
galloc
sequence might skip some chunks if you create the sequence in a rolled back transaction and then create it again with the same name. Skipping chunks can also occur if you create and drop the sequence when DDL replication isn't active and then you create it again when DDL replication is active. The impact of the problem is mild because the sequence guarantees aren't violated. The sequence skips only some initial chunks. Also, as a workaround, you can specify the starting value for the sequence as an argument to thebdr.alter_sequence_set_kind()
function.