Introduction
Evaluating large-scale and multi-system database servers includes planning for the right tools to validate the performance characteristics of these systems.
HammerDB's TPROC-C workload has been widely used at EDB with work like Vik Fearing's daily performance testing and Tom Rieger’s characterization of cloud-based services. HammerDB has been a good tool for this work but as we continue to test systems with more than hundreds of gigabytes of data, or more complex multi-leader systems, we continue to evaluate tools that help us battle test these scenarios.
In that regard, we are always looking to see what can create multi-terabytes in a reasonable amount of time, as well as drive complex system architectures, so we will make some comparisons around these two areas with DBT-2, another freely available TPC-C derived workload that we are considering for our toolset.
Let us initially evaluate two aspects in this regard using TPC-C derived workloads to see how long it takes to build a large database and whether we can drive a multi-system database server. These two areas are important because even a small percentage difference can result in hours of time saved when generating terabytes of data, and driving a multi-system database server correctly is important to get meaningful results.
Loading the database
We will arbitrarily start with a small-scale factor of 1000 warehouses that generates about 75 GB of raw text data. Both test kits are capable of parallelizing this work and taking advantage of all of the system's available processor threads.
Each test kit generates data remotely and streams the data into the database system. For ease of testing, the results measured will only include the table data loading.
Our test setup is composed of two systems, one to generate data and another system running PostgreSQL. The data generation system has an Intel(R) Xeon(R) Platinum 8259CL with two cores and two threads per core. The database server has an Intel(R) Xeon(R) Platinum 8259CL with four cores and two threads per core.
HammerDB takes about 300 minutes to create the database. DBT-2 takes about 20 minutes to create the database and gives us an incredible savings in time. If we extrapolate those numbers to estimate the time it would take to load a database with one TB of data, HammerDB would take about 67 hours and DBT-2 would take about five hours.
Driving multi-system database servers
A multi-system database system can have N number of nodes and may have partitioned data. One particular scenario may be that each database node is partitioned such that a specific range of warehouses (using TPC-C terminology) is intended to be serviced by that node. Then a client (again using TPC-C terminology), which is essentially a database transaction manager, is set up for each database node. The emulated terminals, representing individual users working in a specific warehouse would connect to the client that is supposed to service the terminal's specific warehouse.
HammerDB has limited controls for how terminals are emulated. It allows its virtual users (using HammerDB's terminology for the emulated terminals) to either randomly select a warehouse for the duration of a test or to randomly select a warehouse per transaction. But in either case, the range of warehouses cannot be restricted beyond the entire range of available warehouses.
DBT-2 is capable of emulating terminals for a given range of warehouses.
Conclusion
The time to create the database does have several other factors to consider, such as costs around acquiring the resources to run tests within a reasonable amount of time. If we oversimplify our considerations for the moment by ignoring the costs around the database system and assume that the limiting factor is purely based on the performance of the data generation system, we can extrapolate additional numbers. Let's say we have a system with 128 processor threads so that the data generation is theoretically 32 times faster than what we initially measured. With 50 TB of data, HammerDB could be able to load a database in about 100 hours or 4.5 days. Similarly, DBT-2 could be able to load the same database in about 8 hours.
The need for a finer granularity to control the emulated terminals is certainly subject to what is being tested. Data locality, cache coherency, database lock contention, and other aspects can be exacerbated in a multi-system database environment. How the workload is driven can either alleviate or intensify some of these aspects, depending on what and how we are testing.
Which tools are right for you for the resources available and the complexity of the system being evaluated?
Read more: Benchmarking PostgreSQL: Improving Performance Stability