For the last several months, I have been spending a large percentage of my time trying to bring parallelism to PostgreSQL. Previous blog posts on the future direction of PostgreSQL development have often mentioned this as a priority, although the top spot has usually been reserved for materialized views, a feature which now exists in PostgreSQL 9.3 and which has been improved in PostgreSQL 9.4. My colleague at EnterpriseDB, Kevin Grittner, is continuing to work on further improvements in that area. But my focus is on parallelism. So, how's that going?
So far, the work I've been doing has been focused on creating two major pieces of infrastructure.
1. Dynamic Background Workers. Thanks to some good work by Álvaro Herrera, PostgreSQL 9.3 features the ability to register a background process at the time the server starts. These are basically intended as daemon processes running in the background, unconnected to user sessions; and the intention is that they start up when the server starts up, run for the life time of the server, and then shut down when the server shuts down. For parallel query, we need the same sort of process - something which runs in the background and doesn't connect to a client - but with a shorter lifespan. As it turns out, the infrastructure Álvaro built was relatively easy to adapt to this purpose, and I did, so now we can do that.
2. Dynamic Shared Memory. PostgreSQL uses a process model rather than a thread model; therefore, by default, all data is unshared. To share data between backends, we allocate a fixed-size chunk of shared memory at the time the server is started, and all backends over the lifetime of the server map that chunk of shared memory into their address space. The fixed-size nature of this chunk of shared memory has occasionally been a headache over the years, but parallel query really brings the problem into focus. PostgreSQL 9.4 will be able to perform much larger in-memory sorts than any previous release, so let's suppose that we're sorting a terabyte of data in memory. If we're trying to do that in parallel, we're clearly going to need all of the backends involved to be able to access all of the data involved; doing something like shipping the data to be sorted around via pipes would be hideously inefficient. Equally clearly, we can't get the terabyte of shared memory we need for this purpose from the main shared memory segment: there's not going to be an extra terabyte of space there just lying around waiting to be allocated. We need to get that memory from the operating system, and we need to release it when we're done. To solve this problem, PostgreSQL 9.4 now has the ability to allocate additional shared memory segments after server startup.
Although both of these facilities are committed, I'm still sorting out loose ends. As it turns out, it's not quite good enough to be able to launch a background worker on the fly: you also need to be able to find out whether it's still running, and you need to be able to kill it if you decide to abort the parallel computation. And, while dynamic allocation of shared memory segments is cool, the infrastructure committed so far simply provides one big chunk of undifferentiated bytes. In order to really use shared memory to do interesting things, we'll need data structures that can live within a dynamic shared memory segment with rich APIs that make it easy to code parallel algorithms. More on that in a future blog post.