Tuning shared_buffers and wal_buffers

March 15, 2012
I spend a lot of time answering questions about PostgreSQL, and one of the questions I get asked frequently is: how should I set shared_buffers?  And, a bit less often, how should I set wal_buffers?  I've got canned answers that I can rattle off so fast it'll make your head spin.  Exceptions to my canned answers keep popping up, and it's starting to get hard to give an answer that actually captures all the complexity in this area, so here's a longer explanation.

For shared_buffers, the quick answer is to allocate about 25% of system memory to shared_buffers, as recommended by the official documentation and by the wiki article on Tuning Your PostgreSQL server, but not more than about 8GB on Linux or 512MB on Windows, and sometimes less.  However, I've recently become aware of a number of cases which suggest that higher values may perform much better.  PostgreSQL uses the operating system's memory for caching, but, as recent performance results have shown, there can be a huge performance benefit to having the entire database in shared_buffers as opposed to merely having it in RAM.  Several EnterpriseDB customers or potential customers have reported excellent results from cranking up shared_buffers to very large values, even giving PostgreSQL the lion's share of system memory for its own use.  This flies in the face of some conventional wisdom that PostgreSQL doesn't handle large shared_buffers settings well, but enough people have reported good results that it seems worth taking this strategy seriously.

Of course, the conventional wisdom is not without its reasons.  One of the disadvantages of setting shared_buffers very large is that the memory is nailed down for page caching, and can't be used for other purposes, such as temporary memory for sorts, hashing, and materialization (work_mem) or vacuuming and index builds (maintenance_work_mem).  If your can't comfortably fit the working set inside shared_buffers without leaving a generous amount of memory left over for other purposes, then this strategy is not for you.  Also, as I previously blogged about, if you happen to be running a 32-bit build of PostgreSQL, then you'll find that it's dangerous to set shared_buffers to more than 2-2.5GB, because other memory requests from individual backends may cause them to run out of virtual address space and die.

And, if you can't fit the entire working set inside shared_buffers, then there are a number of reasons to keep it relatively small, in line with the formula above.  If the working set is larger than shared_buffers, most buffer accesses will miss the database buffer cache and fault a page in from the OS; clearly, it makes no sense to allocate a large amount of memory to a cache with a low hit rate.  Furthermore, as dirty pages get pushed out of PostgreSQL's cache, they'll enter the operating system cache.  If that cache is relatively large, the OS has more freedom to reorder writes and thus schedule the I/O more efficiently.   This is especially important around checkpoint time, when the amount of additional dirty data that is pushed out into the OS cache figures to increase with shared_buffers; thus, a higher value for shared_buffers can contribute to latency spikes around checkpoint time.

In tuning wal_buffers, it's helpful to understand how these buffers are used.  PostgreSQL backend processes initially write their write-ahead log records into these buffers, and then the buffers are flushed to disk.  Once the contents of any given 8kB buffer are durably on disk, the buffer can be reused.  Since insertions and writes are both sequential, the WAL buffers are in effect a ring buffer, with insertions filling the buffer and WAL flushes draining it.  Performance suffers when the buffer fills: no more WAL can be inserted until the current flush is complete.  The effects are mitigated by the fact that, when synchronous_commit is not turned off, every transaction commit waits for its WAL record to be flushed to disk; thus, with small transactions at low concurrency levels, a large buffer is not critical.

However,  the standard value in releases prior to PostgreSQL 9.1 was 64kB, which is an amazingly small value for any kind of buffer on a modern system.  In PostgreSQL 9.1, prompted by some good advice from Greg Smith (blog, book), we changed the value to auto-tune to 3% of shared_buffers up to a maximum of 16MB, the size of a single WAL segment.  This results in a substantially higher value than was used in previous releases, and that's a very good thing.  If you are using an older release and haven't set this value manually, it's probably worth plugging your value of shared_buffers into this formula and configuring the resulting value.   Even on versions which use this auto-tuning formula, I've discovered that an even higher value can sometimes improve performance dramatically.  On one test, involving 32 concurrent clients, I found that wal_buffers=64MB doubled performance as compared with wal_buffers=16MB; however, on another system I found that no setting I tried produced more than a 10% improvement over the auto-tuning formula.  Still, especially if you have a system with lots of concurrent activity, higher values may be worth experimenting with, particularly if you have frequent checkpoints: the first write to any given page after a checkpoint will emit a full-page image into WAL to protect against torn pages in the event of a crash, and these full-page writes can chew through WAL buffers very quickly.

If you're serious about tuning your system for optimal performance, it's probably worth trying several different settings for these parameters and determining experimentally which settings work best in your environment.  Far and away, however, the biggest problem with these settings seems to be that people don't tune them at all, or that they raise them only slightly, leaving them still too low.
 
Share this