Trying to go to bed smarter than when I woke up.
This is crossposted from the LendingHome Engineering Blog.
Basic proficiency with PostgreSQL is one of the highest-leverage skills an engineer can have. Even a cursory understanding of its internals and configuration saves significant time: hours that would have otherwise been spent writing complex one-off scripts to restore data and optimize query patterns can often be trumped with a simple configuration change.
In this post, we’ll first go over some of the important configuration options and explain what they do, walk through some common configuration setups, and finally, we’ll link to some helpful resources to guide you further.
The Need for Configuration
PostgreSQL is one of the most popular database servers today. It’s inured from 20-plus years of real-world production deployments, so it’s naturally very performant and reliable. Unfortunately, this ubiquity also means that the default configuration settings are a compromise designed to work on a variety of machines. This standard configuration leaves room for many simple changes that can translate into great performance gains.
The default PostgreSQL config file lives in
postgresql.conf and is pretty long (around 483 lines on a fresh install). Here’s where you can find it on Homebrew and Ubuntu:
Homebrew: /usr/local/var/postgres/postgresql.conf Ubuntu: /etc/postgresql/current/main/postgresql.conf
For many use cases, there are only a few key settings that you need to know about to make a difference:
shared_buffers is the amount of dedicated memory that Postgres can use to cache data before relying on the operating system’s shared kernel buffer. Think of it as Postgres’ dedicated short term memory. It’s important to keep things cached because reading data off of disk is much slower than fetching it from RAM; the higher the value for this setting, the more likely it is the data will be accessed from RAM.
The higher you set this value, the more memory Postgres can use to cache results. While it’s tempting to set this number very high, a rule of thumb is to keep it around 25–40% of your total available RAM. So for example, on a machine with 16GB RAM, a good value is below 6GB.  Keep in mind, if you have a very small dataset, increasing shared_buffers might not make a difference.
If you set shared_buffers to a high value, you might see an error about kernel memory settings, shmmax to be specific. Fear not. Shmmax is the maximum amount of memory that your operating system will allow a process to use. See the section below on how to fix this.
The total amount of memory you think Postgres will be able to use, including both shared_buffers and the amount of memory available by the operating system caches. The query planner uses this number to figure out how much memory will be available for caching results while running queries. Set this too high and the query planner gets too aggressive and memory usage balloons; set it too low, and Postgres might not use the right indexes because they won’t fit into memory.
The maximum number of concurrent connections that Postgres should allow. If your server accepts a lot of traffic with relatively simple queries and you need more throughput, try having a higher number here (keeping in mind constraints from work_mem). On the other hand, if you’re doing complex queries that require a lot of memory, set it lower.
work_mem is the amount of memory allocated to each Postgres operation (see max_connections) This determines how much memory a single Postgres operation can use, and is especially helpful for complex sorts and joins. More memory is better, because that makes it less likely that the OS starts swapping to disk, which is slower than keeping the query in memory.
Be careful — this is the amount of memory allocated per operation. If you have 50 operations and a complex sort going on that requires 100MB per query, you’re suddenly going to be using over 5GB of memory.
This is similar to
work_mem, but for maintenance operations like
CREATE INDEX. If you’re doing a lot of index creation or other types of maintenance tasks, set this high and you’ll save a lot of time. When restoring database dumps, it helps to set this number high to speed up index creation and other maintenance tasks like vacuuming.
Common Configuration Setups
First, it’s important to figure out what kind of workload you are targeting. There’s no one-size-fits-all approach, but here are a couple of general guiding principles based on your type of workload:
How to set up for a restore
If you’re restoring data (maybe through pg_restore from a backup), you’ll want to increase the amount of worker memory (maintenance_work_mem, used to create indexes) and reduce the number of checkpoint segments (see section below on checkpoint_segments), assuming you’re confident that the data you’re loading is intact.
Also, if you are restoring data using pg_restore, leverage its multi-core options through the — jobs option so you can restore with many threads in parallel. This almost divides the amount of time required to restore data by the number of processors you use. Caveat — you have to tell pg_dump to export into the directory format for this to work. Also, see the section below on general write-heavy configurations for more ways to reduce restore time.
If your deployment is heavily focused on reads, you have a few decisions to make. The most important is: does the data you need to access follow the Pareto Principle where 20% of the data is read 80% of the time, or is it a more evenly distributed read distribution? If your reads are Pareto distributed, you can probably increase RAM with hopes of keeping the necessary data in cache for as long as possible.
On the other hand, if you’re in a situation where you have a lot of data and your reads are near-uniformly distributed, it’s good to 1) try and increase shared_buffers so that you can hopefully fit all the data in caches, and if not, 2) upgrade to high-performance storage (like iSCSI 15k drives) or SSDs. If you’re on Heroku, just pick a more performant database plan. Many performance bottlenecks happen at the database layer, so investing in better disks and adding more data caching can really increase performance in read-heavy environments.
Postgres writes new entries to a system called the Write Ahead Log (WAL) in 16MB segments. In the default database configuration, the WAL “flushes” to disk every time 3 of these segments fill up (checkpoint_segments is set to 3 by default), and this process is called a WAL checkpoint. 
Checkpoints are expensive operations, and if you’re in a write-heavy environment, it’s advisable to increase checkpoint_segments to a higher number so that you reduce the frequency at which the WAL flushes to disk. For example, you can set checkpoint_segments to 12 and you will reduce the overall number of checkpoints by a factor of 4.
If you are lucky enough to be in a situation where minor data loss is acceptable, you can disable synchronous commits by setting synchronous_commit to off. Bear in mind that this is a risky approach — you’re putting yourself at risk to losing a small amount of data, but disabling synchronous commits in the right environments can significantly boost write volume.
Extra: Dealing with Shmmax Errors
If you try restarting your server and get an error like this:
IpcMemoryCreate: shmget(key=<some key size>, size=<some size>, 03600) failed: Invalid argument
That means you have to tweak your kernel memory settings. See the Postgres explanation on kernel resources for a quick walkthrough of how to increase allowable memory allocation on your machine. More info on kernel memory settings here.
PostgreSQL is a great database server, and some basic configuration tweaks can help you unlock its performance. It’s very actively maintained with a huge development following, so it’s constantly evolving and keeping up-to-date with user needs.
Thanks to Donovan Bray, Sean Huber, Aman Gupta, and Joe Damato for reading drafts of this.