Sunday, February 15, 2015

Running with scissors mode

DBAs Running with Scissors

Based on some comments in my post about "in-memory" databases, I realized that my post about running Postgres without disk sync was no longer available on Database Soup.  So I'm reposting the instructions here.

This blog post version has corrections and clarifications thanks to Andres Freund, since it first went up yesterday.  Thanks, Andres.

Running PostgreSQL this way was christened "running with scissors mode" by Gavin Roy, because you're operating completely without crash-safety; if something happens to the server, even a power fluctuation, your database contents are untrustworthy and may be corrupt.  However, it can be a useful way to run Postgres for extra, read-only replicas used strictly for load-balancing, or if what you're loading into Postgres is completely disposable/replaceable.

Note that these settings do not, in fact, disable all disk writes.  What they do instead is minimize disk writes, and make all disk writes asynchronous, dependant entirely on the OS's own memory swapping and dirty block flushing for any disk writes.  This is what you want; you don't want the database to halt because, for example, you simply ran out of space in memory.

So, without further ado, here's the settings:

    work_mem =  (RAM - DBsize - shared_buffers)*2 / max_connections
    temp_buffers = (RAM - DBsize - shared_buffers)*4 / max_connections
    temp_file_limit = 0

On the one hand, we want to set work_mem high in order to avoid on-disk sorts. On the other hand, having pinned RAM for sorts push the database out of memory would be counterproductive.  As such, you want to set up work memory to use available RAM you don't need for database caching.  The above assumes that max_connections is set to something sensible for the number of connections you actually need.  You should really be using pgbouncer as well with this setup.


Set temp_file_limit = 0 to cause queries to be cancelled instead of doing disk sorts.

     bgwriter_lru_maxpages = 0
     wal_level = minimal
     fsync = off
     synchronous_commit = off
     full_page_writes = off

     wal_log_hints = off
     wal_buffers = 64MB


Here we're minimizing the amount of writing we do to the transaction log, and making said writing completely asynchronous.  We're also disabling background writing.

     checkpoint_segments = 8 to 64
   checkpoint_timeout = 60min
     checkpoint_completion_target = 0.9

Checkpoint segments is a bit trickier.  On the one hand, you want it to be large enough that it's not cycling a lot and triggering extra disk flushes. On the other hand, you want all the segments to stay cached in RAM.  So something moderate, 256MB to 2GB, depending on how much RAM you have.  Don't set it to more than 1/32nd of RAM.  Ideally, we'd be able to disable checkpoints entirely, but currently there's no reasonable way to do that.

   stats_temp_directory = '/pgramdisk/stats_tmp'

You will also want to move the stats file to a ramdisk so that it's not being written out.  This is a good optimization in general, even outside of running with scissors mode.


Finally, we need to ensure that PostgreSQL will not restart if the system crashes; at that point, you assume your database is corrupt and proceed to recover it from another source.  The first part of doing this is to disable any autostart in your init system configuration.  Secondly, after starting up Postgres, add a line like this to the beginning of postgresql.conf:

     DO_NOT_RESTART=True

The exact parameter you use doesn't matter; what matters is that it's not a recognized parameter, so that Postgres will error out instead of  restarting.  This does mean extra steps when you want to manually restart this node for configuration changes, but there isn't a really good way around that.

Next up, we'll run some performance tests to see how much this benefits us.

Continued in Part II.

Photo of "DBAs running with scissors" taken by Gavin Roy.

2 comments:

  1. This sounds fun. Can you re-post part II if you have it?

    ReplyDelete
  2. I'd love to see the performance tests results.

    ReplyDelete