Database Soup: December 2012

First, read Part 1 and Part 2 of this series.

vacuum_freeze_table_age

To understand the purpose of this parameter, you have to understand how vacuum has worked since version 8.4. Vacuum no longer usually scans the entire table; instead, it has a bitmap (called the visibility_map) which tells it which data pages in the table have reclaimable space, so that it scans only the portion of the table which is "dirty". This is a considerable optimization for large tables, where only 20% of the rows might have been updated or deleted since the last vacuum cycle.

This does, however, introduce a problem. Since old, untouched data pages aren't being vacuumed (since they don't have any reclaimable rows anymore), they are not getting frozen regardless of the setting of vacuum_freeze_min_age. This makes it much more likely you'll hit an eventual wraparound vac when you least expect it.

The idea of vacuum_freeze_table_age is to compliment autovacuum_max_freeze_age by telling the database: "If it's almost vacuum_freeze time for this table, and you were vacuuming the table anyway, then scan the whole table and to freeze out tuples." In theory, this should allow you to hold off wraparound vacuum by running a regular, cancellable vacuum instead. In practice, though, that's a dubious benefit for large tables, since either way you get a full-table scan and a vacuum which runs for a long time. Small, heavily updated tables will tend to get most of their pages vacuumed most of the time anyway, and thus be unaffected by this setting.

As such, there's no point in setting it to a particularly low or creative level. Simply set it to 80% of whatever autovacuum_max_freeze_age is set to (recommended: 800 million).

Flexible Freeze

By now it should have occurred to you that Postgres has a serious problem with vacuum freeze behavior on large tables. It's the unfortunate bad side-effect of optimizations in several other areas. You can tweak the settings as I recommended, but you'll still be in danger of having a wraparound vacuum kick in and saturate your IO at some unexpected time. So what can you do to behave better and proactively freeze tables before they get to that point?

Well, you have one piece of information PostgreSQL doesn't. You know when the slow periods in your application usage are. Try following this "flexible freeze" program in a cron script:
During expected "slow periods", run the query from Part I to get a list of tables, and then:

Set PostgreSQL to "soft vacuum" with a high vacuum_cost_delay (such as 50ms).
Set PostgreSQL to aggressively freeze tuples, with vacuum_freeze_table_age at 50% of autovacuum_freeze_max_age, and vacuum_freeze_min_age set to 10% of its usual value.
Until the slow period is almost over, loop through the list, vacuuming each table.
This will help you avoid wraparound vacuum when you least expect it.

Or, you can use our Flexible Freeze python script.

Further Work

Obviously the suggestions in this article are workarounds. Where PostgreSQL really needs to go is to find some way to avoid needing to vacuum old, cold data ever. The obstacle to doing this is that nobody has figured out how.

One of the first methods suggested was to have an 8-byte XID, which would postpone wraparound by a billion cycles. However, since there are two XIDs on every row header in the database, this would dramatically increase database sizes for many users, especially the users who need more XIDs in the first place. It would also increase memory requirements for a lot of Postgres operations. A second method I suggested during the 9.2 cycle was to have an "XID cycle" counter in the page header of each data page. This would have the advantage of not increasing the size of rows. However, it would have the drawback of failing in cases where there were rows from more than one XID cycle in the page.

Further, both of these methods hit a major performance problem: the CLOG. PostgreSQL's CLOG tracks which XIDs committed and which rolledback, and thus which rows are valid and which are orphaned. Currently, checking the CLOG is quite rapid because it's kept small. Any patch which involves increasing the size of XIDs or keeping more of them will cause the CLOG to bloat by multiples, substantially affecting response times on transaction-processing workloads. That's unacceptable. We've discussed radically restructuring the CLOG, but that's a major enough change that it would need to come with other benefits than just avoiding freeze.

An alternate approach to never freezing old data at all would be to develop some kind of daemon which did background freezes of chunks of large tables. This daemon would need to be able to tell when the system was "busy" and stop work. It would also need to be able to track what old data pages it had visited and which it hadn't, and which ones need to be revisited because they've been dirtied.

Anyone wanting to take up the above -- either by coding or by funding -- let me know! Until then, happy freezing!

Tuesday, December 18, 2012

PyPgDay Call For Presentations

Saturday, December 15, 2012

Reminder: pgCon Call For Papers

Friday, December 7, 2012

Save The Date: PyPgDay

Freezing Your Tuples Off, Part 3

vacuum_freeze_table_age

Flexible Freeze

Further Work