We've released an off-schedule update release today, because of a bug in one of 9.5's features which has forced us to partially disable the feature. This is, obviously, not the sort of thing we do lightly.
One of the performance features in 9.5 was an optimization which speeded up sorts across the board for text and numeric values, contributed by Peter Geoghegan. This was an awesome feature which speeded up sorts across the board by 50% to 2000%, and since databases do a lot of sorting, was an overall speed increase for PostgreSQL. It was especially effective in speeding up index builds.
That feature depends on a built-in function in glibc, strxfrm(), which could be used to create a sortable hash of strings. Now, the POSIX standard says that strxfrm() + strcmp() should produce sorting results identical to the strcoll() function. And in our general tests, it did.
However, there are dozens of versions of glibc in the field, and hundreds of collations, and it's computationally unreasonable to test all combinations. Which is how we missed the problem until a user reported it. It turns out that for certain releases of glibc (particularly anything before 2.22 on Linux or BSD), with certain collations, strxfrm() and strcoll() return different results due to bugs. Which can result in an index lookup failing to find rows which are actually there. In the bug report, for example, an index on a German text column on RedHat Enterprise Linux 6.5 would fail to find many rows in a "between" search.
As a result, we've disabled the feature in 9.5.2 for all indexes which are on collations other than the simplified "C" collation. This sucks.
Also, if you're on 9.5.0 or 9.5.1 and you have indexes on columns with real collations (i.e. not "C" collation), then you should REINDEX (or CREATE CONCURRENTLY + DROP CONCURRENTLY) each of those indexes. Which really sucks.
Of course we're discussing ways to bring back the feature, but nobody has a solution yet. In the meantime, you can read more about the problem on the wiki page.
Showing posts with label releases. Show all posts
Showing posts with label releases. Show all posts
Thursday, March 31, 2016
Thursday, February 5, 2015
Some notes on today's update release
We released a security and cumulative bugfix release to all supported versions today. That means it's update time. What follows is my personal advice on the update.
For the first time in a while, we have a bunch of "low-risk" security fixes in this release, but no "critical" security fixes. The reason I put those terms in quotes is that it doesn't matter how critical the fixes are in general; it matters how critical they are to you. So you should definitely read over the release notes and the CVE notices to check how they affect you.
All five of the security holes patched require prior authentication. Four of the five have not been proven to have an actual privilege escalation vector; they may be only denial-of-service attacks. And the fifth security issue only affects you if you are using per-column privileges for columns with constraints on them. That's why I regard these issues as relatively "low-risk".
There are also some important fixes to performance and replication for versions 9.4 and 9.3, so users of those versions should apply the update soon. For other users, unless you live in the Fiji Islands or other places affected by timezone changes, you can probably wait for your next scheduled maintenance window. You do have scheduled maintenance windows, yes?
Other people who might care to apply this update sooner rather than later include:
For the first time in a while, we have a bunch of "low-risk" security fixes in this release, but no "critical" security fixes. The reason I put those terms in quotes is that it doesn't matter how critical the fixes are in general; it matters how critical they are to you. So you should definitely read over the release notes and the CVE notices to check how they affect you.
All five of the security holes patched require prior authentication. Four of the five have not been proven to have an actual privilege escalation vector; they may be only denial-of-service attacks. And the fifth security issue only affects you if you are using per-column privileges for columns with constraints on them. That's why I regard these issues as relatively "low-risk".
There are also some important fixes to performance and replication for versions 9.4 and 9.3, so users of those versions should apply the update soon. For other users, unless you live in the Fiji Islands or other places affected by timezone changes, you can probably wait for your next scheduled maintenance window. You do have scheduled maintenance windows, yes?
Other people who might care to apply this update sooner rather than later include:
- Users who have already had issues with autovacuum
- People using the new logical decoding
- Users who have a single archive which is shared between master and replicas.
- Folks who create a bunch of tablespaces.
- Developers who use tsquery, xpath(), and/or complex regular expression searches
- JSONB users.
- Norwegians who use Postgres on Windows
- Users who have reported bugs with explicit locking and deadlocking in the last few months.
Thursday, December 5, 2013
Why you need to apply today's update real soon
So, another month, another PostgreSQL update. This one is a lot more critical than most because it patches up to three data loss issues, depending on what version of PostgreSQL you're on and which features you're using. While all users need to plan to apply this update sometime during the next couple weeks, two kinds of users need to schedule an update downtime for this weekend, if at all possible:
Annoyingly, you'll have to do some additional stuff after you update:
VACUUM; -- optionally, ANALYZE as well
This second step is critical for users on 9.3, and a generally good idea for users on other versions. Note that, while VACUUM is a non-blocking operation, it can create a lot of IO, so it's a good idea to do this during a slow traffic period and monitor it for pile-ups.
More information about the replication issue is here.
So, how did this happen? Well, all three issues were unexpected side effects of fixes we applied for other issues in earlier versions. For example, the replication issue is the result of the combination of two independent fixes for failover issues, both of which needed to be fixed. Since all of these issues depend on both timing and heavy traffic, and their effect is subtle (as in, a few missing or extra rows), none of our existing testing was capable of showing them.
If anyone wants to devise destruction tests to detect subtle data corruption issues in PostgreSQL before we release code -- if anyone can -- please offer to help!
- Users on version 9.3
- Users of binary replication
Annoyingly, you'll have to do some additional stuff after you update:
- If using binary replication, you need to take a new base backup of each replica after updating.
- You should run the following on each production database after updating:
VACUUM; -- optionally, ANALYZE as well
This second step is critical for users on 9.3, and a generally good idea for users on other versions. Note that, while VACUUM is a non-blocking operation, it can create a lot of IO, so it's a good idea to do this during a slow traffic period and monitor it for pile-ups.
More information about the replication issue is here.
So, how did this happen? Well, all three issues were unexpected side effects of fixes we applied for other issues in earlier versions. For example, the replication issue is the result of the combination of two independent fixes for failover issues, both of which needed to be fixed. Since all of these issues depend on both timing and heavy traffic, and their effect is subtle (as in, a few missing or extra rows), none of our existing testing was capable of showing them.
If anyone wants to devise destruction tests to detect subtle data corruption issues in PostgreSQL before we release code -- if anyone can -- please offer to help!
Thursday, October 10, 2013
How urgent is that Update?
We've released a cumulative bugfix release today for all supported versions of PostgreSQL, in case you somehow missed it. This update doesn't contain anything time-critical for most users. You'll want to add it to your upgrade queue for your next scheduled downtime, but you don't need to schedule an update for this weekend. Unless, of course, you are one of the two dozen or so users who reported the bugs fixed.
However, you don't really need me to tell you that, because it's in the language of the release. More on this below.
A couple of notable fixes in this release:
Hstore-JSON extension fix: if you are using both Hstore and JSON in Postgres 9.3, you should run ALTER EXTENSION hstore UPDATE after updating your server. Due to a mistake in extension versioning, users missed out on some JSON-hstore conversion functions in 9.3.
Guarantee transmission of all WAL files: users with replication setups and stringent data retention requirements will want to apply the update sooner rather than later because of this fix. It fixes an issue where under some circumstances a promoted replica might ignore a few transactions it actually received from the master before failover, considering them lost.
Over the last dozen or so update releases, we've standardized the language we use to explain how critical the fixes in the release are for most users.
First, there's the question of what kind of update it is:
Bugfix Release or Update Release: this usually contains two to four months of accumulated bug fixes for various reported bugs. Some of these bugs may be critical data loss bugs; see the release contents to check this.
Security Release: the update contains a patch for one or more security holes. This makes the update at least somewhat urgent, because as soon as the security patch is posted online, the black hats start working on exploits (assuming they didn't before then).
Then we use some standard language about the general criticality of bugs fixed in the release:
All users of the affected versions are strongly urged to apply the update immediately: release contains a high-exposure security hole or critical data loss bug which affects all users. Stop reading, declare a downtime, and apply the update now. Fortunately, we've only had one of these in the last 6 years.
All users should apply this update as soon as possible: the release contains one or more security holes and/or major data loss bugs expected to affect most users. Schedule a downtime tonight or this weekend and apply the update.
All users of _________ should apply this update immediately/as soon as possible: the same as above, but only for users of a particular feature or extension.
Users should apply this update at the next available opportunity: the release contains some low-risk security holes and/or serious data loss bugs which only occur under certain circumstances. Updating to this release shouldn't be put off for more than a couple weeks, but you can wait for a good low-traffic window to take the downtime.
Users should update at the next regular maintenance window/scheduled downtime: this update contains nothing critical or even serious for most users; it's largely a cumulative collection of bugfixes. The update should be added to the general queue of OS, library and application updates for the production servers.
Note, however, that I said most users. Every update we push out is urgent for at least a few users who are hit by the specific bugs involved. For example, for the current release, users who depend heavily on the following features should consider updating sooner rather than later:
That's why we're reluctant to adopt any system which rates update releases on some absolute scale. It doesn't matter how critical the update is in general; it matters how critical it is to you.
However, you don't really need me to tell you that, because it's in the language of the release. More on this below.
A couple of notable fixes in this release:
Hstore-JSON extension fix: if you are using both Hstore and JSON in Postgres 9.3, you should run ALTER EXTENSION hstore UPDATE after updating your server. Due to a mistake in extension versioning, users missed out on some JSON-hstore conversion functions in 9.3.
Guarantee transmission of all WAL files: users with replication setups and stringent data retention requirements will want to apply the update sooner rather than later because of this fix. It fixes an issue where under some circumstances a promoted replica might ignore a few transactions it actually received from the master before failover, considering them lost.
Over the last dozen or so update releases, we've standardized the language we use to explain how critical the fixes in the release are for most users.
First, there's the question of what kind of update it is:
Bugfix Release or Update Release: this usually contains two to four months of accumulated bug fixes for various reported bugs. Some of these bugs may be critical data loss bugs; see the release contents to check this.
Security Release: the update contains a patch for one or more security holes. This makes the update at least somewhat urgent, because as soon as the security patch is posted online, the black hats start working on exploits (assuming they didn't before then).
Then we use some standard language about the general criticality of bugs fixed in the release:
All users of the affected versions are strongly urged to apply the update immediately: release contains a high-exposure security hole or critical data loss bug which affects all users. Stop reading, declare a downtime, and apply the update now. Fortunately, we've only had one of these in the last 6 years.
All users should apply this update as soon as possible: the release contains one or more security holes and/or major data loss bugs expected to affect most users. Schedule a downtime tonight or this weekend and apply the update.
All users of _________ should apply this update immediately/as soon as possible: the same as above, but only for users of a particular feature or extension.
Users should apply this update at the next available opportunity: the release contains some low-risk security holes and/or serious data loss bugs which only occur under certain circumstances. Updating to this release shouldn't be put off for more than a couple weeks, but you can wait for a good low-traffic window to take the downtime.
Users should update at the next regular maintenance window/scheduled downtime: this update contains nothing critical or even serious for most users; it's largely a cumulative collection of bugfixes. The update should be added to the general queue of OS, library and application updates for the production servers.
Note, however, that I said most users. Every update we push out is urgent for at least a few users who are hit by the specific bugs involved. For example, for the current release, users who depend heavily on the following features should consider updating sooner rather than later:
- Range indexes
- SP-GiST
- JSON
- regular expressions
- prepared query plans
- SSL connections
- more than 24GB (!) of work memory
That's why we're reluctant to adopt any system which rates update releases on some absolute scale. It doesn't matter how critical the update is in general; it matters how critical it is to you.
Monday, September 9, 2013
More about my favorite 9.3 Features (video and more)
If you somehow missed it, PostgreSQL 9.3.0 is now available, just in time for your back-to-school shopping. 9.3 includes lots of great stuff, some of which I've already been using in development, and I'll tell you more about my favorites below. There's also a survey and we'll have live video of SFPUG doing 9.3 on Thursday.
We didn't emphasize this in the release announcement -- mainly because it's like removing a wart, you won't want to talk about it -- but this is the one 9.3 change liable to make life easier for more developers than any other. We've stopped using SysV memory for anything other than the postmaster startup lock, which means that you can now adjust shared_buffers to your heart's content without needing to mess with sysctl.conf. Let alone the satanic incantations you have to go through on the Mac.
This also clears one of the main barriers to writing simple autotuning scripts. Which means I'm out of excuses for not having written one.
Need a daemon to do background work alongside Postgres, such as scheduling, queueing, maintenance, or replication? Maybe you want to intercept MongoDB-formatted queries and rewrite them for Postgres? Custom background workers allow you to create your own "autovacuum daemon" which does whatever you want it to.
Michael Paquier will be presenting Background Workers for SFPUG on Thursday the 12th (7:30PM PDT). Details on our Meetup Page, including a link to live video for those of you not in the Bay Area.
This has been my biggest desire since 9.2 came out; we were so close to not needing to worry about archiving, ever, for small databases. And now we're there. You can make chains of replicas, fail over to one of them, remaster, make a replica at a new data center the master, change the direction of replication, and lots more configurations without needing to worry about WAL archiving and all its overhead.
If you combine this with Heikki's work on pg_rewind, things get even more flexible since you don't have to resnapshot for failback anymore.
I'll be presenting a live demo of this feature at the SFPUG meeting, including connecting replicas in a ring (all replicas -- no master!).
So, what's your favorite 9.3 feature? Vote here!
No More SHMMAX
We didn't emphasize this in the release announcement -- mainly because it's like removing a wart, you won't want to talk about it -- but this is the one 9.3 change liable to make life easier for more developers than any other. We've stopped using SysV memory for anything other than the postmaster startup lock, which means that you can now adjust shared_buffers to your heart's content without needing to mess with sysctl.conf. Let alone the satanic incantations you have to go through on the Mac.
This also clears one of the main barriers to writing simple autotuning scripts. Which means I'm out of excuses for not having written one.
Custom Background Workers
Need a daemon to do background work alongside Postgres, such as scheduling, queueing, maintenance, or replication? Maybe you want to intercept MongoDB-formatted queries and rewrite them for Postgres? Custom background workers allow you to create your own "autovacuum daemon" which does whatever you want it to.
Michael Paquier will be presenting Background Workers for SFPUG on Thursday the 12th (7:30PM PDT). Details on our Meetup Page, including a link to live video for those of you not in the Bay Area.
Streaming-Only Cascading
This has been my biggest desire since 9.2 came out; we were so close to not needing to worry about archiving, ever, for small databases. And now we're there. You can make chains of replicas, fail over to one of them, remaster, make a replica at a new data center the master, change the direction of replication, and lots more configurations without needing to worry about WAL archiving and all its overhead.
If you combine this with Heikki's work on pg_rewind, things get even more flexible since you don't have to resnapshot for failback anymore.
I'll be presenting a live demo of this feature at the SFPUG meeting, including connecting replicas in a ring (all replicas -- no master!).
So, what's your favorite 9.3 feature? Vote here!
Tuesday, April 9, 2013
Recap of last week's security release
Rather than writing up a blow-by-blow of last week's security release here, I've written a full article for LWN. Subscription required, I'm afraid (or you can wait 2 weeks).
Thursday, March 29, 2012
Postgres Update Release, Data Loss, and a Million Monkeys
One of the major bugfixes in the most recent PostgreSQL update release (9.1.3 et. al.) is one of those annoying timing-based issues which you'd expect to be very rare in production. And it is rare, as measured against the total number of inserts you do in a database. Unfortunately, it's also a "million monkeys" issue, which means that it's not that rare as measured by the number of databases in which it occurs. For that reason, I'd urge all users to update their PostgreSQL releases as soon as they can. Let me explain.
The issue, as I understand it, was that autovaccum has been (since 8.3.0) re-using a pointer for multiple passes over the same leaf page while pruning dead nodes in b-tree indexes. If this occurred at the same time a concurrent process was inserting an new index pointer into dead space on the same leaf page at the same time, autovacuum could cause that index pointer to be flagged dead and removed. As a result, a row would exist in the table but not in the index -- a form of mild data loss or data corruption.
The above depends on split-millesecond timing to occur, which is how we were able to miss the issue for so long. The odds of it happening for any particular insert, even on a busy system, were miniscule, maybe 1 in a million. Further, it required frequent inserts into a table which also had heavy updates and deletes. Not much to worry about overall, yes? Well, no.
This is where the million monkeys come in. Just as the apochryphal mega-primate literary club will, eventually, due to random typewriter-key-mashing, produce War and Peace, with enough database activity your chances of hitting this bug move from "highly improbable" to "nearly certain". Every high-traffic production database has at least one table which receives heavy inserts, updates, and deletes (think queue table), and given a year or so might have a couple billion writes and a million autovacuum runs. Suddenly "one in a million" timing becomes 99.5% probability.
Indeed, on examination, we've found this kind of index corruption on four production customer databases. And we haven't checked nearly all of them.
Obviously this means you should update your production databases as soon as you can (downtime this weekend, perhaps?). Folks have also asked me about how to check for this form of index corruption. It's actually quite simple:
First, do a count of the rows in the table by full table scan:
Then, force an index scan, and do a count(*) using a condition you know to be universally true for an indexed column in the table:
Results like the above would indicate that you've encountered the index corruption bug. Note that you need to test each index to be sure. You fix it by reindexing:
Hopefully this helps with the bug. Our apologies for not catching this issue earlier, but in our defense, PostgreSQL's record on subtle bugs like this is much better than competing databases. This does show you why it's vital to stay up to date with the most recent update releases, though!
The issue, as I understand it, was that autovaccum has been (since 8.3.0) re-using a pointer for multiple passes over the same leaf page while pruning dead nodes in b-tree indexes. If this occurred at the same time a concurrent process was inserting an new index pointer into dead space on the same leaf page at the same time, autovacuum could cause that index pointer to be flagged dead and removed. As a result, a row would exist in the table but not in the index -- a form of mild data loss or data corruption.
The above depends on split-millesecond timing to occur, which is how we were able to miss the issue for so long. The odds of it happening for any particular insert, even on a busy system, were miniscule, maybe 1 in a million. Further, it required frequent inserts into a table which also had heavy updates and deletes. Not much to worry about overall, yes? Well, no.
This is where the million monkeys come in. Just as the apochryphal mega-primate literary club will, eventually, due to random typewriter-key-mashing, produce War and Peace, with enough database activity your chances of hitting this bug move from "highly improbable" to "nearly certain". Every high-traffic production database has at least one table which receives heavy inserts, updates, and deletes (think queue table), and given a year or so might have a couple billion writes and a million autovacuum runs. Suddenly "one in a million" timing becomes 99.5% probability.
Indeed, on examination, we've found this kind of index corruption on four production customer databases. And we haven't checked nearly all of them.
Obviously this means you should update your production databases as soon as you can (downtime this weekend, perhaps?). Folks have also asked me about how to check for this form of index corruption. It's actually quite simple:
First, do a count of the rows in the table by full table scan:
staffos=# SELECT count(*) FROM candidates;
count
-------
1449
(1 row)
Then, force an index scan, and do a count(*) using a condition you know to be universally true for an indexed column in the table:
staffos=# set enable_seqscan = false;
SET
staffos=# select count(*) from candidates where id > 0;
count
-------
1448
(1 row)
Results like the above would indicate that you've encountered the index corruption bug. Note that you need to test each index to be sure. You fix it by reindexing:
staffos=# REINDEX INDEX candidates_pkey;
REINDEX
staffos=# select count(*) from candidates where id > 0;
count
-------
1449
(1 row)
Hopefully this helps with the bug. Our apologies for not catching this issue earlier, but in our defense, PostgreSQL's record on subtle bugs like this is much better than competing databases. This does show you why it's vital to stay up to date with the most recent update releases, though!
Subscribe to:
Posts (Atom)