Thursday, March 31, 2016

9.5.2 update release and corrupt indexes

We've released an off-schedule update release today, because of a bug in one of 9.5's features which has forced us to partially disable the feature.  This is, obviously, not the sort of thing we do lightly.

One of the performance features in 9.5 was an optimization which speeded up sorts across the board for text and numeric values, contributed by Peter Geoghegan.  This was an awesome feature which speeded up sorts across the board by 50% to 2000%, and since databases do a lot of sorting, was an overall speed increase for PostgreSQL.  It was especially effective in speeding up index builds.

That feature depends on a built-in function in glibc, strxfrm(), which could be used to create a sortable hash of strings.  Now, the POSIX standard says that strxfrm() + strcmp() should produce sorting results identical to the strcoll() function.  And in our general tests, it did.

However, there are dozens of versions of glibc in the field, and hundreds of collations, and it's computationally unreasonable to test all combinations.  Which is how we missed the problem until a user reported it.  It turns out that for certain releases of glibc (particularly anything before 2.22 on Linux or BSD), with certain collations, strxfrm() and strcoll() return different results due to bugs.  Which can result in an index lookup failing to find rows which are actually there.  In the bug report, for example, an index on a German text column on RedHat Enterprise Linux 6.5 would fail to find many rows in a "between" search.

As a result, we've disabled the feature in 9.5.2 for all indexes which are on collations other than the simplified "C" collation.  This sucks.

Also, if you're on 9.5.0 or 9.5.1 and you have indexes on columns with real collations (i.e. not "C" collation), then you should REINDEX (or CREATE CONCURRENTLY + DROP CONCURRENTLY) each of those indexes.  Which really sucks.

Of course we're discussing ways to bring back the feature, but nobody has a solution yet. In the meantime, you can read more about the problem on the wiki page.