Wednesday, December 11, 2013

Meet your new NoSQL Database

... Postgres.  You wanna document store?  We got your document store, right here.

Thanks to Craig Kersteins, I just learned about the pgREST project.  This project turns PostgreSQL into a RESTful JSON document store, capable of running ad-hoc Javascript code (either LiveScript or v8) in order to do searches and batch modififications on the server.

pgBSON, a Postgres extension implementing support for a MongoDB-compatible BSON datatype, also recently went 1.0, signalling that it's ready for production use.

These are powerful demonstration of how adaptable Postgres is.  If it's data, we can find a way to do it. And unlike a typical document store database, you can do relational-SQL stuff as well, giving you your choice of interface.

pgREST is also our first major project out of our Taiwanese community, so I'm pretty excited about it.

Of course, we're still waiting for fully indexable, binary-storage JSON in 9.4, but that's on schedule for September.

Friday, December 6, 2013

Checksums! On December 10th

SFPUG will be hosting committer Jeff Davis talking about his 9.3 feature, data page checksums for PostgreSQL, and how to use them.  As usual, we will have live video of his presentation.  Presentation should start around 7:15 PST.

Thursday, December 5, 2013

Why you need to apply today's update real soon

So, another month, another PostgreSQL update.  This one is a lot more critical than most because it patches up to three data loss issues, depending on what version of PostgreSQL you're on and which features you're using.  While all users need to plan to apply this update sometime during the next couple weeks, two kinds of users need to schedule an update downtime for this weekend, if at all possible:
  1. Users on version 9.3
  2. Users of binary replication
While all three data loss issues depending on timing, and are thus low-probability, all three are impossible to predict in advance and difficult to detect when they occur.  As such, they should be considered high-risk and an urgent fix.  Of the three issues, the replication one affects 9.0, 9.1, 9.2 and 9.3, one of the VACUUM issues affects all versions, and the other affects only 9.3.

Annoyingly, you'll have to do some additional stuff after you update:
  • If using binary replication, you need to take a new base backup of each replica after updating.
  • You should run the following on each production database after updating:
SET vacuum_freeze_table_age = 0;
VACUUM; -- optionally, ANALYZE as well

This second step is critical for users on 9.3, and a generally good idea for users on other versions. Note that, while VACUUM is a non-blocking operation, it can create a lot of IO, so it's a good idea to do this during a slow traffic period and monitor it for pile-ups.

More information about the replication issue is here.

So, how did this happen?  Well, all three issues were unexpected side effects of fixes we applied for other issues in earlier versions.  For example, the replication issue is the result of the combination of two independent fixes for failover issues, both of which needed to be fixed.  Since all of these issues depend on both timing and heavy traffic, and their effect is subtle (as in, a few missing or extra rows), none of our existing testing was capable of showing them.

If anyone wants to devise destruction tests to detect subtle data corruption issues in PostgreSQL before we release code -- if anyone can -- please offer to help!