Database Soup: June 2012

Friday, June 22, 2012

PostgreSQL Wants Your Submission

So there's three Calls For Papers right now which need/want more PostgreSQL speakers: Postgres Open, PostgreSQL Europe, and LinuxConf AU.

Postgres Open's deadline is June 26th, so please submit something soon. We're looking for case studies, new innovations, howtos and other talks. Postgres Open is for PostgreSQL's business and user community, so commercial forks and products are welcome. You can also sponsor!

PostgreSQL Europe will be in Prague this year. This will mean higher-than-usual attendance; I'm submitting a talk for the first time in 3 years. You should, too!

LinuxConf Australia wants more PostgreSQL talks! Please submit one. If we can get enough PostgreSQL people there -- especially PostgreSQL people from Australia -- we could hold a miniconf. (As a warning, though, travel sponsorships are limited).

Sunday, June 17, 2012

Creating a Table With Exactly One Row

Here's a situation which comes up a lot in database design: the one-row table. That is, a table designed to hold one and only one row, which contains some settings or metadata about the database or application. For example, imagine that we want to have a one-row table which holds the current database revision version and the date it was last upgraded:

CREATE TABLE db_version (

    version NUMERIC NOT NULL,

    upgraded_on TIMESTAMPTZ NOT NULL

);

Since the database is only one specific version at any time, it would cause problems for the upgrade scripts if there were more than one row in this table. Which row should they believe? So you want to ensure that there's never more than one row.

You could, of course, do this using a trigger. However, a trigger is complex, and a BEFORE INSERT trigger needs to either count the rows, or be installed after a sample row is added. Either is more work than it needs to be. Try this instead:

CREATE UNIQUE INDEX db_version_one_row

ON db_version((version IS NOT NULL));

This works because the version column is already declared NOT NULL. Since you've created a unique index on the NOT NULL expression, you can never insert a second row because it would violate the uniqueness condition.

Unfortunately, there's no clever trick to preventing that one row from being deleted; that requires a BEFORE DELETE trigger:

CREATE FUNCTION db_version_no_delete ()

RETURNS trigger

LANGUAGE plpgsql AS $f$

BEGIN

   RAISE EXCEPTION 'You may not delete the DB version!';

END; $f$;

CREATE TRIGGER db_version_no_delete

BEFORE DELETE ON db_version

FOR EACH ROW EXECUTE PROCEDURE db_version_no_delete();

Now this all works as expected; you can insert the first row to start the table, and only update it after that:

postgres=# INSERT INTO db_version VALUES ( '1.0',now());
INSERT 0 1

postgres=# INSERT INTO db_version VALUES ( '1.1',now());
ERROR: duplicate key value violates unique constraint "db_version_one_row"
DETAIL: Key ((version IS NOT NULL))=(t) already exists.

postgres=# UPDATE db_version SET version = '1.1', upgraded_on = now();
UPDATE 1

postgres=# delete from db_version;
ERROR: You may not delete the DB version!
STATEMENT: delete from db_version;

Friday, June 15, 2012

postgresql.conf: max_locks_per_transaction

PostgreSQL, and the postgresql.conf file, has around 150 parameters which you don't need to care about 98% of the time (as opposed to the 50-or-so settings you do need to care about more often), except that every once in a while you run into a situation which requires you to learn about some obscure parameter. That is, after all, why it's a changeable setting and not just hard-coded. max_locks_per_transaction is one setting.

The purpose of max_locks_per_transaction is to determine the size of the virtual locks "table" in memory. By default, it's set to 64, which means that Postgres is prepared to track up to (64 X number of open transactions) locks. For example, if you have it set at the default, and you currently have 10 concurrent sessions with transactions open, you can have up to 640 total locks held between all sessions. The reason to have a limit is to avoid using dedicated shared memory if you don't need more locks than that.

Most of the time for most users, But every once in a while, it's not:

2012-06-11 14:20:05.703 PDT,"processor","breakpad",17155,"[local]",4fd660cd.4303,2,"SELECT",2012-06-11 14:19:09 PDT,86/199551,0,ERROR,53200,"out of shared memory",,"You might need to increase max_locks_per_transaction.",,,,"select j.id, pj.uuid, 1, j.starteddatetime from jobs j right join priority_jobs_2849 pj on j.uuid = pj.uuid",,,""

The above helpful message is from the activity log. Unfortunately, the error which the client gets is just "out of shared memory", which is not that helpful ("what do you mean 'out of shared memory'? I have 4GB!").

The reason why the database above ran out of locks was that a few sessions were holding up to 1800 locks, most of them RowExclusiveLock. Given that a lock in Postgres is usually a lock on an object (like a table or part of a table) and not on a row, holding 1800 locks in one transaction is somewhat unusual. Why so many locks?

Well, the database in question has three tables each of which has over a hundred partitions. One frequent application activity was running an UPDATE against each of these partitioned tables with no partition condition in it, causing the UPDATE to check all partitions of each table. This resulted in RowExclusiveLocks on each partition and each index on each partition ... 1800 locks in total. Note that it needs this lock even though only one partition had rows which were actually updated; despite the name, it's a lock on the table or index, not on a specific row necessarily. This lock prevents certain things, like dropping the table or index while the locking transaction is still running.

So that's one time you probably want to increase max_locks_per_transaction out-of-the-box: if your database has several tables with many partitions, or if you use table inheritance a lot for other purposes. Especially since increasing it requires a restart.

Tuesday, June 12, 2012

Video SFPUG June 13: Build Your Application Inside Postgres

Tommorrow night ( June 13th, 7:15PM PDT ) Leon Starr will be presenting "Building Your Application Inside PostgreSQL" for the San Francisco PostgreSQL User Group. As usual, this SFPUG will be broadcast live on video.

In Leon's words, the presentation will be about:

Leon Starr has built a full featured open source Executable UML model editor entirely in plpgsql. WHY? Leon presents the case for building a tightly constrained schema and letting the RDBMS do the heavy logic lifting. Code patterns will be presented where object and db principles are intertwined with an eye toward deriving the best of both worlds. Naturally, plpgsql has its limitations as does the presenter who is relatively new to the language. You are invited to poke fun at his novice coding techniques and weigh in with your experiences and thoughts regarding the appropriateness of stored procedures vs. conventional code.

We have a new broadcaster, JustinTV, who is also physically hosting the event. JustinTV runs on PostgreSQL so they're going to be our live streaming of choice for the forseeable future. I also have a new webcam, so hopefully video quality will be better than last time.