Monday, August 31, 2015

Lock-polling script for ALTER TABLE

One of PostgreSQL's advantages over many other SQL databases is letting users modify database objects with a minimum of locking and fuss, and do it in a transaction so it can be rolled back.  However, a minimum of locking isn't no locking, so one still has to plan for getting a lock to push through the database migration.

For example, say I wanted to add two new columns to a table.  I already know that by making the column nullable, and not setting a default, I can add them with a metadata-only change which requires only an update to the system catalogs.  However, this table change does require an ACCESS EXCLUSIVE lock for a few milleseconds to go through, which can be tricky.  See, an access exclusive lock blocks everything, including reads and autovacuum, which can be a bit messy if your timing is bad.  You can end up waiting for that lock behind a long-running query, and then all the other database traffic can pile up behind you.

Now, if we had ALTER TABLE CONCURRENTLY it would take care of this for you.  But we don't (yet), so you need to fake it with a DO script.  Here's an example DO script which polls for a lock every 2 seconds until it can get one, then pushes through the table change:

    DO $f$
    DECLARE ntries INT := 10;
        sleepytime INT := 2;
    BEGIN

    FOR get_lock IN 1 .. ntries LOOP
            BEGIN
                    LOCK TABLE mytable
                    IN ACCESS EXCLUSIVE MODE NOWAIT;
                    ALTER TABLE mytable ADD COLUMN new_col1 INT,
                            ADD COLUMN new_col2 VARCHAR;
                    RAISE INFO 'table updated';
                    RETURN;
            EXCEPTION
                    WHEN lock_not_available THEN
                            PERFORM pg_sleep(sleepytime);
            END;
    END LOOP;

    RAISE INFO 'unable to obtain lock after % tries', ntries;

    END;$f$;


The idea here is that you keep trying to LOCK ... NOWAIT, which will throw and error if it can't get the lock immediately.  Then it sleeps and tries again 2 seconds later.  If you're using 9.3 or later, you can take an even better approach, using lock_timeout:

    DO $f$
    DECLARE ntries INT := 10;
        sleepytime INT := 2;
    BEGIN
    SET lock_timeout = '100ms';
    FOR get_lock IN 1 .. ntries LOOP
            BEGIN
                    LOCK TABLE mytable IN ACCESS EXCLUSIVE MODE;
                    ALTER TABLE mytable ADD COLUMN a int,
                            ADD COLUMN b INT,
                            ADD COLUMN c INT;
                    RAISE INFO 'table updated';
                    RETURN;
            EXCEPTION
                    WHEN lock_not_available THEN
                            PERFORM pg_sleep(sleepytime);
            END;
    END LOOP;

    RAISE INFO 'unable to obtain lock after % tries', ntries;

    END;$f$;


The advantage of this is that it doesn't have to get the lock immediately; it waits 100ms, then gives up on getting the lock.  That helps in situations where the table is never completely free of read queries, without risking a serious pile-up of requests.

You should be able to adapt this approach. to the migrations you actually need to push through.

No comments:

Post a Comment