Friday, November 8, 2013

ALTER TABLE and downtime, part II

Now, you'll note in part I that I said that doing certain ALTER operations was "free", in quotes.  That's because these operations aren't actually worry-free; there's the major issue of locking.

Currently, any ALTER TABLE you run requires an ACCESS EXCLUSIVE lock on the altered table in order to commit.  This is a lock which blocks not only writes, but reads as well.  You don't see read locks much in PostgreSQL, so one tends to forget they exist until you do something like running an ALTER TABLE in a database which processes 15,000 queries per second.  Then, this sort of thing can happen:
  1. ALTER TABLE waits behind a long-running reporting or write query.
  2. SELECT queries start to pile-up behind ALTER TABLE statement
  3. server starts performing badly because of the pile-up, and is effectively locked up for a few minutes.
  4. the pile-up takes several more minutes to clear, even after the ALTER has gone through.
This gets even worse if the table being changed has foreign keys referencing it; then any write queries against the referencing tables are also blocked.  The result is something which looks very much like a downtime, even though we're using transactional DDL.

This means that, for deployments which involve ALTER TABLE, you'll need to plan on a brief downtime or at least watching the database closely if you run an extremely-high-transaction-rate database with many multi-statement transactions.  In many cases I get around this by polling for a manual lock on the table, and proceeding with the ALTER once I get one.   This doesn't work well with existing schema management tools, though.

I'm not just talking about ALTER TABLE ADD/DROP COLUMN here; this happens for any ALTER TABLE statement, including ALTER TABLE SET STATISTICS.  Unfortunately, while PostgreSQL DDL statements are transaction-safe, our system tables themselves are not currently MVCC, which means that the only way we can modify anything in the system tables is to take an exclusive lock on it.

Fortunately, thanks to Robert Haas, that just changed in 9.4 development.  In the future, we should be able to do the following operations without taking more than a share lock on the table:
  • SET STATISTICS
  • DROP TRIGGER/CONSTRAINT
  • DROP a foreign key from a referencing table
And these operations should require only a regular write lock on the table:
  • SET DEFAULT
  • SET NOT NULL
  • ADD COLUMN
... which will be really cool, and a boon to DBAs and DevOps everywhere.

There is one special case of ALTER TABLE you need to know about in addition to this, though.  That's ALTER TABLE ALTER COLUMN ... TYPE <new_data_type>.   More on this in a succeeding post.

DDL = Data Definition Language, i.e. CREATE/ALTER/DROP statements.

MVCC = Multi-Version Concurrency Control, the way we ensure that users can see different versions of the database depending on their current transaction context.

7 comments:

  1. > In many cases I get around this by polling for a manual lock on the table, and proceeding with the ALTER once I get one

    How do you do this exactly?

    I usually repeat a kind of the block below to work around it.

    BEGIN;
    SET LOCAL statement_timeout TO '300ms';
    ALTER TABLE table1 ADD COLUMN new_column text;
    END;

    ReplyDelete
    Replies
    1. The problem with that is that it's going to block all queries which came in after it for 300ms. I'll post an example lock-polling function later.

      Delete
    2. Hm... I thought it is inevitable. Eagerly waiting for the lock-pooling example then.

      Delete
  2. BTW, I faced an interesting nuance about NOT NULL and inheritence recently. When you add a column to the parent table and then try to SET NOT NULL on the propagated columns in the child tables then it checks all the partitions every time for each partition. However, it you add columns to the child tables directly and then SET NOT NULL on this columns, then it will check the partition itself only for each partition.

    ReplyDelete
  3. MVCC = Multiversion Concurrency Control

    ReplyDelete
  4. Why do you need a high query frequency in order to trigger the problem ? Wouldn't any long-running transaction + hardware pushed to its limits by other queries exhibit the "queue queries, and then take ages to dequeue" symptom ?

    ReplyDelete