Saturday, November 1, 2014

Finding Foreign Keys with No Indexes

Unlike some other SQL databases, PostgreSQL does not automatically create indexes on the "child" (in formal language "referencing") side of a foreign key.  There are some good reasons for this (see below), but it does give users another opportunity to forget something, which is indexing the foreign keys (FKs) that need it.  But which ones need it?  Well, fortunately, you can interrogate the system catalogs and find out.

I have added a query for finding foreign keys without indexes to pgx_scripts.  Indexes on the "parent", or "referenced" side of the FK are automatically indexed (they have to be, because they need to be unique), so we won't talk about them further.

Now, in order to understand how to interpret this, you have to understand why you would or would not have an index on an FK, and what sort of indexes are valid.  There's two times that indexes on the child side of FKs are used:
  • when doing JOIN and lookup queries using the FK column
  • when updating or deleting a row from the "parent" table
The second occasion is news to some DBAs.  The way it works is this: before letting you delete or change a row in the "parent" table, Postgres has to verify that there are no rows in the "child" table referencing the FK value that might be going away.  If there are, it needs to perform the action you have defined (such as CASCADE, SET NULL or RESTRICT).  If the "child" table is large, this can be substantially speeded up by having an index on the FK column.

This means that it's important to have an index on the child side of the FK if any of the following are true:
  • The child table is large and the parent table gets updates/deletes
  • The parent table is large and the FK is used for JOINs
  • The child table is large and the FK is used to filter (WHERE clause) records on the child table
This means most FKs, but not all of them.  If both tables are small, or if the parent table is small and the FK is used only to prevent bad data entry, then there's no reason to index it.  Also, if the FK is very low cardinality (like, say, only four possible values) then it's probably also a waste of resources to index it.

Now you're ready to run the query on your own database and look at the results. The query tries to filter for the best indexing candidates, but it is just a query and you need to use your judgement on what you know about the tables.  The query also filters for either the parent or child table being larger than 10MB.

Now, you might say "but I have an index on column Y" and wonder why it's appearing on the report.   That's probably because the FK does match the first columns of the index.  For example, an index on ( name, team_id ) cannot be used for an FK on team_id.

You may notice a 2nd section of the report called "questionable indexes".  These are FKs which have an index available, but that index may not be usable for JOINs and constraint enforcement, or may be very inefficient.  This includes:
  • Non-BTree indexes.  Currently other types of indexes can't be used for FK enforcement, although that is likely to change in future Postgres versions.  But they can sometimes be used for joins.
  • Indexes with more than one column in addition to the FK columns.  These indexes can be used for FKs, but they may be very inefficient at it due to the bigger size and extra index levels.
  • Partial indexes (i.e. INDEX ... WHERE).  In general, these cannot be used for FK enforcement, but they can sometimes be used for joins.
It's not quite as clear when you want to add to, or replace those indexes.  My advice is to start with the missing indexes and then move on to the more nuanced cases.

1 comment:

  1. Nice Article !

    Really this will help to people of PostgreSQL Community.
    I have also prepared small demonstration on, to find a missing indexes in PostgreSQL.
    You can visit my article using below link.