Friday, August 28, 2015

Stupid Hacks: Dictionary replace function

I write a lot more PL/pgSQL than I'd like to.  Not that I don't like SQL, but as a language PL/pgSQL really shows its thrown-together origin; it's like 80's primitive.  One thing that PL/pgSQL lacks is good string manipulation tools, which is particularly tragic given that the #1 thing to do in PL/pgSQL is to generate queries from parameters and run EXECUTE.

Postgres has two built-in ways to do string substitution: concatenation and format().  Both have drawbacks.  Let me give you an example:

EXECUTE 'SELECT ' || col1 || ', ' || col2 || ' FROM ' || userschema ||
  '.accounts WHERE ' || filterclause || ' ORDER BY ' || col1 || ',' || col2;

EXECUTE format('SELECT %s, %s FROM %s.accounts WHERE %s ORDER BY %s, %s', col1, col2, userschema, filterclause, col1, col2);

You can see the problem here.  Both formats are hard to read and hard to maintain.  Python and Perl have a good fix for this: dictionary/hash-based string substitution, where you can swap in the dictionary keys for the values.  So I wrote up a quick hack to do this in PL/pgSQL.

Here's the dict-replace function.

Using it, you'd replace the above with:

EXECUTE replace_vars('SELECT ${col1}, ${col2} FROM ${userschema}.accounts
  WHERE ${filterclause} ORDER BY ${col1}, ${col2}', vardict);

Of course, you need to first set up the vardict as a JSON value, in the form:

vardict := '{ "col1" : "username", 'col2' : "branchname", ...}'

Still, much more readable, eh?  No refcounting, no repeating variables, no string breaks.  I used Bash's variable substitution syntax of ${var} because it seemed like the thing least likely to conflict with user text, unlike anything involving %.

There's some caveats, though: it's not performant, and you could probably find strings which will break it, particularly if you're swapping in JSON values.  It's not SQL-injection safe, so don't use it for stuff handling user input.  You still have to do your own quoting of strings.  And if you have access to PL/Perl or PL/Python you don't need this nonsense.

But for the minimal case, it should help.

5 comments:

  1. We have a operator :get_value" "-->" and probably it can be implemented to "format" function - so write patch, please.

    ReplyDelete
    Replies
    1. Pavel,

      Are you suggesting that format() be extended to add #> and similar from JSON[B]?

      Delete
  2. I propose to add named/path placeholders to to format function. Internally it can use #> operator, and it can be used for any type with this operator support. So "format('the name: ${name}', hstore '{name => Pavel})" or "format('the name: ${name}', json '{name: "Pavel"})"

    ReplyDelete
    Replies
    1. Sounds good to me. ${} makes sense to you too?

      Delete