Monday, August 25, 2014

The great JSONB tradeoff

One thing we're waiting on to get 9.4 out the door is making a decision on JSONB storage.  Currently under debate are patches to the JSONB format which would, for users with large numbers (as in 150+) of top-level keys, cause JSONB to take up to 60% less space on disk in exchange for an up to 80% increase (that is slowdown) in response times on key extraction.  We need your feedback as potential users of the feature to make a good decision about what's best.

At stake is the question of whether the key locations should be saved as successive offsets, or as lengths.  The advantage of the former is that it speeds up extraction by making it only two lookups to locate a top-level key regardless of the number of keys you have.  The disadvantage is that a series of increasing offsets isn't very compressible.

For JSONB fields which consist of a few top-level keys and large values, this question makes no difference at all.  However, for the worst case ... 150+ top-level keys with short (under 10 bytes) values, the difference is quite dramatic.  For example, I constructed a test with 183 keys, of which 175 were NUMERIC.  I checked both table size and time to extract key #160 from 100,000 rows:

Table Size1147 MB541 MB
Extract 100K2.8s5.2s

This is a "worst case" scenario for the difference between these two designs.  Note that the extraction slowdown affects only retrieving the value to the client; it does not affect index lookups of JSONB rows, which are speedy no matter which patch is employed.

However, we're undecided on this "fix" because we don't know a couple things:

  • How likely are users to have 150+ top-level keys in one field (or keys on any single level together) with short values?
  • Is up to 60% space savings in return for up to 80% extraction slowdown a good tradeoff?  Or a bad one?
As such, I am turning to you, the users, to help us decide what tradeoff makes sense for 9.4.  Let me know in the comments.


  1. Here's my test results with github archive data as posted in the mailing list.

    The main difference in this dataset is that the PushEvent objects I'm querying in this case have only 8 toplevel keys (thus the minimal overhead).


    Test data: 610MB of Json -- 341969 items

    Index size (jsonb_ops): 331MB

    Test query 1: SELECT data->'url', data->'actor' FROM t_json WHERE data @> '{"type": "PushEvent"}'
    Test query 1 items: 169732

    Test query 2: SELECT data FROM t_json WHERE data @> '{"type": "PushEvent"}'
    Test query 2 items:

    HEAD (aka, all offsets) EXTENDED
    Size: 374MB
    Toast Size: 145MB

    Test query 1 runtime: 680ms
    Test query 2 runtime: 405ms
    HEAD (aka, all offsets) EXTERNAL
    Size: 366MB
    Toast Size: 333MB

    Test query 1 runtime: 505ms
    Test query 2 runtime: 350ms
    All Lengths (Tom Lane patch) EXTENDED
    Size: 379MB
    Toast Size: 108MB

    Test query 1 runtime: 720ms
    Test query 2 runtime: 420ms
    All Lengths (Tom Lane patch) EXTERNAL
    Size: 366MB
    Toast Size: 333MB

    Test query 1 runtime: 525ms
    Test query 2 runtime: 355ms

  2. Disk space is cheap. If you have that many keys, you want speed, not space. /twocents

  3. The cost of disk space is not so important here, it's the likelihood of whether the data needed will be in the filesystem disk cache or not as reading from disk is orders of magnitude slower than reading from memory. If the data takes up less space then more of it can fit in RAM and fewer reads will result in accessing the disk.

  4. I would counter Laurence's rebuttal by saying bith disk and ram are cheap. Speed matters. Hardware changes frequently; architectural designs not so much.

  5. Fixing current pglz deficiencies by making jsonb's storage format less optimal does not sound like a good basis for a data layout design decision.

    I would stick to absolute offsets. If a tuning of the space vs. time tradeoff is needed, this should be handled by pglz compression options (e.g. a delta compression pre stage).

  6. any possibility of making the tradeoff a runtime configuration with the default probably being for speed because storage is generally cheap.

  7. There's some room to improve jsonb format, the question is if it's worth having another beta.
    It's possible for example, to store small integers (up to 28 bits) inline instead of an 8~12 bytes Decimal. This will easily shave a lot of bytes in real world data with negligible runtime cost for most operations.

  8. I'd also be in favour of speed over space. A relational database as it is uses a lot more storage vs what would be absolutely necessary to represent the same information in a more minimal or compact format.

  9. It would be interesting to separate out the slowdown inherent in the new format from the slowdown resulting from decompression now being required, as Arthur's figures do with STORAGE EXTENDED vs EXTERNAL.

    If the majority of that slowdown is due to the decompression (which is certainly conceivable) then people can choose the appropriate column storage for their workload.

  10. Greg,
    > Disk space is cheap

    Not so cheap, in case of SSD (still). But I wonder would be this trade-off picture the same in case of SSD (I assume yes, if those timings were measured for "warm" database)

  11. As a postings-list format, an 80% slowdown in decoding time for a 60% space savings is not a good trade-off.

    Also, I would expect that the format will end up changing in the medium/long-term to some kind of group var-byte encoding, just as practically all other postings-list formats from other systems have done - which will change the storage efficiency picture again (better).

    Also, starting from the standpoint of doing "row-by-row" compression in postgresql is simply Doing Things Wrong, so Don't Force It (DFI). In other words: the problem here is with the pglz compression. Don't try to "force" the format of your data into something uglier or slower just to compensate for pglz.

    Also, IIRC, isn't the current space usage stil- much better than what you'd typically see when comparing to MongoDB? In that case, why all hair-splitting and fuss?

    1. Because it's considerably bigger than the space requirements for JSON-text, and users will be surprised by that.

  12. If the idea is that JSON (text) is very compressible but slow, then it seems almost redundant to make that same trade-off again with JSONB. It would make JSONB less compelling as an alternative. If people bother to choose JSONB instead of JSON, then it's because they're fully intending to optimize for speed rather than space (otherwise just use JSON). If JSONB allows better indexing and that's one reason why you'd use it, then that's already a speed/space trade-off anyway.