Wednesday, November 19, 2014

Good kernel, bad kernel

A month ago I got into an argument on IRC with Sergey about telling people to avoid kernel 3.2.  This turned out to be a very productive argument, because Sergey then went and did a battery of performance tests against various Linux kernels on Ubuntu. Go read it now, I'll wait.

My takeaways from this:

  • Kernel 3.2 is in fact lethally bad.
  • Kernel 3.13 is the best out of kernel 3.X so far.  I hope that this can be credited to the PostgreSQL team's work with the LFS/MM group.
  • No 3.X kernel yet has quite the throughput of 2.6.32, at least at moderate memory sizes and core counts.
  • However, kernel 3.13 has substantially lower write volumes at almost the same throughput.  This means that if you are write-bound on IO, 3.13 will improve your performance considerably.
  • If your database is mostly-reads and highly concurrent, consider enabling
    kernel.sched_autogroup_enabled.
Thanks a lot to Sergey for doing this testing, and thanks even more to the LFS/MM group for improving IO performance so much in 3.13.

6 comments:

  1. https://www.kernel.org/category/releases.html

    3.13 does not Longterm!

    what about 3.14? Greg Kroah-Hartman seems like a true mainterner.

    ReplyDelete
  2. I would love to see the same test with CentOS/RHEL 6 and 7 included.

    ReplyDelete
  3. It could be very useful to have CentOS 5 included too.

    ReplyDelete
    Replies
    1. Be my guest! I'll happily link to you if you publish it.

      Delete
  4. This reminds me of all the threads I started in the performance mailing list regarding 3.2 being terrible. We got around parts of it by adjusting scheduler migration cost, and other kernel knobs, but the memory management was abhorrent and completely impossible to circumvent.

    What's interesting is that we saw almost 15% better improvement by disabling autogrouping on an 80% read server. Of course, that too was on the 3.2 kernel, so who knows what we were actually fixing. :p

    ReplyDelete
    Replies
    1. Shaun: Yes, and we just tested autogrouping on high-speed replicas, and found no difference at all. I suspect that it depends on the exact nature of the queries and some of your other settings.

      Delete