Thursday, April 3, 2014

Simplifying Replication Position

Two years ago I wrote a post explaining how to determine relative log position in order to determine which replica was furthest ahead.  This is so you can choose the "best" replica to fail over.

Thanks to the miracle of PostgreSQL rapid development (Thanks Euler!), those complicated calculations are no longer necessary if you're using PostgreSQL 9.2 or later.  Particularly, this function makes your life much easier:

pg_xlog_location_diff(loc1, loc2)

So, reprising my old post, we want to determine three things, and can now do it in three one-liners:

Determine the absolute position in the transaction log, so that all replicas can be compared:

SELECT pg_xlog_location_diff ( pg_current_xlog_location(), '0/0000000' );

Determine the replay lag in megabytes, to see roughly how long it'll take this replica to "catch up":

SELECT pg_xlog_location_diff ( pg_last_xlog_receive_location(),  pg_last_xlog_replay_location() ) / 1000000;

Determine if a server is completely "caught up":

SELECT pg_xlog_location_diff ( pg_last_xlog_receive_location(),  pg_last_xlog_replay_location() ) = 0;

See, things have gotten much simpler!

BTW, the above has just been incorporated into HandyRep, in the select_replica_furthest_ahead plugin.

Tuesday, April 1, 2014

New New Index Bloat Query

Earlier this week Ioguix posted an excellent overhaul of the well-known Index Bloat Estimation from check_postgres.  However, I felt that we needed several additional changes before the query is ready for me to use in our internal monitoring utilities, and thought I'd post our version here.

Here's the New New Index Bloat Query.

Here's what I changed:
  1. Had it pull btree indexes only, because the calculations don't work for GIN/GiST indexes.
  2. Moved all of the nested subqueries up into WITH clauses for easier readability and maitainability.   Since all supported Postgres versions now do CTEs, I didn't see any reason to work around them.
  3. Removed the \set statements, because they don't work for embedding the query in a script.  And they only need setting dynamically if you've compiled PostgreSQL with nonstandard options, anyway.
  4. Removed a couple of CASE statements aimed at 7.X support; who cares?
  5. Added some extra informational columns for using this in interactive mode: table size, pretty index size, and index scans.  This helps folks figure out whether to rebuild an index, ignore the bloat or to drop it.
  6. In the example query, filtering down to indexes with bloat over 50% and 50MB, which is our threshold for "significant bloat"
Example output:

-[ RECORD 27 ]+-----------------------------
database_name | prod-web

schema_name   | partitions
table_name    | transactions_201308
index_name    | transactions_201308_uuid_idx
bloat_pct     | 52.9
bloat_bytes   | 351649792
bloat_size    | 335 MB
index_bytes   | 664788992
index_size    | 634 MB
table_bytes   | 4570447872
table_size    | 4359 MB
index_scans   | 263

So this index would be a good candidate for deletion, since it's 50% bloated and seldom scanned.

Now, there's still more to do on this.  We need a similar query for GIN/GiST/SPGiST.  The query could use some more cleanup; removing one-letter table aliases, and GROUP BY 1,2,3,4 stuff.  But it's vastly improved for checking which of your indexes need VACUUM/REINDEX.  Thanks, Ioguix!

Wednesday, March 19, 2014

12-Step Program for Performance, with

Few of my readers will be familiar with Wanelo, the social shopping site, because you just aren't in its target market.  But is currently the darling of the dot-com world, or as Bloomberg puts it, "the next big thing in ecommerce".  And most importantly to you: they run PostgreSQL.

Last night Wanelo CTO Konstantin Gredeskoul described their "12-step Program for Application Performance" to a packed crowd at their HQ in San Francisco.  This is kind of a recipe book for scaling a social web site backed by Postgres.  Wanelo's stack includes Rails, PostgreSQL, Joyent, SOLR, Redis, Memcached, Manta, and mobile applications.

Video for the talk is here (sorry about quality, Google Hangouts has ... issues).  Konstantin's slides are here.

Monday, March 17, 2014

SFPUG Live Video: Wanelo's 12-Step Program

Tommorrow for San Francisco PostgreSQL User Group, recent dot-com success and social shopping site Wanelo will be explaining their 12-Step Program for Application Performance.  I can't give away the details, but it will involve replication, sharding, and the Joyent cloud.  If you can't make it in person (and you can't, it's full up), then join us on Google Hangout.

Note: this is our first time using Google Hangout to broadcast SFPUG.  It may or may not go well.  Don't expect any video before around 7:15PM Pacific.  If it works, there will also be archival video on YouTube.

Tuesday, March 11, 2014

Thursday, March 6, 2014

Time for a change of slogans?

So the folks at DatabaseFriends, a blog I was completely unaware of until now, posted a survey whose results surprised a lot of people.  In a "favorite database" survey, PostgreSQL came in ahead of MySQL and MariaDB combined.  And not just a little ahead, but 2X the votes.

These results don't particularly surprise me, at this point.  I've seen MySQL declining in popularity since before the Sun acquisition (and precipitiously after the Oracle acquisition), and even the heroic efforts of MariaDB/SkySQL isn't going to turn that trend around.  I think the booth picture above really says it all: that's in the middle of the show day at SCALE.  The "most popular database" booth is deserted by staff and attendees alike.  (To be fair, the SkySQL booth was somewhat busier).

Now, you can say what you want about the unscientific nature of the survey.  But the core PostgreSQL community didn't organize a write-in campaign; heck, we didn't even know about the survey until the results were posted.   And frankly, it's the same kind of thing I've seen from other sources: the Freshmeat stats from 2005-2010 for click-throughs (sadly, no longer online); the report from 451Research showing former MySQL users migrating to PostgreSQL. 

Now, MySQL still has a huge installed base (still, at this point, something like 4X that of PostgreSQL), so it's not going away any time soon.  And MariaDB is winning new users.  But I think it's time that MySQL had a new slogan, because it's pretty clear that the old one no longer applies.

Mind you, PostgreSQL could use a new slogan too ...

Friday, February 28, 2014

Keep Calm with the PostgreSQL Zazzle Store

The Keep Calm Just Use PostgreSQL shirt is now up at the PostgreSQL Zazzle store.  Didn't know we had a Zazzle store?  Most people don't, we don't advertise it much.

I've put a number of things up there, including:
I've left the customization buttons on, so that you can change style, size, and items you want your PostgreSQL propaganda on.  20% of all purchases is a donation to PostgreSQL; most of that goes towards buying gear for PostgreSQL trade show booth volunteers.  And yes, I've provided a few examples of swag on women's sizes.

Since this is Zazzle, it's really only good for PostgreSQL users in the USA, though.  I understand that the Europeans have their own swag store; hopefully one will post a link in the comments.