We do a lot of high-availablity PostgreSQL deployments for clients. As of 9.1, streaming replication (SR) is excellent for this, and can scale sufficiently to scale a client across and AWS node cluster with some simple tools to help manage it. But where we're keenly feeling the lack is simple load balancing and failover.
We use pgPool a lot for this, and once you've set it up it works. But pgPool suffers from runaway sporkism: it's a load balancer and a failover tool and multimaster replication and data partitioning and a cache and compatible with SR and Slony and Bucardo. If you need all of those things, it's great, but if you only need one of them you still get the complexity of all of them. It also suffers from having been designed around the needs of specific SRA customers, and not really for general use. We've been tinkering with it for a while, and I just don't see a way to "fix" pgPool for the general use case; we need something new, designed to be simple and limited to the 80% use-case from scratch.
What we really need is simple tool, or rather a pair of tools, designed to only do failover and read load-balancing, and only for PostgreSQL streaming replication. These tools should be stackable in any combination the user wants, like pgBouncer (and, for that matter, with pgBouncer). They should provide information via a web service, and be configurable via a web service.
I'll call the first tool "pgFailover". The purpose of this tool is to manage a master-replica group of servers, including managing both planned and unplanned failovers. You would only have one active pgFailover node within a group in order to avoid "split-brain" issues. It will not handle database connections at all.
pgFailover would track a master and several replicas. The status of each server would be monitored by polling both the replication information on each server, and pg_stat_replication on the master. This information would be provided to the network by pgFailover via a web service, and pgFailover would accept commands via the same webservice as well as on the local command line.
Based on user-configurable criterial, pgFailover would carry out any of the following operations: failover to a new master; remaster the other replicas; add a new replica, with or without data sync; resync a replica; or shut down a replica. It would also handle some situations automatically. In the event that user-configurable conditions of nonresponsiveness are met, it would fail over the master to the "best" replica. The failover replica would be decided based on either the configuration or based on which replica is most caught up according to replication timestamps. Likewise, replicas would be dropped from the availability list if they stop replicating for a certain period or become nonresponsive.
The second tool I'll call "pgBalancer", after the unreleased tool from Skype. pgBalancer would just do load-balancing of database connections across the replicated servers. It won't deal with failover or monitoring the servers at all; instead, it relies entirely on pgFailover for that. This allows users to have several separate pgBalancer servers, supporting both high availabilty and complex load-balancing configurations.
Since automated separation of read and write queries is an impossible problem, we won't even try to solve it. Instead, we'll rely on the idea that the application knows whether it's reading or writing, and provide two separate database connection ports, one for read-write (RW) connections, and one for read-only (RO) connections. pgBalancer would obtain information on what servers to connect to by either a configuration file, or by querying a pgFailover server for information via its web service. RO connections would be load-balanced across available servers, based on a simple "least active" plus "round-robin" algorithm.
pgBalancer would also accept a variety of commands via web service, including: suspend a service or all services; disconnect a specific connection or all connections; failover the write node to specific new server; drop a server from the load balancing list; add a server to the load balancing list; give a list of connections; give a list of servers.
If we could build two tools which more-or-less match the specification above, I think they would go a lot further towards supporting the kinds of high-availability, horizontally-scaled PostgreSQL server clusters we want to use for the applications of the future. Time permitting, I'll start work on pgFailover. A geek can dream, can't he?