Over the last decade, Greenplum, Vertica, Everest, Paraccel, and a number of non-public projects all forked off of PostgreSQL. In each case, one of the major changes to the forks was to radically change data storage structures in order to enable new functionality or much better performance on large data. In general, once a Postgres fork goes through the storage change, they stop contributing back to the main project because their codebase is then different enough to make merging very difficult.
Considering the amount of venture capital money poured into these forks, that's a big loss of feature contributions from the community. Especially when the startup in question gets bought out by a company who buries it or loots it for IP and then kills the product.
More importantly, we have a number of people who would like to do something interesting and substantially different with PostgreSQL storage, and will likely be forced to fork PostgreSQL to get their ideas to work. Index-organized tables, fractal trees, JSON trees, EAV-optimized storage, non-MVCC tables, column stores, hash-distributed tables and graphs all require changes to storage which can't currently be fit into the model of index classes and blobs we offer for extensibility of data storage. Transactional RAM and Persistent RAM in the future may urge other incompatible storage changes.
As a community, we want to capture these innovations and make them part of mainstream Postgres, and their users part of the PostgreSQL community. The only way to do this is to have some form of pluggable storage, just like we have pluggable function languages and pluggable index types.
The direct way to do this would be to refactor our code to replace all direct manipulation of storage and data pages with a well-defined API. This would be extremely difficult, and would produce large performance issues in the first few versions. It would, however, also have the advantage of allowing us to completely solve the binary upgrade of page format changes issue.
A second approach would be to do a MySQL, and build up Foreign Data Wrappers (FDWs) to the point where they could perform and behave like local tables. This may be the more feasible route because the work could be done incrementally, and FDWs are already a well-defined API. However, having Postgres run administration and maintenance of foreign tables would be a big step and is conceptually difficult to imagine.
Either way, this is a problem we need to solve long-term in order to continue expanding the places people can use PostgreSQL.