Just a reminder of work in progress and general background, for those
who might be commenting without being aware of present work...
First, in MediaWiki 1.5 we've made a major schema change, intended to
reduce the number of changes to data rows that have to be made and to
slim down the amount of data that has to be pulled per-row when scanning
non-bulk-text metadata.
Specifically, the 'cur' and 'old' tables are being split into
'page',
'revision', and 'text'. Lists of pages won't be trudging through
large
page text fields, and operations like renames of heavily-edited pages
won't have to touch 15000 records. This will also give us the potential
to move the bulk text to a separate replicated object store to keep the
core metadata DBs relatively small and limber (and cacheable).
Talk to icee in #mediawiki if interested in the object store; he's
working on a prototype for us, to use for image uploads and potentially
bulk text storage.
Second, remember that each wiki's database is independent. It's very
likely that at some point we'll want to split out some of the larger
wikis to separate master servers; aside from localizing disk and cache
utilization, this could provide some fault isolation in that a failure
in one master would not affect the wikis running off the other master.
Third, we're expecting to have at least two new additional data centers
soon in Europe and the US. Initially these are probably going to be
squid proxies since that's easy for us to do (we have a small offsite
squid farm in France currently in addition to the squids in the main
cluster in Florida) but local web server boxen pulling from local slaved
databases at least for read-only requests is something we're likely to
see, to move more of the load off of the central location.
Finally, people constantly bring up the 'PostgreSQL cures cancer'
bugaboo. 1.4 has experimental PostgreSQL support, which I'd like to see
as a first-class supported configuration for the 1.5 release. This is
only going to happen, though, if people pitch in to help in testing, bug
fixing, and of course make some benchmarks and failure mode tests
compared to MySQL! If you ever want Wikimedia to consider switching, the
software needs to be available to make it work and it needs to be
demonstrated as a legitimate improvement with a feasible conversion.
Domas is the PostgreSQL partisan on our team and wrote the existing
PostgreSQL support. If you'd like to help you should probably track him
down; in #mediawiki you'll usually find him as 'dammit'.
-- brion vibber (brion @
pobox.com)