There are two projects that I am exploring that might make mediawiki more
robust in handling the emmense load being placed on it.
One, is a one or two pass wiki parser. The current parser, which performs
dozens of passes, probably degrades by the square of the file size. I have
added some thoughts to
http://meta.wikipedia.org/wiki/One-pass_parser
Storing diffs in the 'old' table. This would not affect performance,
except when loading or comparing old revisions, but could drastically
reduce the size of the database, which has to benefit how manageable it
is. There already is a differencing engine in the source, though I'm not
sure how reliable it is--it may also degrade by the square of the file
difference. Here too, a sequence of diffs can be merged in one pass.
Having written such code in the past, I plan to create a write up
exploring this idea. Has this idea been discussed amongst the developers?
What are the gotcha's?
Nick Pisarro