Daniel Herding wrote:
but the databases are already fucked up. The XML
export special page (which
is used by interwiki.py) gives out crappy XML, which leads to a SAX parse
bug. And my newly created sqldump.py is unusable for these wikis.
So I guess we should ask the MediaWiki developers to help us out. Maybe they
can shut down the wiki for a while, then run over the 'old' database,
replacing every non-UTF-8-byte with a question mark.
Add a quick output filter to Special:Export; there's a particular
character one's supposed to use for invalid chars (check the Unicode specs).
It would be nice if someone could repair the databases
on fr:, nds:, es:, and
other affected Wikipedias. You should also consider implementing a filter
that stops users from posting illegal characters. Mail me if you need
additional information.
Patches are welcome, please send them to wikitech-l.
-- brion vibber (brion @
pobox.com)