If anyone is interested, I have a rudimental Perl
script that is
capable of reading the downloadable SQL dump and output all the
articles as separate files in a number of alphabetical directories.
It's not very fast, but it works.
What's missing from the script: wikimarkup -> HTML conversion,
Mr David A. Wheeler,
Have you seen my Perl script for conversion of the SQL dump to
TomeRaider datase? You might find useful code there.
It renders all pages in html, checks als hyperlinks and unlinks half a
million orphaned ones. It edits wiki code to remove redundant tags,
fixes some badly coded html tables, adds stats and language specific
introduction. Replaces html tags by extended ascii (saves a lot of
space). Resolves redirects, thus making hyperlinks point directly to the
proper article. It removes tables that only contained an image (plus
possibly a single footer text).
In fact I think the script could be extended to generate separate html
pages in a few hours. Plucker specifics not taken into account.
Script:
http://members.chello.nl/epzachte/Wikipedia/WikiToTome.pl
More info:
http://members.chello.nl/epzachte/Wikipedia
Erik Zachte