[Wikitech-l] RE: Creating a Plucker version (was: Static html dump)

22 May 2003

...
  If anyone is interested, I have a rudimental Perl
script that is 
 capable of reading the downloadable SQL dump and output all the 
 articles as separate files in a number of alphabetical directories. 
 It's not very fast, but it works.
 What's missing from the script: wikimarkup -> HTML conversion, 
Mr David A. Wheeler,
Have you seen my Perl script for conversion of the SQL dump to
TomeRaider datase? You might find useful code there.

It renders all pages in html, checks als hyperlinks and unlinks half a
million orphaned ones. It edits wiki code to remove redundant tags,
fixes some badly coded html tables, adds stats and language specific
introduction. Replaces html tags by extended ascii (saves a lot of
space). Resolves redirects, thus making hyperlinks point directly to the
proper article. It removes tables that only contained an image (plus
possibly a single footer text).

In fact I think the script could be extended to generate separate html
pages in a few hours. Plucker specifics not taken into account.

Script: http://members.chello.nl/epzachte/Wikipedia/WikiToTome.pl
More info: http://members.chello.nl/epzachte/Wikipedia

Erik Zachte

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] RE: Creating a Plucker version (was: Static html dump)