[Wikitech-l] Re: wiki 2 html converter

13 Apr 2005

Brion Vibber wrote:
...
  Tim's started a script for this, it's in the
maintenance directory in
 CVS. This is the development version of MediaWiki and is not directly
 usable yet on current page download dumps as the database format has
 changed. 
I guess I'd better say a few words about it, since the topic keeps
coming up. The script produces quite nice HTML dumps, with HTML files
distributed between directories identified by the first two bytes. The
link URLs are rewritten appropriately. This is useful for English wikis
since it allows you to guess the URL, but it doesn't work so well for
languages with a different orthography. Doing it by character rather
than byte could be useful, however that would give a much broader tree,
especially for the CJK languages.

It's currently clumsy to use, requiring you to move HTMLDump.php from
skins/disabled/ to skins/, which unfortunately enables the skin in the
user interface. Any user who changed their skin to it would find that
there are no user preference links allowing them to get back (the
interface is greatly stripped down). We need to make a distinction
between valid skins and skins available for users.

It rewrites stylesheet and image URLs to relative URLs in hard-coded
paths (../../../images and ../../../skins). This needs to be made more
flexible. It doesn't currently rewrite URLs for images from commons, or
provide a way to package these images.

So it's a start, but there's still plenty of things to do. Luckily,
porting the parser to a different language isn't one of them. I'm not
working on it at the moment, so I won't mind if someone picks it up
where I left off.

-- Tim Starling

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] Re: wiki 2 html converter