Re: [Wikitech-l] Simple way to convert XML to HTML

26 Jul 2009

2009/7/26 Andrew Garrett &lt;agarrett(a)wikimedia.org&gt;rg>:
...

 On 21/07/2009, at 6:48 PM, Daniel Schwen wrote:

    wouldn't it be faster than to actually create a
static HTML dump the
 traditional way?  The content is wiki-text. It has to be parsed to be turned into
 HTML. There
 isn't a more traditional way, because there is no other way. 
 Wouldn't it be possible to dump the parser cache instead of dumping
 XML and reparsing? Al the parsing work is already done on the
 Wikimedia servers, why do it again on a slow desktop system? 
 For a few reasons:

 1/ There's no reason to expect that the contents of every page,
 revision, et cetera, would be in the parser cache.
 2/ Deleted or otherwise private revision content may remain in the
 parser cache.
 3/ There would be a lot of redundant content in the parser cache,
 owing to people browsing with the same options.
 4/ None of the useful article metadata is stored in the parser cache.
 5/ The parser cache is stored in memcached, a hash-based system which
 it is impossible to simply "dump", let alone dump selectively
 excluding all of the other things stored in memcached (including quite
 a bit of private data).

 It might, however, be sensible to generate parsed HTML text for every
 page, save them in a directory, and then zip it up.

 Oh, wait... 
I always thought it would be much more useful to generate the HTML of
action=render for every page rather than the action=view with the HTML
for one specific skin a million or so times, which is then a pain to
parse out if you want to do anything other than open the HTML in a
browser.

(-:

Andrew Dunbar (hippietrail)

...
  --
 Andrew Garrett
 agarrett(a)wikimedia.org
 http://werdn.us/

 _______________________________________________
 Wikitech-l mailing list
 Wikitech-l(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

-- 
http://wiktionarydev.leuksman.com http://linguaphile.sf.net

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Simple way to convert XML to HTML