Re: [Wikitech-l] Html dump for Wikipedia

5 Dec 2011

On 04/12/11 12:32, MZMcBride wrote:
...
  This may be a stupid question as I don't
understand the mechanics
 particularly well, but... as far as I understand it, there's a Squid cache
 layer that contains the HTML output of parsed and rendered wikitext pages.
 This stored HTML is what most anonymous viewers receive when they access the
 site. Why can't that be dumped into a output file rather than running
 expensive and timely HTML dump generation scripts?

 In other words, it's not as though the HTML doesn't exist already. It's
 served millions and millions of times each day. Why is it so painful to make
 it available as a dump? 
Most of the code would be the same, it's just a bit more flexible to
do the parsing in the extension, it makes it easier to change some
details of the generated HTML, and lets you avoid polluting the caches
with rarely-viewed pages. It's not especially painful either way.

-- Tim Starling

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Html dump for Wikipedia