[Xmldatadumps-admin-l] Fwd: logging xml

Tomasz Finc tfinc at wikimedia.org
Fri May 15 18:59:42 UTC 2009


Platonides wrote:
> This should have been sent to the -admin.
> 
> Platonides wrote:
> 
> Looking at the different dumps, no pages-logging.xml.bz2
> <http://download.wikimedia.org/eswiki/20090504/eswiki-20090504-pages-logging.xml.bz2>
> seems to be right. All of them are 14 bytes, the compression of an empty
> file.

It's seemingly busted and since a bug on it doesn't exist I've opened a 
one to track progress

https://bugzilla.wikimedia.org/show_bug.cgi?id=18808

> However, there're quite big files logging.xml.gz
> <http://download.wikimedia.org/eswiki/20090504/eswiki-20090504-logging.xml.gz>
> at the 'Creating split stub dumps' section.
> 
> Are those logging.xml.gz
> <http://download.wikimedia.org/eswiki/20090504/eswiki-20090504-logging.xml.gz>
> files really stubs (what's missing?) or they're just misplaced?
> Should pages-logging.xml.bz2 contain something different?

After chatting with Brion on this one, we he can't think of any reason 
as to why that separate step exists. Content wise 'logging.xml.gz' has 
everything that 'Log events to all pages' step is claiming to provide.


> 
> I suspect that the proper file is the gz and the existance of the bz2
> are a mistake, but the xml logging files are quite new, and not too
> documented, so can't be sure.

This seems highly likely but I'm cc'ing Aaron just to make sure as 
according to Brion he wrote that step.

If we can confirm that 'pages-logging.xml.bz2' can be superseded by 
'logging.xml.gz' then I'll move the build steps around clean up the page.

--tomasz



More information about the Xmldatadumps-admin-l mailing list