[Xmldatadumps-admin-l] Archive -20100312- [31.9 GB] many revisions between 2005-01-10T and 2005-05-14 have empty text

Dmitry Chichkov dchichkov at gmail.com
Fri May 14 03:33:32 UTC 2010


Correction: subject should read "Archive -20100130- [31.9 GB] many revisions
between 2005-01-10T and 2005-05-14 have empty text".


On Thu, May 13, 2010 at 8:31 PM, Dmitry Chichkov <dchichkov at gmail.com>wrote:

> Here is an instruction onto how you can check that:
> 1) extract first ~700M from the archive (takes 1-10 seconds)
> # 7z e enwiki-20100130-pages-meta-history.xml.7z
> # Ctrl-C
>
> 2) open the .xml file and search for 9450068
> You'll see that the revision text is missing.
>
> 3) Check that this revision is in fact not an empty one:
> http://en.wikipedia.org/w/index.php?oldid=9450068
>
> 4) Scroll down in the .xml file and see more empty and regular revisions.
>
> -- Dmitry
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wikimedia.org/pipermail/xmldatadumps-admin-l/attachments/20100513/8e17bb76/attachment.htm 


More information about the Xmldatadumps-admin-l mailing list