[Xmldatadumps-admin-l] [Wikitech-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

Tomasz Finc tfinc at wikimedia.org
Thu Mar 11 03:54:21 UTC 2010


Yup, that's the one. If you have a fast upload pipe then I'm more then 
happy to setup space for it. Otherwise it should be arriving in our 
snail mail after a couple of days.

-tomasz

Kevin Webb wrote:
> Many thanks to everyone involved.
> 
> Also, in case it's of use to anyone I have a copy of the
> enwiki-20080103-pages-meta-history.xml dump in 7z form. Is that the
> backup that's beeing referred to or is it in fact 20081003?
> 
> kpw
> 
> On Wed, Mar 10, 2010 at 10:20 PM, Tomasz Finc <tfinc at wikimedia.org> wrote:
>> Thankfully due to an awesome volunteer we'll be able to get that 2008
>> snapshot in our archive. I'll mail out when it shows up in our snail mail.
>>
>> --tomasz
>>
>> Erik Zachte wrote:
>>> I'm thrilled. Big thanks to Tim and Tomasz for pulling this off.
>>> For the record the 2008-10-03 dump existed for a short while only.
>>> It evaporated before wikistats and many others could parse it,
>>> so now we can finally catch up from 3.5 (!) years backlog.
>>>
>>> Erik Zachte
>>>
>>>> -----Original Message-----
>>>> From: wikitech-l-bounces at lists.wikimedia.org [mailto:wikitech-l-
>>>> bounces at lists.wikimedia.org] On Behalf Of Tomasz Finc
>>>> Sent: Thursday, March 11, 2010 4:11
>>>> To: Wikimedia developers; xmldatadumps-admin-l at lists.wikimedia.org;
>>>> xmldatadumps at lists.wikimedia.org
>>>> Subject: [Wikitech-l] 2010-03-11 01:10:08: enwiki Checksumming pages-
>>>> meta-history.xml.bz2 :D
>>>>
>>>> New full history en wiki snapshot is hot off the presses!
>>>>
>>>> It's currently being checksummed which will take a while for 280GB+ of
>>>> compressed data but for those brave souls willing to test please grab
>>>> it
>>>> from
>>>>
>>>> http://download.wikipedia.org/enwiki/20100130/enwiki-20100130-pages-
>>>> meta-history.xml.bz2
>>>>
>>>> and give us feedback about its quality. This run took just over a month
>>>> and gained a huge speed up after Tims work on re-compressing ES. If we
>>>> see no hiccups with this data snapshot, I'll start mirroring it to
>>>> other
>>>> locations (internet archive, amazon public data sets, etc).
>>>>
>>>> For those not familiar, the last successful run that we've seen of this
>>>> data goes all the way back to 2008-10-03. That's over 1.5 years of
>>>> people waiting to get access to these data bits.
>>>>
>>>> I'm excited to say that we seem to have it :)
>>>>
>>>> --tomasz
>>>>
>>>> _______________________________________________
>>>> Wikitech-l mailing list
>>>> Wikitech-l at lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>>
>>>
>>> _______________________________________________
>>> Xmldatadumps-admin-l mailing list
>>> Xmldatadumps-admin-l at lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-admin-l
>>
>> _______________________________________________
>> Xmldatadumps-admin-l mailing list
>> Xmldatadumps-admin-l at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-admin-l
>>




More information about the Xmldatadumps-admin-l mailing list