[Xmldatadumps-admin-l] [Wikitech-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

Kevin Webb kpwebb at gmail.com
Thu Mar 11 03:55:47 UTC 2010


It's in EC2 so I could get it to you in about 20 mins. Just hit me
with an email off-list with the desired destination...

kpw

On Wed, Mar 10, 2010 at 10:54 PM, Tomasz Finc <tfinc at wikimedia.org> wrote:
> Yup, that's the one. If you have a fast upload pipe then I'm more then happy
> to setup space for it. Otherwise it should be arriving in our snail mail
> after a couple of days.
>
> -tomasz
>
> Kevin Webb wrote:
>>
>> Many thanks to everyone involved.
>>
>> Also, in case it's of use to anyone I have a copy of the
>> enwiki-20080103-pages-meta-history.xml dump in 7z form. Is that the
>> backup that's beeing referred to or is it in fact 20081003?
>>
>> kpw
>>
>> On Wed, Mar 10, 2010 at 10:20 PM, Tomasz Finc <tfinc at wikimedia.org> wrote:
>>>
>>> Thankfully due to an awesome volunteer we'll be able to get that 2008
>>> snapshot in our archive. I'll mail out when it shows up in our snail
>>> mail.
>>>
>>> --tomasz
>>>
>>> Erik Zachte wrote:
>>>>
>>>> I'm thrilled. Big thanks to Tim and Tomasz for pulling this off.
>>>> For the record the 2008-10-03 dump existed for a short while only.
>>>> It evaporated before wikistats and many others could parse it,
>>>> so now we can finally catch up from 3.5 (!) years backlog.
>>>>
>>>> Erik Zachte
>>>>
>>>>> -----Original Message-----
>>>>> From: wikitech-l-bounces at lists.wikimedia.org [mailto:wikitech-l-
>>>>> bounces at lists.wikimedia.org] On Behalf Of Tomasz Finc
>>>>> Sent: Thursday, March 11, 2010 4:11
>>>>> To: Wikimedia developers; xmldatadumps-admin-l at lists.wikimedia.org;
>>>>> xmldatadumps at lists.wikimedia.org
>>>>> Subject: [Wikitech-l] 2010-03-11 01:10:08: enwiki Checksumming pages-
>>>>> meta-history.xml.bz2 :D
>>>>>
>>>>> New full history en wiki snapshot is hot off the presses!
>>>>>
>>>>> It's currently being checksummed which will take a while for 280GB+ of
>>>>> compressed data but for those brave souls willing to test please grab
>>>>> it
>>>>> from
>>>>>
>>>>> http://download.wikipedia.org/enwiki/20100130/enwiki-20100130-pages-
>>>>> meta-history.xml.bz2
>>>>>
>>>>> and give us feedback about its quality. This run took just over a month
>>>>> and gained a huge speed up after Tims work on re-compressing ES. If we
>>>>> see no hiccups with this data snapshot, I'll start mirroring it to
>>>>> other
>>>>> locations (internet archive, amazon public data sets, etc).
>>>>>
>>>>> For those not familiar, the last successful run that we've seen of this
>>>>> data goes all the way back to 2008-10-03. That's over 1.5 years of
>>>>> people waiting to get access to these data bits.
>>>>>
>>>>> I'm excited to say that we seem to have it :)
>>>>>
>>>>> --tomasz
>>>>>
>>>>> _______________________________________________
>>>>> Wikitech-l mailing list
>>>>> Wikitech-l at lists.wikimedia.org
>>>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>>>
>>>>
>>>> _______________________________________________
>>>> Xmldatadumps-admin-l mailing list
>>>> Xmldatadumps-admin-l at lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-admin-l
>>>
>>> _______________________________________________
>>> Xmldatadumps-admin-l mailing list
>>> Xmldatadumps-admin-l at lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-admin-l
>>>
>
>



More information about the Xmldatadumps-admin-l mailing list