[Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

Felipe Ortega glimmer_phoenix at yahoo.es
Thu Mar 11 15:48:35 UTC 2010



--- El jue, 11/3/10, Tomasz Finc <tfinc at wikimedia.org> escribió:

> De: Tomasz Finc <tfinc at wikimedia.org>
> Asunto: [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D
> Para: "Wikimedia developers" <wikitech-l at lists.wikimedia.org>, xmldatadumps-admin-l at lists.wikimedia.org, xmldatadumps at lists.wikimedia.org
> Fecha: jueves, 11 de marzo, 2010 04:10
> New full history en wiki snapshot is
> hot off the presses!
> 
> It's currently being checksummed which will take a while
> for 280GB+ of 
> compressed data but for those brave souls willing to test
> please grab it 
> from
> 
> http://download.wikipedia.org/enwiki/20100130/enwiki-20100130-pages-meta-history.xml.bz2
> 
> and give us feedback about its quality. This run took just
> over a month 
> and gained a huge speed up after Tims work on
> re-compressing ES. If we 
> see no hiccups with this data snapshot, I'll start
> mirroring it to other 
> locations (internet archive, amazon public data sets,
> etc).

Really good news :-)

> 
> For those not familiar, the last successful run that we've
> seen of this 
> data goes all the way back to 2008-10-03. That's over 1.5
> years of 
> people waiting to get access to these data bits.
> 

In fact, something went wrong with that one, as well. The last valid full dump (afaik) was 2008-03-03, containing data up to early January 2008.

> I'm excited to say that we seem to have it :)
> 

Let's cross our fingers. Congrats for the great job, guys!!

Felipe

> --tomasz
> 
> _______________________________________________
> Xmldatadumps-admin-l mailing list
> Xmldatadumps-admin-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-admin-l
> 


      



More information about the Xmldatadumps-admin-l mailing list