Yes, I was wrong… I confused myself with the Jan and April due to 4 from 24… dyslexia is a bitch ;^)

 

Best Regards, Nat

 

Senior Technical Staff Member, T.J. Watson Research, IBM

+1 860 812 5089

https://research.ibm.com/people/nathaniel-mills

 

 

From: Xabriel Collazo Mojica <xcollazo@wikimedia.org>
Date: Monday, April 1, 2024 at 11:32
AM
To: Nathaniel Mills <wnm3@us.ibm.com>
Cc: xmldatadumps-l@lists.wikimedia.org <xmldatadumps-l@lists.wikimedia.org>
Subject: [EXTERNAL] Re: [Xmldatadumps-l] April dump has extra 01 in directory and filename

Hi Nathaniel, Seems like the pages-articles-multistream for enwiki is still running for the April (2024-04-01) run, so you wouldn't be able to download? Or did you mean there is a naming issue with the 2024-01-01 run? On Mon, Apr 1, 2024

ZjQcmQRYFpfptBannerStart

This Message Is From an External Sender

This message came from outside your organization.

 

ZjQcmQRYFpfptBannerEnd

Hi Nathaniel,

 

Seems like the pages-articles-multistream for enwiki is still running for the April (2024-04-01) run, so you wouldn't be able to download?

 

Or did you mean there is a naming issue with the 2024-01-01 run?

 

 

On Mon, Apr 1, 2024 at 11:21AM Nathaniel Mills <wnm3@us.ibm.com> wrote:

In order to wget the bz2 file we had to use a different URL pattern for April 2024:
https://dumps.wikimedia.org/enwiki/20240101/enwiki-20240101-pages-articles-multistream.xml.bz2

we used to use a pattern without the extra 01 suffix…

 

Best Regards, Nat

 

Senior Technical Staff Member, T.J. Watson Research, IBM

+1 860 812 5089

https://research.ibm.com/people/nathaniel-mills

 

_______________________________________________
Xmldatadumps-l mailing list -- xmldatadumps-l@lists.wikimedia.org
To unsubscribe send an email to xmldatadumps-l-leave@lists.wikimedia.org


 

--

Xabriel J. Collazo Mojica (he/him, pronunciation)

Sr Software Engineer

Wikimedia Foundation