Hi Jeremie,

Thanks for this info.

In the meantime, what about making chunks of 3.5Bio triples (or any size less than 4Bio) and a script to convert the dataset? Would that be possible ?

 

Best,

Ghislain

 

Provenance : Courrier pour Windows 10

 

De : Jérémie Roquet
Envoyé le :mardi 7 novembre 2017 15:25
À : Discussion list for the Wikidata project.
Objet :Re: [Wikidata] Wikidata HDT dump

 

Hi everyone,

 

I'm afraid the current implementation of HDT is not ready to handle

more than 4 billions triples as it is limited to 32 bit indexes. I've

opened an issue upstream: https://github.com/rdfhdt/hdt-cpp/issues/135

 

Until this is addressed, don't waste your time trying to convert the

entire Wikidata to HDT: it can't work.

 

--

Jérémie

 

_______________________________________________

Wikidata mailing list

Wikidata@lists.wikimedia.org

https://lists.wikimedia.org/mailman/listinfo/wikidata