Hello,
a new dump of Wikidata in HDT (with index) is available
<http://www.rdfhdt.org/datasets/>. You will see how Wikidata has become
huge compared to other datasets. it contains about twice the limit of 4B
triples discussed above.
In this regard, what is in 2018 the most user friendly way to use this
format?
BR,
Ettore
On Tue, 7 Nov 2017 at 15:33, Ghislain ATEMEZING <
ghislain.atemezing(a)gmail.com> wrote:
Hi Jeremie,
Thanks for this info.
In the meantime, what about making chunks of 3.5Bio triples (or any size
less than 4Bio) and a script to convert the dataset? Would that be possible
?
Best,
Ghislain
Provenance : Courrier <https://go.microsoft.com/fwlink/?LinkId=550986>
pour Windows 10
*De : *Jérémie Roquet <jroquet(a)arkanosis.net>
*Envoyé le :*mardi 7 novembre 2017 15:25
*À : *Discussion list for the Wikidata project.
<wikidata(a)lists.wikimedia.org>
*Objet :*Re: [Wikidata] Wikidata HDT dump
Hi everyone,
I'm afraid the current implementation of HDT is not ready to handle
more than 4 billions triples as it is limited to 32 bit indexes. I've
opened an issue upstream:
https://github.com/rdfhdt/hdt-cpp/issues/135
Until this is addressed, don't waste your time trying to convert the
entire Wikidata to HDT: it can't work.
--
Jérémie
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata