Would it be an idea if HDT remains unfeasible to place the journal file of blazegraph
online?
Yes, people need to use blazegraph if they want to access the files and query it but it
could be an extra next to turtle dump?
On 27 Oct 2017, at 17:08, Laura Morales
<lauretas(a)mail.com> wrote:
Hello everyone,
I'd like to ask if Wikidata could please offer a HDT [1] dump along with the already
available Turtle dump [2]. HDT is a binary format to store RDF data, which is pretty
useful because it can be queried from command line, it can be used as a Jena/Fuseki
source, and it also uses orders-of-magnitude less space to store the same data. The
problem is that it's very impractical to generate a HDT, because the current
implementation requires a lot of RAM processing to convert a file. For Wikidata it will
probably require a machine with 100-200GB of RAM. This is unfeasible for me because I
don't have such a machine, but if you guys have one to share, I can help setup the
rdf2hdt software required to convert Wikidata Turtle to HDT.
Thank you.
[1]
http://www.rdfhdt.org/
[2]
https://dumps.wikimedia.org/wikidatawiki/entities/
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata