Hi Bene,
On 29.05.2015 22:06, Bene* wrote:
Hi Markus,
Maybe it could be more efficient to do some API
requests to find the
right entities rather than filtering many alternative labels as part
of the query. It's not a pattern that we should encourage for
production ;-)
I don't think it would be more efficient to do any api request
before
querying blazegraph because the api is way slower than the triple search
in BlazeGraph. For exposing SPARQL for newcomers it would perhaps be
nicer to add the actual Q/P ids instead of huge lists of UNIONs but for
the internal query I guess passing everything to BlazeGraph and let it
do the right things with it is imo better and more efficient.
Two reasons:
* Performance: It depends. API might be slower in simple cases, but
every label that is part of the SPARQL query is adding a join. At some
point, the query will just need too much memory to run at all, and fast
turns into impossible ;-). I think some of the queries the system
creates are already too big for BlazeGraph. If the API is really so
slow, one could use BlazeGraph like the API (issuing many small queries
to fetch IDs). But in the end resolving the entities in a query needs
only very few API requests, and they are only needed once when building
the query.
* Utility: This is the more important reason. Resolving the Qids and
Pids as part of the SPARQL generation process will make the tool more
useful. You can print something like "I assume that by 'Madonna' you
meant the American singer, songwriter, and actress (Q1744)" and let the
user change this. As it is now, the husbands of Madonna give a rather
surprising mix of results for different entities. This might still be
entertaining for Madonna, but it is a real problem for questions like
"Where is Paris?", which do not produce a meaningful result (you get a
list of places and coordinates for different things, and you cannot find
out which coordinate belongs to which place).
A label-based query paradigm can also work, but then it should produce
queries that return the entities that were found for each label, so the
user can at least see from the result which "Madonna" each husband
belongs to.
Cheers,
Markus