Hi Tomer,

Unfortunately your queries do work on a rather large portion of the data (P625 has ~10 million items) and I could not find an obvious way to optimize them.
Have you considered using other services like https://qlever.cs.uni-freiburg.de/wikidata or https://wikidata.demo.openlinksw.com/sparql to have a comparison of how they perform?
It is very unlikely that we will allow longer timeouts in the near future so if you plan to work on a large subset I think that using dumps (RDF or json) might be a better option for you at the moment. WDQS is not fit to extract large subsets of wikidata.
Another option might be to discuss and get advice on https://www.wikidata.org/wiki/Wikidata:Request_a_query, there might be different and more performant ways to do what you want?

Hope this helps a bit,

David.


On Fri, Dec 9, 2022 at 1:22 PM <ts.tomersagi@gmail.com> wrote:
Hi,
I am getting frequent timeouts trying to use the SPARQL endpoint GUI at https://query.wikidata.org/ .
I'll admit, I have some complex queries, bu I really feel like this is something that the system should be able to handle or at least allow me to request a longer timeout wait.
For example, this query:

            SELECT ?item ?item2
            WHERE
            {

            ?item wdt:P625 ?location .
            ?item <http://www.w3.org/2002/07/owl#sameAs> ?item2 .

            }
            LIMIT 10

or this query:

SELECT DISTINCT ?item ?itemname  ?location
        WHERE {
                ?item wdt:P625 ?location ;
                    wdt:P31 ?type ;
                    rdfs:label ?itemname.
                ?type wdt:P279 ?supertype .

                FILTER(
                LANG(?itemname) = "en" &&
                ?supertype not in (wd:Q5, wd:Q4991371, wd:Q7283, wd:Q36180, wd:Q7094076, wd:Q905511, wd:Q1063801,
                wd:Q1062856, wd:Q35127, wd:Q68, wd:Q42848, wd:Q2858615, wd:Q241317 , wd:Q1662611, wd:Q7397, wd:Q151885,
                wd:Q1301371, wd:Q1068715, wd:Q7366 , wd:Q18602249, wd:Q16521, wd:Q746549, wd:Q13485782, wd:Q36963)
                )

        }
LIMIT 200000

When I use python SPARQLwrapper things improve somewhat, but still timeout on some of my queries.
I tried the first query above on an old wikidata dump we have from 2021 that we loaded on Jena TDB and it managed to complete it (0 results, but I had to run it to figure that out...).
Seems strange to get such poor performance.
Cheers
Tomer
_______________________________________________
Discovery mailing list -- discovery@lists.wikimedia.org
To unsubscribe send an email to discovery-leave@lists.wikimedia.org