On Fri, Jul 24, 2020 at 2:19 PM Ryan Kemper <rkemper(a)wikimedia.org> wrote:
Hi all,
We experienced WDQS service disruptions on 2020/07/23. As a result there
was a full outage (inability to respond to all queries) for a period of
several minutes, and a more extended period of intermittently degraded
service (inability to respond to a subset of queries) for 1-2 hours.
The full incident report is available here:
https://wikitech.wikimedia.org/wiki/Incident_documentation/20200723-wdqs-ou…
Ultimately, we traced the proximate cause to a series of non-performant
queries, which caused a deadlock in blazegraph, the backend for WDQS. We
have placed a temporary block on the IP address in question and are taking
steps to better define service availability expectations as well as
processes to make detection of these events more streamlined going forward.
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata