[Labs-l] Cron job concurrency: consider adding `-once` to your cron tasks

Bryan Davis bd808 at wikimedia.org
Wed Aug 2 17:15:00 UTC 2017


On Wed, Aug 2, 2017 at 11:05 AM, Maximilian Doerr
<maximilian.doerr at gmail.com> wrote:
> Which tools are the offending tools?

I'm not sure I would classify any of the tools with lots of parallel
jobs running as "offending". That word has some aggressive
connotations at least in English. I'm also not sure that naming and
shaming anyone is useful. If you really want to know where I've been
intervening, you can grep for !log messages by me in the
#wikimedia-cloud freenode IRC channel logs for the last 24 hours or
so.

We do have a tool at http://tools.wmflabs.org/grid-jobs/ that shows
data updated once an hour that allows sorting and drilling down into
per-tool information. This tool and a graphite view of running jobs
over time (<https://graphite-labs.wikimedia.org/render?title=Tools&yMin=0&width=800&height=400&target=cactiStyle(alias(sumSeries(tools.tools-services-01.sge.hosts.tools-*-12*.job_count),%27precise%27))&target=cactiStyle(alias(sumSeries(tools.tools-services-01.sge.hosts.tools-*-14*.job_count),%27trusty%27))&target=cactiStyle(alias(tools.tools-k8s-master-01.KubernetesCollector.namespaces.active,%27k8s%27))&from=-90days>)
are what led to deeper investigation.

Bryan
-- 
Bryan Davis              Wikimedia Foundation    <bd808 at wikimedia.org>
[[m:User:BDavis_(WMF)]] Manager, Cloud Services          Boise, ID USA
irc: bd808                                        v:415.839.6885 x6855



More information about the Labs-l mailing list