[Labs-l] Tools: Mail server outage, submitting jobs from crontabs and cron jobs every minute

Tim Landscheidt tim at tim-landscheidt.de
Thu May 22 12:23:26 UTC 2014


Hi,

tonight the mail server's disk ran full and thus mail to and
from Tools was disabled for some time.

The reason for the clogged disk was a tool's cron job that
tried (and succeeded) to submit a job every minute.  On each
success and thus every minute, the message "Your job 4711
('foobar') has been submitted" was sent to the tool's main-
tainer's mail address whose mail provider Yahoo! auto-
blocked the messages.  For some reason, they do that with an
"421" status and the slightly contradictory "All messages
from 208.80.155.162 will be permanently deferred; Retrying
will NOT succeed." causing exim to try to send the messages
again and again and again, probably reaffirming the block.
As a result, apparently *all* messages from Tools to Yahoo!
addresses are currently blocked (or "deferred"; looking at
the mail queue, this seems to affect only Cyberpower678 at
this time).

So what's the take-away message?

a) You can silence jsub with the option -quiet to only emit
   a message if the job was *not* successfully submitted,
   thus in crontabs suppressing sending mails if everything
   went alright.  Unless you actively process those "Your
   job has been submitted" messages by machine or your mind,
   it is probably a good idea to always use "jsub
   -quiet ...".

b) Cron jobs should not be scheduled every minute (cf. also
   https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help#Scheduling_jobs_at_regular_intervals_with_cron).
   In most such cases, a continuous job that does something
   and then sleeps for a minute is a much better solution,
   and as an extra doesn't make you think about setting up
   some locking if the job should take longer than a minute
   to execute.

Tim




More information about the Labs-l mailing list