[Labs-l] Problem with running PHP CLI in cron/jsub

Maciej Jaros egil at wp.pl
Tue May 13 21:17:04 UTC 2014


Tim Landscheidt (2014-05-12 01:58):
> ...
>> How do I make sure I get a notification of job failure?
> You can tell jsub or more precisely the underlying qsub with
> the option "-m e" to send a notification on several occa-
> sions (cf. "man qsub"):
>
> | [...]
>
> |        -m b|e|a|s|n,...
> |               Available  for  qsub,  qsh, qrsh, qlogin and
> |               qalter only.
>
> |               Defines or  redefines  under  which  circum‐
> |               stances  mail is to be sent to the job owner
> |               or to the users defined with the  -M  option
> |               described  below.  The option arguments have
> |               the following meaning:
>
> |               `b'     Mail is sent at the beginning of the job.
> |               `e'     Mail is sent at the end of the job.
> |               `a'     Mail is sent when the job is aborted or
> |                       rescheduled.
> |               `s'     Mail is sent when the job is suspended.
> |               `n'     No mail is sent.
>
> |               Currently no mail is sent when a job is sus‐
> |               pended.
>
> | [...]
>
> For example, "jsub -mem 10m -m a php" leads to:
>
> | Job 821685 (php5) Aborted
> |  Exit Status      = 139
> |  Signal           = SEGV
> |  User             = tools.wikilint
> |  Queue            = task at tools-exec-07.eqiad.wmflabs
> |  Host             = tools-exec-07.eqiad.wmflabs
> |  Start Time       = 05/11/2014 23:49:56
> |  End Time         = 05/11/2014 23:49:56
> |  CPU              = 00:00:00
> |  Max vmem         = NA
> | failed assumedly after job because:
> | job 821685.1 died through signal SEGV (11)
>
> NB: "Aborted" means aborted in the grid sense.  "jsub -mem
> 10m -m a false" will not generate a mail for example.  So
> you might want to use "-m aes" and filter notifications
> about successful jobs in your mail client.
>

Thanks. I've ended up mailing errors manually if the job is not aborted.

Below is my script in case anyone else wishes to use it.

#!/bin/bash

# Functions
# ----------------------------

function sendFailMail {
   echo -e "Subject: [job-fail] Something is wrong...\n\n$1" | 
/usr/sbin/exim -odf -i tools.NAME-OF-YOUR-TOOL at tools.wmflabs.org
}

function checkResult {
   local result=$?
   if [ "$result" -ne "0" ]; then
     local message="[ERROR] exit code from command was non-zero: $result"
     echo $message
     sendFailMail $message
   fi
}

function checkLog {
   local logFile=$1
   local matchString="(Warning|Error|Notice):"
   if grep -Eq $matchString $logFile
   then
     local message="Log contains errors or something strange..."
     message="$message\n\n"`grep -E $matchString $logFile`
     echo -e $message
     sendFailMail "$message"
   fi
}

# Script
# ----------------------------

nowDate=`date +"%Y-%m-%d %H:%M"`
logFile="../dna_refresh-"`date +%Y-%m`".php-out"

cd /data/project/dna/public_html
echo -e "\n\n-----------------------\n$nowDate\n- - - - - - - - - - - 
-\n" >> $logFile
php ./index.php &>> $logFile
#checkResult
checkLog $logFile






More information about the Labs-l mailing list