[Labs-l] NFS overloaded

Yuvi Panda yuvipanda at gmail.com
Tue Jan 26 21:42:55 UTC 2016


Hello!

This has been resolved now. More detailed incident report will be
published soon.

In the meantime:

- We rebooted one of the exec hosts (tools-exec-1217) because it was
stuck with excess load, and this lost all non-continous jobs running
there. Continuous jobs running there would be rescheduled
automatically.
- Some queues were in error state, and I've cleared the error state
(so everything should be ok now)
- Some jobs were stuck in error state, I've cleared them (so they have
scheduled themselves and are running now)

The grid is healthy as of now, so let us know if anything seems amiss.

Thanks



More information about the Labs-l mailing list