I was wondering if anybody would have the time to
create some sort of application status dashboard, similar to the ones found on google
(
http://www.google.com/appsstatus#hl=en) or amazon
http://status.aws.amazon.com/).
Essentially, something that acts like a simplified external facing blog, where people
could update the different pieces as problems are detected. Eventually, it would be nice
to be able to tie it with our different monitoring softwares, but to begin with, it would
be very convenient for our partners to be able to see the overall status of our different
layers. For instance, if someone detects that
m.wikipedia.org is not working, the first
step would be to update said dashboard to inform the less technically-savy people who are
not necessarily on IRC of the problem and that someone is looking at it / in the process
of fixing it.
Another important feature would be to keep history on problems that have happened in the
past, much like google does. I know this is already done to some extent with the server
admin log, but having an easy to read interface would in my opinion prove beneficial.
Anyway, any suggestion on additional features, or requirements are welcomed.
--Fred.
Requirements:
*When the site goes down, the dashboard must still be up, no matter
which layer failed.
*The dashboard must be able to handle the slashdot effect that the site
going down would have on it.
The dashboard would fix bug 20079, forked from 16043.