[Engineering] Gerrit was down today
Chad Horohoe
chorohoe at wikimedia.org
Thu Oct 6 22:33:40 UTC 2016
Hi!
Sorry for the extended downtime! From what we can tell, it appears as though
the machine that Gerrit is running on (lead) is having some hardware issues
that
are making the CPU misbehave. We've worked around it for now, so things
should
be up (and Zuul is processing CI events just fine).
However, since it appears it's a hardware problem, we're planning to
migrate off
of lead to a new machine (cobalt). The public IP addresses will not be
changing.
The plan right now is to do this migration tomorrow with a scheduled
downtime
at 17:00UTC (10:00 PST).
We'll be keeping a close eye on things in the meantime, so if things
deteriorate
again we can start the migration sooner.
(and yeah, wikitech incident report to follow, I'm a little burnt out right
now though)
Thanks again for bearing with us!
-Chad
On Thu, Oct 6, 2016 at 2:32 PM Greg Grossmeier <greg at wikimedia.org> wrote:
> (It wasn't just you)
>
> Gerrit was down today starting around 17:49 UTC. It is now back up and
> services are coming back online.
>
> A full investigation into the cause of the outage is still on-going.[0]
>
> Apologies for the downtime.
>
> WMF Release Engineering
>
> [0] https://etherpad.wikimedia.org/p/gerrit-outage-20161006
> But this is missing a lot of the information/discussion that is
> happening in #wikimedia-operations on Freenode. A link to the
> incident report will be pasted into that etherpad when it is
> created.
>
> --
> | Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E |
> | Release Team Manager A18D 1138 8E47 FAC8 1C7D |
>
> _______________________________________________
> Engineering mailing list
> Engineering at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/engineering
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/engineering/attachments/20161006/72966b19/attachment.html>
More information about the Engineering
mailing list