[Labs-l] Partial outage in progress -- update

Pine W wiki.pine at gmail.com
Tue Feb 17 20:13:40 UTC 2015


I just want to reiterate that having tools be available near 100 percent
uptime is valuable. If something needs to change in tool labs to make them
more reliable, such as simplification or reduction of dependencies, or
increase in redundancy, I hope that this will be explored after the
immediate problems are fixed.

Thanks (:

Pine
On Feb 17, 2015 12:01 PM, "Andrew Bogott" <abogott at wikimedia.org> wrote:

> On 2/17/15 10:49 AM, Andrew Bogott wrote:
>
>> No data was lost.  I'm currently migrating VMs from virt1005 onto a new,
>> more trustworthy server. All affected instances will start back up one by
>> one over the next hour or so.
>>
> There turn out to be some MASSIVE instances on that box, so the copy is
> taking longer than I expected.  Still chugging along though.
>
> -A
>
>
>
>
>> -Andrew
>>
>>
>> On 2/17/15 9:31 AM, Andrew Bogott wrote:
>>
>>> One of the labs virtualization hosts, virt1005, is suffering a disk
>>> failure.  I'm restarting right now -- that may allow us to gradually
>>> recover.  If we're less lucky, then rebuilding from the outage will be a
>>> prolonged process.
>>>
>>> A list of affected instances can be found here:
>>>
>>> https://phabricator.wikimedia.org/P305
>>>
>>> Note that this box was hosting the Tools web proxy, so the web interface
>>> for most tools is currently down.  That should be easy to rebuild if
>>> necessary, and will be a high priority.
>>>
>>> Updates as events warrant!
>>>
>>> -Andrew
>>>
>>
>>
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20150217/48b66ca2/attachment.html>


More information about the Labs-l mailing list