Update
The restart was not as smooth as expected, and after a minute of trying
to shut the ports down to prepare for it, the storage cluster started
having trouble and that caused some instability for the virtual
machines (some slow disk writes, some network connectivity issues,
...).
Everything should be back up and we will gather and investigate the
incident to avoid it from happening in the future.
Let us know if you are still seeing issues by opening a ticket or
pinging us on IRC:
https://wikitech.wikimedia.org/wiki/Help:Toolforge#Communication_and_support
Thanks for your patience!
On Mon, 2023-05-15 at 13:55 +0200, David Caro wrote:
Hi!
We are restarting a switch[1] today at 13:00 UTC.
We are moving all the affected VMs to different hypervisors, and we
expect no downtime, though you might experience the servers being a
bit
unresponsive when the migration finally moves the VM (a couple
seconds).
We will reply to this email once it's done.
Thanks!
[
1]https://phabricator.wikimedia.org/T316544
---
David Caro
SRE - Cloud Services
Wikimedia Foundation <https://wikimediafoundation.org/>
PGP Signature: 7180 83A2 AC8B 314F B4CE 1171 4071 C7E1 D262 69C3
"Imagine a world in which every single human being can freely share
in
the sum of all knowledge. That's our commitment."
_______________________________________________
Cloud-announce mailing list -- cloud-announce(a)lists.wikimedia.org
List information:
https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.…