Started a few minutes ago. getting alternating errors on reloading a page;
500 (on a static HTML page, not CGI, mind you!), "This webpage is not
available: The connection to toolserver.org was interrupted.", 404.
It appears that at least S1 and S4 are having issues getting to the
database from the http side from projects like ACC (
http://toolserver.org/~acc/) and UTRS (http://toolserver.org/~unblock). I
have two users from ACC verify that they have the same issues. Though
through SSH I am able to clearly access both databases (using "mysql
p_(database name)"). Anyone having this issue, or am I chasing ghosts (or
worse our two projects are chasing ghosts).
DeltaQuad
English Wikipedia Administrator and Checkuser
Hello all,
just a little story of what happened today: As you know I planed to dump the
user-databases of rosemary today to import them on thyme later. Around 12
o'clock CET I looked at the replag of thyme during a break and everything was
fine. After my dinner I looked in my mails seeing an email from the OSM-guys
complaining that their title-dir was away. As a background information: thyme
carries the nfs-server of the user-store, title and munin – these are normally
on hemlock, but because hemlock's SAN-card is broken we had to move them to
another server.
Short time later I spoke with Nosy at IRC about thyme. She told me that thyme
is inaccessible by SSH. Few days ago we had discovered that thyme's serial-
console was not working (we have put that on the datacenter-to-do-list). But
without SSH and serial-console you can not even reboot a server neither
access. Nosy had started to move the nfs-server from thyme to rosemary and we
completed that together.
Because of the missing user-store the script that checks your quota at login
failed and login to linux-servers was hardly possible. I deleted the script on
these boxes and added a quick&dirty-fix to puppet. These fix failed later making
the login at the linux-boxes impossible for some time (even for roots).
The switching of the user-store from thyme to rosemary made some problems on
the userland-servers (because user-store was busy), but I think we fixed this.
Maybe we have to reboot some boxes in the next days – I will send a mail if
needed.
Thyme also carried my wikidata-replication-program which failed too (so the
replag of wikidata everywhere increased). I moved it to another server now.
A strange thing is that the mysql-process on thyme is still running; even
replication is working so the replag will not increase there.
The next step is to reach Mark or someone from the datacenter to reboot thyme
and then look where the problem was. Munin shows nothing abnormal.
Just to let you know. Good night.
Sincerely,
DaB.
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
Hi,
Today I am working on a project where I need to convert an access database
dump to something else (mysql likely). Most things I found on the web
are not really suitable to script this process (it will need to run every
month), but mdbtools looks to be promising. However I tried to compile
that on toolserver all I got out of it are coredumps :-(. (Tried both on
willow and nightshade). The ubuntu pacakge does work on my home server.
Did anybody ever look into mdbtools and have it working? Is it perhaps
possible to have it globally installed on toolserver, as it probably is
usefull to some of the glam-projects as well?
If people want an mdb file to test with, the file I need to convert is
at /mnt/user-store/rce-nl-data.
Regards,
Andre
Hello all,
I will switch sql-s1-rr from thyme to rosemary and back after
19:00 UTC this evening.
You should notice no difference, but everything at sql-s1 will be a little bit
slower while sql-s1-rr is on rosemary (and all queries running on thyme will
be killed).
The background of the switches is that I finally extracted the mysql-files from
the wmf-binary dump and so I can re-setup thyme now with an unbroken version
of s1. If everything works as planed I will handle rosemary during the next
week too, so the hole s1-cluster will be unbroken again.
Just to let you know.
Sincerely,
DaB.
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
It seems that s3 replication is halted.
Did I miss any announcement or is that unplanned issue?
In any case, I would be interested in ETA of catching-up to be up-to-date.
Thank you.
Danny B.
Hello,
the service processors of wolfsbane and thyme need to be updated.
They will be shutdown before the update.
This means enwiki-rr and a webserver and the user-store and osm-tiles will not be available for 15-30 minutes.
I will do this work starting at
Sun 11th November 20pm UTC until Sun 21pm UTC
Kind regards
Marlen/nosy