-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
There is currently a large amount of replication lag on the s2/s5 server. This
might be related to a failed disk which will be replaced next week. In the
mean time we will continue to monitor it.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)
iEYEARECAAYFAkx3s/QACgkQIXd7fCuc5vJIJACfTKk8rhp3G7R3+Idn0ZT7pGf6
o9gAn0A8gqVS9OGnNy+xm46dpst7z0KO
=RKXX
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
I will be restarting all MySQL servers later tonight / tomorrow morning (UTC).
For user databases (sql), there will be < 30 minutes downtime.
For -rr replicated database, there will be no downtime except on s2/s5, where
there will be < 30 minutes downtime.
For -user replicated databases, there will be < 30 minutes downtime.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)
iEYEARECAAYFAkx23+MACgkQIXd7fCuc5vIM8ACbBIfvnwizJb2NcfQ/fneyqWTe
eHsAnRJPIKUimIY6wtf9w/VEj0rPKwBM
=eulh
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
We are about to make two changes to the Solaris login servers:
* MySQL will be linked with libreadline instead of libeditline. The new MySQL
will be more compatible with MySQL on Linux, but will not be able to read
existing .mysql_history files previously created on Solaris. You may wish to
remove your existing .mysql_history file.
* The default Ruby (/usr/bin/ruby) will change from 1.9 to 1.8. Ruby 1.9 is
still available it /usr/bin/ruby1.9.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)
iEYEARECAAYFAkx2qYAACgkQIXd7fCuc5vJ6mACfSpusHWz/snPeNYvqzBfx7GwX
9iMAnjlcaOUKPxbVdIePhhmY/gP9jKJd
=3Bsi
-----END PGP SIGNATURE-----
Hello all,
I will reboot nightshade tonight at 22 o'clock UTC, because of an kernel-
update, The process will be visible at
https://jira.toolserver.org/browse/MNT-753
and there will be messages for the user, which are login, before.
Sinclery,
DaB.
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
There was an outage of the Toolserver from around 2200h, 17th August until
around 0020h. The initial cause of the outage was a network issue at Wikimedia
(who provide us with transit). The network outage caused a cascading failure
of the two HA clusters (turnera/damiana and wolfsbane/ortelius), which
prolonged the outage by about 20 minutes.
The problem is now resolved and all services should be accessible again.
Although the cluster outage did not significantly extend the outage (since
cluster is quite good as recovering itself), it is nonetheless not something
that should have happened, and we will be reviewing the cluster setup to
prevent it happening in the future.
- river..
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)
iEYEARECAAYFAkxrGiUACgkQIXd7fCuc5vLk5QCdGZPp2+5/kDUW3jq+NZ6HkVb0
hDsAn1GBCTY9n2v/idrjNXa08Zrnmol3
=D8KG
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
It is now possible to specify the server type (RR or user) when using the 'sql'
script:
$ sql -u enwiki_p # Connect to user database
$ sql -r enwiki_p # Connect to RR database
If you currently use 'sql' for scripting, you should update your scripts to
specify one of these options. Currently, the default (if no option is
specified) is to connect to the obsolete sql-sX alias. This might change in
future.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)
iEYEARECAAYFAkxpU08ACgkQIXd7fCuc5vKB1ACfYHHRSDMB9FWBbWUQvQNkNf1d
204An0Jcf8F1elf58gkNzOtDpbqDOJgR
=+Q9g
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
I've just added a new feature to the automated query killer which can be used
to limit the execution time of queries.
To use it, include the string LIMIT:<n> in your query. For example:
SELECT /* LIMIT:60 */ * FROM ...
would limit the query execution time to 60 seconds.
This could be used in web tools to prevent excessively long page load times
(where the user will gave given up by the time the query finishes), or as a
sanity check on queries which usually finish quickly, but sometimes take too
long.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)
iEYEARECAAYFAkxj8k4ACgkQIXd7fCuc5vJQWwCfaVYnSWoxSmEd8HesFcfUikeH
dLsAn3sgbrnXQwpeGoyys48fad+gbQKO
=HE6k
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
This mail describes two separate changes to how database access works. Please
read all of it and make the necessary changes to your tools.
Summary
=======
This is a brief summary of the changes you should make:
* If you currently use the fast server (sql-sX-fast or XXwiki-p.fastdb), change
back to the normal server, then follow the rest of these instructions.
* For tools which only connect to 'sql', no changes are necessary.
* For tools which use database servers but *do not* use user databases:
* If you currently connect to sql-sX.toolserver.org, instead connect to
sql-sX-rr.toolserver.org
* If you currently connect to XXwiki-p.db.toolserver.org, instead connect to
XXwiki-p.rrdb.toolserver.org
* For tools which use database servers and *do* use user databases:
* If you currently connect to sql-sX.toolserver.org, instead connect to
sql-sX-user.toolserver.org
* If you currently connect to XXwiki-p.db.toolserver.org, instead connect to
XXwiki-p.userdb.toolserver.org
* If you have any queries which could run for longer than 10 minutes when
working correctly, add the string SLOW_OK somewhere in the query, e.g.:
SELECT /* SLOW_OK */ * FROM table...
I will update the documentation on the wiki shortly to reflect these changes.
The rest of this mail describes the changes in detail.
RR servers
==========
A while ago, we introduced the idea of 'fast' database servers, which only
allowed queries running for less than 60 seconds. The idea was that since
there were no long queries to create load on the server, it would be less
likely to have replication lag than the normal servers.
Since the introduction of fast servers, we have seen very little take-up of
them, even for tools which could usefully use them. Additionally, we have not
had much issue with replication lag recently.
We will therefore be retiring the fast servers, and replacing them with RR
servers. To connect to an RR server, the following hostnames should be used:
* sql-sX-rr.toolserver.org
* XXwiki-p.rrdb.toolserver.org
Unlike the normal aliases, the RR server will randomly connect you to one of
the two servers in each cluster. (For example, when connecting to sql-s1-rr,
you will randomly connect to either thyme or rosemary.)
There is no disadvantage for users to connect to the RR alias (since there is
no limit on query execution time), and this will allow us to better distribute
load among the database servers, which will reduce replication lag for
everyone. It also makes it easier for us to add additional database servers to
a cluster in the future.
The only tools which cannot use the RR servers are tools which access user
databases, since these databases are still only present on a single server.
These tools should instead connect to the new "user" aliases:
* sql-sX-user.toolserver.org
* XXwiki-p.userdb.toolserver.org
The user aliases will always point to the server which currently contains the
user databases.
Long query killer
=================
To help prevent replication lag, we will be introducing an automatic query
killer on all servers. This will work as follows:
* If the replication lag is under 10 minutes, no queries will be killed.
* If the replication lag is 10 minutes or more, but less than 30 minutes,
queries will be killed if the following two conditions are both true:
1. The query does not contain the text SLOW_OK
2. The query has been running for X seconds or more, where X is the current
replication lag.
* If the replication lag is 30 minutes or more, queries will be killed if the
following condition is true:
1. The query has been running for X seconds or more, where X is the current
replication lag.
This is intended to only kill queries which are causing replication lag (in
particular, queries which cause InnoDB lock wait timeouts). We will monitor
the performance of the query killer and might adjust the parameters in the
future.
If you find that your queries are being killed and you don't think they should
have been, please open a request in JIRA.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)
iEYEARECAAYFAkxdHDMACgkQIXd7fCuc5vJi0wCdH8e8MkbHdVukfXxR9JGDTHz7
e6sAoLgvCpkNW39zJ1uhkGWnRTK61XF3
=n+x9
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
Since no one reported any problems with Perl 5.12, this is now the default Perl
version (/usr/bin/perl) on the Solaris systems.
Additionally, Python 2.7 is now available as /usr/bin/python2.7. This is not
yet the default version, but will become so if there are no issues.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)
iEYEARECAAYFAkxaWAgACgkQIXd7fCuc5vLTSwCgt9Y3sfm6KDUP3TPC/+qYt4SD
x/AAn1pBJz8TMwBetE2eYk2cVdOFafSh
=WOIe
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
yarrow (sql-s[25]-fast) is currently offline due to a problem with its disk
array. All load for these clusters is being served by daphne and there should
be no impact to users.
- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)
iEYEARECAAYFAkxYV7kACgkQIXd7fCuc5vJu7QCfQvlpFbiHMys0CyDVSs/yCrHa
L9AAn3Rjnl1lKK72zkm1BI0Xf94st2Jm
=QTJz
-----END PGP SIGNATURE-----