Hi!
We are happy to announce the new domain 'toolforge.org' is now ready to be
adopted by our Toolforge community.
There is a lot of information related to this change in a wikitech page we have
for this:
https://wikitech.wikimedia.org/wiki/News/Toolforge.org
The most important change you will see happening is a new domain/scheme for
Toolforge-hosted webservices:
* from https://tools.wmflabs.org/<toolname>/
* to https://<toolname>.toolforge.org/
A live example of this change can be found in our internal openstack-browser
webservice tool:
* legacy URL: https://tools.wmflabs.org/openstack-browser/
* new URL: https://openstack-browser.toolforge.org
This domain change is something we have been working on for months previous to
this announcement. Part of our work has been to ensure we have a smooth
transition from the old domain (and URL scheme) to the new canonical one.
However, we acknowledge the ride might be bumpy for some folks, due to technical
challenges or cases we didn't consider when planning this migration. Please
reach out intermediately if you find any limitation or failure anywhere related
to this change. The wikitech page also contains a section with information for
common problems.
You can check now if your webservice needs any specific change by creating a
temporal redirection to the new canonical URL:
$ webservice --canonical --backend=kubernetes start [..]
$ webservice --canonical --backend=gridengine start [..]
The --canonical switch will create a temporal redirect that you can turn on/off.
Please use this to check how your webservice behaves with the new domain/URL
scheme. If you start the webservice without --canonical, the temporal redirect
will be removed.
We aim to introduce permanent redirects for the legacy URLs on 2020-06-15. We
expect to keep serving legacy URLs forever, by means of redirections to the new
URLs. More information on the redirections can also be found in the wikitech page.
The toolforge.org domain is finally here! <3
--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation
Hi,
we just enabled email ratelimiting in our MTA server [0] in Toolforge.
Please, report any problem or issue you may find related to this.
The current limit is 100 messages per hour per sender address. We may tune the
value as we observe the behavior of the system and the users.
regards.
[0] https://en.wikipedia.org/wiki/Message_transfer_agent
--
Arturo Borrero Gonzalez
SRE / Wikimedia Cloud Services
Wikimedia Foundation
Tomorrow (June 11th) at 1600 UTC, we will be failing over the primary
NFS server to do maintenance and upgrades on it. The secondary partner
in the cluster is already upgraded and ready, and recent changes
*should* make it a fairly straightforward failover with a brief period
of high load. If it doesn't proceed smoothly, it will be a slightly
longer period of high load and NFS lockup as failover completes (10-20
min or so). After maintenance it will be failed back, which will also,
hopefully, be quick and painless.
--
Brooke Storm
SRE
Wikimedia Cloud Services
bstorm(a)wikimedia.org
IRC: bstorm_
Hello!
Next week we'll be rebuilding and upgrading the hardware that provides
DNS service to cloud-vps and toolforge. These rebuilds will start at
14:00 UTC and the whole process may take 2-3 hours. It's likely that DNS
lookups will be somewhat slower as clients fail over between the
in-progress and the working server. In theory there should be few other
user-facing effects from these upgrades.
In practice, though, this isn't something that we've done for quite a
while, and touching DNS is always risky since it underlies pretty much
everything. Here are some things to be ready for:
- As a precaution we'll be disabling Horizon during the window to
prevent new VMs or DNS changes landing in an inconsistent state.
- Some badly-behaved DNS clients won't fail over properly and will
report errors when their primary DNS server is down.
- Puppet will almost certainly experience transient failures, since
Puppet is known to be one of those badly-behaved clients.
- If things go very badly there may be periods of total DNS outage which
will result in many WMCS-hosted services failing. There's no particular
reason that this /should/ happen, but this is the worst-case scenario.
For additional context, the phabricator task for this work is
https://phabricator.wikimedia.org/T253780
- Andrew + the WMCS team
As the last release of Python 2 is finally out, the July release of
Pywikibot is going to be the **last release that supports Python 2**.
Support of Python 3.4 and MediaWiki older than 1.19 is also going to be
dropped. After this release, Pywikibot is not going to receive any further
patches and bug fixes related to those Python and MediaWiki versions.
Functions and other stuff specific to Python 3.4, Python 2.x or MediaWiki
older than 1.19 will be removed.
For your convenience, this release is marked with a "python2"
git tag and it is also the last 3.0.x release. In case you really need it,
the Pywikibot team created /shared/pywikibot/core_python2 repository in
Toolforge and a python2-pywikibot package in software repositories of some
operating systems.
The Pywikibot team strongly recommends that you migrate your scripts from
Python 2 to Python 3. The migration steps were described in the previous
message, which can be found here:
https://lists.wikimedia.org/pipermail/pywikibot/2020-January/009976.html
Detailed plan of Python 2 deprecation with dates is described here:
https://www.mediawiki.org/wiki/Manual:Pywikibot/Compatibility
If you encounter any problems with the migration, you can always ask us
here: https://phabricator.wikimedia.org/T242120
Best regards,
Pywikibot team
At 2020-06-04T11:12 UTC a change was merged to the
operations/puppet.git repository which resulted in data loss for Cloud
VPS projects using a local Puppetmaster
(role::puppetmaster::standalone). The specific data loss is removal of
any local to the Puppetmaster instance commits overlaid on the
upstream labs/private.git repository. These patches would have
contained passwords, ssh keys, TLS certificates, and similar
authentication information for Puppet managed configuration.
The majority of Cloud VPS projects are not affected by this
configuration data loss. Several highly used and visible projects,
including Toolforge (tools) and Beta Cluster (deployment-prep), have
some impact. We have disabled Puppet across all Cloud VPS instances
that were reachable by our central command and control service (cumin)
and are currently evaluating impact and recovering data from
/var/logs/puppet.log change logs where available.
More information will be collected at
<https://phabricator.wikimedia.org/T254491> and an incident report
will also be prepared once the initial response is complete.
Bryan
--
Bryan Davis Technical Engagement Wikimedia Foundation
Principal Software Engineer Boise, ID USA
[[m:User:BDavis_(WMF)]] irc: bd808