Hello,
This is a brief and very unofficial note to say thanks for what
appears to be good reliability and performance of WMF's technical
infrastructure during the current global public health environment and
the swarms of contributing and reading activity on ENWP and elsewhere.
My guess is that there are many dashboards that show how site
performance and reliability are doing. At a convenient time, perhaps
for the upcoming issue of The Signpost, I'd be interested in seeing a
report regarding how the infrastructure is handling the situation.
Ever onward,
Pine
( https://meta.wikimedia.org/wiki/User:Pine )
Hi all!
In response to COVID-19, we are putting in place stricter guidelines around
deployments with an emphasis on site reliability.
To support this change, we have the following guidelines for software
development:
-
While we are not going to go into full emergency or holiday mode (i.e., no
releases), we do think it is necessary to de-risk the deployment train by
adding some additional scrutiny into the process. Our ask is that you take
extra precautions as outlined in our deployment guidelines below. Most
importantly, if you know you have limited availability to support a
deployment, don’t put your code on the train. When in doubt, ask.
-
Please review the COVID-19 deployment guidelines at
https://wikitech.wikimedia.org/wiki/Deployments/Covid-19
-
SWAT (emergency hot-fix) deploys will continue as is
-
We are limiting the frequency of onsite data center work to help
minimize the exposure of our team members who travel in and out of our data
center facilities. This will result in the general delay of hardware
installations and repairs, though we will continue being immediately
available for emergencies associated with uptime and critical
redundancies. We are still finalizing what this means and will provide
additional guidance when we have it.
Please err on the side of caution with the changes you merge.
Considerations (from the wikitech page)
-
Can you roll back this change without lasting impact?
-
A recovery plan is required as this will help identify our capacity
for recovering from the failure
-
THIS IS A KEY QUESTION, if you can’t answer it, you shouldn’t deploy
-
Is specialized knowledge required to support this change in production?
-
Are there multiple people with this knowledge?
-
Is there a way to increase confidence about the correctness of this
change?
-
Reviews (Design, Code, etc)
-
Testing coverage (unit tests, integration tests)
-
Manual testing (e.g. Beta, vagrant, docker)
We’re hosting office hours on Mondays at 17:00 UTC in #wikimedia-office
where you can ask questions regarding what is a good choice vs not.
Thank you all in advance for your understanding and empathy over the next
few weeks.
<3
-- Your Local (Internet) Neighborhood Release Engineers
As mentioned earlier on the xmldatadumps-l, the dumps are running very slow
this month, ince the vslow db hosts they use are also serving live traffic
during a tables migration. Even manual runs of partial jobs would not help
the situation any, so there will be NO SECOND DUMP RUN THIS MONTH. The
March 1 Wikidata run is still in process but it should complete in the next
several days.
With any luck everything will be back to normal in April and we'll be able
to conduct two runs as usual from then on.
Ariel
Hello,
Tuesday 17th at 09:00 AM UTC we will be restarting db1132 as part
of T239791.
We will take the opportunity to upgrade MySQL there.
m2 holds the following databases:
- OTRS
- Recommendations API
- Reviewdb (gerrit)
- Debmonitor
We expect read-only time to be around 30-60 seconds, also, gerrit might be
unavailable as well.
Coordination will happen on #wikimedia-operations
Thanks!
Manuel.
The 1.35.0-wmf.23 version of MediaWiki is still blocked[0].
Neither this version nor the subsequent 1.35.0-wmf.24 can proceed until
this issue is resolved:
* T247562: Warning: Memcached::setMulti(): failed to set key
global:segment:...
- https://phabricator.wikimedia.org/T247562
Thanks to the many folks who have contributed debugging on T247562 so
far. If anyone else has any insight, further input would be certainly
be appreciated.
-- Your erratic train enabler
[0]. <https://phabricator.wikimedia.org/T233871>
The 1.35.0-wmf.23 version of MediaWiki is still blocked[0].
The new version can't proceed to all wikis [1] until this issue is resolved:
* T247562: Warning: Memcached::setMulti(): failed to set key
global:segment:...
- https://phabricator.wikimedia.org/T247562
This was generating a large spike of warnings of the following form:
ErrorException from line 340 of
/srv/mediawiki/php-1.35.0-wmf.23/includes/libs/objectcache/MemcachedPeclBagOStuff.php:
PHP Warning: Memcached::setMulti(): failed to set key
global:segment:enwiki%3Apcache%3Aidhash%3A23309859-0!canonical:ce9eb2174b45be4c0a2966f5bfbf5f045e3b6388
Thanks for any help on this issue.
-- Your discombobulated train trundler
[0]. <https://phabricator.wikimedia.org/T233871>
[1]. <https://tools.wmflabs.org/versions/>
Hi!
I am Vivek Malhan from Ludhiana, Punjab, India.
Currently I am pursuing B.Tech in Computer Science and Engineering. I
am new to open source and I want to contribute in your organization .
I am currently comfortable working with JavaScript,PHP, Python , Java ,
C and C++ . This year I want to participate in GSoC 2020 under
WikiMedia. I have already installed and run your software on my
device. I am new to GSoC, can you please guide me on how I can
contribute to your organization.
Can you guide me about this?
Thank you.
Regards
--
Vivek Malhan
https://github.com/VivekMalhan666https://bethenewwyou.wordpress.com/
The 1.35.0-wmf.23 version of MediaWiki is blocked[0].
The new version can't proceed to group2 [1] until this issue is resolved:
* T247562: Warning: Memcached::setMulti(): failed to set key
global:segment:...
- https://phabricator.wikimedia.org/T247562
Assuming a fix, the train will likely resume Monday, March 16th.
Thank you for your help resolving this issue!
-- Your harried train functionary
[0]. <https://phabricator.wikimedia.org/T233871>
[1]. <https://tools.wmflabs.org/versions/>