[Labs-l] [Labs-announce] Partial labs downtime Wednesday, 2015-08-12, 15:00 UTC: Reboot of labvirt1001

Maximilian Doerr maximilian.doerr at gmail.com
Mon Aug 10 21:41:55 UTC 2015


How will this affect Cyberbot's continuous scripts?

Cyberpower678
English Wikipedia Account Creation Team
Mailing List Moderator

> On Aug 10, 2015, at 17:33, Merlijn van Deen <valhallasw at arctus.nl> wrote:
> 
> For Tool Labs, the plan is as follows:
>   - tomorrow, we will disable the queue so no new tasks will be distributed to the affected hosts
>   - we will send an e-mail with tasks that are still running an hour later
> 
> Unfortunately, there is currently no host that can run jobs that take longer than a few days, because other virt* hosts will also be rebooted this week.
> 
> For reference, the current long-running jobs on these hosts are the following, grouped by user name:. Please take a look and consider whether the jobs are still doing something useful -- and if not, please kill them (qdel <job id>).
> 
> Merlijn
> 
> 
> 
> Columns:
> 
> job id       name       start date/time
> 
> aka
> ---------------
> 1317747 start Sat Aug  1 19:17:12 2015
> 
> tools.checkwiki
> ---------------
> 145845 eswiki-munch Thu Jun 25 05:00:13 2015
> 818559 arwiki-munch Sat Jul 18 05:00:16 2015
> 
> tools.dexbot
> ---------------
> 1236997 del Thu Jul 30 13:36:09 2015
> 1341699 kian_new2 Sun Aug  2 11:03:18 2015
> 
> tools.gpy
> ---------------
> 527733 gpy Thu Jul  9 01:14:28 2015
> 
> tools.luke081515bot
> ---------------
> 1346744 queue Sun Aug  2 14:24:31 2015
> 
> tools.mjbmrbot
> ---------------
> 209254 lgdcp2_1 Sat Jun 27 15:35:04 2015
> 273994 lgdcp2_2 Tue Jun 30 02:00:07 2015
> 345013 lgdcp2_3 Thu Jul  2 15:00:05 2015
> 807548 lsdcp2_3 Fri Jul 17 21:00:12 2015
> 1092477 lgdcp1_4 Sun Jul 26 14:00:07 2015
> 1093960 lsdcp1_4 Sun Jul 26 15:00:10 2015
> 
> tools.shuaib-bot
> ---------------
> 1622344 translator Mon Aug 10 02:10:09 2015
> 
> tools.wikidata-exports
> ---------------
> 694469 create_dumps Tue Jul 14 08:40:22 2015
> 735030 create_dumps Wed Jul 15 14:31:25 2015
> 768842 create_dumps Thu Jul 16 16:12:52 2015
> 
> 
> 
>> On 10 August 2015 at 21:20, Andrew Bogott <abogott at wikimedia.org> wrote:
>> On Wednesday I'll be rebooting labvirt1001.  This will cause downtime for about 10% of labs instances, and this downtime may last as long as 60 minutes (although the average downtime will be much less.)
>> 
>> We will do our best to juggle and reschedule ToolLabs jobs, but persistent jobs that cannot gracefully restart may be interrupted and require your personal attention.
>> 
>> Here is the list of instances that will be affected by this reboot:
>> 
>> | citoidtest                    | ACTIVE  | -          | Running     | public=10.68.16.182                 |
>> | conf                          | ACTIVE  | -          | Running     | public=10.68.18.87, 208.80.155.233  |
>> | deployment-bastion            | ACTIVE  | -          | Running     | public=10.68.16.58, 208.80.155.191  |
>> | deployment-cache-text02       | ACTIVE  | -          | Running     | public=10.68.16.16                  |
>> | deployment-elastic08          | ACTIVE  | -          | Running     | public=10.68.17.188                 |
>> | deployment-memc03             | ACTIVE  | -          | Running     | public=10.68.16.15                  |
>> | deployment-parsoid05          | ACTIVE  | -          | Running     | public=10.68.16.120                 |
>> | deployment-pdf01              | ACTIVE  | -          | Running     | public=10.68.16.73                  |
>> | deployment-restbase01         | ACTIVE  | -          | Running     | public=10.68.17.227                 |
>> | deployment-salt               | ACTIVE  | -          | Running     | public=10.68.16.99                  |
>> | deployment-urldownloader      | ACTIVE  | -          | Running     | public=10.68.16.135                 |
>> | diffengine                    | ACTIVE  | -          | Running     | public=10.68.17.127                 |
>> | educationdashboard-i18n       | SHUTOFF | -          | Shutdown    | public=10.68.16.235                 |
>> | ee-flow-extra                 | ACTIVE  | -          | Running     | public=10.68.16.102                 |
>> | etcd01                        | ACTIVE  | -          | Running     | public=10.68.16.130                 |
>> | etcd03                        | ACTIVE  | -          | Running     | public=10.68.16.132                 |
>> | firstinstance                 | SHUTOFF | -          | NOSTATE     | public=10.68.16.212                 |
>> | graphite-trusty               | ACTIVE  | -          | Running     | public=10.68.17.181                 |
>> | huggle-d2                     | ACTIVE  | -          | Running     | public=10.68.17.194                 |
>> | icinga                        | ACTIVE  | -          | Running     | public=10.68.16.195                 |
>> | integration-raita             | ACTIVE  | -          | Running     | public=10.68.16.53                  |
>> | integration-slave-trusty-1013 | ACTIVE  | -          | Running     | public=10.68.18.28                  |
>> | integration-slave-trusty-1015 | ACTIVE  | -          | Running     | public=10.68.18.30                  |
>> | k8s-worker-02                 | ACTIVE  | -          | Running     | public=10.68.18.91                  |
>> | kartotherian1                 | ACTIVE  | -          | Running     | public=10.68.16.117                 |
>> | language-replag-slave         | SHUTOFF | -          | Shutdown    | public=10.68.16.248                 |
>> | maps-tiles2                   | ACTIVE  | -          | Running     | public=10.68.17.110                 |
>> | mobile-browser-tests          | ACTIVE  | -          | Running     | public=10.68.16.149                 |
>> | mwreview-proxy-test           | ACTIVE  | -          | Running     | public=10.68.16.83                  |
>> | osmit-cruncher1               | ACTIVE  | -          | Running     | public=10.68.17.92                  |
>> | puppet-jmm-debdeploy-precise  | ACTIVE  | -          | Running     | public=10.68.18.106                 |
>> | puppet-mailman                | ACTIVE  | -          | Running     | public=10.68.17.177                 |
>> | sentry-builder                | ACTIVE  | -          | Running     | public=10.68.18.82                  |
>> | staging-eventlogging          | ACTIVE  | -          | Running     | public=10.68.16.199                 |
>> | staging-ms-be03               | ACTIVE  | -          | Running     | public=10.68.17.249                 |
>> | staging-rdb01                 | ACTIVE  | -          | Running     | public=10.68.17.193                 |
>> | staging-tin                   | ACTIVE  | -          | Running     | public=10.68.16.110                 |
>> | stashbot-logstash             | ACTIVE  | -          | Running     | public=10.68.18.101                 |
>> | tools-bastion-02              | ACTIVE  | -          | Running     | public=10.68.16.44, 208.80.155.132  |
>> | tools-exec-1201               | ACTIVE  | -          | Running     | public=10.68.17.49, 208.80.155.203  |
>> | tools-exec-1202               | ACTIVE  | -          | Running     | public=10.68.16.57, 208.80.155.211  |
>> | tools-exec-1204               | ACTIVE  | -          | Running     | public=10.68.17.88, 208.80.155.213  |
>> | tools-exec-1206               | ACTIVE  | -          | Running     | public=10.68.17.105, 208.80.155.215 |
>> | tools-exec-1209               | ACTIVE  | -          | Running     | public=10.68.17.129, 208.80.155.218 |
>> | tools-exec-1213               | ACTIVE  | -          | Running     | public=10.68.17.252, 208.80.155.222 |
>> | tools-exec-1217               | ACTIVE  | -          | Running     | public=10.68.18.20, 208.80.155.226  |
>> | tools-exec-1218               | ACTIVE  | -          | Running     | public=10.68.18.19, 208.80.155.227  |
>> | tools-exec-1408               | ACTIVE  | -          | Running     | public=10.68.18.14, 208.80.155.152  |
>> | tools-exec-cyberbot           | ACTIVE  | -          | Running     | public=10.68.16.39                  |
>> | tools-webgrid-generic-1404    | ACTIVE  | -          | Running     | public=10.68.18.53                  |
>> | tools-webgrid-lighttpd-1409   | ACTIVE  | -          | Running     | public=10.68.18.43                  |
>> | tools-webgrid-lighttpd-1410   | ACTIVE  | -          | Running     | public=10.68.18.44                  |
>> | toolsbeta-exec-101            | ACTIVE  | -          | Running     | public=10.68.16.7                   |
>> | toolsbeta-exec-201            | ACTIVE  | -          | Running     | public=10.68.16.250                 |
>> | wikidata-mobile               | ACTIVE  | -          | Running     | public=10.68.18.41                  |
>> | wikispy                       | ACTIVE  | -          | Running     | public=10.68.17.119                 |
>> | wlmjurytool2014               | ACTIVE  | -          | Running     | public=10.68.17.134                 |
>> | wmt-exec                      | ACTIVE  | -          | Running     | public=10.68.17.236                 |
>> 
>> 
>> _______________________________________________
>> Labs-announce mailing list
>> Labs-announce at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/labs-announce
>> 
>> _______________________________________________
>> Labs-l mailing list
>> Labs-l at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/labs-l
> 
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20150810/3299687a/attachment.html>


More information about the Labs-l mailing list