[01:35:03] (03PS2) 10Huji: Change AbuseFilter block duration for fawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/358156 (https://phabricator.wikimedia.org/T167562) [02:19:58] !log l10nupdate@tin scap sync-l10n completed (1.30.0-wmf.5) (duration: 07m 04s) [02:20:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:26:06] !log l10nupdate@tin ResourceLoader cache refresh completed at Mon Jun 19 02:26:06 UTC 2017 (duration 6m 8s) [02:26:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:15:46] (03PS5) 10Reedy: [WIP] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [03:17:23] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [03:27:30] (03PS6) 10Reedy: [WIP] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [03:29:35] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [03:37:41] (03PS7) 10Reedy: [WIP] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [03:39:27] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [03:42:19] (03PS8) 10Reedy: [WIP] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [03:44:01] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [03:52:45] (03PS9) 10Reedy: [WIP] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [03:53:49] jouncebot: next [03:53:49] In 9 hour(s) and 6 minute(s): European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170619T1300) [03:54:50] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [03:57:18] (03PS10) 10Reedy: Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [03:58:56] (03CR) 10Reedy: [C: 032] rm SMW comment [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359887 (owner: 10Reedy) [03:59:55] (03CR) 10jerkins-bot: [V: 04-1] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [03:59:56] (03Merged) 10jenkins-bot: rm SMW comment [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359887 (owner: 10Reedy) [04:00:20] (03CR) 10jenkins-bot: rm SMW comment [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359887 (owner: 10Reedy) [04:00:22] (03CR) 10Reedy: [C: 032] Minor code style [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359888 (owner: 10Reedy) [04:00:24] (03PS2) 10Reedy: Minor code style [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359888 [04:05:25] !log reedy@tin Synchronized wmf-config/CommonSettings.php: Fix comments minor code style (duration: 00m 42s) [04:05:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:05:47] (03CR) 10jenkins-bot: Minor code style [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359888 (owner: 10Reedy) [04:09:29] (03PS11) 10Reedy: Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [04:11:38] (03CR) 10Reedy: Add composer test for coding standards, and try to pass (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [04:12:14] (03CR) 10jerkins-bot: [V: 04-1] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [04:15:25] (03PS2) 10Reedy: Update composer dependancies [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359891 [04:15:25] (03PS12) 10Reedy: Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [04:15:56] (03CR) 10Reedy: [C: 032] Update composer dependancies [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359891 (owner: 10Reedy) [04:17:22] (03Merged) 10jenkins-bot: Update composer dependancies [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359891 (owner: 10Reedy) [04:18:00] (03CR) 10jenkins-bot: Update composer dependancies [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359891 (owner: 10Reedy) [04:18:02] (03CR) 10jerkins-bot: [V: 04-1] Add composer test for coding standards, and try to pass [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [04:19:28] !log reedy@tin Synchronized multiversion/vendor/: Update! (duration: 01m 05s) [04:19:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:20:32] !log reedy@tin Synchronized composer.json: update (duration: 00m 41s) [04:20:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:21:21] !log reedy@tin Synchronized composer.lock: update (duration: 00m 41s) [04:21:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:23:07] (03CR) 10Reedy: "Ok, so why is jerkins still bitching?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/271936 (https://phabricator.wikimedia.org/T162835) (owner: 10Jforrester) [04:58:37] 10Operations, 10Wikimedia-General-or-Unknown, 10Tor, 10WorkType-NewFunctionality: Run our own Tor client for Tor block - https://phabricator.wikimedia.org/T32716#3358852 (10Reedy) [05:00:08] 10Operations, 10monitoring, 10Tor: Icinga check for Tor - https://phabricator.wikimedia.org/T148614#3358857 (10Reedy) [05:29:42] (03CR) 10Urbanecm: [C: 031] "Would take care about it in the next EU SWAT, LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/355881 (https://phabricator.wikimedia.org/T166437) (owner: 10Multichill) [05:51:33] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3358878 (10Marostegui) >>! In T168109#3358211, @alanajjar wrote: > @Marostegui can we doing it now? Hi, Sorry, I wasn't around on Sunday :-) Ping me today... [05:58:12] (03PS1) 10Marostegui: db-eqiad.php: Depool db1021 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359901 (https://phabricator.wikimedia.org/T166205) [06:07:05] 10Operations, 10Icinga, 10monitoring, 10Tor: Icinga check for Tor - https://phabricator.wikimedia.org/T148614#3358897 (10Peachey88) [06:15:17] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Depool db1021 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359901 (https://phabricator.wikimedia.org/T166205) (owner: 10Marostegui) [06:16:49] (03Merged) 10jenkins-bot: db-eqiad.php: Depool db1021 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359901 (https://phabricator.wikimedia.org/T166205) (owner: 10Marostegui) [06:16:57] (03CR) 10jenkins-bot: db-eqiad.php: Depool db1021 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359901 (https://phabricator.wikimedia.org/T166205) (owner: 10Marostegui) [06:18:46] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Depool db1021 - T166205 (duration: 00m 41s) [06:18:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:18:55] T166205: Convert unique keys into primary keys for some wiki tables on s2 - https://phabricator.wikimedia.org/T166205 [06:23:08] !log Deploy alter table on s2 - db1021 - T166205 [06:23:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:37:05] !log force learning cycle to db1046 controller T166141 [06:37:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:37:16] T166141: db1046 BBU looks faulty - https://phabricator.wikimedia.org/T166141 [06:38:06] !log Deploy alter table s2 - labsdb1001 - T166205 [06:38:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:38:14] T166205: Convert unique keys into primary keys for some wiki tables on s2 - https://phabricator.wikimedia.org/T166205 [06:46:17] (03PS1) 10Jcrespo: mariadb: Depool pc2005 and pc2006 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359904 (https://phabricator.wikimedia.org/T167784) [06:50:28] (03CR) 10Jcrespo: [C: 032] mariadb: Depool pc2005 and pc2006 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359904 (https://phabricator.wikimedia.org/T167784) (owner: 10Jcrespo) [06:58:23] !log installing gnutls security updates [06:58:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:02:23] (03Merged) 10jenkins-bot: mariadb: Depool pc2005 and pc2006 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359904 (https://phabricator.wikimedia.org/T167784) (owner: 10Jcrespo) [07:03:50] !log jynus@tin Synchronized wmf-config/db-codfw.php: Depool pc2005 & pc2006 (duration: 00m 41s) [07:03:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:05:17] !log upgrade, reboot and clear data on pc2005 [07:05:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:09:11] !log upgrade, reboot and clear data on pc2006 [07:09:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:10:52] !log Deploy alter table s5 - codfw master - db2023 (and will replicate) so this will generate lag on codfw slaves - T166207 [07:11:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:11:01] T166207: Convert unique keys into primary keys for some wiki tables on s5 - https://phabricator.wikimedia.org/T166207 [07:13:50] (03CR) 10jenkins-bot: mariadb: Depool pc2005 and pc2006 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359904 (https://phabricator.wikimedia.org/T167784) (owner: 10Jcrespo) [07:13:57] !log Reboot ms-be1010 [07:14:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:40:48] (03PS1) 10Jcrespo: mariadb: Repool pc2004,5,6 after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359905 (https://phabricator.wikimedia.org/T167784) [07:42:05] !log restarting app server canaries to pick up gnutls update [07:42:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:56:56] (03CR) 10Jcrespo: [C: 032] mariadb: Repool pc2004,5,6 after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359905 (https://phabricator.wikimedia.org/T167784) (owner: 10Jcrespo) [07:58:03] (03Merged) 10jenkins-bot: mariadb: Repool pc2004,5,6 after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359905 (https://phabricator.wikimedia.org/T167784) (owner: 10Jcrespo) [07:59:48] !log jynus@tin Synchronized wmf-config/db-codfw.php: Repool pc2004,5,6 after maintenance (duration: 00m 41s) [07:59:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:01:11] (03CR) 10jenkins-bot: mariadb: Repool pc2004,5,6 after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359905 (https://phabricator.wikimedia.org/T167784) (owner: 10Jcrespo) [08:09:42] 10Operations, 10Discovery, 10Traffic, 10Wikidata, and 2 others: runUpdate.sh script in wikidata stand-alone has abruptly started incurring numerous 429 errors. - https://phabricator.wikimedia.org/T168019#3359140 (10ema) 05Open>03Resolved a:03ema No `Wikidata Query Service Updater` requests have been... [08:14:11] (03PS2) 10Alexandros Kosiaris: Remove Rubocop exception for non-existent file [puppet] - 10https://gerrit.wikimedia.org/r/359477 (owner: 10Faidon Liambotis) [08:14:16] (03CR) 10Alexandros Kosiaris: [V: 032 C: 032] Remove Rubocop exception for non-existent file [puppet] - 10https://gerrit.wikimedia.org/r/359477 (owner: 10Faidon Liambotis) [08:16:41] !log Drop table titlekey on s6 - T164949 [08:16:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:16:51] T164949: Drop titlekey table from all wmf databases - https://phabricator.wikimedia.org/T164949 [08:24:40] (03PS1) 10ArielGlenn: script to batch 7z recompress revision content history files manually [dumps] - 10https://gerrit.wikimedia.org/r/359907 (https://phabricator.wikimedia.org/T168223) [08:27:19] 10Operations, 10Discovery, 10Elasticsearch, 10Discovery-Search (Current work), 10Patch-For-Review: Elasticsearch errors about BulkShardRequest - https://phabricator.wikimedia.org/T167091#3359179 (10Gehel) @dcausse, @EBernhardson: this error is now filtered in the logs. Do we want to address the root caus... [08:32:40] 10Operations, 10ops-eqiad, 10DBA: db1047 BBU RAID issues (was: Investigate db1047 replication lag) - https://phabricator.wikimedia.org/T159266#3359207 (10elukey) >>! In T159266#3351373, @Ottomata wrote: > @elukey might have other opinions, but I'm inclined to try our best to expedite the ordering of new hard... [08:34:57] 10Operations, 10ops-eqiad, 10DBA: db1047 BBU RAID issues (was: Investigate db1047 replication lag) - https://phabricator.wikimedia.org/T159266#3359235 (10Marostegui) 05stalled>03Resolved Let's close this then for now as nothing will be done at this point (and I agree with what you guys think - not worth) [08:35:22] !log Drop table title key from s2 - T164949 [08:35:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:35:33] T164949: Drop titlekey table from all wmf databases - https://phabricator.wikimedia.org/T164949 [08:43:44] 10Operations, 10ArchCom-RfC, 10Traffic, 10Services (designing): Make API usage limits easier to understand, implement, and more adaptive to varying request costs / concurrency limiting - https://phabricator.wikimedia.org/T167906#3359251 (10Marostegui) p:05Triage>03Normal [08:44:48] 10Operations, 10DNS, 10Traffic: en.wiki domain owned by us, but isn't hosted by us?? - https://phabricator.wikimedia.org/T167060#3359252 (10Marostegui) p:05Triage>03Low [08:45:42] 10Operations, 10DBA, 10cloud-services-team, 10Patch-For-Review, 10Wikimedia-log-errors: Terbium cronjobs attempting to connect to labstestweb2001 - https://phabricator.wikimedia.org/T167961#3359253 (10jcrespo) [08:46:55] 10Operations, 10DBA, 10cloud-services-team, 10Patch-For-Review, 10Wikimedia-log-errors: Terbium cronjobs attempting to connect to labstestweb2001 - https://phabricator.wikimedia.org/T167961#3351232 (10jcrespo) I have added 2 temporary accounts to connect from terbium and wasat as the admin users. [09:07:29] !log restart ircecho on einsteinium, was not notifying due to a thrown exception [09:07:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:08:32] 10Operations, 10ops-eqiad, 10User-fgiunchedi: Debug HP raid cache disabled errors on ms-be1019/20/21 - https://phabricator.wikimedia.org/T163777#3359322 (10fgiunchedi) @Cmjohnson today sounds good, ping me here or on IRC [09:10:57] (03PS1) 10Alexandros Kosiaris: Renumber neon.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/359908 (https://phabricator.wikimedia.org/T162040) [09:10:59] (03PS1) 10Alexandros Kosiaris: Renumber etcd100{2,3,4,5,6}.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/359909 (https://phabricator.wikimedia.org/T162040) [09:11:01] (03PS1) 10Alexandros Kosiaris: Renumber chlorine.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/359910 (https://phabricator.wikimedia.org/T162040) [09:11:04] RECOVERY - DPKG on achernar is OK: All packages OK [09:11:33] !log upgrading achernar's BIOS from 1.2.4 to 2.4.2 hoping it will address recurring CPU throttling issue (T162850) [09:11:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:11:41] T162850: CPU throttling on DELL PowerEdge R320 - https://phabricator.wikimedia.org/T162850 [09:12:15] (03CR) 10Hashar: "I did review your patch and haven't noticed the "stop" either :-]" [puppet] - 10https://gerrit.wikimedia.org/r/358959 (https://phabricator.wikimedia.org/T146464) (owner: 10Hashar) [09:13:47] !log rebooting achernar to address CPU throttling and apply the BIOS update [09:13:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:14:50] 10Operations, 10User-fgiunchedi: Decommission ms-be1001 - ms-be1012 - https://phabricator.wikimedia.org/T166489#3359347 (10fgiunchedi) @Dzahn almost, I'm running the last swift ring rebalance today. ETA is two/three days, I'll update/reassign this task once the machines are good to decom! [09:15:44] PROBLEM - Host achernar is DOWN: PING CRITICAL - Packet loss = 100% [09:16:34] PROBLEM - eventlogging-service-eventbus endpoints health on kafka2003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:17:14] PROBLEM - puppet last run on maps-test2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [09:17:30] maybe they didn't like that achenar went away [09:17:34] RECOVERY - eventlogging-service-eventbus endpoints health on kafka2003 is OK: All endpoints are healthy [09:18:32] !log swift eqiad-prod: remove ms-be1001 - ms-be1012 - T166489 [09:18:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:18:41] T166489: Decommission ms-be1001 - ms-be1012 - https://phabricator.wikimedia.org/T166489 [09:21:30] elukey: this happened another time IIRC, not sure if a task was filed [09:21:43] and/or the hosts/services are the same [09:22:35] yep I remember [09:22:45] but I think it was for restbase hosts [09:23:01] this one is eventlogging/eventbus [09:23:08] (03PS2) 10Alexandros Kosiaris: Renumber neon.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/359908 (https://phabricator.wikimedia.org/T162040) [09:23:10] (03PS2) 10Alexandros Kosiaris: Renumber etcd100{2,3,4,5}.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/359909 (https://phabricator.wikimedia.org/T162040) [09:23:12] (03PS2) 10Alexandros Kosiaris: Renumber chlorine.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/359910 (https://phabricator.wikimedia.org/T162040) [09:23:35] 10Operations, 10ops-eqiad, 10Analytics-Kanban, 10DBA, 10User-Elukey: db1046 BBU looks faulty - https://phabricator.wikimedia.org/T166141#3359355 (10Marostegui) @elukey looks like the BBU is now almost completely dead. After Jaime's relearn attempt, almost 3 hours ago the battery status hasn't changed: ``... [09:25:41] akosiaris: was that you? ^^^ [09:25:53] ? [09:26:02] icinga-wm quit [09:26:16] nope [09:26:26] 10Operations, 10Patch-For-Review: CPU throttling on DELL PowerEdge R320 - https://phabricator.wikimedia.org/T162850#3177068 (10faidon) This happened again today with achernar. This is clearly a hardware/firmware issue; however, all of the servers listed in the task description (including achernar) run an outda... [09:26:32] Active: active (running) since Mon 2017-06-19 09:03:39 UTC; 22min ago [09:26:40] yeah, I was looking now [09:27:26] it looks like the tcp connection is established [09:28:26] looks like it's working fine [09:28:29] the process I mean [09:28:44] strace say's it's on the usual select loop [09:28:47] weird [09:29:56] nothing on tcpdump though [09:30:02] just reconnected [09:30:13] with the IRC server IP [09:30:32] it changed IRC server, might be unrelated [09:30:33] !log Deploy alter table on s2 - dbstore1001 - T166205 [09:30:37] yeah [09:30:41] and that server died for real :D [09:30:41] that's what I am thinking too [09:30:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:30:43] just bad timing [09:30:43] T166205: Convert unique keys into primary keys for some wiki tables on s2 - https://phabricator.wikimedia.org/T166205 [09:30:55] sorry for the bothering ;) [09:31:13] no worries [09:31:44] we lost the notification for mr1-esams.oob [09:31:46] down [09:32:14] RECOVERY - Host mr1-esams.oob is UP: PING OK - Packet loss = 0%, RTA = 81.64 ms [09:32:24] PROBLEM - IPv4 ping to eqiad on ripe-atlas-eqiad is CRITICAL: CRITICAL - failed 79 probes of 433 (alerts on 19) - https://atlas.ripe.net/measurements/1790945/#!map [09:33:34] !log temporarily stop dbstore1002:s3 and db1015 to fix srwiki [09:33:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:34:28] (03PS1) 10ArielGlenn: no more printing of pid, return code when running a command [dumps] - 10https://gerrit.wikimedia.org/r/359912 [09:35:30] (03CR) 10ArielGlenn: [C: 032] no more printing of pid, return code when running a command [dumps] - 10https://gerrit.wikimedia.org/r/359912 (owner: 10ArielGlenn) [09:35:49] (03PS1) 10Alexandros Kosiaris: neon: Add IPv6 mapped address [puppet] - 10https://gerrit.wikimedia.org/r/359913 [09:36:56] (03CR) 10Alexandros Kosiaris: [V: 032 C: 032] neon: Add IPv6 mapped address [puppet] - 10https://gerrit.wikimedia.org/r/359913 (owner: 10Alexandros Kosiaris) [09:37:24] RECOVERY - IPv4 ping to eqiad on ripe-atlas-eqiad is OK: OK - failed 0 probes of 433 (alerts on 19) - https://atlas.ripe.net/measurements/1790945/#!map [09:39:58] 10Operations, 10ops-eqiad, 10Analytics, 10Analytics-Cluster: rack/setup/install new kafka nodes - https://phabricator.wikimedia.org/T167992#3359422 (10elukey) My preference would be either `jumbo` or `aggregate` (the latter sounds better probably) [09:40:54] RECOVERY - NTP peers on achernar is OK: NTP OK: Offset 0.005369 secs [09:45:14] RECOVERY - puppet last run on maps-test2001 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [09:46:41] (03PS4) 10DCausse: [WIP] Add ltr-query 0.1.1 snapshot [software/elasticsearch/plugins] - 10https://gerrit.wikimedia.org/r/359359 (owner: 10EBernhardson) [09:52:54] 10Operations, 10Commons, 10Multimedia, 10Traffic, and 2 others: Disable serving unpatrolled new files to Wikipedia Zero users - https://phabricator.wikimedia.org/T167400#3331800 (10fgiunchedi) >>! In T167400#3355774, @Bawolff wrote: >>>! In T167400#3354974, @BBlack wrote: >> However, this carries the cave... [09:55:32] (03PS3) 10Alexandros Kosiaris: Renumber neon.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/359908 (https://phabricator.wikimedia.org/T162040) [09:55:34] (03PS3) 10Alexandros Kosiaris: Renumber etcd100{2,3,4,5}.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/359909 (https://phabricator.wikimedia.org/T162040) [09:55:36] (03PS3) 10Alexandros Kosiaris: Renumber chlorine.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/359910 (https://phabricator.wikimedia.org/T162040) [09:55:55] !log restarting elasticsearch on relforge1* to pickup new snapshot of the ltr plugin [09:55:58] (03CR) 10Thiemo Mättig (WMDE): [C: 031] "This is blocking our review column for about a week now. :-(" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359135 (https://phabricator.wikimedia.org/T167126) (owner: 10Lucas Werkmeister (WMDE)) [09:56:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:56:47] !log migrate neon.eqiad.wmnet to ganeti01.svc.eqiad.wmnet's row_A nodegroup [09:56:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:57:00] 10Operations, 10netops: codfw row D switch upgrade - https://phabricator.wikimedia.org/T167274#3359531 (10Marostegui) Is this happening tomorrow then? [10:04:51] 10Operations, 10Commons, 10Multimedia, 10Traffic, and 2 others: Disable serving unpatrolled new files to Wikipedia Zero users - https://phabricator.wikimedia.org/T167400#3359557 (10Bawolff) > re: the file header I'm assuming it'd be set for both the original and carried over to thumbnails, and purged from... [10:07:54] PROBLEM - MegaRAID on db1046 is CRITICAL: CRITICAL: 1 LD(s) must have write cache policy WriteBack, currently using: WriteThrough [10:13:54] marostegui: --^ shall we apply an override in hiera for this? [10:15:42] !log roll-upgrade swift to 2.10 on to ms-fe1* - T162609 [10:15:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:15:51] T162609: Swift version and distro upgrade - https://phabricator.wikimedia.org/T162609 [10:17:11] (03CR) 10Alexandros Kosiaris: [C: 032] Renumber neon.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/359908 (https://phabricator.wikimedia.org/T162040) (owner: 10Alexandros Kosiaris) [10:27:54] RECOVERY - MegaRAID on db1046 is OK: OK: optimal, 1 logical, 2 physical, WriteBack policy [10:32:16] elukey: there you go ^^^ :-P [10:33:43] :D [11:00:52] 10Operations, 10ops-eqiad, 10hardware-requests, 10User-Joe: Decommission mw1170-mw1179 - https://phabricator.wikimedia.org/T168271#3359900 (10Joe) [11:01:15] <_joe_> !log depooling mw1170-mw1179 for decommissioning, T168271 [11:01:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:01:25] T168271: Decommission mw1170-mw1179 - https://phabricator.wikimedia.org/T168271 [11:04:25] (03PS1) 10Giuseppe Lavagetto: role::mediawiki::appservers: move mw1170-1179 to role::spare::system [puppet] - 10https://gerrit.wikimedia.org/r/359920 (https://phabricator.wikimedia.org/T168271) [11:18:19] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3360091 (10alanajjar) >>! In T168109#3358878, @Marostegui wrote: >>>! In T168109#3358211, @alanajjar wrote: >> @Marostegui can we doing it now? > > Hi, > >... [11:19:55] !log rebooting cp3007 for kernel update [11:20:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:22:54] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3360112 (10Marostegui) >>! In T168109#3360091, @alanajjar wrote: >>>! In T168109#3358878, @Marostegui wrote: >>>>! In T168109#3358211, @alanajjar wrote: >>> @M... [11:26:21] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3360127 (10alanajjar) @Marostegui commons ([[ https://meta.wikimedia.org/w/index.php?title=Special:CentralAuth&target=Smuconlaw | see here]]) [11:28:19] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3360161 (10Marostegui) >>! In T168109#3360127, @alanajjar wrote: > @Marostegui > > commons ([[ https://meta.wikimedia.org/w/index.php?title=Special:CentralAut... [11:31:30] !log restarting replication on dbstore1002:s3 and db1015 [11:31:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:31:42] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3360164 (10alanajjar) @Marostegui All thanks for you. And [[https://meta.wikimedia.org/wiki/Special:GlobalRenameProgress/Sgconlaw |we start]]! [11:33:06] PROBLEM - PyBal backends health check on lvs2006 is CRITICAL: PYBAL CRITICAL - prometheus_80 - Could not depool server prometheus2003.codfw.wmnet because of too many down! [11:33:54] !log Rename user Smuconlaw → Sgconlaw - T168109 [11:34:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:34:03] T168109: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109 [11:34:06] RECOVERY - PyBal backends health check on lvs2006 is OK: PYBAL OK - All pools are healthy [11:35:58] PROBLEM - mediawiki-installation DSH group on mw1173 is CRITICAL: Host mw1173 is not in mediawiki-installation dsh group [11:36:36] PROBLEM - mediawiki-installation DSH group on mw1174 is CRITICAL: Host mw1174 is not in mediawiki-installation dsh group [11:37:16] PROBLEM - mediawiki-installation DSH group on mw1175 is CRITICAL: Host mw1175 is not in mediawiki-installation dsh group [11:37:46] PROBLEM - mediawiki-installation DSH group on mw1176 is CRITICAL: Host mw1176 is not in mediawiki-installation dsh group [11:38:16] PROBLEM - mediawiki-installation DSH group on mw1177 is CRITICAL: Host mw1177 is not in mediawiki-installation dsh group [11:38:56] PROBLEM - mediawiki-installation DSH group on mw1178 is CRITICAL: Host mw1178 is not in mediawiki-installation dsh group [11:39:16] PROBLEM - mediawiki-installation DSH group on mw1170 is CRITICAL: Host mw1170 is not in mediawiki-installation dsh group [11:39:26] PROBLEM - mediawiki-installation DSH group on mw1179 is CRITICAL: Host mw1179 is not in mediawiki-installation dsh group [11:39:46] PROBLEM - mediawiki-installation DSH group on mw1171 is CRITICAL: Host mw1171 is not in mediawiki-installation dsh group [11:40:26] PROBLEM - mediawiki-installation DSH group on mw1172 is CRITICAL: Host mw1172 is not in mediawiki-installation dsh group [11:40:59] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3360208 (10Marostegui) commons finished without any major delays :-) Next big one will be enwiki (11k) [11:46:30] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3360224 (10alanajjar) >>! In T168109#3360208, @Marostegui wrote: > commons finished without any major delays :-) > Next big one will be enwiki (11k) Thanks @M... [11:47:10] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3360226 (10Marostegui) Let's wait until everything is finished, no? As per: https://meta.wikimedia.org/wiki/Special:GlobalRenameProgress/Sgconlaw there is stil... [11:49:04] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3360241 (10alanajjar) >>! In T168109#3360226, @Marostegui wrote: > Let's wait until everything is finished, no? > As per: https://meta.wikimedia.org/wiki/Speci... [11:50:43] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3360244 (10Marostegui) Yes, let's wait until it is fully finished - shouldn't take long anyways as most of them are pretty small indeed Thanks :-) [11:53:26] PROBLEM - Disk space on mx2001 is CRITICAL: DISK CRITICAL - /var/spool/exim4/scan is not accessible: Permission denied [11:54:26] RECOVERY - Disk space on mx2001 is OK: DISK OK [12:01:22] 10Operations, 10MediaWiki-JobRunner, 10Beta-Cluster-reproducible, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): jobrunner / jobchron systemd services are in error state after a stop - https://phabricator.wikimedia.org/T168044#3360277 (10hashar) [12:04:42] !log run 'echo "autoLearnMode=1" > /tmp/disable_learn && megacli -AdpBbuCmd -SetBbuProperties -f /tmp/disable_learn -a0' on all the analytics workers to disable BBU Auto learn - T167809 [12:04:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:04:52] T167809: New analytic hosts with BBU learning cycle enabled - https://phabricator.wikimedia.org/T167809 [12:05:36] (03Abandoned) 10Hashar: jobrunner: add exit codes to services units [puppet] - 10https://gerrit.wikimedia.org/r/357362 (https://phabricator.wikimedia.org/T168044) (owner: 10Hashar) [12:07:18] (03CR) 10Alexandros Kosiaris: [C: 032] Renumber etcd100{2,3,4,5}.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/359909 (https://phabricator.wikimedia.org/T162040) (owner: 10Alexandros Kosiaris) [12:16:58] 10Operations, 10Traffic, 10Wikimedia-Blog, 10HTTPS: Change automatic shortlink in blog theme - https://phabricator.wikimedia.org/T165511#3360320 (10Volker_E) Patch in https://github.com/wikimedia/wikimediablog-wordpresscom/pull/8 [12:17:20] 10Operations, 10Analytics-Kanban, 10User-Elukey: New analytic hosts with BBU learning cycle enabled - https://phabricator.wikimedia.org/T167809#3360322 (10elukey) Updated https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration with instructions on how to set the BBU auto-learn fun... [12:17:37] 10Operations, 10Analytics-Kanban, 10User-Elukey: New analytic hosts with BBU learning cycle enabled - https://phabricator.wikimedia.org/T167809#3360324 (10elukey) [12:17:56] 10Operations, 10MediaWiki-JobRunner, 10Beta-Cluster-reproducible, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): jobrunner / jobchron systemd services are in error state after a stop - https://phabricator.wikimedia.org/T168044#3360325 (10hashar) a:03hashar [12:18:36] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3360329 (10alanajjar) >>! In T168109#3360244, @Marostegui wrote: > Yes, let's wait until it is fully finished - shouldn't take long anyways as most of them are... [12:33:40] (03CR) 10Elukey: [C: 031] role::mediawiki::appservers: move mw1170-1179 to role::spare::system [puppet] - 10https://gerrit.wikimedia.org/r/359920 (https://phabricator.wikimedia.org/T168271) (owner: 10Giuseppe Lavagetto) [12:35:47] 10Operations, 10DBA, 10Wikimedia-Site-requests: Global rename of Smuconlaw → Sgconlaw: supervision needed - https://phabricator.wikimedia.org/T168109#3360352 (10Marostegui) 05Open>03Resolved a:03Marostegui Thanks @alanajjar!! [12:36:05] (03PS2) 10ArielGlenn: script to batch 7z recompress revision content history files manually [dumps] - 10https://gerrit.wikimedia.org/r/359907 (https://phabricator.wikimedia.org/T168223) [12:36:11] !log akosiaris@puppetmaster1001 conftool action : set/pooled=inactive; selector: name=chlorine.eqiad.wmnet [12:36:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:40:53] (03Abandoned) 10Giuseppe Lavagetto: cache: add monitoring of services at the SSL termination level [puppet] - 10https://gerrit.wikimedia.org/r/357805 (https://phabricator.wikimedia.org/T167048) (owner: 10Giuseppe Lavagetto) [12:41:52] 10Operations, 10Analytics-Kanban, 10User-Elukey: New analytic hosts with BBU learning cycle enabled - https://phabricator.wikimedia.org/T167809#3360362 (10elukey) ``` elukey@neodymium:~$ sudo cumin 'R:class = role::analytics_cluster::hadoop::worker' 'megacli -AdpBbuCmd -GetBbuProperties -aALL -nolog | grep "... [12:42:19] (03CR) 10Giuseppe Lavagetto: [C: 032] role::mediawiki::appservers: move mw1170-1179 to role::spare::system [puppet] - 10https://gerrit.wikimedia.org/r/359920 (https://phabricator.wikimedia.org/T168271) (owner: 10Giuseppe Lavagetto) [12:45:59] 10Operations, 10MediaWiki-JobRunner, 10Beta-Cluster-reproducible, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): jobrunner / jobchron systemd services are in error state after a stop - https://phabricator.wikimedia.org/T168044#3360375 (10hashar) p:05Normal>03Low [12:46:27] (03PS2) 10DCausse: [cleanup] remove old interwiki search config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/357642 [12:47:17] 10Operations, 10ops-eqiad, 10hardware-requests, 10Patch-For-Review, 10User-Joe: Decommission mw1170-mw1179 - https://phabricator.wikimedia.org/T168271#3360394 (10Joe) [12:48:02] (03PS1) 10Gilles: Deloy Thumbor to all Wikipedias [puppet] - 10https://gerrit.wikimedia.org/r/359931 (https://phabricator.wikimedia.org/T167794) [12:48:26] (03CR) 10Gehel: [V: 032 C: 032] maps - add dummy redis password for tilerator / tileratorui [labs/private] - 10https://gerrit.wikimedia.org/r/358950 (https://phabricator.wikimedia.org/T167871) (owner: 10Gehel) [12:49:17] (03PS2) 10Gilles: Deploy Thumbor to all Wikipedias [puppet] - 10https://gerrit.wikimedia.org/r/359931 (https://phabricator.wikimedia.org/T167794) [12:49:50] 10Operations, 10ops-eqiad, 10hardware-requests, 10Patch-For-Review, 10User-Joe: Decommission mw1170-mw1179 - https://phabricator.wikimedia.org/T168271#3360405 (10Joe) [12:50:32] 10Operations, 10ops-eqiad, 10hardware-requests, 10Patch-For-Review, 10User-Joe: Decommission mw1170-mw1179 - https://phabricator.wikimedia.org/T168271#3359900 (10Joe) @Cmjohnson please proceed to decom/derack these servers and rack new ones in their place. [12:51:20] can I ack the dsh alerts so they do not appear on unhandled? [12:51:25] (03PS2) 10Lucas Werkmeister (WMDE): Configure WikibaseQualityConstraints extension [mediawiki-config] - 10https://gerrit.wikimedia.org/r/358553 [12:53:13] (03CR) 10Alexandros Kosiaris: [C: 032] Renumber chlorine.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/359910 (https://phabricator.wikimedia.org/T162040) (owner: 10Alexandros Kosiaris) [12:54:11] (03PS5) 10DCausse: [WIP] Add ltr-query 0.1.1 snapshot [software/elasticsearch/plugins] - 10https://gerrit.wikimedia.org/r/359359 (owner: 10EBernhardson) [12:54:23] !log restarting elasticsearch on relforge1* to pickup new snapshot of the ltr plugin [12:54:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:00:05] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, and thcipriani: Dear anthropoid, the time has come. Please deploy European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170619T1300). [13:00:05] DatGuy, James_F, Urbanecm, and dcausse: A patch you scheduled for European Mid-day SWAT(Max 8 patches) is about to be deployed. Please be available during the process. [13:00:13] I'm here [13:00:15] o/ [13:00:20] present [13:01:21] (03CR) 10Hashar: "I am not sure how much the plwiki community is aware about this change. That doesn't quite reflect on T162849." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359514 (https://phabricator.wikimedia.org/T162849) (owner: 10Jforrester) [13:01:25] (03PS4) 10Hashar: Adding the domain for the Bayerische Staatsgemäldesammlungen [mediawiki-config] - 10https://gerrit.wikimedia.org/r/355881 (https://phabricator.wikimedia.org/T166437) (owner: 10Multichill) [13:01:27] (03PS2) 10Hashar: Add sandbox link for dtywiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359387 (https://phabricator.wikimedia.org/T168038) (owner: 10DatGuy) [13:01:36] I am doing https://gerrit.wikimedia.org/r/#/c/355881/3 [13:01:40] which is straightforward [13:01:56] (03CR) 10Hashar: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/355881 (https://phabricator.wikimedia.org/T166437) (owner: 10Multichill) [13:02:25] 10Operations, 10ops-eqiad, 10User-Elukey, 10User-Joe: rack/setup/install conf1004-conf1006 - https://phabricator.wikimedia.org/T166081#3360432 (10elukey) [13:03:11] (03CR) 10Hashar: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/355881 (https://phabricator.wikimedia.org/T166437) (owner: 10Multichill) [13:03:24] DatGuy: then I am doing the sandbox link [13:03:29] (03CR) 10Hashar: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359387 (https://phabricator.wikimedia.org/T168038) (owner: 10DatGuy) [13:03:32] cheers [13:03:35] (03CR) 10Hashar: [C: 032] Add sandbox link for dtywiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359387 (https://phabricator.wikimedia.org/T168038) (owner: 10DatGuy) [13:03:38] grbblbl [13:05:21] (03Merged) 10jenkins-bot: Adding the domain for the Bayerische Staatsgemäldesammlungen [mediawiki-config] - 10https://gerrit.wikimedia.org/r/355881 (https://phabricator.wikimedia.org/T166437) (owner: 10Multichill) [13:05:23] (03Merged) 10jenkins-bot: Add sandbox link for dtywiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359387 (https://phabricator.wikimedia.org/T168038) (owner: 10DatGuy) [13:05:55] DatGuy: do you have the extension to test on mwdebug1001 ? [13:05:55] (03CR) 10jenkins-bot: Adding the domain for the Bayerische Staatsgemäldesammlungen [mediawiki-config] - 10https://gerrit.wikimedia.org/r/355881 (https://phabricator.wikimedia.org/T166437) (owner: 10Multichill) [13:06:00] yep [13:06:07] I have pulled it there :] [13:07:18] looks good [13:07:20] great [13:08:31] !log hashar@tin Synchronized wmf-config/InitialiseSettings.php: Add sandbox link for dtywiki - T168038 (duration: 00m 42s) [13:08:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:08:42] T168038: Sandbox for Doteli(dty) Wikipedia - https://phabricator.wikimedia.org/T168038 [13:08:54] DatGuy: if all goes fine. I guess you can mark https://phabricator.wikimedia.org/T168038 as resolved [13:09:22] (03PS3) 10Hashar: [cleanup] remove old interwiki search config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/357642 (owner: 10DCausse) [13:09:52] yeah it seems fine w/out mw1001 [13:10:03] it is on all wikis now :) [13:10:13] dcausse: wanna deploy your change yourself? :] [13:10:20] hashar: sure [13:10:26] (03CR) 10Hashar: [C: 031] [cleanup] remove old interwiki search config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/357642 (owner: 10DCausse) [13:10:26] Hey, sorry, here now. [13:10:30] ah [13:10:34] James_F: good morning! [13:10:45] Was distracted. :-) [13:11:28] James_F: so somehow https://phabricator.wikimedia.org/T162849 states that the change has some visual glitch on monobook (some margin is too large) [13:11:59] and I am not sure how the plwiki community got contacted about it. Then it is just some button changes so most probably it is not going to be too much of an issue? :] [13:13:46] hashar: Yeah, let's go ahead [13:13:47] . [13:14:33] (03CR) 10Hashar: [C: 032] "clarified with james" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359514 (https://phabricator.wikimedia.org/T162849) (owner: 10Jforrester) [13:14:52] James_F: wanna test it on mwdebug1001 first ?:] [13:15:11] Sure. [13:16:05] 10Operations, 10Icinga, 10monitoring, 10Patch-For-Review: Icinga randomly forgets downtimes, causing alert and page spam - https://phabricator.wikimedia.org/T164206#3360470 (10jcrespo) I have not seen downtimes getting lots since the switch, but I confirm I have seen a probably related issue: services losi... [13:17:06] is it possible to add something to swat or do it when everyone is done? [13:17:07] (03CR) 10Gehel: [V: 032 C: 032] [WIP] Add ltr-query 0.1.1 snapshot [software/elasticsearch/plugins] - 10https://gerrit.wikimedia.org/r/359359 (owner: 10EBernhardson) [13:17:13] just a config change [13:17:16] (03PS2) 10Hashar: Enable OOjs UI buttons on EditPage for plwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359514 (https://phabricator.wikimedia.org/T162849) (owner: 10Jforrester) [13:17:17] bah [13:17:28] (03CR) 10Hashar: [C: 032] Enable OOjs UI buttons on EditPage for plwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359514 (https://phabricator.wikimedia.org/T162849) (owner: 10Jforrester) [13:17:31] had to +2 again [13:17:41] I blame Jenkins. ;-) [13:17:47] aude: yes please add :) [13:17:59] ok [13:18:16] https://gerrit.wikimedia.org/r/#/c/359135/ [13:18:19] i'll add to the wiki [13:18:31] (03Merged) 10jenkins-bot: Enable OOjs UI buttons on EditPage for plwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359514 (https://phabricator.wikimedia.org/T162849) (owner: 10Jforrester) [13:19:01] James_F: ok it is on mwdebug1001 [13:19:15] and surely that looks oojs enabled to me :) [13:20:09] hashar: Yeah, LGTM. [13:20:10] dcausse: waiting for confirmation and I guess you can do your change [13:21:08] !log hashar@tin Synchronized wmf-config/InitialiseSettings.php: Enable OOjs UI buttons on EditPage for plwiki - T162849 (duration: 00m 42s) [13:21:10] James_F: done ! [13:21:15] dcausse: your turn [13:21:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:21:18] T162849: Support WMF communities in run-up to switching EditPage over to OOUI - https://phabricator.wikimedia.org/T162849 [13:21:23] hashar: ok, deploying [13:21:26] Thank you! [13:22:05] (03PS4) 10DCausse: [cleanup] remove old interwiki search config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/357642 [13:22:20] (03PS1) 10Nschaaf: Enable Reader Survey using QuickSurveys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359936 (https://phabricator.wikimedia.org/T131949) [13:23:48] (03CR) 10DCausse: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/357642 (owner: 10DCausse) [13:24:07] (03PS2) 10Nschaaf: Enable Reader Survey using QuickSurveys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359936 (https://phabricator.wikimedia.org/T131949) [13:25:01] (03Merged) 10jenkins-bot: [cleanup] remove old interwiki search config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/357642 (owner: 10DCausse) [13:28:34] !log dcausse@tin Synchronized wmf-config/CirrusSearch-labs.php: [cleanup] remove old interwiki search config (duration: 00m 41s) [13:28:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:29:29] (03PS2) 10Hashar: Add “Constraints” section on test.wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359135 (https://phabricator.wikimedia.org/T167126) (owner: 10Lucas Werkmeister (WMDE)) [13:29:43] !log dcausse@tin Synchronized wmf-config/InitialiseSettings.php: [cleanup] remove old interwiki search config (duration: 00m 41s) [13:29:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:29:53] aude: I'm done [13:29:56] aude: I have rebased your patch [13:29:58] ok [13:30:06] if you want i can deploy it [13:30:06] not sure in which order to sync the files though [13:30:10] yes please do :) [13:30:26] ok [13:31:00] (03CR) 10Aude: [C: 032] Add “Constraints” section on test.wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359135 (https://phabricator.wikimedia.org/T167126) (owner: 10Lucas Werkmeister (WMDE)) [13:31:23] (03PS1) 10Mforns: Change EL purging script to avoid limit/offset [puppet] - 10https://gerrit.wikimedia.org/r/359938 (https://phabricator.wikimedia.org/T168071) [13:31:33] i think the production one first then the other file (where we remove the setting) [13:31:39] aude: actually that might be problematic [13:32:02] (03Merged) 10jenkins-bot: Add “Constraints” section on test.wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359135 (https://phabricator.wikimedia.org/T167126) (owner: 10Lucas Werkmeister (WMDE)) [13:32:05] the Wikibase.php sets the parrent key ['statementSections'] = [ 'item' => ] [13:32:29] so potentially that could override the ['property'] key set via Wikibase-production.php [13:33:20] looking [13:33:41] we can move the whole thing [13:34:47] aude: ah that should be fine sorry [13:34:59] the require Wikibase-production.php happens at the end [13:35:23] looking [13:36:46] yeah [13:36:55] i'll try on mwdebug [13:37:00] +1 [13:39:10] looks ok on wikidata [13:39:15] checking test.wikidata [13:41:25] it's good :) [13:42:36] !log aude@tin Synchronized wmf-config/Wikibase-production.php: Add constraints section to property pages on test.wikidata (duration: 00m 41s) [13:42:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:44:00] !log aude@tin Synchronized wmf-config/Wikibase.php: Remove old constraints section config (duration: 00m 41s) [13:44:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:44:35] \O/ [13:44:46] looks good [13:44:54] !log European SWAT completed [13:45:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:45:27] thanks [13:46:58] (03PS2) 10Giuseppe Lavagetto: role:jobqueue_redis: daily restart of slaves [puppet] - 10https://gerrit.wikimedia.org/r/357193 [13:48:04] !log fixing salt minion setup on wtp1047 [13:48:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:48:39] !log deploying latest elasticsearch plugin (ltr plugin) [13:48:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:50:16] (03Abandoned) 10Giuseppe Lavagetto: etcd: invert replication [puppet] - 10https://gerrit.wikimedia.org/r/353232 (owner: 10Giuseppe Lavagetto) [13:58:23] !log remove decommissioned nodes from redis / trebuchet for elasticsearch/plugins [13:58:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:01:26] !log restarting elasticsearch / relforge for ltr plugin deployment [14:01:30] (03PS1) 10Marostegui: Revert "db-eqiad.php: Depool db1070" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359940 [14:01:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:01:35] (03PS6) 10Giuseppe Lavagetto: role::lvs::balancer: convert to role/profile (step 1) [puppet] - 10https://gerrit.wikimedia.org/r/357824 [14:01:37] (03PS9) 10Giuseppe Lavagetto: role::lvs::balancer: refactor to role/profile (step 2) [puppet] - 10https://gerrit.wikimedia.org/r/357863 [14:01:39] (03PS2) 10Giuseppe Lavagetto: role::lvs::balancer: also manage interface tagging [puppet] - 10https://gerrit.wikimedia.org/r/358027 [14:01:41] (03CR) 10jerkins-bot: [V: 04-1] Revert "db-eqiad.php: Depool db1070" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359940 (owner: 10Marostegui) [14:02:06] (03Abandoned) 10Marostegui: Revert "db-eqiad.php: Depool db1070" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359940 (owner: 10Marostegui) [14:03:43] (03PS1) 10Marostegui: db-eqiad.php: Repool db1070 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359941 [14:03:45] (03CR) 10jerkins-bot: [V: 04-1] role::lvs::balancer: also manage interface tagging [puppet] - 10https://gerrit.wikimedia.org/r/358027 (owner: 10Giuseppe Lavagetto) [14:03:57] 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install ganeti1005-ganeti1008 - https://phabricator.wikimedia.org/T166076#3360654 (10akosiaris) a:05RobH>03Cmjohnson [14:05:23] (03CR) 10Marostegui: [C: 032] db-eqiad.php: Repool db1070 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359941 (owner: 10Marostegui) [14:05:46] (03PS3) 10Giuseppe Lavagetto: role::lvs::balancer: also manage interface tagging [puppet] - 10https://gerrit.wikimedia.org/r/358027 [14:05:59] !log starting cluster restart on elasticsearch / cirrus / eqiad for ltr plugin deployment [14:06:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:06:27] (03Merged) 10jenkins-bot: db-eqiad.php: Repool db1070 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359941 (owner: 10Marostegui) [14:08:20] !log marostegui@tin Synchronized wmf-config/db-eqiad.php: Repool db1070 - T153743 (duration: 00m 41s) [14:08:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:08:30] T153743: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743 [14:12:19] 10Operations, 10MediaWiki-extensions-PageAssessments, 10Wikimedia-General-or-Unknown, 10Patch-For-Review: foreachwikiindblist regular cronspam - https://phabricator.wikimedia.org/T159438#3360673 (10jcrespo) If it is happening, it is only on labtestweb. [14:12:50] 10Operations, 10MediaWiki-extensions-PageAssessments, 10Wikimedia-General-or-Unknown, 10Patch-For-Review: foreachwikiindblist regular cronspam - https://phabricator.wikimedia.org/T159438#3360678 (10jcrespo) [14:12:55] 10Operations, 10DBA, 10cloud-services-team, 10Patch-For-Review, 10Wikimedia-log-errors: Terbium cronjobs attempting to connect to labstestweb2001 - https://phabricator.wikimedia.org/T167961#3360676 (10jcrespo) [14:15:20] (03CR) 10Elukey: [C: 031] role:jobqueue_redis: daily restart of slaves [puppet] - 10https://gerrit.wikimedia.org/r/357193 (owner: 10Giuseppe Lavagetto) [14:20:23] (03PS2) 10Pmiazga: Remove unused wgPopupsAPIUseRESTBase config variable [mediawiki-config] - 10https://gerrit.wikimedia.org/r/358415 (https://phabricator.wikimedia.org/T165018) [14:24:10] !log roll-upgrade swift to 2.10 on ms-be10[22-30] - T162609 [14:24:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:24:21] T162609: Swift version and distro upgrade - https://phabricator.wikimedia.org/T162609 [14:25:50] 10Operations, 10Traffic, 10Wikimedia-General-or-Unknown: Impending load test - https://phabricator.wikimedia.org/T167920#3360717 (10Haiku-narrative) Actually, you missed it. In truth, after looking at the APIs that you had referenced, a bunch of discussion, we backed down from the majority of our lod testin... [14:28:06] 10Operations, 10Traffic, 10Wikimedia-General-or-Unknown: Impending load test - https://phabricator.wikimedia.org/T167920#3360719 (10BBlack) We've also been tweaking and tuning our ratelimits in general to try to find a happy medium. Both of the API endpoints should now be limiting at the same rate of 1000 r... [14:28:48] (03CR) 10Giuseppe Lavagetto: [C: 031] "https://puppet-compiler.wmflabs.org/6799/ a full run shows only expected, not-affecting changes." (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/357824 (owner: 10Giuseppe Lavagetto) [14:30:39] 10Operations, 10Traffic, 10Wikimedia-General-or-Unknown: Impending load test - https://phabricator.wikimedia.org/T167920#3360721 (10Marostegui) 05Open>03Resolved >>! In T167920#3360717, @Haiku-narrative wrote: > Actually, you missed it. In truth, after looking at the APIs that you had referenced, a bunc... [14:32:58] (03CR) 10Elukey: [C: 031] "Added some non-blocking comments just to annoy you :)" (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/359120 (https://phabricator.wikimedia.org/T161598) (owner: 10Muehlenhoff) [14:35:39] 10Operations, 10Wikimedia-General-or-Unknown: Json queries fail "Too Many Requests" - https://phabricator.wikimedia.org/T168033#3360737 (10BBlack) >>! In T168033#3357737, @Yurivict wrote: > But I don't see how is it reasonable to fail requests when some metric is exceeded, vs. delaying responses. If we implem... [14:36:04] (03CR) 10DCausse: [C: 031] [WIP] Update cirrus server counts to match reality [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359376 (owner: 10EBernhardson) [14:36:11] (03CR) 10Giuseppe Lavagetto: [C: 032] role:jobqueue_redis: daily restart of slaves [puppet] - 10https://gerrit.wikimedia.org/r/357193 (owner: 10Giuseppe Lavagetto) [14:36:29] niceeee [14:36:42] finally some consistency to the job queues [14:40:03] (03PS7) 10Andrew Bogott: designate: Clean up puppet config for deleted instances. [puppet] - 10https://gerrit.wikimedia.org/r/359374 (https://phabricator.wikimedia.org/T147878) [14:40:04] (03PS1) 10Andrew Bogott: Horizon proxy panel: Fix an error message [puppet] - 10https://gerrit.wikimedia.org/r/359945 [14:40:07] (03PS1) 10Andrew Bogott: Proxyleaks: Don't delete the actual proxy API server record [puppet] - 10https://gerrit.wikimedia.org/r/359946 [14:40:28] (03PS1) 10Alexandros Kosiaris: hieraize OTRS private data [labs/private] - 10https://gerrit.wikimedia.org/r/359947 [14:42:02] (03CR) 10Andrew Bogott: [C: 032] Horizon proxy panel: Fix an error message [puppet] - 10https://gerrit.wikimedia.org/r/359945 (owner: 10Andrew Bogott) [14:42:32] (03PS1) 10Alexandros Kosiaris: otrs: Refactor in the profile/role pattern [puppet] - 10https://gerrit.wikimedia.org/r/359949 [14:46:20] <_joe_> andrewbogott: can I merge your change as well? [14:46:25] yes please [14:49:02] ok, apparently I don't understand how 'dig' works. It looks to me like the same server is telling me that a domain is both defined and not defined... [14:49:07] https://www.irccloud.com/pastebin/0QRMtHoF/ [14:53:41] !log pausing cluster restart of elasticsearch eqiad [14:53:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:54:42] (03CR) 10Elukey: Change EL purging script to avoid limit/offset (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/359938 (https://phabricator.wikimedia.org/T168071) (owner: 10Mforns) [14:55:08] (03CR) 10Andrew Bogott: [C: 031] mariadb: wikitech servers to use core grants [puppet] - 10https://gerrit.wikimedia.org/r/359152 (https://phabricator.wikimedia.org/T167961) (owner: 10Marostegui) [14:56:04] (03PS3) 10Marostegui: mariadb: wikitech servers to use core grants [puppet] - 10https://gerrit.wikimedia.org/r/359152 (https://phabricator.wikimedia.org/T167961) [14:57:08] (03PS1) 10Giuseppe Lavagetto: role::redis::jobqueue: fixup crontabs [puppet] - 10https://gerrit.wikimedia.org/r/359950 [14:58:18] (03CR) 10Alexandros Kosiaris: [V: 032 C: 032] hieraize OTRS private data [labs/private] - 10https://gerrit.wikimedia.org/r/359947 (owner: 10Alexandros Kosiaris) [14:58:21] (03CR) 10Marostegui: [C: 032] mariadb: wikitech servers to use core grants [puppet] - 10https://gerrit.wikimedia.org/r/359152 (https://phabricator.wikimedia.org/T167961) (owner: 10Marostegui) [14:58:38] (03PS2) 10Giuseppe Lavagetto: role::redis::jobqueue: fixup crontabs [puppet] - 10https://gerrit.wikimedia.org/r/359950 [14:58:46] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] role::redis::jobqueue: fixup crontabs [puppet] - 10https://gerrit.wikimedia.org/r/359950 (owner: 10Giuseppe Lavagetto) [14:59:35] !log cold reset ms-be1013 drac [14:59:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:59:52] 10Operations, 10Patch-For-Review: Tracking and Reducing cron-spam from root@ - https://phabricator.wikimedia.org/T132324#3360792 (10Marostegui) [14:59:56] 10Operations, 10DBA, 10cloud-services-team, 10Patch-For-Review, 10Wikimedia-log-errors: Terbium cronjobs attempting to connect to labstestweb2001 - https://phabricator.wikimedia.org/T167961#3360789 (10Marostegui) 05Open>03Resolved a:03jcrespo After Jaime added the grants manually, I have talked to... [15:03:27] !log restbase restbase2001 is out of rotation, performing experiments with the new cassandra driver v3.2.2 which seems to be causing problems only in production [15:03:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:04:07] (03PS1) 10Hashar: Rake: memoize git_changed_in_head() [puppet] - 10https://gerrit.wikimedia.org/r/359951 (https://phabricator.wikimedia.org/T166888) [15:10:38] (03PS1) 10Filippo Giunchedi: install_server: ms-be10[13-21] to stretch [puppet] - 10https://gerrit.wikimedia.org/r/359952 (https://phabricator.wikimedia.org/T162609) [15:12:12] (03PS2) 10Alexandros Kosiaris: otrs: Refactor in the profile/role pattern [puppet] - 10https://gerrit.wikimedia.org/r/359949 [15:12:42] !log akosiaris@puppetmaster1001 conftool action : set/pooled=yes; selector: name=chlorine.eqiad.wmnet [15:12:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:12:56] (03PS7) 10Giuseppe Lavagetto: role::lvs::balancer: convert to role/profile (step 1) [puppet] - 10https://gerrit.wikimedia.org/r/357824 [15:13:15] 10Operations, 10Goal, 10Kubernetes, 10Patch-For-Review: Eliminate SPOFs in the existing eqiad kubernetes infrastructure - https://phabricator.wikimedia.org/T162040#3360843 (10akosiaris) [15:14:12] (03PS2) 10Filippo Giunchedi: install_server: ms-be10[13-21] to stretch [puppet] - 10https://gerrit.wikimedia.org/r/359952 (https://phabricator.wikimedia.org/T162609) [15:14:33] 10Operations, 10Goal, 10Kubernetes: Prepare to service applications from kubernetes - https://phabricator.wikimedia.org/T162039#3360846 (10akosiaris) [15:14:35] 10Operations, 10Goal, 10Kubernetes, 10Patch-For-Review: Eliminate SPOFs in the existing eqiad kubernetes infrastructure - https://phabricator.wikimedia.org/T162040#3150598 (10akosiaris) 05Open>03Resolved 2/3 etcd hosts are now in a different row giving up read functionality in the worst case scenario,... [15:14:46] 10Operations, 10Goal, 10Kubernetes: Prepare to service applications from kubernetes - https://phabricator.wikimedia.org/T162039#3150583 (10akosiaris) [15:16:31] !log Deploy alter table labsdb1009 - T166207 [15:16:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:16:40] T166207: Convert unique keys into primary keys for some wiki tables on s5 - https://phabricator.wikimedia.org/T166207 [15:17:29] (03CR) 10Filippo Giunchedi: [C: 032] install_server: ms-be10[13-21] to stretch [puppet] - 10https://gerrit.wikimedia.org/r/359952 (https://phabricator.wikimedia.org/T162609) (owner: 10Filippo Giunchedi) [15:17:31] (03CR) 10Alexandros Kosiaris: [C: 032] "https://puppet-compiler.wmflabs.org/6804/ says noop, merging" [puppet] - 10https://gerrit.wikimedia.org/r/359949 (owner: 10Alexandros Kosiaris) [15:17:44] (03PS3) 10Alexandros Kosiaris: otrs: Refactor in the profile/role pattern [puppet] - 10https://gerrit.wikimedia.org/r/359949 [15:19:11] (03CR) 10Giuseppe Lavagetto: [C: 032] role::lvs::balancer: convert to role/profile (step 1) [puppet] - 10https://gerrit.wikimedia.org/r/357824 (owner: 10Giuseppe Lavagetto) [15:19:21] (03PS8) 10Giuseppe Lavagetto: role::lvs::balancer: convert to role/profile (step 1) [puppet] - 10https://gerrit.wikimedia.org/r/357824 [15:20:21] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] role::lvs::balancer: convert to role/profile (step 1) [puppet] - 10https://gerrit.wikimedia.org/r/357824 (owner: 10Giuseppe Lavagetto) [15:20:31] <_joe_> wat? [15:20:44] (03PS9) 10Giuseppe Lavagetto: role::lvs::balancer: convert to role/profile (step 1) [puppet] - 10https://gerrit.wikimedia.org/r/357824 [15:20:49] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] role::lvs::balancer: convert to role/profile (step 1) [puppet] - 10https://gerrit.wikimedia.org/r/357824 (owner: 10Giuseppe Lavagetto) [15:25:18] !log installed python-setuptools-scm on copper [15:25:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:26:03] 10Operations, 10Labs, 10Labs-Infrastructure: Puppet CA: virt1000.wikimedia.org' will expire on 2017-08-15 - https://phabricator.wikimedia.org/T168110#3360856 (10Andrew) I see this but can't figure out where it's coming from. We haven't had a box named virt1000 for ages... is the CA cert somehow still named... [15:26:51] (03CR) 10Andrew Bogott: [C: 032] toollabs: install `lame` on exec hosts [puppet] - 10https://gerrit.wikimedia.org/r/359669 (https://phabricator.wikimedia.org/T168128) (owner: 10Merlijn van Deen) [15:26:52] 10Operations, 10Wikimedia-General-or-Unknown: Json queries fail "Too Many Requests" - https://phabricator.wikimedia.org/T168033#3360858 (10Yurivict) What you are referring to is [[https://en.wikipedia.org/wiki/HTTP_pipelining|HTTP pipelining]]. The server doesn't have to read pipelined headers once it determin... [15:26:55] (03PS4) 10Andrew Bogott: toollabs: install `lame` on exec hosts [puppet] - 10https://gerrit.wikimedia.org/r/359669 (https://phabricator.wikimedia.org/T168128) (owner: 10Merlijn van Deen) [15:31:45] 10Operations, 10Continuous-Integration-Infrastructure, 10Patch-For-Review: CI for operations/puppet is taking too long - https://phabricator.wikimedia.org/T166888#3360863 (10hashar) >>! In T166888#3333018, @faidon wrote: > Thanks for all the detailed responses from all of you, it's really appreciated. It'... [15:32:35] 10Operations, 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): CI for operations/puppet is taking too long - https://phabricator.wikimedia.org/T166888#3360869 (10hashar) [15:33:01] 10Operations, 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): CI for operations/puppet is taking too long - https://phabricator.wikimedia.org/T166888#3310890 (10hashar) a:03hashar [15:33:03] (03PS1) 10Alexandros Kosiaris: kubernetes: Allow IPv6 access as well to master [puppet] - 10https://gerrit.wikimedia.org/r/359953 [15:33:49] (03PS2) 10Alexandros Kosiaris: kubernetes: Allow IPv6 access as well to master [puppet] - 10https://gerrit.wikimedia.org/r/359953 [15:34:28] (03CR) 10Alexandros Kosiaris: [V: 032 C: 032] kubernetes: Allow IPv6 access as well to master [puppet] - 10https://gerrit.wikimedia.org/r/359953 (owner: 10Alexandros Kosiaris) [15:41:44] (03PS1) 10Alexandros Kosiaris: kubestage100X: Assign IPv6 addresses [dns] - 10https://gerrit.wikimedia.org/r/359955 [15:42:43] (03CR) 10Alexandros Kosiaris: [C: 032] kubestage100X: Assign IPv6 addresses [dns] - 10https://gerrit.wikimedia.org/r/359955 (owner: 10Alexandros Kosiaris) [15:47:32] (03PS10) 10Giuseppe Lavagetto: role::lvs::balancer: refactor to role/profile (step 2) [puppet] - 10https://gerrit.wikimedia.org/r/357863 [15:47:43] !log uploaded linux_4.9.25-1~bpo8+3 to apt.wikimedia.org [15:47:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:48:18] !log uploaded linux-meta_1.13 to apt.wikimedia.org (with this update the linux-meta package now also defaults to 4.9 (previously 4.4)) [15:48:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:53:09] (03PS13) 10Mforns: role::mariadb::analytics::custom_repl_slave: add eventlogging_cleaner.py [puppet] - 10https://gerrit.wikimedia.org/r/356383 (https://phabricator.wikimedia.org/T108850) (owner: 10Elukey) [16:01:59] (03CR) 10Giuseppe Lavagetto: [C: 031] "https://puppet-compiler.wmflabs.org/6805/" [puppet] - 10https://gerrit.wikimedia.org/r/357863 (owner: 10Giuseppe Lavagetto) [16:22:53] (03CR) 10Daniel Kinzler: [C: 031] "Agree with intent. No idea if the paths are correct." [puppet] - 10https://gerrit.wikimedia.org/r/358783 (https://phabricator.wikimedia.org/T164783) (owner: 10Smalyshev) [17:00:05] gehel: Respected human, time to deploy Weekly Wikidata query service deployment window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170619T1700). Please do the needful. [17:04:37] (03PS2) 10Andrew Bogott: Proxyleaks: Don't delete the actual proxy API server record [puppet] - 10https://gerrit.wikimedia.org/r/359946 [17:04:49] (03PS8) 10Andrew Bogott: designate: Clean up puppet config for deleted instances. [puppet] - 10https://gerrit.wikimedia.org/r/359374 (https://phabricator.wikimedia.org/T147878) [17:05:43] no deployment on wdqs this week.... [17:06:37] (03CR) 10Andrew Bogott: [C: 032] Proxyleaks: Don't delete the actual proxy API server record [puppet] - 10https://gerrit.wikimedia.org/r/359946 (owner: 10Andrew Bogott) [17:07:00] (03CR) 10Andrew Bogott: [C: 032] designate: Clean up puppet config for deleted instances. [puppet] - 10https://gerrit.wikimedia.org/r/359374 (https://phabricator.wikimedia.org/T147878) (owner: 10Andrew Bogott) [17:08:51] 10Operations, 10Electron-PDFs, 10Services, 10Patch-For-Review: pdfrender fails to serve requests since Mar 8 00:30:32 UTC on scb1003 - https://phabricator.wikimedia.org/T159922#3361154 (10mobrovac) >>! In T159922#3357374, @Volans wrote: > There is an ETA for a permanent fix? It seems to me that we've alrea... [17:18:16] (03PS1) 10Andrew Bogott: Labs puppetmaster: Allow api calls from the designate host [puppet] - 10https://gerrit.wikimedia.org/r/359959 (https://phabricator.wikimedia.org/T147878) [17:20:06] (03PS1) 10Ottomata: Initial commit of certpy [software/certpy] - 10https://gerrit.wikimedia.org/r/359960 (https://phabricator.wikimedia.org/T166167) [17:20:08] (03CR) 10Andrew Bogott: [C: 032] Labs puppetmaster: Allow api calls from the designate host [puppet] - 10https://gerrit.wikimedia.org/r/359959 (https://phabricator.wikimedia.org/T147878) (owner: 10Andrew Bogott) [17:21:13] !log installing exim4 security updates [17:21:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:44:00] jouncebot: next [17:44:01] In 0 hour(s) and 15 minute(s): Morning SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170619T1800) [17:49:13] 10Operations, 10netops: codfw row D switch upgrade - https://phabricator.wikimedia.org/T167274#3361242 (10ayounsi) Correct, nobody voiced any concern for that date. [18:00:04] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, and thcipriani: Respected human, time to deploy Morning SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170619T1800). Please do the needful. [18:00:18] * Reedy steals the window [18:00:34] (03PS2) 10Reedy: Add atj to InterwikiSortOrders [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359810 (https://phabricator.wikimedia.org/T167714) [18:00:37] (03CR) 10Reedy: [C: 032] Add atj to InterwikiSortOrders [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359810 (https://phabricator.wikimedia.org/T167714) (owner: 10Reedy) [18:01:38] (03Merged) 10jenkins-bot: Add atj to InterwikiSortOrders [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359810 (https://phabricator.wikimedia.org/T167714) (owner: 10Reedy) [18:02:40] !log reedy@tin Synchronized wmf-config/InterwikiSortOrders.php: Add atjwiki (duration: 00m 41s) [18:02:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:07:07] (03PS1) 10Reedy: Add atjwiki to securepollglobal.dblist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359964 [18:08:03] (03CR) 10Reedy: [C: 032] Add atjwiki to securepollglobal.dblist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359964 (owner: 10Reedy) [18:09:20] (03Merged) 10jenkins-bot: Add atjwiki to securepollglobal.dblist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/359964 (owner: 10Reedy) [18:10:20] !log reedy@tin Synchronized dblists/securepollglobal.dblist: (no justification provided) (duration: 00m 41s) [18:10:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:10:38] (03PS3) 10Reedy: Remove $wgEnableValidationStatisticsUpdates from FlaggedRevs config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/354600 (owner: 10Nemo bis) [18:10:42] (03CR) 10Reedy: [C: 032] Remove $wgEnableValidationStatisticsUpdates from FlaggedRevs config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/354600 (owner: 10Nemo bis) [18:11:42] (03Merged) 10jenkins-bot: Remove $wgEnableValidationStatisticsUpdates from FlaggedRevs config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/354600 (owner: 10Nemo bis) [18:12:51] (03PS3) 10Reedy: Create logo for the Kabiye Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/344562 (https://phabricator.wikimedia.org/T160868) (owner: 10Odder) [18:12:57] (03CR) 10Reedy: [C: 032] Create logo for the Kabiye Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/344562 (https://phabricator.wikimedia.org/T160868) (owner: 10Odder) [18:13:36] (03PS11) 10Reedy: Update instances of Wikimedia Foundation logo #1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/307475 (https://phabricator.wikimedia.org/T144254) (owner: 10Urbanecm) [18:14:21] (03Merged) 10jenkins-bot: Create logo for the Kabiye Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/344562 (https://phabricator.wikimedia.org/T160868) (owner: 10Odder) [18:14:57] (03PS12) 10Reedy: Update instances of Wikimedia Foundation logo #1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/307475 (https://phabricator.wikimedia.org/T144254) (owner: 10Urbanecm) [18:15:43] (03CR) 10Reedy: [C: 032] Update instances of Wikimedia Foundation logo #1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/307475 (https://phabricator.wikimedia.org/T144254) (owner: 10Urbanecm) [18:16:45] (03Merged) 10jenkins-bot: Update instances of Wikimedia Foundation logo #1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/307475 (https://phabricator.wikimedia.org/T144254) (owner: 10Urbanecm) [18:18:37] !log reedy@tin Synchronized static/images: (no justification provided) (duration: 00m 41s) [18:18:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:19:35] !log reedy@tin Synchronized wmf-config/flaggedrevs.php: Remove old setting that does nothing (duration: 00m 41s) [18:19:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:20:21] (03PS1) 10GWicke: Restart pdfrender service once per day [puppet] - 10https://gerrit.wikimedia.org/r/359967 (https://phabricator.wikimedia.org/T159922) [18:20:25] !log reedy@tin Synchronized static/favicon/wmf.ico: (no justification provided) (duration: 00m 41s) [18:20:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:21:27] (03PS3) 10Reedy: Change AbuseFilter block duration for fawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/358156 (https://phabricator.wikimedia.org/T167562) (owner: 10Huji) [18:21:29] (03CR) 10Reedy: [C: 032] Change AbuseFilter block duration for fawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/358156 (https://phabricator.wikimedia.org/T167562) (owner: 10Huji) [18:21:35] !log reedy@tin Synchronized wmf-config/InitialiseSettings.php: logos (duration: 00m 41s) [18:21:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:22:47] (03Merged) 10jenkins-bot: Change AbuseFilter block duration for fawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/358156 (https://phabricator.wikimedia.org/T167562) (owner: 10Huji) [18:24:45] (03CR) 10Reedy: [C: 04-1] "Seems dependent patch not merged yet" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/330328 (https://phabricator.wikimedia.org/T154112) (owner: 10Gergő Tisza) [18:29:33] !log reedy@tin Synchronized wmf-config/abusefilter.php: (no justification provided) (duration: 00m 41s) [18:29:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:30:55] (03PS3) 10Reedy: Use directly wgGalleryOptions without wmg [mediawiki-config] - 10https://gerrit.wikimedia.org/r/331819 (owner: 10Dereckson) [18:30:58] (03CR) 10Reedy: [C: 032] Use directly wgGalleryOptions without wmg [mediawiki-config] - 10https://gerrit.wikimedia.org/r/331819 (owner: 10Dereckson) [18:32:15] (03Merged) 10jenkins-bot: Use directly wgGalleryOptions without wmg [mediawiki-config] - 10https://gerrit.wikimedia.org/r/331819 (owner: 10Dereckson) [18:34:11] !log reedy@tin Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 00m 42s) [18:34:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:35:01] !log reedy@tin Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 41s) [18:35:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:04:29] (03CR) 10Mobrovac: [C: 04-1] "We probably need something smarter as the crux of the problem is the restart itself, so cron might leave the service down once per day on " [puppet] - 10https://gerrit.wikimedia.org/r/359967 (https://phabricator.wikimedia.org/T159922) (owner: 10GWicke) [20:00:04] gwicke, cscott, arlolra, subbu, bearND, halfak, and Amir1: Dear anthropoid, the time has come. Please deploy Services – Parsoid / OCG / Citoid / Mobileapps / ORES / … (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170619T2000). [20:52:56] (03PS1) 10Andrew Bogott: wmf_sink: Catch an exception during puppet config cleanup. [puppet] - 10https://gerrit.wikimedia.org/r/360039 [20:54:10] (03CR) 10Andrew Bogott: [C: 032] wmf_sink: Catch an exception during puppet config cleanup. [puppet] - 10https://gerrit.wikimedia.org/r/360039 (owner: 10Andrew Bogott) [21:00:04] dapatrick, bawolff, and Reedy: Respected human, time to deploy Weekly Security deployment window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170619T2100). Please do the needful. [21:29:14] (03CR) 10Volans: [C: 04-1] "I've done a pass on the Python part, without looking at the details of the executed openssl commands and I have some major blockers:" (0334 comments) [software/certpy] - 10https://gerrit.wikimedia.org/r/359960 (https://phabricator.wikimedia.org/T166167) (owner: 10Ottomata) [22:08:41] (03PS2) 10Thcipriani: Scap: add beta canary_dashboard_url config value [puppet] - 10https://gerrit.wikimedia.org/r/353179 (https://phabricator.wikimedia.org/T164981) [22:10:51] (03PS2) 10Thcipriani: Scap3: deploy logstash/plugins with scap3 [puppet] - 10https://gerrit.wikimedia.org/r/354472 (https://phabricator.wikimedia.org/T165748) [22:14:36] !log Added non-voting operations-puppet-tests-docker job for operations/puppet repo, should (hopefully) be fast, and will timeout after 1 minute if it's not. More info https://gerrit.wikimedia.org/r/#/c/360091/ + T166888 [22:14:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:14:45] T166888: CI for operations/puppet is taking too long - https://phabricator.wikimedia.org/T166888 [22:25:05] 10Operations, 10Security-Team: Use user-specific passwords for accessing EventLogging database - https://phabricator.wikimedia.org/T120532#3361706 (10Platonides) [22:25:48] 10Operations, 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): CI for operations/puppet is taking too long - https://phabricator.wikimedia.org/T166888#3361707 (10thcipriani) >>! In T166888#3361686, @Stashbot wrote: > {nav icon=file, name=Mentioned in SAL (#w... [22:30:30] !log find /srv/carbon/whisper/archived_metrics -mtime +730 -type f -delete on labmon1001 [22:30:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:35:21] 10Operations, 10Commons, 10Multimedia, 10Traffic, and 2 others: Disable serving unpatrolled new files to Wikipedia Zero users - https://phabricator.wikimedia.org/T167400#3331800 (10Platonides) >>! In T167400#3356482, @Poyekhali wrote: > Unless Commons have a lot of new file patrollers who really mark new f... [22:45:23] !log removed some big dirs from /home/ori on install1002 [22:45:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:00:04] addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, and thcipriani: Dear anthropoid, the time has come. Please deploy Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170619T2300). [23:00:04] odder: A patch you scheduled for Evening SWAT (Max 8 patches) is about to be deployed. Please be available during the process. [23:02:54] I'm making myself available, jouncebot, thank you. [23:10:58] Dear antropoids, it's getting late, do you really want to have me here when you deploy a few logos? :) [23:31:29] odder: I guess no one's around because today's a holiday [23:31:37] I can sync it out though [23:32:05] (03CR) 10Legoktm: [C: 032] Upload logos for the Dinka Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/358883 (owner: 10Odder) [23:32:15] Another holiday?! [23:32:23] It wasn't in the calendar, legoktm... [23:32:54] yeah, it's the WMF's june holiday [23:33:10] (I only learned today too :P) [23:33:10] (03Merged) 10jenkins-bot: Upload logos for the Dinka Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/358883 (owner: 10Odder) [23:35:01] hmm [23:35:02] [14:23:59] <-- logmsgbot (~logmsgbot@einsteinium.wikimedia.org) has quit (Ping timeout: 240 seconds) [23:35:06] odder: it's deployed [23:35:16] !log legoktm@tin: Synchronized static/images/project-logos/: Upload logos for the Dinka Wikipedia (duration: 00m 42s) [23:35:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:36:10] 10Operations: logmsgbot needs restarting - https://phabricator.wikimedia.org/T168348#3361781 (10Legoktm) [23:36:28] andrewbogott: do you know how to do ^ ? [23:36:53] no, but I can give it a try [23:36:54] legoktm: Works, much thankings [23:37:28] They'll do the config when they do it, I suppose, I think they're still working some details to actually create that wiki [23:37:35] At least they've got a logo now ;) [23:37:47] legoktm: do you happen to know what the actual bot is called? (vs. its irc nick) [23:37:55] ircecho or something [23:38:06] andrewbogott: https://wikitech.wikimedia.org/wiki/Logmsgbot [23:38:10] !log are we logging? [23:38:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:38:40] so… this is something different from ^ then? [23:38:57] logmsgbot is the bot that does the auto !log [23:38:58] [11:29:29] !log reedy@tin Synchronized wmf-config/abusefilter.php: (no justification provided) (duration: 00m 41s) [23:39:06] stashbot is what records it [23:39:06] See https://wikitech.wikimedia.org/wiki/Tool:Stashbot for help. [23:39:44] on einsteinium, # systemctl restart tcpircbot-logmsgbot should probably do it [23:40:18] it's crashing [23:40:33] :/ [23:40:37] what does it say? [23:41:35] 10Operations: logmsgbot needs restarting - https://phabricator.wikimedia.org/T168348#3361781 (10Andrew) ``` Jun 19 23:41:04 einsteinium systemd[1]: Started TCP socket to IRC bot: tcpircbot-logmsgbot. Jun 19 23:41:04 einsteinium python[34607]: Traceback (most recent call last): Jun 19 23:41:04 einsteinium python[... [23:41:39] ^ [23:42:01] * odder waves goodbye [23:42:05] thanks again, legoktm [23:42:12] odder: np :) [23:42:16] No idea why that's happening today [23:43:50] actually there have been a several patches recently [23:44:17] patches to? [23:44:37] 10Operations: logmsgbot needs restarting - https://phabricator.wikimedia.org/T168348#3361797 (10Andrew) commit 751ca10988c83ac5d01df93c2e3e2b338c4ceaa5 Author: Paladox Date: Wed Jun 14 19:37:17 2017 +0000 Ircecho: Fix bot not to use carriage returns This is no longer... [23:45:17] ah [23:45:22] probably that last one if I had to guess [23:45:37] It looks like it was crashing for a different reason before that one [23:45:51] -.- [23:45:56] I guess it's just really broken then [23:46:10] yeah… hopefully thems what broke it will see the bug [23:46:24] In theory I'm on my way out the door… we can live without this overnight, right? [23:47:01] yeah [23:47:32] sorry I don't have a quick fix :( [23:54:04] no worries [23:54:06] not your fault :p