[00:00:04] twentyafterfour: Time to snap out of that daydream and deploy Phabricator update. Get on with it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180510T0000). [00:01:19] !log no phabricator deployment this evening. [00:01:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:04:58] (03CR) 10jenkins-bot: Enable ORES on arwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/431035 (https://phabricator.wikimedia.org/T192498) (owner: 10Catrope) [00:06:08] !log catrope@tin Synchronized wmf-config/InitialiseSettings.php: Enable ORES on arwiki (T192498) (duration: 01m 20s) [00:06:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:06:13] T192498: Deploy ORES advanced editquality models to arwiki - https://phabricator.wikimedia.org/T192498 [00:10:19] Ugh there are notices, "Undefined index max" [00:10:24] Trying to figure out which wiki is throwing them [00:10:50] Aha it's huwiki [00:12:18] (03PS1) 10Catrope: Follow-up c19a7d1dd: fix typo [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432330 [00:12:20] (03PS1) 10Catrope: Follow-up 1ae686e8: add missing max key [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432331 [00:12:36] (03CR) 10Catrope: [C: 032] Follow-up c19a7d1dd: fix typo [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432330 (owner: 10Catrope) [00:12:39] (03CR) 10Catrope: [C: 032] Follow-up 1ae686e8: add missing max key [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432331 (owner: 10Catrope) [00:14:10] (03Merged) 10jenkins-bot: Follow-up c19a7d1dd: fix typo [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432330 (owner: 10Catrope) [00:14:13] (03Merged) 10jenkins-bot: Follow-up 1ae686e8: add missing max key [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432331 (owner: 10Catrope) [00:16:00] !log catrope@tin Synchronized wmf-config/InitialiseSettings.php: Fix notices about missing index "max" (duration: 01m 20s) [00:16:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:17:23] hoo, Reedy: I see that the Wikibase patch has merged now, but I have to leave. If you want to deploy that patch yourselves, feel free [00:17:45] SMalyshev: I didn't SWAT your patch because you didn't respond to my pings in this channel [00:17:52] RoanKattouw: Do you think it's ok to force merge on the wmf branch? [00:18:07] Probably [00:18:11] If it's a small simple patch [00:18:37] just removes one old wgHooks registration [00:18:40] I'll be bold [00:18:45] OK go for it then [00:18:51] I'm out, see you all tomorrow [00:20:20] (03CR) 10jenkins-bot: Follow-up 1ae686e8: add missing max key [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432331 (owner: 10Catrope) [00:26:53] !log hoo@tin Synchronized php-1.32.0-wmf.2/extensions/Wikibase/lib/WikibaseLib.php: Remove wgHooks entry for GalleryGetModes (T194316) (duration: 01m 20s) [00:26:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:26:57] T194316: Wikibooks Special:UnusedFiles throws Internal Error - https://phabricator.wikimedia.org/T194316 [00:28:55] !log hoo@tin Synchronized php-1.32.0-wmf.3/extensions/Wikibase/lib/WikibaseLib.php: Remove wgHooks entry for GalleryGetModes (T194316) (duration: 01m 16s) [00:28:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:29:20] 10Operations, 10Design-Research: Edit optoutresearch@ mailing list recipients - https://phabricator.wikimedia.org/T100860#4195961 (10Dzahn) @aripstra Thank you :) Let's first see if you still actually need it. Of course removing it would be easy. If you still want it that's also no problem but then we would wa... [00:29:24] Looks fine: https://en.wikibooks.org/wiki/Special:UnusedFiles [00:33:56] !log mw2135, mw2136, mw2137 - reinstall with stretch, depooled, downtimed [00:33:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:44:02] RECOVERY - Ubuntu mirror in sync with upstream on sodium is OK: /srv/mirrors/ubuntu is over 0 hours old. [00:52:29] (03PS1) 10Brian Wolff: Use the english message in badpass logging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432334 [00:53:37] (03CR) 10jerkins-bot: [V: 04-1] Use the english message in badpass logging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432334 (owner: 10Brian Wolff) [01:06:02] (03PS2) 10Brian Wolff: Use the english message in badpass logging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432334 [01:16:41] !log twentyafterfour@tin Synchronized php-1.32.0-wmf.3/includes/libs/rdbms: sync https://gerrit.wikimedia.org/r/#/c/432297/ refs T194308 T191049 (duration: 01m 24s) [01:16:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:16:47] T191049: 1.32.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T191049 [01:16:48] T194308: Transaction should be in the callback stage (not 'cursory') - https://phabricator.wikimedia.org/T194308 [01:26:02] 10Operations, 10MediaWiki-Parser, 10MediaWiki-Platform-Team, 10Parsing-Team, and 2 others: Servers using tidy-html5 are rendering pages differently, especially with - https://phabricator.wikimedia.org/T193414#4196048 (10zhuyifei1999) >>! In T193414#4195487, @ssastry wrote: > See https://commons.wikim... [02:39:48] !log dzahn@neodymium conftool action : set/pooled=yes; selector: name=mw2143.codfw.wmnet [02:39:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:41:27] !log dzahn@neodymium conftool action : set/pooled=yes; selector: name=mw2142.codfw.wmnet [02:41:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:42:44] PROBLEM - Memory correctable errors -EDAC- on mw2213 is CRITICAL: 3 ge 3 https://grafana.wikimedia.org/dashboard/db/host-overview?orgId=1&var-server=mw2213&var-datasource=codfw%2520prometheus%252Fops [02:44:54] !log dzahn@neodymium conftool action : set/pooled=yes; selector: name=mw2141.codfw.wmnet [02:44:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:48:45] !log mw2136,mw2137,mw22138 - reinstall with stret h [02:48:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:51:53] !log mw2137,mw2138,mw22139 - reinstall with strech (last message was wrong, already running) [02:51:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:02:59] 10Operations, 10ops-eqiad: kafka1023 correctable memory errors - https://phabricator.wikimedia.org/T194249#4196094 (10Dzahn) p:05Triage>03Normal [03:03:09] 10Operations, 10ops-codfw: wtp2020 correctable memory errors - https://phabricator.wikimedia.org/T194176#4196095 (10Dzahn) p:05Triage>03Normal [03:03:19] 10Operations, 10ops-codfw: wtp2013 memory correctable errors - https://phabricator.wikimedia.org/T194174#4196096 (10Dzahn) p:05Triage>03Normal [03:03:31] 10Operations, 10ops-codfw: mw2213 correctable memory errors - https://phabricator.wikimedia.org/T194172#4196097 (10Dzahn) p:05Triage>03Normal [03:03:42] 10Operations, 10ops-codfw: rdb2002 correctable memory errors - https://phabricator.wikimedia.org/T194171#4196098 (10Dzahn) p:05Triage>03Normal [03:04:53] !log l10nupdate@tin scap sync-l10n completed (1.32.0-wmf.2) (duration: 15m 35s) [03:04:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:28:42] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 746.09 seconds [04:14:51] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 162.15 seconds [04:52:33] <_joe_> zhuyifei1999_: maybe when I recover from the flu, probably next week [04:52:44] umm okay [04:52:56] get well soon [07:01:04] (03PS1) 10Jcrespo: wikireplicas: Disable binary log, increase transaction log and buffer [puppet] - 10https://gerrit.wikimedia.org/r/432344 (https://phabricator.wikimedia.org/T194343) [07:02:41] (03PS2) 10Jcrespo: wikireplicas: Disable binary log, increase transaction log and buffer [puppet] - 10https://gerrit.wikimedia.org/r/432344 (https://phabricator.wikimedia.org/T194343) [07:07:32] (03PS3) 10Jcrespo: wikireplicas: Disable binary log, increase transaction log and buffer [puppet] - 10https://gerrit.wikimedia.org/r/432344 (https://phabricator.wikimedia.org/T194343) [07:07:58] (03CR) 10Vgutierrez: [C: 032] install_server: Reimage lvs1003 as stretch spare system [puppet] - 10https://gerrit.wikimedia.org/r/432116 (https://phabricator.wikimedia.org/T184293) (owner: 10Vgutierrez) [07:08:04] (03PS4) 10Vgutierrez: install_server: Reimage lvs1003 as stretch spare system [puppet] - 10https://gerrit.wikimedia.org/r/432116 (https://phabricator.wikimedia.org/T184293) [07:08:10] (03CR) 10Jcrespo: [C: 032] wikireplicas: Disable binary log, increase transaction log and buffer [puppet] - 10https://gerrit.wikimedia.org/r/432344 (https://phabricator.wikimedia.org/T194343) (owner: 10Jcrespo) [07:09:26] (03PS5) 10Vgutierrez: install_server: Reimage lvs1003 as stretch spare system [puppet] - 10https://gerrit.wikimedia.org/r/432116 (https://phabricator.wikimedia.org/T184293) [07:12:17] !log depool labsdb1010 from wikireplicas for maintenace [07:12:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:21:03] 10Operations, 10ops-eqiad, 10Traffic, 10Patch-For-Review: rack/setup/install lvs101[3-6] - https://phabricator.wikimedia.org/T184293#3878953 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by vgutierrez on neodymium.eqiad.wmnet for hosts: ``` lvs1003.wikimedia.org ``` The log can be found in `/... [07:54:30] 10Operations, 10ops-eqiad, 10Traffic, 10Patch-For-Review: rack/setup/install lvs101[3-6] - https://phabricator.wikimedia.org/T184293#4196209 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['lvs1003.wikimedia.org'] ``` and were **ALL** successful. [08:10:33] !log shutdown and restart labsdb1010 for upgrade [08:10:35] [08:10:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:12:25] (03PS1) 10Mobrovac: Proton: Add role and profile [puppet] - 10https://gerrit.wikimedia.org/r/432347 (https://phabricator.wikimedia.org/T186748) [08:12:58] (03CR) 10jerkins-bot: [V: 04-1] Proton: Add role and profile [puppet] - 10https://gerrit.wikimedia.org/r/432347 (https://phabricator.wikimedia.org/T186748) (owner: 10Mobrovac) [08:13:50] (03CR) 10Volans: "> Patch Set 7:" [puppet] - 10https://gerrit.wikimedia.org/r/419131 (https://phabricator.wikimedia.org/T188112) (owner: 10Volans) [08:14:17] (03CR) 10Mobrovac: "recheck" [puppet] - 10https://gerrit.wikimedia.org/r/432347 (https://phabricator.wikimedia.org/T186748) (owner: 10Mobrovac) [08:14:47] (03CR) 10jerkins-bot: [V: 04-1] Proton: Add role and profile [puppet] - 10https://gerrit.wikimedia.org/r/432347 (https://phabricator.wikimedia.org/T186748) (owner: 10Mobrovac) [08:14:56] PROBLEM - haproxy failover on dbproxy1011 is CRITICAL: CRITICAL check_failover servers up 1 down 1 [08:15:38] ^that is expected, see log [08:17:18] (03CR) 10Filippo Giunchedi: prometheus: varnish_thumbnails aggregation rule (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/431528 (https://phabricator.wikimedia.org/T184942) (owner: 10Ema) [08:19:25] (03PS8) 10Volans: Cumin masters in WMCS: upgrade to python3 [puppet] - 10https://gerrit.wikimedia.org/r/419131 (https://phabricator.wikimedia.org/T188112) [08:19:27] (03PS9) 10Volans: Cumin masters in prod: upgrade to python3 [puppet] - 10https://gerrit.wikimedia.org/r/412894 (https://phabricator.wikimedia.org/T187773) [08:21:07] RECOVERY - haproxy failover on dbproxy1011 is OK: OK check_failover servers up 2 down 0 [08:23:36] ^restart finished correctly [08:24:57] (03PS2) 10Mobrovac: Proton: Add role and profile [puppet] - 10https://gerrit.wikimedia.org/r/432347 (https://phabricator.wikimedia.org/T186748) [08:26:51] 10Operations, 10ops-eqiad, 10netops, 10Patch-For-Review: Rack/cable/configure asw2-c-eqiad switch stack - https://phabricator.wikimedia.org/T187962#4196251 (10fgiunchedi) For swift / `ms` servers the requirements are as follows: * `ms-fe*` to be depooled and moved one at a time. * `ms-be*` to be moved one... [08:29:12] (03CR) 10Mobrovac: [C: 04-1] "Superseded by Id227c1e597f999097093c1f17e97e02eee7b17df" [puppet] - 10https://gerrit.wikimedia.org/r/409996 (https://phabricator.wikimedia.org/T178166) (owner: 10Niedzielski) [08:31:16] (03PS1) 10Ladsgroup: mediawiki: Stop cronjob to delete autopatrol actions [puppet] - 10https://gerrit.wikimedia.org/r/432350 (https://phabricator.wikimedia.org/T189596) [08:32:54] (03CR) 10Ladsgroup: "Please deploy :)" [puppet] - 10https://gerrit.wikimedia.org/r/432350 (https://phabricator.wikimedia.org/T189596) (owner: 10Ladsgroup) [08:34:54] (03PS1) 10Urbanecm: Add throttle rule for Netherlands Hackathon 2018 - Women Tech Storm, clean obsolete rules [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432354 (https://phabricator.wikimedia.org/T194346) [08:42:01] (03CR) 10Jcrespo: [C: 032] mediawiki: Stop cronjob to delete autopatrol actions [puppet] - 10https://gerrit.wikimedia.org/r/432350 (https://phabricator.wikimedia.org/T189596) (owner: 10Ladsgroup) [08:43:59] (03CR) 10Jcrespo: [C: 032] "Notice: /Stage[main]/Mediawiki::Maintenance::Wikidata/Cron[wikidata-deleteAutoPatrolLogs]/ensure: removed" [puppet] - 10https://gerrit.wikimedia.org/r/432350 (https://phabricator.wikimedia.org/T189596) (owner: 10Ladsgroup) [08:52:12] (03CR) 10Alexandros Kosiaris: [C: 031] "LGTM, stalling merge per mobrovac's request" [puppet] - 10https://gerrit.wikimedia.org/r/432347 (https://phabricator.wikimedia.org/T186748) (owner: 10Mobrovac) [09:07:00] !log stop db1072 for maintenance [09:07:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:17:47] (03PS4) 10ArielGlenn: add ability to keep some dumps files and remove others, during cleanup [puppet] - 10https://gerrit.wikimedia.org/r/432135 (https://phabricator.wikimedia.org/T194124) [09:26:18] (03CR) 10MarcoAurelio: "Scheduled for SWAT." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/429989 (https://phabricator.wikimedia.org/T113616) (owner: 10MarcoAurelio) [09:27:06] PROBLEM - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1966 bytes in 0.109 second response time [09:27:23] (03PS5) 10ArielGlenn: add ability to keep some dumps files and remove others, during cleanup [puppet] - 10https://gerrit.wikimedia.org/r/432135 (https://phabricator.wikimedia.org/T194124) [09:27:49] (03CR) 10Gilles: [C: 031] thumbor: add prometheus-memcached-exporter [puppet] - 10https://gerrit.wikimedia.org/r/431594 (https://phabricator.wikimedia.org/T147326) (owner: 10Filippo Giunchedi) [09:32:16] RECOVERY - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is OK: HTTP OK: HTTP/1.1 200 OK - 1949 bytes in 0.083 second response time [10:48:15] (03PS1) 10Volans: debmonitor: add hosts to the spare role [puppet] - 10https://gerrit.wikimedia.org/r/432356 (https://phabricator.wikimedia.org/T191299) [10:51:53] (03CR) 10Volans: [C: 032] debmonitor: add hosts to the spare role [puppet] - 10https://gerrit.wikimedia.org/r/432356 (https://phabricator.wikimedia.org/T191299) (owner: 10Volans) [10:52:52] (03PS1) 10Jcrespo: mariadb: Add db1123 to mediawiki configuration, depooled [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432357 (https://phabricator.wikimedia.org/T192979) [11:01:01] (03PS1) 10Jcrespo: mariadb: Prepare db1072 for stretch reimage and move it to m3 [puppet] - 10https://gerrit.wikimedia.org/r/432358 (https://phabricator.wikimedia.org/T186320) [11:02:26] (03PS1) 10Jcrespo: mariadb: Reenable notifications for db1123 [puppet] - 10https://gerrit.wikimedia.org/r/432359 (https://phabricator.wikimedia.org/T192979) [11:03:48] 10Operations, 10Traffic: Identify bots using AES128-SHA maintainers running on toolforge - https://phabricator.wikimedia.org/T194380#4196798 (10Vgutierrez) p:05Triage>03Normal [11:07:49] !log updated puppet compiler facts [11:07:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:15:35] (03PS1) 10Volans: Add dummy ECDSA private key for debmonitor.discovery.wmnet [labs/private] - 10https://gerrit.wikimedia.org/r/432361 (https://phabricator.wikimedia.org/T191299) [11:16:13] (03CR) 10Volans: [V: 032 C: 032] Add dummy ECDSA private key for debmonitor.discovery.wmnet [labs/private] - 10https://gerrit.wikimedia.org/r/432361 (https://phabricator.wikimedia.org/T191299) (owner: 10Volans) [11:18:02] (03PS4) 10Volans: debmonitor: add server side puppettization [puppet] - 10https://gerrit.wikimedia.org/r/430881 (https://phabricator.wikimedia.org/T191299) [11:22:23] (03PS1) 10Volans: Debmonitor: add dummy Django secret key [labs/private] - 10https://gerrit.wikimedia.org/r/432362 (https://phabricator.wikimedia.org/T191299) [11:22:52] (03CR) 10Volans: [V: 032 C: 032] Debmonitor: add dummy Django secret key [labs/private] - 10https://gerrit.wikimedia.org/r/432362 (https://phabricator.wikimedia.org/T191299) (owner: 10Volans) [11:33:47] (03PS1) 10Volans: Debmonitor: add dummy keyholder deployment keys [labs/private] - 10https://gerrit.wikimedia.org/r/432363 (https://phabricator.wikimedia.org/T191299) [11:34:20] (03CR) 10Volans: [V: 032 C: 032] Debmonitor: add dummy keyholder deployment keys [labs/private] - 10https://gerrit.wikimedia.org/r/432363 (https://phabricator.wikimedia.org/T191299) (owner: 10Volans) [11:37:36] (03CR) 10Volans: "Compiler results available here:" [puppet] - 10https://gerrit.wikimedia.org/r/430881 (https://phabricator.wikimedia.org/T191299) (owner: 10Volans) [11:46:25] !log reimage analytics1030/31 to Debian Stretch [11:46:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:57:42] 10Operations, 10Performance-Team, 10monitoring, 10Patch-For-Review: Consolidate performance website and related software - https://phabricator.wikimedia.org/T158837#4196943 (10Imarlier) [11:58:09] 10Operations, 10Performance-Team, 10monitoring, 10Patch-For-Review: Consolidate performance website and related software - https://phabricator.wikimedia.org/T158837#3124337 (10Imarlier) a:05Krinkle>03Imarlier [11:59:58] jouncebot: [12:00:00] jouncebot: no [12:00:01] jouncebot: now [12:00:01] No deployments scheduled for the next 0 hour(s) and 59 minute(s) [12:00:03] jouncebot: next [12:00:03] In 0 hour(s) and 59 minute(s): European Mid-day SWAT(Max 6 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180510T1300) [12:00:32] (03CR) 10Reedy: [C: 032] Add throttle rule for Netherlands Hackathon 2018 - Women Tech Storm, clean obsolete rules [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432354 (https://phabricator.wikimedia.org/T194346) (owner: 10Urbanecm) [12:02:12] (03Merged) 10jenkins-bot: Add throttle rule for Netherlands Hackathon 2018 - Women Tech Storm, clean obsolete rules [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432354 (https://phabricator.wikimedia.org/T194346) (owner: 10Urbanecm) [12:03:08] 10Operations, 10Performance-Team, 10monitoring, 10Patch-For-Review: Consolidate performance website and related software - https://phabricator.wikimedia.org/T158837#4196950 (10Krinkle) [12:04:32] (03CR) 10Jcrespo: [C: 032] mariadb: Add db1123 to mediawiki configuration, depooled [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432357 (https://phabricator.wikimedia.org/T192979) (owner: 10Jcrespo) [12:04:35] !log reedy@tin Synchronized wmf-config/throttle.php: (no justification provided) (duration: 01m 48s) [12:04:38] !log that was for "Add throttle rule for Netherlands Hackathon 2018 - Women Tech Storm" [12:04:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:04:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:05:45] (03Merged) 10jenkins-bot: mariadb: Add db1123 to mediawiki configuration, depooled [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432357 (https://phabricator.wikimedia.org/T192979) (owner: 10Jcrespo) [12:07:57] !log jynus@tin Synchronized wmf-config/db-codfw.php: Add db1123 (duration: 01m 19s) [12:08:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:21:15] !log jynus@tin Synchronized wmf-config/db-eqiad.php: Add db1123 (duration: 01m 19s) [12:21:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:22:11] 10Operations, 10Datasets-General-or-Unknown, 10Dumps-Generation, 10hardware-requests: Give misc dump crons their own host - https://phabricator.wikimedia.org/T181936#4196982 (10ArielGlenn) In theory the new host is arriving today, and if all goes well it should be available for getting its puppet role by e... [12:22:33] (03PS1) 10ArielGlenn: turn off misc dump crons on snapshot1007 [puppet] - 10https://gerrit.wikimedia.org/r/432365 (https://phabricator.wikimedia.org/T181936) [12:23:43] (03CR) 10Alexandros Kosiaris: [C: 031] Add login and LDAP support [software/debmonitor] - 10https://gerrit.wikimedia.org/r/425417 (https://phabricator.wikimedia.org/T167504) (owner: 10Volans) [12:24:35] (03PS1) 10ArielGlenn: role for new misc dumps cron host [puppet] - 10https://gerrit.wikimedia.org/r/432366 (https://phabricator.wikimedia.org/T181936) [12:24:52] (03PS2) 10Arturo Borrero Gonzalez: [WIP] openstack: neutron: nova.conf: enable options [puppet] - 10https://gerrit.wikimedia.org/r/432130 (https://phabricator.wikimedia.org/T193657) [12:25:31] (03PS1) 10ArielGlenn: add snapshot1008 role and hiera settings [puppet] - 10https://gerrit.wikimedia.org/r/432367 (https://phabricator.wikimedia.org/T181936) [12:26:28] (03PS1) 10ArielGlenn: do 8 jobs in parallel for wikidata weeklies [puppet] - 10https://gerrit.wikimedia.org/r/432368 (https://phabricator.wikimedia.org/T181936) [12:26:43] (03PS3) 10Arturo Borrero Gonzalez: [WIP] openstack: neutron: nova.conf: enable options [puppet] - 10https://gerrit.wikimedia.org/r/432130 (https://phabricator.wikimedia.org/T193657) [12:34:16] 10Operations, 10vm-requests: EQIAD & CODFW: 1 VM in each data center for xhprof/xhgui/other profiling tools - https://phabricator.wikimedia.org/T194390#4197015 (10Imarlier) [12:34:57] 10Operations, 10Performance-Team, 10monitoring, 10Patch-For-Review: Consolidate performance website and related software - https://phabricator.wikimedia.org/T158837#4197028 (10Imarlier) [12:37:20] something wrong with puppet-compiler.wmflabs.org? host change details is empty, example: https://puppet-compiler.wmflabs.org/compiler02/11180/ [12:39:31] weird indeed [12:42:28] arturo: https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/11180/console [12:42:43] ERROR: Error preparing output for labtestcontrol2003.wikimedia.org: 'ascii' codec can't encode characters in position 5945-5946: ordinal not in range(128) [12:43:23] oh, so a bug in the puppet compiler software? [12:43:35] how did you make it run? Via pcc? [12:43:42] any spurious typos/chars ? [12:43:51] via the jenkins web for [12:43:55] form* [12:44:23] will try a rebuild [12:44:47] trying now with pcc [12:45:29] looks like an instance of T190842 [12:45:29] T190842: Compiler fails to generate html with non-ascii characters - https://phabricator.wikimedia.org/T190842 [12:45:42] yeah I was about to paste it :) [12:45:49] same again: https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/11181/console [12:46:14] same error for me too [12:46:15] doh, and T173518 ! I'll merge [12:46:15] T173518: Errors dealing with non-ascii characters in output - https://phabricator.wikimedia.org/T173518 [12:47:12] 10Operations, 10Performance-Team, 10monitoring, 10Patch-For-Review: Consolidate performance website and related software - https://phabricator.wikimedia.org/T158837#4197057 (10Krinkle) [12:47:48] so there is something in the html that escapes ascii? [12:48:33] 10Operations, 10Performance-Team, 10monitoring, 10Patch-For-Review: Consolidate performance website and related software - https://phabricator.wikimedia.org/T158837#3349355 (10Krinkle) [12:49:50] looks like it yeah [12:52:17] 10Operations, 10Performance-Team, 10Patch-For-Review: Move coal from graphite#001 nodes to webperf#001 - https://phabricator.wikimedia.org/T159354#3065109 (10Krinkle) [12:53:44] 10Operations, 10Performance-Team, 10Patch-For-Review: Move coal from graphite#001 nodes to webperf#001 - https://phabricator.wikimedia.org/T159354#4197075 (10Krinkle) [12:53:56] (03CR) 10Alexandros Kosiaris: [C: 031] Add basic test coverage [software/debmonitor] - 10https://gerrit.wikimedia.org/r/394621 (https://phabricator.wikimedia.org/T167504) (owner: 10Volans) [12:54:49] 10Operations, 10Performance-Team, 10Patch-For-Review: Move coal from graphite#001 nodes to webperf#001 - https://phabricator.wikimedia.org/T159354#3065109 (10Krinkle) Also ticking off the "backup files in bacula" checkbox, because we now use the regular carbon storage, which at some point between between 201... [12:56:29] (03PS1) 10Jcrespo: mariadb: Pool db1123 with low weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432372 (https://phabricator.wikimedia.org/T192979) [13:00:05] addshore, hashar, anomie, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Your horoscope predicts another unfortunate European Mid-day SWAT(Max 6 patches) deploy. May Zuul be (nice) with you. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180510T1300). [13:00:05] tgr, Urbanecm, and Hauskatze: A patch you scheduled for European Mid-day SWAT(Max 6 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [13:00:18] I can SWAT today [13:00:18] o/ [13:00:46] o/ [13:00:49] no testing needed/possible for the TemplateStyles patch [13:00:53] tgr: want to deploy your patch yourself? [13:01:07] I can do that if you prefer [13:01:56] (03PS6) 10ArielGlenn: add ability to keep some dumps files and remove others, during cleanup [puppet] - 10https://gerrit.wikimedia.org/r/432135 (https://phabricator.wikimedia.org/T194124) [13:02:10] tgr: I'm fine with both, if there are nothing to test, I can do the deploy, since I will be doing everything else :) [13:02:26] please do it then [13:02:33] tgr: sure [13:02:34] (03CR) 10ArielGlenn: [C: 032] add ability to keep some dumps files and remove others, during cleanup [puppet] - 10https://gerrit.wikimedia.org/r/432135 (https://phabricator.wikimedia.org/T194124) (owner: 10ArielGlenn) [13:02:35] I can only test it once deployed [13:02:53] tgr: ok, I'll let you know in a few minutes when it's deployed [13:02:58] (03PS2) 10Zfilipin: Enable TemplateStyles for nowiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432073 (https://phabricator.wikimedia.org/T193786) (owner: 10Gergő Tisza) [13:03:30] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432073 (https://phabricator.wikimedia.org/T193786) (owner: 10Gergő Tisza) [13:04:47] (03Merged) 10jenkins-bot: Enable TemplateStyles for nowiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432073 (https://phabricator.wikimedia.org/T193786) (owner: 10Gergő Tisza) [13:06:49] !log zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:432073|Enable TemplateStyles for nowiki (T193786)]] (duration: 01m 20s) [13:06:52] tgr: your patch is deployed, please test it and thanks for deploying with #releng ;) [13:06:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:06:54] T193786: [task] Configure TemplateStyles for nowiki - https://phabricator.wikimedia.org/T193786 [13:07:02] Reedy: you have deployed https://gerrit.wikimedia.org/r/#/c/432354/ ? [13:07:12] zeljkof: Yup [13:07:15] (it's scheduled for eu swat) [13:07:32] thanks! :D one less thing for swat [13:07:54] (03PS5) 10Zfilipin: cawiki: remove gendered namespace aliases, already on MW core [mediawiki-config] - 10https://gerrit.wikimedia.org/r/429989 (https://phabricator.wikimedia.org/T113616) (owner: 10MarcoAurelio) [13:08:19] Hauskatze: around for swat? [13:08:40] da zeljkof [13:09:00] Hauskatze: please stand by, your patch will be at mwdebug in a few minutes [13:09:11] zeljkof: working fine, thanks! [13:09:13] (03CR) 10Alexandros Kosiaris: [C: 031] First working version (031 comment) [software/debmonitor] - 10https://gerrit.wikimedia.org/r/394620 (https://phabricator.wikimedia.org/T167504) (owner: 10Volans) [13:09:31] (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/429989 (https://phabricator.wikimedia.org/T113616) (owner: 10MarcoAurelio) [13:09:32] Savršen. [13:09:41] :D [13:09:51] (savršeno) ;) [13:10:07] (but really rarely used) [13:10:18] I think it means 'perfect' ? [13:10:31] it does, but we do not use that phrase :) [13:10:48] (03Merged) 10jenkins-bot: cawiki: remove gendered namespace aliases, already on MW core [mediawiki-config] - 10https://gerrit.wikimedia.org/r/429989 (https://phabricator.wikimedia.org/T113616) (owner: 10MarcoAurelio) [13:10:59] izvrsno would be a better choice, meaning excellent [13:11:01] (03Abandoned) 10Elukey: Update Gemfile after puppet-lint-wmf_styleguide-check update [puppet] - 10https://gerrit.wikimedia.org/r/428582 (owner: 10Elukey) [13:12:24] Hauskatze: the patch is at mwdebug, please test and let me know if I can deploy [13:12:41] zeljkof: I'll test now [13:13:34] (03PS1) 10Arturo Borrero Gonzalez: [WIP] openstack: neutron: api-paste.ini: enable options [puppet] - 10https://gerrit.wikimedia.org/r/432374 (https://phabricator.wikimedia.org/T193657) [13:13:44] (03PS1) 10ArielGlenn: don't configure any settings for a labs copy of dumps [puppet] - 10https://gerrit.wikimedia.org/r/432375 [13:14:10] zeljkof: noting weird that I can see [13:14:22] namespaceDupes will tell us in due time ;) [13:14:29] Hauskatze: ok, deploying [13:14:49] (03CR) 10Rush: [C: 04-1] [WIP] openstack: neutron: nova.conf: enable options (0313 comments) [puppet] - 10https://gerrit.wikimedia.org/r/432130 (https://phabricator.wikimedia.org/T193657) (owner: 10Arturo Borrero Gonzalez) [13:14:50] almost forgot about namespaceDupes [13:14:58] 10Operations, 10TemplateStyles, 10Traffic, 10Wikimedia-Extension-setup, and 4 others: Deploy TemplateStyles to WMF production - https://phabricator.wikimedia.org/T133410#4197119 (10Tgr) [13:15:18] (03PS2) 10Arturo Borrero Gonzalez: [WIP] openstack: neutron: api-paste.ini: enable options [puppet] - 10https://gerrit.wikimedia.org/r/432374 (https://phabricator.wikimedia.org/T193657) [13:15:53] !log zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:429989|cawiki: remove gendered namespace aliases, already on MW core (T113616)]] (duration: 01m 20s) [13:15:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:15:58] T113616: Set $namespaceGenderAliases for catalan - https://phabricator.wikimedia.org/T113616 [13:16:34] Hauskatze: ok, deployed, but I can not connect to terbium, debugging [13:16:48] zeljkof: maybe you want to try to wasat [13:16:52] to -> on [13:16:59] if anybody knows what is wrong... [13:17:02] $ ssh terbium.equiad.wmnet [13:17:10] channel 0: open failed: administratively prohibited: open failed [13:17:14] (03CR) 10Volans: [V: 032 C: 032] "@akosiaris Thanks a lot for the review!!!" [software/debmonitor] - 10https://gerrit.wikimedia.org/r/394620 (https://phabricator.wikimedia.org/T167504) (owner: 10Volans) [13:17:19] stdio forwarding failed [13:17:25] ssh_exchange_identification: Connection closed by remote host [13:17:57] zeljkof: works for me [13:18:19] (03CR) 10Volans: [V: 032 C: 032] "Thanks a lot Alex and Moritz for the reviews!" [software/debmonitor] - 10https://gerrit.wikimedia.org/r/394990 (https://phabricator.wikimedia.org/T167504) (owner: 10Volans) [13:18:21] trying from another machine [13:19:04] the other machine connected [13:19:07] * zeljkof is confused [13:19:14] something is wrong on this machine [13:19:19] anyway, will run the script [13:19:26] !log reimage analytics1029 to Debian Stretch [13:19:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:19:36] (03CR) 10ArielGlenn: [C: 032] don't configure any settings for a labs copy of dumps [puppet] - 10https://gerrit.wikimedia.org/r/432375 (owner: 10ArielGlenn) [13:20:22] (03CR) 10Volans: [C: 032] Add basic test coverage [software/debmonitor] - 10https://gerrit.wikimedia.org/r/394621 (https://phabricator.wikimedia.org/T167504) (owner: 10Volans) [13:21:10] (03Merged) 10jenkins-bot: Add basic test coverage [software/debmonitor] - 10https://gerrit.wikimedia.org/r/394621 (https://phabricator.wikimedia.org/T167504) (owner: 10Volans) [13:21:51] (03CR) 10Volans: [C: 032] Add login and LDAP support [software/debmonitor] - 10https://gerrit.wikimedia.org/r/425417 (https://phabricator.wikimedia.org/T167504) (owner: 10Volans) [13:22:39] (03Merged) 10jenkins-bot: Add login and LDAP support [software/debmonitor] - 10https://gerrit.wikimedia.org/r/425417 (https://phabricator.wikimedia.org/T167504) (owner: 10Volans) [13:23:04] thanks zeljkof - it seems that the links fixed were due to a rename in the Media namespace so no errors on my side :) [13:23:37] fwiw tgr the attachAccount.php script requires a list of accounts... I guess in a file? [13:23:53] needed to fix https://deployment.wikimedia.beta.wmflabs.org/wiki/Special:CentralAuth/Rxy [13:23:57] (03CR) 10Zfilipin: [C: 032] "Output of namespaceDupes script is at T113616#4197144" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/429989 (https://phabricator.wikimedia.org/T113616) (owner: 10MarcoAurelio) [13:24:03] another botched global rename [13:24:04] :| [13:24:14] Hauskatze: this is script output https://phabricator.wikimedia.org/T113616#4197144 [13:24:27] (03CR) 10Volans: [C: 032] Add server side validation of client certificates [software/debmonitor] - 10https://gerrit.wikimedia.org/r/428302 (https://phabricator.wikimedia.org/T167504) (owner: 10Volans) [13:25:12] (03Merged) 10jenkins-bot: Add server side validation of client certificates [software/debmonitor] - 10https://gerrit.wikimedia.org/r/428302 (https://phabricator.wikimedia.org/T167504) (owner: 10Volans) [13:25:35] looks like that's all for eu swat [13:25:49] !log EU SWAT finished [13:25:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:28:15] Hauskatze: https://gerrit.wikimedia.org/r/432376 [13:28:49] ottomata: o/ [13:29:12] qq - would it be ok for me to restart kafka on jumbo for openjdk upgrades? [13:30:15] elukey: sure should be [13:30:20] mirror maker is running there now [13:30:25] so if you want it to pick up the jvm upgrdes [13:30:28] you'll have to restart that too i guess? [13:30:31] tgr: thanks, so I'll create a file on ~/home/maurelio and specify that as --userlist [13:31:02] ottomata: since you added it yesterday it should have already picked up the new version (checking) [13:31:17] yep [13:31:27] all right, starting! [13:31:41] oh ok great [13:31:55] !log rolling restart of kafka on kafka-jumbo1* for openjdk-8 security upgrades [13:31:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:33:43] (03Abandoned) 10ArielGlenn: use files from an optional 'prefetch dir' for prefetch [dumps] - 10https://gerrit.wikimedia.org/r/423242 (owner: 10ArielGlenn) [13:33:57] (03Abandoned) 10ArielGlenn: allow configuration of extra dir to search for prefetch files [dumps] - 10https://gerrit.wikimedia.org/r/423241 (owner: 10ArielGlenn) [13:35:47] (03CR) 10Alexandros Kosiaris: [C: 04-1] mcrouter: add support for listening on the ssl port (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/431736 (https://phabricator.wikimedia.org/T192370) (owner: 10Giuseppe Lavagetto) [13:37:15] (03PS1) 10Volans: Make model validation stronger [software/debmonitor] - 10https://gerrit.wikimedia.org/r/432377 [13:41:34] !log dzahn@neodymium conftool action : set/pooled=yes; selector: name=mw2135.codfw.wmnet [13:41:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:42:06] (03CR) 10Alexandros Kosiaris: [C: 04-1] profile::mediawiki::mcrouter_wancache: add ssl, proxy support (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/431737 (https://phabricator.wikimedia.org/T192370) (owner: 10Giuseppe Lavagetto) [13:42:49] !log dzahn@neodymium conftool action : set/pooled=yes; selector: name=mw2136.codfw.wmnet [13:42:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:43:00] (03CR) 10Alexandros Kosiaris: [C: 031] "Nice!" [puppet] - 10https://gerrit.wikimedia.org/r/431738 (https://phabricator.wikimedia.org/T192370) (owner: 10Giuseppe Lavagetto) [13:43:35] !log dzahn@neodymium conftool action : set/pooled=yes; selector: name=mw2137.codfw.wmnet [13:43:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:46:51] !log dzahn@neodymium conftool action : set/pooled=yes; selector: name=mw2138.codfw.wmnet [13:46:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:48:15] (03Abandoned) 10Krinkle: mtail: Use a temporary variable for $cache_control [puppet] - 10https://gerrit.wikimedia.org/r/432117 (https://phabricator.wikimedia.org/T184942) (owner: 10Krinkle) [13:53:21] PROBLEM - Kafka Broker Under Replicated Partitions on kafka-jumbo1003 is CRITICAL: 11.9 ge 10 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1003 [13:54:24] (03PS1) 10ArielGlenn: keep dump prefetch files longer on dumps generation nfs servers [puppet] - 10https://gerrit.wikimedia.org/r/432378 (https://phabricator.wikimedia.org/T194124) [13:54:53] (03CR) 10jerkins-bot: [V: 04-1] keep dump prefetch files longer on dumps generation nfs servers [puppet] - 10https://gerrit.wikimedia.org/r/432378 (https://phabricator.wikimedia.org/T194124) (owner: 10ArielGlenn) [13:55:43] (03PS3) 10Filippo Giunchedi: thumbor: add prometheus-memcached-exporter [puppet] - 10https://gerrit.wikimedia.org/r/431594 (https://phabricator.wikimedia.org/T147326) [13:56:05] under replicated partitions for 1003 are due to the reboots and already gone [13:56:10] err restarts [13:56:12] PROBLEM - Kafka Broker Under Replicated Partitions on kafka-jumbo1004 is CRITICAL: 10.28 ge 10 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1004 [13:56:13] anybody else have trouble connecting to terbium? https://phabricator.wikimedia.org/P7114 [13:56:31] equiad.wmnet ? [13:56:34] (03CR) 10Filippo Giunchedi: [C: 032] thumbor: add prometheus-memcached-exporter [puppet] - 10https://gerrit.wikimedia.org/r/431594 (https://phabricator.wikimedia.org/T147326) (owner: 10Filippo Giunchedi) [13:56:53] !log mw2139,mw2140,mw2144,mw2202 - reinstall with --no-verify - last few special cases that didnt have puppet certs or failed before - after that all appservers done [13:56:54] to make it more confusing, works fine from my laptop, does not work from my desktop, both have the same configuration, as far as I can see [13:56:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:57:10] zeljkof: in the paste I can see equiad.wmnet [13:57:13] not eqiad [13:57:28] ouch, could it be that simple!? :D [13:57:43] maybe I had a typo in my bash history... [13:57:45] checking [13:58:38] elukey: that was it! :D [13:58:43] did not even notice it [13:58:56] it was in my bash history for whatever reason... [13:59:08] :) [13:59:38] and my eyes are hurting trying to figure out what is different between the two machines... [14:01:10] (03PS2) 10ArielGlenn: keep dump prefetch files longer on dumps generation nfs servers [puppet] - 10https://gerrit.wikimedia.org/r/432378 (https://phabricator.wikimedia.org/T194124) [14:05:19] (03PS1) 10Ottomata: Set default eventbus kafka log level to WARNING [puppet] - 10https://gerrit.wikimedia.org/r/432380 (https://phabricator.wikimedia.org/T193230) [14:07:00] !log restart hive/oozie Hadoop daemons on analytics1003 for openjdk-8 upgrades [14:07:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:09:29] (03PS2) 10Jcrespo: mariadb: Reenable notifications for db1123 [puppet] - 10https://gerrit.wikimedia.org/r/432359 (https://phabricator.wikimedia.org/T192979) [14:13:26] (03CR) 10Jcrespo: [C: 032] mariadb: Reenable notifications for db1123 [puppet] - 10https://gerrit.wikimedia.org/r/432359 (https://phabricator.wikimedia.org/T192979) (owner: 10Jcrespo) [14:16:22] (03PS2) 10Ottomata: Set default eventbus kafka log level to WARNING [puppet] - 10https://gerrit.wikimedia.org/r/432380 (https://phabricator.wikimedia.org/T193230) [14:17:05] (03CR) 10jerkins-bot: [V: 04-1] Set default eventbus kafka log level to WARNING [puppet] - 10https://gerrit.wikimedia.org/r/432380 (https://phabricator.wikimedia.org/T193230) (owner: 10Ottomata) [14:18:04] RECOVERY - Kafka Broker Under Replicated Partitions on kafka-jumbo1004 is OK: (C)10 ge (W)5 ge 4.433 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1004 [14:21:30] (03PS3) 10Ottomata: Set default eventbus kafka log level to WARNING [puppet] - 10https://gerrit.wikimedia.org/r/432380 (https://phabricator.wikimedia.org/T193230) [14:22:39] (03PS4) 10Ottomata: Set default eventbus kafka log level to WARNING [puppet] - 10https://gerrit.wikimedia.org/r/432380 (https://phabricator.wikimedia.org/T193230) [14:22:43] (03CR) 10Ottomata: [V: 032 C: 032] Set default eventbus kafka log level to WARNING [puppet] - 10https://gerrit.wikimedia.org/r/432380 (https://phabricator.wikimedia.org/T193230) (owner: 10Ottomata) [14:27:05] !log rolling restart of eventbus service to apply new log level settings [14:27:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:27:43] PROBLEM - Kafka Broker Under Replicated Partitions on kafka-jumbo1004 is CRITICAL: 16.03 ge 10 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1004 [14:28:02] all good --^ [14:28:43] PROBLEM - Kafka Broker Under Replicated Partitions on kafka-jumbo1002 is CRITICAL: 11.03 ge 10 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1002 [14:28:56] !log otto@tin Started restart [eventlogging/eventbus@aa9eb2c]: apply log level changes [14:28:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:30:14] PROBLEM - Kafka Broker Under Replicated Partitions on kafka-jumbo1005 is CRITICAL: 10.67 ge 10 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1005 [14:31:20] (03PS2) 10Jcrespo: mariadb: Prepare db1072 for stretch reimage and move it to m3 [puppet] - 10https://gerrit.wikimedia.org/r/432358 (https://phabricator.wikimedia.org/T186320) [14:32:03] !log restarting db1123 [14:32:05] (03PS3) 10Filippo Giunchedi: swift: add prometheus-memcached-exporter [puppet] - 10https://gerrit.wikimedia.org/r/431593 (https://phabricator.wikimedia.org/T147326) [14:32:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:32:55] (03CR) 10Jcrespo: [C: 032] mariadb: Prepare db1072 for stretch reimage and move it to m3 [puppet] - 10https://gerrit.wikimedia.org/r/432358 (https://phabricator.wikimedia.org/T186320) (owner: 10Jcrespo) [14:33:53] PROBLEM - Kafka Broker Under Replicated Partitions on kafka-jumbo1005 is CRITICAL: 10.67 ge 10 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1005 [14:34:12] 10Operations, 10Cassandra, 10Services (blocked), 10User-Eevans, 10User-fgiunchedi: Replace 5 Samsung SSD 850 devices w/ 4 1.6T Intel or HP SSDs - https://phabricator.wikimedia.org/T189822#4197344 (10Eevans) 05Open>03Resolved With the replacements done, the matrix of machine/controller and SSD combina... [14:34:23] RECOVERY - Kafka Broker Under Replicated Partitions on kafka-jumbo1003 is OK: (C)10 ge (W)5 ge 0.68 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1003 [14:34:53] (03CR) 10Filippo Giunchedi: [C: 032] swift: add prometheus-memcached-exporter [puppet] - 10https://gerrit.wikimedia.org/r/431593 (https://phabricator.wikimedia.org/T147326) (owner: 10Filippo Giunchedi) [14:34:58] (03PS4) 10Filippo Giunchedi: swift: add prometheus-memcached-exporter [puppet] - 10https://gerrit.wikimedia.org/r/431593 (https://phabricator.wikimedia.org/T147326) [14:37:43] PROBLEM - Kafka Broker Under Replicated Partitions on kafka-jumbo1005 is CRITICAL: 10.67 ge 10 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1005 [14:38:10] !log rolling restart of Hadoop Yarn nodemanagers on analytics worker nodes for openjdk-8 security upgrades [14:38:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:41:23] PROBLEM - Kafka Broker Under Replicated Partitions on kafka-jumbo1005 is CRITICAL: 10.67 ge 10 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1005 [14:41:34] sorry the alarm is a bit spammy [14:41:45] 10Operations, 10ops-codfw, 10DBA: Degraded RAID on db2067 - https://phabricator.wikimedia.org/T194103#4197363 (10Papaul) a:05Papaul>03Marostegui @Marostegui replaced the disk with another disk. [14:42:24] elukey: you can go here [14:42:25] https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=Kafka+Broker+Under+Replicated+Partitions [14:42:29] and silence/dt them [14:42:29] :) [14:42:55] ah nice shortcut thanks! [14:43:02] 10Operations, 10ops-codfw, 10DBA: Degraded RAID on db2067 - https://phabricator.wikimedia.org/T194103#4197368 (10jcrespo) a:05Marostegui>03jcrespo Manuel is not around today, I am taking the task. [14:44:37] (03CR) 10Rush: "> Patch Set 2:" [puppet] - 10https://gerrit.wikimedia.org/r/430539 (https://phabricator.wikimedia.org/T190893) (owner: 10Zhuyifei1999) [14:46:26] (03CR) 10Jcrespo: [C: 032] mariadb: Pool db1123 with low weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432372 (https://phabricator.wikimedia.org/T192979) (owner: 10Jcrespo) [14:47:41] (03Merged) 10jenkins-bot: mariadb: Pool db1123 with low weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432372 (https://phabricator.wikimedia.org/T192979) (owner: 10Jcrespo) [14:53:16] !log jynus@tin Synchronized wmf-config/db-eqiad.php: Pool db1123 with low weight (duration: 01m 20s) [14:53:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:53:53] RECOVERY - Kafka Broker Under Replicated Partitions on kafka-jumbo1005 is OK: (C)10 ge (W)5 ge 4.767 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1005 [14:54:09] (03CR) 10EBernhardson: [C: 031] elasticsearch: alert when cirrus writes are frozen for too long [puppet] - 10https://gerrit.wikimedia.org/r/431754 (https://phabricator.wikimedia.org/T193605) (owner: 10Gehel) [14:54:12] (03CR) 10Anomie: WIP: wiki replicas - prepare for refactored actor storage (0310 comments) [puppet] - 10https://gerrit.wikimedia.org/r/431823 (https://phabricator.wikimedia.org/T188299) (owner: 10Bstorm) [14:55:36] (03CR) 10EBernhardson: [C: 031] Add wikis with more that 1000 categories to categories dump [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432043 (https://phabricator.wikimedia.org/T194139) (owner: 10Smalyshev) [14:56:03] RECOVERY - Kafka Broker Under Replicated Partitions on kafka-jumbo1004 is OK: (C)10 ge (W)5 ge 3.5 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1004 [14:56:34] 10Operations, 10Performance-Team, 10Patch-For-Review: Move coal from graphite#001 nodes to webperf#001 - https://phabricator.wikimedia.org/T159354#4197404 (10fgiunchedi) >>! In T159354#4197075, @Krinkle wrote: > Also ticking off the "backup files in bacula" checkbox, because we now use the regular carbon sto... [15:00:43] !log rolling restart of Hadoop HDFS datanodes on analytics workers to pick up the new openjdk-8 security upgrades [15:00:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:02:28] PROBLEM - Check systemd state on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:03:57] PROBLEM - Check the NTP synchronisation status of timesyncd on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:05:27] PROBLEM - Nginx local proxy to apache on mw2202 is CRITICAL: connect to address 10.192.32.90 and port 443: Connection refused [15:05:28] PROBLEM - Check whether ferm is active by checking the default input chain on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:05:57] PROBLEM - kubelet operational latencies on kubernetes1001 is CRITICAL: instance=kubernetes1001.eqiad.wmnet operation_type={create_container,image_status,podsandbox_status,remove_container,start_container} https://grafana.wikimedia.org/dashboard/db/kubernetes-kubelets?orgId=1 [15:06:58] PROBLEM - DPKG on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:07:07] RECOVERY - kubelet operational latencies on kubernetes1001 is OK: All metrics within thresholds. https://grafana.wikimedia.org/dashboard/db/kubernetes-kubelets?orgId=1 [15:08:28] PROBLEM - configured eth on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:08:54] is anybody working on mw2202 ? [15:09:11] ah yes it seems a reimage [15:10:07] PROBLEM - Disk space on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:10:07] PROBLEM - dhclient process on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:11:37] PROBLEM - mediawiki-installation DSH group on mw2202 is CRITICAL: Host mw2202 is not in mediawiki-installation dsh group [15:11:37] PROBLEM - HHVM processes on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:12:17] PROBLEM - High CPU load on API appserver on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:12:58] PROBLEM - HHVM rendering on mw2202 is CRITICAL: connect to address 10.192.32.90 and port 80: Connection refused [15:13:07] PROBLEM - nutcracker port on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:14:38] PROBLEM - nutcracker process on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:16:08] PROBLEM - puppet last run on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:17:27] PROBLEM - Apache HTTP on mw2202 is CRITICAL: connect to address 10.192.32.90 and port 80: Connection refused [15:19:07] 10Operations, 10Ops-Access-Reviews: Add Michael Holloway (Reading Infrastructure) to maps admin groups - https://phabricator.wikimedia.org/T194404#4197487 (10Mholloway) [15:19:17] PROBLEM - Check size of conntrack table on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:19:17] PROBLEM - MD RAID on mw2202 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [15:23:50] 10Operations, 10Ops-Access-Reviews, 10Reading-Infrastructure-Team-Backlog: Add Michael Holloway (Reading Infrastructure) to maps admin groups - https://phabricator.wikimedia.org/T194404#4197499 (10Mholloway) [15:24:50] downtimed mw2202 for 2h [15:24:58] Cc: mutante [15:32:39] (03Abandoned) 10Mholloway: Stop rewriting m.wikipedia.org and zero.wikipedia.org [puppet] - 10https://gerrit.wikimedia.org/r/404158 (https://phabricator.wikimedia.org/T69015) (owner: 10Mholloway) [15:37:10] (03CR) 10Jforrester: [C: 031] Use the english message in badpass logging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432334 (owner: 10Brian Wolff) [15:40:19] (03PS1) 10Mholloway: Add mholloway-shell to maps-admins, kartotherian-admin, tilerator-admin [puppet] - 10https://gerrit.wikimedia.org/r/432393 (https://phabricator.wikimedia.org/T194404) [15:41:17] (03CR) 10Jcrespo: [C: 032] mariadb: Move db1072 to m3, db1123 to s4 [software] - 10https://gerrit.wikimedia.org/r/432103 (https://phabricator.wikimedia.org/T192979) (owner: 10Jcrespo) [15:42:17] (03Merged) 10jenkins-bot: mariadb: Move db1072 to m3, db1123 to s4 [software] - 10https://gerrit.wikimedia.org/r/432103 (https://phabricator.wikimedia.org/T192979) (owner: 10Jcrespo) [15:42:42] (03PS1) 10Volans: Client self-update capability [software/debmonitor] - 10https://gerrit.wikimedia.org/r/432394 (https://phabricator.wikimedia.org/T167504) [15:42:46] (03PS1) 10Volans: CLI: use lsb_release for OS detection [software/debmonitor] - 10https://gerrit.wikimedia.org/r/432395 [15:53:46] RECOVERY - Kafka Broker Under Replicated Partitions on kafka-jumbo1002 is OK: (C)10 ge (W)5 ge 4 https://grafana.wikimedia.org/dashboard/db/kafka?panelId=29&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1002 [15:54:17] !log drain and reimage analytics1028 to Debian Jessie (Hadoop Journal node) [15:54:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:00:04] godog, moritzm, and _joe_: (Dis)respected human, time to deploy Puppet SWAT(Max 6 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180510T1600). Please do the needful. [16:00:05] No GERRIT patches in the queue for this window AFAICS. [16:04:13] no patches? https://media1.giphy.com/media/FlHsBSjHXgBMY/giphy.mp4 [16:08:20] 10Operations, 10Discovery-Search (Current work), 10Patch-For-Review: search.wikimedia.org is source of lots of 500s - https://phabricator.wikimedia.org/T179266#4197682 (10debt) 05Open>03Resolved Thanks, Erik! [16:10:03] godog: Wait, is it too late to add one? [16:10:30] 10Operations, 10Performance-Team, 10Patch-For-Review: Move coal from graphite#001 nodes to webperf#001 - https://phabricator.wikimedia.org/T159354#4197690 (10Krinkle) @fgiunchedi Ah, I see now. b44b1213518cf5 added it, but acc4a831bc623 left the declaration unused. [16:10:38] I totally forgot in everything else happening that I wanted https://gerrit.wikimedia.org/r/#/c/430260/ deployed [16:11:33] bawolff: no, please add it to the deployments page, I'll take a look [16:11:40] 10Operations, 10Performance-Team, 10Patch-For-Review: Move coal from graphite#001 nodes to webperf#001 - https://phabricator.wikimedia.org/T159354#4197691 (10Krinkle) [16:13:43] godog: added [16:17:12] bawolff: kk, I'll merge [16:17:20] (03PS2) 10Filippo Giunchedi: Allow mediawiki-l and mediawiki-announce to be indexed [puppet] - 10https://gerrit.wikimedia.org/r/430260 (https://phabricator.wikimedia.org/T193572) (owner: 10Brian Wolff) [16:17:20] Thank you [16:18:06] (03CR) 10Filippo Giunchedi: [C: 032] Allow mediawiki-l and mediawiki-announce to be indexed [puppet] - 10https://gerrit.wikimedia.org/r/430260 (https://phabricator.wikimedia.org/T193572) (owner: 10Brian Wolff) [16:20:23] bawolff: np, it is live [16:20:56] Woo - https://lists.wikimedia.org/robots.txt [16:21:03] * bawolff welcomes our new robot overloads [16:21:51] seriosuly, I wonder how long it'll take for the internet to notice [16:31:38] yeah. I guess a couple days at the very least [16:36:10] RECOVERY - Apache HTTP on mw2202 is OK: HTTP OK: HTTP/1.1 200 OK - 10975 bytes in 0.073 second response time [16:39:15] (03PS1) 10Amire80: Add wmgBabelCategoryNames to officewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432403 [16:41:10] 10Operations, 10netops: Asymetric routing for cross-DC pfw syslog-tls - https://phabricator.wikimedia.org/T192104#4197810 (10ayounsi) p:05Triage>03Normal [16:41:56] 10Operations, 10netops: Asymetric routing for cross-DC pfw syslog-tls - https://phabricator.wikimedia.org/T192104#4197815 (10ayounsi) [16:56:52] RECOVERY - Check whether ferm is active by checking the default input chain on mw2202 is OK: OK ferm input default policy is set [16:56:53] RECOVERY - MD RAID on mw2202 is OK: OK: Active: 2, Working: 2, Failed: 0, Spare: 0 [16:57:11] (03PS4) 10Krinkle: prometheus: Add varnishrls aggregation rules [puppet] - 10https://gerrit.wikimedia.org/r/432090 (https://phabricator.wikimedia.org/T184942) [16:57:12] RECOVERY - Disk space on mw2202 is OK: DISK OK [16:57:12] RECOVERY - dhclient process on mw2202 is OK: PROCS OK: 0 processes with command name dhclient [16:57:22] RECOVERY - HHVM processes on mw2202 is OK: PROCS OK: 6 processes with command name hhvm [16:57:23] RECOVERY - DPKG on mw2202 is OK: All packages OK [16:57:33] RECOVERY - configured eth on mw2202 is OK: OK - interfaces up [16:57:38] (03PS5) 10Krinkle: prometheus: Add varnishrls aggregation rules [puppet] - 10https://gerrit.wikimedia.org/r/432090 (https://phabricator.wikimedia.org/T190978) [16:57:42] RECOVERY - Check size of conntrack table on mw2202 is OK: OK: nf_conntrack is 0 % full [17:00:04] cscott, arlolra, subbu, halfak, and Amir1: Your horoscope predicts another unfortunate Services – Graphoid / Parsoid / Citoid / ORES deploy. May Zuul be (nice) with you. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180510T1700). [17:01:24] (03PS1) 10Ayounsi: Smokeping: replace tellurium with frbast1001 [puppet] - 10https://gerrit.wikimedia.org/r/432404 [17:01:26] RECOVERY - Check systemd state on mw2202 is OK: OK - running: The system is fully operational [17:01:56] RECOVERY - nutcracker port on mw2202 is OK: TCP OK - 0.000 second response time on 127.0.0.1 port 11212 [17:02:06] (03CR) 10Ayounsi: [C: 032] Smokeping: replace tellurium with frbast1001 [puppet] - 10https://gerrit.wikimedia.org/r/432404 (owner: 10Ayounsi) [17:02:16] RECOVERY - Nginx local proxy to apache on mw2202 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.237 second response time [17:04:06] RECOVERY - Check the NTP synchronisation status of timesyncd on mw2202 is OK: OK: synced at Thu 2018-05-10 17:03:55 UTC. [17:04:26] RECOVERY - HHVM rendering on mw2202 is OK: HTTP OK: HTTP/1.1 200 OK - 74320 bytes in 0.295 second response time [17:04:37] RECOVERY - puppet last run on mw2202 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:04:56] RECOVERY - High CPU load on API appserver on mw2202 is OK: OK - load average: 0.30, 0.37, 0.18 [17:07:38] 10Operations, 10ops-codfw, 10DBA: Degraded RAID on db2067 - https://phabricator.wikimedia.org/T194103#4198044 (10jcrespo) @Papaul can you check the disk used? It says it has 300GB, either it is wrongly detected or a mistake (this host needs 600GB disks), and it failed becaue of that: ``` physicaldrive 1I:1:... [17:07:43] 10Operations, 10ops-codfw, 10DBA: Degraded RAID on db2067 - https://phabricator.wikimedia.org/T194103#4198048 (10jcrespo) a:05jcrespo>03Papaul [17:09:26] PROBLEM - Check correctness of the icinga configuration on einsteinium is CRITICAL: Icinga configuration contains errors [17:12:16] RECOVERY - nutcracker process on mw2202 is OK: PROCS OK: 1 process with UID = 113 (nutcracker), command name nutcracker [17:14:51] 10Operations, 10ops-eqiad, 10DBA, 10decommission: Decommission db1055 - https://phabricator.wikimedia.org/T194118#4198106 (10jcrespo) a:05jcrespo>03RobH @robh this can now proceed as usual, db1055 is a spare with notifications disabled. Please note that in case of reusing its parts, megaraid,1,megaraid... [17:15:59] 10Operations, 10ops-eqiad, 10DBA, 10decommission: Decommission db1055 - https://phabricator.wikimedia.org/T194118#4198112 (10jcrespo) [17:16:15] 10Operations, 10Performance-Team: Certain graphite data directories should be backed up - https://phabricator.wikimedia.org/T194418#4198120 (10Imarlier) [17:16:18] 10Operations, 10ops-eqiad, 10DBA, 10decommission: Decommission db1055 - https://phabricator.wikimedia.org/T194118#4189202 (10jcrespo) p:05Normal>03Low [17:16:47] 10Operations, 10Performance-Team, 10Patch-For-Review: Move coal from graphite#001 nodes to webperf#001 - https://phabricator.wikimedia.org/T159354#4198131 (10Imarlier) @Krinkle @fgiunchedi I split the backup question out to it's own task: https://phabricator.wikimedia.org/T194418 [17:17:03] so icinga's config is not good for [17:17:04] Error: Could not find any host matching 'mw2202.mgmt' (config file '/etc/icinga/puppet_services.cfg', starting on line 495126) [17:17:12] mutante: --^ [17:29:44] RECOVERY - Check correctness of the icinga configuration on einsteinium is OK: Icinga configuration is correct [17:30:21] (03PS1) 10Cdentinger: fundraising bastion replacement [puppet] - 10https://gerrit.wikimedia.org/r/432406 [17:30:22] ah nice [17:41:17] (03CR) 10Ayounsi: [C: 031] "not a icinga expert, and frack uses icinga differently, but that looks good" [puppet] - 10https://gerrit.wikimedia.org/r/432406 (owner: 10Cdentinger) [17:44:08] 10Operations, 10Cloud-Services, 10netops: Allocate public v4 IPs for Neutron setup in eqiad - https://phabricator.wikimedia.org/T193496#4198174 (10chasemp) [17:44:55] 10Operations, 10Cloud-Services, 10netops: Allocate public v4 IPs for Neutron setup in eqiad - https://phabricator.wikimedia.org/T193496#4171118 (10chasemp) >>! In T193496#4192431, @ayounsi wrote: > @chasemp Can you provide an ETA for returning the /25? It's educated guess work but towards the end of calenda... [17:46:48] !log ppchelko@tin Started deploy [restbase/deploy@fb306e3]: Logging improvements [17:46:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:47:49] (03PS1) 10Jcrespo: mariadb: Fully pool db1123, remove db1072 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432409 (https://phabricator.wikimedia.org/T186320) [17:49:27] (03PS2) 10Jcrespo: mariadb: Fully pool db1123, remove db1072 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432409 (https://phabricator.wikimedia.org/T186320) [17:54:22] (03CR) 10Jcrespo: [C: 032] mariadb: Fully pool db1123, remove db1072 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432409 (https://phabricator.wikimedia.org/T186320) (owner: 10Jcrespo) [17:55:34] (03Merged) 10jenkins-bot: mariadb: Fully pool db1123, remove db1072 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432409 (https://phabricator.wikimedia.org/T186320) (owner: 10Jcrespo) [17:55:59] (03CR) 10Jgreen: [C: 032] fundraising bastion replacement [puppet] - 10https://gerrit.wikimedia.org/r/432406 (owner: 10Cdentinger) [17:56:09] 10Operations, 10ops-eqiad, 10netops, 10Patch-For-Review: Rack/cable/configure asw2-c-eqiad switch stack - https://phabricator.wikimedia.org/T187962#4198186 (10chasemp) Seems like the total list for #cloud-services-team to really worry about from the physical diagrams is: ```C2: labstore1004 C3: labsdb100... [17:56:42] 10Operations, 10ops-eqiad, 10Cloud-VPS: Update and move labnet1001/1002 - https://phabricator.wikimedia.org/T193579#4198188 (10chasemp) @Cmjohnson did you get a chance to move labnet1002? [17:57:42] !log jynus@tin Synchronized wmf-config/db-codfw.php: Remove db1072 (duration: 01m 20s) [17:57:43] PROBLEM - mobileapps endpoints health on scb1003 is CRITICAL: /{domain}/v1/page/most-read/{year}/{month}/{day} (retrieve the most read articles for January 1, 2016) timed out before a response was received [17:57:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:58:46] ^ fits the pattern of mobileapps alerting in response to restbase deployments [17:59:21] !log jynus@tin Synchronized wmf-config/db-eqiad.php: Fully pool db1123, Remove db1072 (duration: 01m 15s) [17:59:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:00:04] RECOVERY - mobileapps endpoints health on scb1003 is OK: All endpoints are healthy [18:00:04] addshore, hashar, anomie, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Your horoscope predicts another unfortunate Morning SWAT (Max 6 patches) deploy. May Zuul be (nice) with you. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180510T1800). [18:00:04] No GERRIT patches in the queue for this window AFAICS. [18:02:21] !log ppchelko@tin Finished deploy [restbase/deploy@fb306e3]: Logging improvements (duration: 15m 33s) [18:02:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:09:55] !log sbisson@tin Started deploy [kartotherian/deploy@3aa87ff]: Kartotherian: load style from yaml [18:09:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:15:57] 10Operations, 10Product-Analytics, 10SRE-Access-Requests: Requesting access to stat1006 for Go Fish Digital - https://phabricator.wikimedia.org/T194287#4198219 (10mpopov) [18:16:09] !log sbisson@tin Finished deploy [kartotherian/deploy@3aa87ff]: Kartotherian: load style from yaml (duration: 06m 16s) [18:16:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:21:47] !log sbisson@tin Started deploy [tilerator/deploy@a5ec109]: Tilerator: load source and style from yaml [18:21:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:30:30] !log sbisson@tin Finished deploy [tilerator/deploy@a5ec109]: Tilerator: load source and style from yaml (duration: 08m 45s) [18:30:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:33:45] (03PS3) 10Bstorm: WIP: wiki replicas - prepare for refactored actor storage [puppet] - 10https://gerrit.wikimedia.org/r/431823 (https://phabricator.wikimedia.org/T188299) [18:34:48] (03CR) 10Bstorm: "I think I captured all the comments here. Besides any further typos or quirks, I might want to go back and look at the comment about the " [puppet] - 10https://gerrit.wikimedia.org/r/431823 (https://phabricator.wikimedia.org/T188299) (owner: 10Bstorm) [18:52:00] (03PS1) 10MaxSem: Revert "Temporarily disable GlobalPreferences in beta" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432416 [18:52:29] (03PS2) 10MaxSem: Revert "Temporarily disable GlobalPreferences in beta" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432416 [18:52:58] * MaxSem is gonna SWAT the above [18:53:21] (03CR) 10MaxSem: [C: 032] Revert "Temporarily disable GlobalPreferences in beta" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432416 (owner: 10MaxSem) [18:54:32] (03Merged) 10jenkins-bot: Revert "Temporarily disable GlobalPreferences in beta" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432416 (owner: 10MaxSem) [18:56:43] !log maxsem@tin Synchronized wmf-config/InitialiseSettings-labs.php: https://gerrit.wikimedia.org/r/#/c/432416/ labs only (duration: 01m 20s) [18:56:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:00:05] twentyafterfour: #bothumor I � Unicode. All rise for MediaWiki train deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180510T1900). [19:04:33] (03CR) 10Bstorm: "Ok, on the indexes, because this already adds a bunch of indexes for the WHERE clause of the actor table, the revactor_rev field is alread" [puppet] - 10https://gerrit.wikimedia.org/r/431823 (https://phabricator.wikimedia.org/T188299) (owner: 10Bstorm) [19:08:29] (03PS4) 10Bstorm: WIP: wiki replicas - prepare for refactored actor storage [puppet] - 10https://gerrit.wikimedia.org/r/431823 (https://phabricator.wikimedia.org/T188299) [19:09:20] (03CR) 10jenkins-bot: mariadb: Fully pool db1123, remove db1072 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432409 (https://phabricator.wikimedia.org/T186320) (owner: 10Jcrespo) [19:12:09] 10Operations, 10Performance-Team, 10Graphite: Certain graphite data directories should be backed up - https://phabricator.wikimedia.org/T194418#4198359 (10Aklapper) [19:13:40] (03CR) 10Anomie: [C: 031] "Should be good with the changes made now. Haven't tested, of course." [puppet] - 10https://gerrit.wikimedia.org/r/431823 (https://phabricator.wikimedia.org/T188299) (owner: 10Bstorm) [19:14:51] !log twentyafterfour@tin Synchronized php-1.32.0-wmf.3/includes/libs/rdbms/: deploy https://gerrit.wikimedia.org/r/#/c/432415/ refs T194308 (duration: 01m 23s) [19:14:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:14:56] T194308: Transaction should be in the callback stage (not 'cursory') - https://phabricator.wikimedia.org/T194308 [19:19:47] 10Operations, 10Product-Analytics, 10SRE-Access-Requests: Requesting access to stat1006 for Go Fish Digital - https://phabricator.wikimedia.org/T194287#4194550 (10RobH) Ok, this needs quite a bit more info. Shell access is handled by individual accounts, so we'll need to know the individual users they want... [19:20:28] 10Operations, 10Product-Analytics, 10SRE-Access-Requests: Requesting access to stat1006 for Go Fish Digital - https://phabricator.wikimedia.org/T194287#4198376 (10RobH) a:03mpopov Assigning back to @mpopov for feedback. Once given, just set to unassigned to be picked back up by SRE team. [19:35:00] (03CR) 10jenkins-bot: Add throttle rule for Netherlands Hackathon 2018 - Women Tech Storm, clean obsolete rules [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432354 (https://phabricator.wikimedia.org/T194346) (owner: 10Urbanecm) [19:35:36] (03CR) 10jenkins-bot: mariadb: Add db1123 to mediawiki configuration, depooled [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432357 (https://phabricator.wikimedia.org/T192979) (owner: 10Jcrespo) [19:42:35] 10Operations, 10ops-eqiad, 10netops, 10Patch-For-Review: Rack/cable/configure asw2-c-eqiad switch stack - https://phabricator.wikimedia.org/T187962#3991415 (10bd808) >>! In T187962#4198186, @chasemp wrote: > Seems like the total list for #cloud-services-team to really worry about from the physical diagrams... [19:44:29] 10Operations, 10ops-eqiad, 10netops, 10Patch-For-Review: Rack/cable/configure asw2-c-eqiad switch stack - https://phabricator.wikimedia.org/T187962#4198413 (10Andrew) In theory we can do it on the 24th but the 31st is much better for me. [19:53:33] !log all deployment blockers resolved, proceeding to deploy mediawiki 1.32.0-wmf.3 to all wikis [19:53:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:54:13] (03PS1) 1020after4: all wikis to 1.32.0-wmf.3 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432424 [19:54:15] (03CR) 1020after4: [C: 032] all wikis to 1.32.0-wmf.3 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432424 (owner: 1020after4) [19:55:41] (03Merged) 10jenkins-bot: all wikis to 1.32.0-wmf.3 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432424 (owner: 1020after4) [19:57:50] !log twentyafterfour@tin rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.3 [19:57:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:58:34] (03CR) 10jenkins-bot: Enable TemplateStyles for nowiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432073 (https://phabricator.wikimedia.org/T193786) (owner: 10Gergő Tisza) [19:58:40] (03CR) 10jenkins-bot: cawiki: remove gendered namespace aliases, already on MW core [mediawiki-config] - 10https://gerrit.wikimedia.org/r/429989 (https://phabricator.wikimedia.org/T113616) (owner: 10MarcoAurelio) [19:58:46] (03CR) 10jenkins-bot: mariadb: Pool db1123 with low weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432372 (https://phabricator.wikimedia.org/T192979) (owner: 10Jcrespo) [19:58:52] (03CR) 10jenkins-bot: Revert "Temporarily disable GlobalPreferences in beta" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432416 (owner: 10MaxSem) [19:58:57] (03CR) 10jenkins-bot: Follow-up c19a7d1dd: fix typo [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432330 (owner: 10Catrope) [19:59:03] (03CR) 10jenkins-bot: all wikis to 1.32.0-wmf.3 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432424 (owner: 1020after4) [20:02:12] !log New branch 1.32.0-wmf.3 appears to be stable on all wikis. This completes the train for the week. Tune in again next week, same bat time, same bat channel. refs T191049 [20:02:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:02:16] T191049: 1.32.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T191049 [20:13:39] !log sbisson@tin Started deploy [kartotherian/deploy@06572bb]: Kartotherian: add fonts for Syriac and Inuktituk [20:13:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:17:19] !log sbisson@tin Finished deploy [kartotherian/deploy@06572bb]: Kartotherian: add fonts for Syriac and Inuktituk (duration: 03m 40s) [20:17:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:18:46] (03PS1) 10Halfak: Adds git-lfs package to ores base.pp [puppet] - 10https://gerrit.wikimedia.org/r/432432 [20:18:54] 10Operations, 10ops-eqiad, 10netops, 10Patch-For-Review: Rack/cable/configure asw2-c-eqiad switch stack - https://phabricator.wikimedia.org/T187962#4198506 (10ayounsi) [20:19:24] I could use a quick review of https://gerrit.wikimedia.org/r/#/c/432432/ if anyone has time [20:19:48] It's just adding a package to our base for ORES because we'll need it for deployments in labs. [20:24:27] !log dzahn@neodymium conftool action : set/pooled=yes; selector: name=mw2202.codfw.wmnet [20:24:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:25:00] 10Operations, 10ops-eqiad, 10netops, 10Patch-For-Review: Rack/cable/configure asw2-c-eqiad switch stack - https://phabricator.wikimedia.org/T187962#4198533 (10ayounsi) 31st isn't ideal for me, let's aim for May 29th if that also works for @Cmjohnson. Note that if we can't agree on a definitive date, we wi... [20:26:04] !log dzahn@neodymium conftool action : set/pooled=yes; selector: name=mw2144.codfw.wmnet [20:26:06] https://phab.wmflabs.org/config/group/swift/ [20:26:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:26:07] oh [20:26:08] woops [20:27:47] !log dzahn@neodymium conftool action : set/pooled=yes; selector: name=mw2140.co3dfw.wmnet [20:27:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:27:53] !log dzahn@neodymium conftool action : set/pooled=yes; selector: name=mw2140.codfw.wmnet [20:27:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:30:53] 10Operations, 10Cloud-Services, 10netops: Allocate public v4 IPs for Neutron setup in eqiad - https://phabricator.wikimedia.org/T193496#4198545 (10ayounsi) a:05ayounsi>03faidon 185.15.56.0/24 is yours pending approval from @faidon [20:41:35] (03PS1) 10Rush: openstack: allow codfw control servers to connect to labtest mysql [puppet] - 10https://gerrit.wikimedia.org/r/432525 [20:42:21] (03CR) 10Rush: [C: 032] openstack: allow codfw control servers to connect to labtest mysql [puppet] - 10https://gerrit.wikimedia.org/r/432525 (owner: 10Rush) [20:46:09] (03PS1) 10Rush: openstack: resolve in puppet for labtest keystone ferm [puppet] - 10https://gerrit.wikimedia.org/r/432527 [20:46:32] PROBLEM - Check whether ferm is active by checking the default input chain on labtestcontrol2001 is CRITICAL: ERROR ferm input drop default policy not set, ferm might not have been started correctly [20:46:59] (03PS2) 10Rush: openstack: resolve in puppet for labtest keystone ferm [puppet] - 10https://gerrit.wikimedia.org/r/432527 [20:47:23] ^fixing [20:47:38] (03CR) 10Rush: [C: 032] openstack: resolve in puppet for labtest keystone ferm [puppet] - 10https://gerrit.wikimedia.org/r/432527 (owner: 10Rush) [20:50:22] RECOVERY - Check whether ferm is active by checking the default input chain on labtestcontrol2001 is OK: OK ferm input default policy is set [20:50:46] 10Operations, 10HHVM, 10Patch-For-Review, 10User-Elukey: Upgrade mw* servers to Debian Stretch (using HHVM) - https://phabricator.wikimedia.org/T174431#4198574 (10Dzahn) [20:51:30] 10Operations, 10HHVM, 10Patch-For-Review, 10User-Elukey: Upgrade mw* servers to Debian Stretch (using HHVM) - https://phabricator.wikimedia.org/T174431#3561778 (10Dzahn) All appservers are running stretch now. (one of them is broken, creating subtask) This now just needs to stay open for deployment and m... [20:56:22] 10Operations, 10ops-codfw, 10DC-Ops: mw2139 failed to boot - hardware check - https://phabricator.wikimedia.org/T194426#4198591 (10Dzahn) [20:56:24] 10Operations, 10ops-codfw, 10DC-Ops: mw2139 failed to boot - hardware check - https://phabricator.wikimedia.org/T194426#4198603 (10Dzahn) p:05Triage>03Normal [20:57:23] PROBLEM - keystone public endoint port 5000 on labtestcontrol2001 is CRITICAL: connect to address 208.80.153.47 and port 5000: Connection refused [20:57:52] (03PS1) 1020after4: Add account for phabricator_files to swift::params::accounts [puppet] - 10https://gerrit.wikimedia.org/r/432528 [20:58:22] PROBLEM - keystone admin endpoint port 35357 on labtestcontrol2001 is CRITICAL: connect to address 208.80.153.47 and port 35357: Connection refused [20:58:45] (03Abandoned) 10Dzahn: Revert "disable icinga notifications on mw22* hosts" [puppet] - 10https://gerrit.wikimedia.org/r/431807 (owner: 10Dzahn) [20:59:03] (03PS1) 10Dzahn: Revert "disable icinga notifications on mw21[3-4]* hosts" [puppet] - 10https://gerrit.wikimedia.org/r/432529 [20:59:39] (03PS1) 10Ottomata: Puppetize Turnilo (Pivot replacement) [puppet] - 10https://gerrit.wikimedia.org/r/432530 (https://phabricator.wikimedia.org/T194427) [21:00:24] (03CR) 10Dzahn: [C: 032] Revert "disable icinga notifications on mw21[3-4]* hosts" [puppet] - 10https://gerrit.wikimedia.org/r/432529 (owner: 10Dzahn) [21:00:30] (03PS2) 10Dzahn: Revert "disable icinga notifications on mw21[3-4]* hosts" [puppet] - 10https://gerrit.wikimedia.org/r/432529 [21:02:55] (03PS1) 1020after4: add phabricator_files dummy key to swift::params::account_keys: [labs/private] - 10https://gerrit.wikimedia.org/r/432531 [21:04:17] (03CR) 10Herron: ELK: change elasticsearch index prefix to logstash-syslog for syslog type (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/431860 (https://phabricator.wikimedia.org/T193766) (owner: 10Herron) [21:04:49] (03CR) 1020after4: "this is cherry-picked on deployment-puppetmaster02" [puppet] - 10https://gerrit.wikimedia.org/r/432528 (owner: 1020after4) [21:06:19] (03CR) 1020after4: [C: 032] "self-merging because I need to test this on beta." [labs/private] - 10https://gerrit.wikimedia.org/r/432531 (owner: 1020after4) [21:09:18] (03PS1) 10Merlijn van Deen: Do not connect to SQL server for a dry run [puppet] - 10https://gerrit.wikimedia.org/r/432532 [21:09:22] (03CR) 1020after4: [V: 032 C: 032] add phabricator_files dummy key to swift::params::account_keys: [labs/private] - 10https://gerrit.wikimedia.org/r/432531 (owner: 1020after4) [21:11:52] (03PS2) 10Ottomata: Puppetize Turnilo (Pivot replacement) [puppet] - 10https://gerrit.wikimedia.org/r/432530 (https://phabricator.wikimedia.org/T194427) [21:12:12] RECOVERY - keystone admin endpoint port 35357 on labtestcontrol2001 is OK: HTTP OK: HTTP/1.1 300 Multiple Choices - 783 bytes in 0.081 second response time [21:12:25] (03CR) 10jerkins-bot: [V: 04-1] Puppetize Turnilo (Pivot replacement) [puppet] - 10https://gerrit.wikimedia.org/r/432530 (https://phabricator.wikimedia.org/T194427) (owner: 10Ottomata) [21:12:33] RECOVERY - keystone public endoint port 5000 on labtestcontrol2001 is OK: HTTP OK: HTTP/1.1 300 Multiple Choices - 781 bytes in 0.076 second response time [21:12:56] (03PS2) 10Merlijn van Deen: Do not connect to SQL server for a dry run [puppet] - 10https://gerrit.wikimedia.org/r/432532 [21:12:58] (03CR) 10Zhuyifei1999: Do not connect to SQL server for a dry run (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/432532 (owner: 10Merlijn van Deen) [21:14:06] (03CR) 10Zhuyifei1999: [C: 031] Do not connect to SQL server for a dry run [puppet] - 10https://gerrit.wikimedia.org/r/432532 (owner: 10Merlijn van Deen) [21:14:08] (03PS3) 10Ottomata: Puppetize Turnilo (Pivot replacement) [puppet] - 10https://gerrit.wikimedia.org/r/432530 (https://phabricator.wikimedia.org/T194427) [21:14:22] PROBLEM - puppet last run on labtestcontrol2003 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/etc/prometheus/rabbitmq-exporter.yaml] [21:14:52] (03CR) 10jerkins-bot: [V: 04-1] Puppetize Turnilo (Pivot replacement) [puppet] - 10https://gerrit.wikimedia.org/r/432530 (https://phabricator.wikimedia.org/T194427) (owner: 10Ottomata) [21:16:28] (03CR) 10Ottomata: "https://puppet-compiler.wmflabs.org/compiler02/11187/thorium.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/432530 (https://phabricator.wikimedia.org/T194427) (owner: 10Ottomata) [21:25:12] RECOVERY - mediawiki-installation DSH group on mw2202 is OK: OK [21:31:45] (03PS1) 1020after4: WIP: Add phabricator config for the new swift backend [puppet] - 10https://gerrit.wikimedia.org/r/432533 [21:43:02] (03CR) 10Paladox: WIP: Add phabricator config for the new swift backend (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/432533 (owner: 1020after4) [22:21:59] 10Operations, 10ops-eqiad, 10netops, 10Patch-For-Review: Rack/cable/configure asw2-c-eqiad switch stack - https://phabricator.wikimedia.org/T187962#4198886 (10Andrew) I'm flying on the 29th. If Chase wants to manage these things without me that's fine with me though :) [23:00:05] addshore, hashar, anomie, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: #bothumor I � Unicode. All rise for Evening SWAT (Max 6 patches) deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180510T2300). [23:00:05] Smalyshev: A patch you scheduled for Evening SWAT (Max 6 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [23:00:12] here [23:02:49] me too, let's SWAT [23:02:55] ok [23:03:37] I see it's basically my patch, which is pretty simple and doesn't need testing - the dump script will pick it up later [23:04:20] (03PS3) 10Thcipriani: Add wikis with more that 1000 categories to categories dump [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432043 (https://phabricator.wikimedia.org/T194139) (owner: 10Smalyshev) [23:04:39] (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432043 (https://phabricator.wikimedia.org/T194139) (owner: 10Smalyshev) [23:06:05] (03Merged) 10jenkins-bot: Add wikis with more that 1000 categories to categories dump [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432043 (https://phabricator.wikimedia.org/T194139) (owner: 10Smalyshev) [23:07:09] SMalyshev: was there anything you needed to test before this goes out? I just pulled over to mwdebug1002 if so. [23:07:27] thcipriani: no, it's for dump scripts so nothing to check on mwdebug [23:07:36] okie doke, going live [23:07:38] you can just deploy it [23:07:43] thanks! [23:09:42] (03CR) 10jenkins-bot: Add wikis with more that 1000 categories to categories dump [mediawiki-config] - 10https://gerrit.wikimedia.org/r/432043 (https://phabricator.wikimedia.org/T194139) (owner: 10Smalyshev) [23:09:55] !log thcipriani@tin Synchronized dblists/categories-rdf.dblist: SWAT: [[gerrit:432043|Add wikis with more that 1000 categories to categories dump]] T194139 (duration: 01m 02s) [23:09:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:09:59] T194139: The argument //deepcategory// in CirrusSearch only reports the members of the root category for nowiki - https://phabricator.wikimedia.org/T194139 [23:10:11] SMalyshev: all's live [23:38:41] 10Operations, 10ops-codfw, 10DC-Ops: mw2139 failed to boot - hardware check - https://phabricator.wikimedia.org/T194426#4199023 (10Dzahn) racadm getsel doesn't have anything: ``` /admin1-> racadm getsel Record: 1 Date/Time: 01/15/2015 23:01:17 Source: system Severity: Ok Description: Log cle... [23:59:47] 10Operations, 10Ops-Access-Reviews, 10Reading-Infrastructure-Team-Backlog, 10Patch-For-Review: Add Michael Holloway (Reading Infrastructure) to maps admin groups - https://phabricator.wikimedia.org/T194404#4197487 (10Dzahn) Hi @MHolloway i've just been checking the requirements for this and this is what th...