[00:00:17] <greg-g>	 :) :)
[00:04:13] <wikibugs>	 10Operations, 10Patch-For-Review, 10Services (doing), 10User-mobrovac: nodejs 6.11 - https://phabricator.wikimedia.org/T170548#3473292 (10debt) @ksmith and @Gehel - I believe we updated maps to use node 6 in these tickets: T150354 and T158984.   @MaxSem - is there more to do, that you know of?
[00:15:04] <wikibugs>	 (03PS1) 10Reedy: phpcs on multiversion [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367831 (https://phabricator.wikimedia.org/T171509)
[00:17:52] <wikibugs>	 (03CR) 10Reedy: "Woo, less exclusions" (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367831 (https://phabricator.wikimedia.org/T171509) (owner: 10Reedy)
[00:23:02] <wikibugs>	 (03PS2) 10Reedy: phpcs on multiversion [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367831 (https://phabricator.wikimedia.org/T171509)
[00:25:40] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] phpcs on multiversion [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367831 (https://phabricator.wikimedia.org/T171509) (owner: 10Reedy)
[00:27:23] <wikibugs>	 (03PS3) 10Reedy: phpcs on multiversion [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367831 (https://phabricator.wikimedia.org/T171509)
[01:07:16] <wikibugs>	 (03CR) 10Krinkle: [C: 031] phpcs on multiversion [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367831 (https://phabricator.wikimedia.org/T171509) (owner: 10Reedy)
[01:08:06] <wikibugs>	 (03CR) 10Krinkle: [C: 031] "@Reedy: Wanna land?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367831 (https://phabricator.wikimedia.org/T171509) (owner: 10Reedy)
[01:08:29] <Reedy>	 Krinkle: you mean jfdi? :P
[01:08:44] <Krinkle>	 Well, I did review it, and some tests are passing.
[01:08:52] <Krinkle>	 We can give it a few minutes in beta first, and on XMD
[01:08:54] <Krinkle>	 XWD*
[01:09:06] <legoktm>	 I thought Krinkle was making a pilot joke
[01:09:15] <Krinkle>	 ...
[01:09:30] <Reedy>	 lol
[01:09:44] <wikibugs>	 (03CR) 10Reedy: [C: 032] phpcs on multiversion [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367831 (https://phabricator.wikimedia.org/T171509) (owner: 10Reedy)
[01:11:12] <wikibugs>	 (03Merged) 10jenkins-bot: phpcs on multiversion [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367831 (https://phabricator.wikimedia.org/T171509) (owner: 10Reedy)
[01:11:22] <wikibugs>	 (03CR) 10jenkins-bot: phpcs on multiversion [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367831 (https://phabricator.wikimedia.org/T171509) (owner: 10Reedy)
[01:12:21] <logmsgbot>	 !log reedy@tin Synchronized tests/multiversion/: phpcs (duration: 00m 46s)
[01:12:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[01:13:17] <logmsgbot>	 !log reedy@tin Synchronized phpcs.xml: phpcs (duration: 00m 46s)
[01:13:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[01:14:28] <icinga-wm>	 PROBLEM - puppet last run on puppetmaster1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[01:15:00] <logmsgbot>	 !log reedy@tin Synchronized multiversion/: phpcs (duration: 01m 06s)
[01:15:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[01:15:20] <Reedy>	 next stop mediawiki-codesniffer 0.10.1
[01:24:37] <icinga-wm>	 RECOVERY - puppet last run on puppetmaster1001 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures
[01:29:34] <wikibugs>	 (03CR) 10Krinkle: [C: 031] Do the echo when running update.php [puppet] - 10https://gerrit.wikimedia.org/r/354932 (owner: 10Reedy)
[01:30:48] <wikibugs>	 (03CR) 10Krinkle: [C: 04-1] "foreachwikiindblist (mwscriptwikiset) doesn't fail early when one wiki returns non-zeor, right? So we could do all.dblist, as long as we f" [puppet] - 10https://gerrit.wikimedia.org/r/363639 (owner: 10Reedy)
[02:04:53] <wikibugs>	 (03CR) 10Krinkle: [C: 031] Allow mwdeploy user to restart jobchron [puppet] - 10https://gerrit.wikimedia.org/r/367815 (https://phabricator.wikimedia.org/T129148) (owner: 10Thcipriani)
[02:07:07] <icinga-wm>	 RECOVERY - Check Varnish expiry mailbox lag on cp1074 is OK: OK: expiry mailbox lag is 10765
[02:40:40] <wikibugs>	 10Operations, 10Wikimedia-Language-setup, 10Wikimedia-Site-requests, 10Hindi-Sites: Create Wikiversity Hindi - https://phabricator.wikimedia.org/T168765#3473485 (10Jayprakash12345)
[03:01:06] <logmsgbot>	 !log l10nupdate@tin LocalisationUpdate failed: git pull of extensions failed
[03:01:16] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[03:27:57] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 770.50 seconds
[04:10:57] <icinga-wm>	 PROBLEM - Check systemd state on cp1008 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[04:11:17] <icinga-wm>	 PROBLEM - Varnish HTTP text-backend - port 3128 on cp1008 is CRITICAL: connect to address 208.80.154.42 and port 3128: Connection refused
[04:22:07] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 288.98 seconds
[04:27:47] <icinga-wm>	 PROBLEM - puppet last run on cp1008 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Service[varnish]
[04:49:57] <wikibugs>	 10Operations, 10ops-eqiad, 10Cloud-Services: rack/setup/install labstore100[67].wikimedia.org - https://phabricator.wikimedia.org/T167984#3473675 (10madhuvishy) @Cmjohnson These two need to be in the public vlan.
[05:05:02] <wikibugs>	 10Operations, 10ops-eqiad, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Degraded RAID on labsdb1001 - https://phabricator.wikimedia.org/T171538#3473678 (10Marostegui) 05Open>03Resolved ``` [root@labsdb1001 05:04 /root] # megacli -pdrbld -showprog -physdrv\[16:9\] -aALL  Device...
[05:31:45] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 032] prometheus::node_exporter: fix compatibility with the future parser [puppet] - 10https://gerrit.wikimedia.org/r/367694 (owner: 10Giuseppe Lavagetto)
[05:31:52] <wikibugs>	 (03PS3) 10Giuseppe Lavagetto: prometheus::node_exporter: fix compatibility with the future parser [puppet] - 10https://gerrit.wikimedia.org/r/367694
[05:32:09] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] prometheus::node_exporter: fix compatibility with the future parser [puppet] - 10https://gerrit.wikimedia.org/r/367694 (owner: 10Giuseppe Lavagetto)
[05:36:02] <wikibugs>	 (03PS10) 10Giuseppe Lavagetto: role::configcluster: move to future environment [puppet] - 10https://gerrit.wikimedia.org/r/365572
[05:37:34] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 032] role::configcluster: move to future environment [puppet] - 10https://gerrit.wikimedia.org/r/365572 (owner: 10Giuseppe Lavagetto)
[05:38:57] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s1 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[05:39:11] <marostegui>	 ^ backups
[05:40:28] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: s4 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[05:40:28] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: m2 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[05:40:28] <icinga-wm>	 PROBLEM - MariaDB Slave IO: m2 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[05:40:28] <icinga-wm>	 PROBLEM - MariaDB Slave IO: s3 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[05:40:28] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: s6 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[05:40:28] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: s2 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[05:40:28] <icinga-wm>	 PROBLEM - MariaDB Slave IO: s2 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[05:40:38] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: s3 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[05:40:38] <icinga-wm>	 PROBLEM - MariaDB Slave IO: s4 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[05:40:38] <icinga-wm>	 PROBLEM - MariaDB Slave IO: s5 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[05:40:38] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: s1 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[05:41:18] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: s4 on dbstore1001 is OK: OK slave_sql_state Slave_SQL_Running: Yes
[05:41:18] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: s2 on dbstore1001 is OK: OK slave_sql_state Slave_SQL_Running: Yes
[05:41:18] <icinga-wm>	 RECOVERY - MariaDB Slave IO: s3 on dbstore1001 is OK: OK slave_io_state Slave_IO_Running: Yes
[05:41:18] <icinga-wm>	 RECOVERY - MariaDB Slave IO: m2 on dbstore1001 is OK: OK slave_io_state not a slave
[05:41:18] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: s6 on dbstore1001 is OK: OK slave_sql_state Slave_SQL_Running: Yes
[05:41:18] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: m2 on dbstore1001 is OK: OK slave_sql_state not a slave
[05:41:19] <icinga-wm>	 RECOVERY - MariaDB Slave IO: s2 on dbstore1001 is OK: OK slave_io_state Slave_IO_Running: Yes
[05:41:37] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: s3 on dbstore1001 is OK: OK slave_sql_state Slave_SQL_Running: No, (no error: intentional)
[05:41:37] <icinga-wm>	 RECOVERY - MariaDB Slave IO: s4 on dbstore1001 is OK: OK slave_io_state Slave_IO_Running: Yes
[05:41:37] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: s1 on dbstore1001 is OK: OK slave_sql_state Slave_SQL_Running: No, (no error: intentional)
[05:41:37] <icinga-wm>	 RECOVERY - MariaDB Slave IO: s5 on dbstore1001 is OK: OK slave_io_state Slave_IO_Running: Yes
[05:53:17] <_joe_>	 !log moving all conf* servers to the future puppet parser 
[05:53:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:56:42] <wikibugs>	 10Operations, 10Puppet, 10User-Joe: Switch all hosts to the future parser - https://phabricator.wikimedia.org/T171704#3473695 (10Joe)
[05:59:48] <icinga-wm>	 PROBLEM - mailman I/O stats on fermium is CRITICAL: CRITICAL - I/O stats: Transfers/Sec=1105.80 Read Requests/Sec=758.80 Write Requests/Sec=1.50 KBytes Read/Sec=49154.40 KBytes_Written/Sec=39.60
[06:06:10] <wikibugs>	 (03PS1) 10Krinkle: Revert "Bump cache epoch for Wikidata" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367853 (https://phabricator.wikimedia.org/T167784)
[06:07:58] <icinga-wm>	 RECOVERY - mailman I/O stats on fermium is OK: OK - I/O stats: Transfers/Sec=97.30 Read Requests/Sec=0.00 Write Requests/Sec=1.30 KBytes Read/Sec=0.00 KBytes_Written/Sec=29.60
[06:18:50] <wikibugs>	 10Operations, 10Patch-For-Review, 10Services (doing), 10User-mobrovac: nodejs 6.11 - https://phabricator.wikimedia.org/T170548#3473724 (10MoritzMuehlenhoff) @debt , @ksmith : We currently use nodejs 6.9 in the production cluster and are migrating to 6.11. While 6.x is an LTS release, there's a sizable numb...
[06:21:35] <wikibugs>	 (03PS8) 10Muehlenhoff: Clean up stray binary packages after Debian updates [puppet] - 10https://gerrit.wikimedia.org/r/367645
[06:28:13] <icinga-wm>	 PROBLEM - ores on scb1003 is CRITICAL: connect to address 10.64.32.153 and port 8081: Connection refused
[06:44:13] <icinga-wm>	 RECOVERY - ores on scb1003 is OK: HTTP OK: HTTP/1.0 200 OK - 3666 bytes in 0.011 second response time
[06:57:11] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 032] Clean up stray binary packages after Debian updates [puppet] - 10https://gerrit.wikimedia.org/r/367645 (owner: 10Muehlenhoff)
[06:57:23] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s1 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[06:59:53] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: m2 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[06:59:54] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: s4 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[06:59:54] <icinga-wm>	 PROBLEM - MariaDB Slave IO: s2 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[06:59:54] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: s2 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[06:59:54] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: s6 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[06:59:54] <icinga-wm>	 PROBLEM - MariaDB Slave IO: s3 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[06:59:54] <icinga-wm>	 PROBLEM - MariaDB Slave IO: m2 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:00:04] <icinga-wm>	 PROBLEM - MariaDB Slave IO: s5 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:00:04] <icinga-wm>	 PROBLEM - MariaDB Slave IO: s4 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:00:04] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: s3 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:00:04] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: s1 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:00:23] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s1 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:02:03] <icinga-wm>	 RECOVERY - MariaDB Slave IO: s5 on dbstore1001 is OK: OK slave_io_state Slave_IO_Running: Yes
[07:02:03] <icinga-wm>	 RECOVERY - MariaDB Slave IO: s4 on dbstore1001 is OK: OK slave_io_state Slave_IO_Running: Yes
[07:02:03] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: s3 on dbstore1001 is OK: OK slave_sql_state Slave_SQL_Running: Yes
[07:02:03] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: s1 on dbstore1001 is OK: OK slave_sql_state Slave_SQL_Running: Yes
[07:02:53] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: m2 on dbstore1001 is OK: OK slave_sql_state not a slave
[07:02:53] <icinga-wm>	 RECOVERY - MariaDB Slave IO: m2 on dbstore1001 is OK: OK slave_io_state not a slave
[07:02:53] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: s4 on dbstore1001 is OK: OK slave_sql_state Slave_SQL_Running: No, (no error: intentional)
[07:02:53] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: s6 on dbstore1001 is OK: OK slave_sql_state Slave_SQL_Running: No, (no error: intentional)
[07:02:53] <icinga-wm>	 RECOVERY - MariaDB Slave IO: s2 on dbstore1001 is OK: OK slave_io_state Slave_IO_Running: Yes
[07:02:53] <icinga-wm>	 RECOVERY - MariaDB Slave IO: s3 on dbstore1001 is OK: OK slave_io_state Slave_IO_Running: Yes
[07:02:53] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: s2 on dbstore1001 is OK: OK slave_sql_state Slave_SQL_Running: No, (no error: intentional)
[07:09:04] <wikibugs>	 10Operations, 10ops-codfw: failing RAID disk on frdb2001 - https://phabricator.wikimedia.org/T171584#3473760 (10MoritzMuehlenhoff) p:05Triage>03Normal
[07:11:04] <icinga-wm>	 PROBLEM - BGP status on cr1-eqord is CRITICAL: BGP CRITICAL - AS2914/IPv6: Active, AS2914/IPv4: Active
[07:13:53] <icinga-wm>	 RECOVERY - Varnish HTTP text-backend - port 3128 on cp1008 is OK: HTTP OK: HTTP/1.1 200 OK - 176 bytes in 0.001 second response time
[07:14:33] <icinga-wm>	 RECOVERY - Check systemd state on cp1008 is OK: OK - running: The system is fully operational
[07:14:53] <ema>	 !log cp1008: use sdb only in varnish.service, waiting for Chris to replace sda T171028
[07:15:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:15:06] <stashbot>	 T171028: Degraded RAID on cp1008 - https://phabricator.wikimedia.org/T171028
[07:15:10] <wikibugs>	 10Operations, 10Interactive-Sprint, 10Maps (Kartotherian): Upgrade kartotherian and tilerator to nodejs 6.11 - https://phabricator.wikimedia.org/T171707#3473775 (10Gehel)
[07:18:02] <wikibugs>	 (03PS24) 10Ema: varnish: Avoid std.fileread() and use new errorpage template [puppet] - 10https://gerrit.wikimedia.org/r/350966 (https://phabricator.wikimedia.org/T113114) (owner: 10Krinkle)
[07:28:02] <wikibugs>	 (03CR) 10Jcrespo: [C: 04-1] "Right now this is a no, based on 2 ongoing issues: https://phabricator.wikimedia.org/T167784 and the unbreak now https://phabricator.wikim" [puppet] - 10https://gerrit.wikimedia.org/r/366887 (https://phabricator.wikimedia.org/T171263) (owner: 10Ladsgroup)
[07:28:56] <wikibugs>	 (03PS25) 10Ema: varnish: Avoid std.fileread() and use new errorpage template [puppet] - 10https://gerrit.wikimedia.org/r/350966 (https://phabricator.wikimedia.org/T113114) (owner: 10Krinkle)
[07:30:50] <wikibugs>	 (03CR) 10Ema: [C: 032] varnish: Avoid std.fileread() and use new errorpage template [puppet] - 10https://gerrit.wikimedia.org/r/350966 (https://phabricator.wikimedia.org/T113114) (owner: 10Krinkle)
[07:42:24] <icinga-wm>	 RECOVERY - BGP status on cr1-eqord is OK: BGP OK - up: 52, down: 0, shutdown: 4
[07:43:03] <wikibugs>	 (03PS1) 10Muehlenhoff: Reimage mw2119 with jessie [puppet] - 10https://gerrit.wikimedia.org/r/367855
[07:43:53] <icinga-wm>	 PROBLEM - Router interfaces on cr1-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 37, down: 1, dormant: 0, excluded: 0, unused: 0
[07:46:13] <wikibugs>	 (03PS4) 10Giuseppe Lavagetto: wmflib: fix all Hiera backends' Rubocop infractions [puppet] - 10https://gerrit.wikimedia.org/r/359447 (owner: 10Faidon Liambotis)
[07:48:49] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 032] Reimage mw2119 with jessie [puppet] - 10https://gerrit.wikimedia.org/r/367855 (owner: 10Muehlenhoff)
[07:48:53] <icinga-wm>	 RECOVERY - Router interfaces on cr1-eqord is OK: OK: host 208.80.154.198, interfaces up: 39, down: 0, dormant: 0, excluded: 0, unused: 0
[07:48:56] <wikibugs>	 (03PS2) 10Muehlenhoff: Reimage mw2119 with jessie [puppet] - 10https://gerrit.wikimedia.org/r/367855
[07:49:38] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 032] wmflib: fix all Hiera backends' Rubocop infractions [puppet] - 10https://gerrit.wikimedia.org/r/359447 (owner: 10Faidon Liambotis)
[07:50:33] <icinga-wm>	 PROBLEM - BGP status on cr1-eqord is CRITICAL: BGP CRITICAL - AS2914/IPv4: Active, AS2914/IPv6: Active
[07:52:26] <wikibugs>	 (03PS3) 10Muehlenhoff: Reimage mw2119 with jessie [puppet] - 10https://gerrit.wikimedia.org/r/367855
[07:53:08] <icinga-wm>	 ACKNOWLEDGEMENT - BGP status on cr1-eqord is CRITICAL: BGP CRITICAL - AS2914/IPv6: Active, AS2914/IPv4: Active Ema Peering with NTT flapping (AS2914)
[07:53:14] <wikibugs>	 (03CR) 10Muehlenhoff: [V: 032 C: 032] Reimage mw2119 with jessie [puppet] - 10https://gerrit.wikimedia.org/r/367855 (owner: 10Muehlenhoff)
[07:53:44] <jynus>	 !log start defragmenging on pc1* hosts T167784
[07:53:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:53:54] <stashbot>	 T167784: WMF ParserCache disk space exhaustion - https://phabricator.wikimedia.org/T167784
[07:56:30] <moritzm>	 !log restarting cassandra-metrics-collector on restbase* to pick up openjdk security update
[07:56:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:59:46] <moritzm>	 !log restarting cassandra-metrics-collector on maps* to pick up openjdk security update
[07:59:55] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:00:40] <wikibugs>	 (03Abandoned) 10Giuseppe Lavagetto: prometheus::node::exporter: ugly workaround for future parser [puppet] - 10https://gerrit.wikimedia.org/r/367659 (owner: 10Giuseppe Lavagetto)
[08:02:25] <moritzm>	 !log installing Java security updates on jessie-based stat systems
[08:02:34] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:02:51] <wikibugs>	 (03PS2) 10Filippo Giunchedi: thumbor: fix connections-per-backend in nginx [puppet] - 10https://gerrit.wikimedia.org/r/367687 (https://phabricator.wikimedia.org/T171468)
[08:03:23] <wikibugs>	 10Operations, 10Traffic, 10Patch-For-Review, 10User-Elukey: logster should not resolve statsd's IP every time it sends a metric - https://phabricator.wikimedia.org/T171318#3473823 (10elukey) After a chat with Moritz and Ema we decided to pick the current jessie version and apply the patch on top of it. In...
[08:04:27] <wikibugs>	 10Operations, 10ops-eqiad, 10DBA: Degraded RAID on db1066 - https://phabricator.wikimedia.org/T169448#3473825 (10jcrespo) This is taking a long time to be rebuilt :-/ - It is still doing it.
[08:04:54] <wikibugs>	 (03CR) 10Ema: [C: 031] thumbor: fix connections-per-backend in nginx [puppet] - 10https://gerrit.wikimedia.org/r/367687 (https://phabricator.wikimedia.org/T171468) (owner: 10Filippo Giunchedi)
[08:06:33] <icinga-wm>	 PROBLEM - puppet last run on stat1005 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[libgsl0-dev]
[08:07:24] <wikibugs>	 (03PS3) 10Filippo Giunchedi: thumbor: fix connections-per-backend in nginx [puppet] - 10https://gerrit.wikimedia.org/r/367687 (https://phabricator.wikimedia.org/T171468)
[08:08:00] <wikibugs>	 (03PS4) 10Filippo Giunchedi: thumbor: fix connections-per-backend in nginx [puppet] - 10https://gerrit.wikimedia.org/r/367687 (https://phabricator.wikimedia.org/T171468)
[08:08:33] <icinga-wm>	 RECOVERY - puppet last run on stat1005 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures
[08:08:52] <wikibugs>	 10Operations, 10Traffic, 10Patch-For-Review, 10User-Elukey: logster should not resolve statsd's IP every time it sends a metric - https://phabricator.wikimedia.org/T171318#3460781 (10MoritzMuehlenhoff) Ideally we upgrade to 1.x in Debian, the version currently in the archive is from 2014 and hasn't been to...
[08:09:08] <wikibugs>	 (03CR) 10Filippo Giunchedi: [C: 032] thumbor: fix connections-per-backend in nginx [puppet] - 10https://gerrit.wikimedia.org/r/367687 (https://phabricator.wikimedia.org/T171468) (owner: 10Filippo Giunchedi)
[08:10:12] <wikibugs>	 (03PS1) 10Jcrespo: mariadb: Depool db1066 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367859 (https://phabricator.wikimedia.org/T169448)
[08:12:42] <wikibugs>	 (03CR) 10Jcrespo: [C: 032] mariadb: Depool db1066 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367859 (https://phabricator.wikimedia.org/T169448) (owner: 10Jcrespo)
[08:13:22] <icinga-wm>	 ACKNOWLEDGEMENT - Host ms-be2024 is DOWN: PING CRITICAL - Packet loss = 100% Filippo Giunchedi T171275
[08:14:33] <icinga-wm>	 RECOVERY - MegaRAID on db1066 is OK: OK: optimal, 1 logical, 2 physical, WriteBack policy
[08:14:44] <wikibugs>	 (03Merged) 10jenkins-bot: mariadb: Depool db1066 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367859 (https://phabricator.wikimedia.org/T169448) (owner: 10Jcrespo)
[08:16:29] <wikibugs>	 (03CR) 10jenkins-bot: mariadb: Depool db1066 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367859 (https://phabricator.wikimedia.org/T169448) (owner: 10Jcrespo)
[08:18:37] <wikibugs>	 10Operations, 10ops-codfw: ms-be2024 not powering on - https://phabricator.wikimedia.org/T171275#3473840 (10fgiunchedi) a:05fgiunchedi>03Papaul @papaul I don't seem to be able to bring back the power via ilo, connected via ssh and power is off. Turning power on doesn't seem to do anything.  ``` </>hpiLO->...
[08:21:05] <wikibugs>	 10Operations, 10ops-codfw, 10User-fgiunchedi: ms-be2024 not powering on - https://phabricator.wikimedia.org/T171275#3473851 (10fgiunchedi)
[08:21:22] <wikibugs>	 10Operations, 10monitoring: On stretch, python metric collector for disk is on DEBUG logging mode - https://phabricator.wikimedia.org/T171638#3473856 (10fgiunchedi)
[08:21:24] <wikibugs>	 10Operations, 10monitoring, 10User-fgiunchedi: Diamond log level set to DEBUG spams syslog - https://phabricator.wikimedia.org/T171580#3473854 (10fgiunchedi)
[08:26:01] <wikibugs>	 (03CR) 10Filippo Giunchedi: "LGTM in general, see inline. Too bad the disk collector isn't able to blacklist filesystems :(" (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/367710 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush)
[08:26:47] <moritzm>	 !log reimaging mw2119 to jessie (T145742)
[08:26:51] <logmsgbot>	 !log jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1066 for maintenance (duration: 00m 46s)
[08:26:58] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:26:59] <stashbot>	 T145742: Migrate video scalers to jessie - https://phabricator.wikimedia.org/T145742
[08:27:07] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:27:56] <wikibugs>	 10Operations, 10Pybal, 10Traffic, 10monitoring: pybal: add prometheus metrics - https://phabricator.wikimedia.org/T171710#3473875 (10ema)
[08:28:03] <wikibugs>	 10Operations, 10Pybal, 10Traffic, 10monitoring: pybal: add prometheus metrics - https://phabricator.wikimedia.org/T171710#3473888 (10ema) p:05Triage>03Normal
[08:28:18] <wikibugs>	 10Operations, 10MediaWiki-extensions-LocalisationUpdate, 10MinervaNeue, 10Release-Engineering-Team: The mobile-frontend-placeholder message is not updated in din.wikipedia.org - https://phabricator.wikimedia.org/T171711#3473889 (10Amire80)
[08:29:10] <wikibugs>	 10Operations, 10ops-eqiad, 10DBA, 10Patch-For-Review: Degraded RAID on db1066 - https://phabricator.wikimedia.org/T169448#3473901 (10jcrespo) I depool it and not it finishes :-(
[08:29:37] <wikibugs>	 (03PS6) 10Filippo Giunchedi: librenms: enable graphite extension [puppet] - 10https://gerrit.wikimedia.org/r/366836 (https://phabricator.wikimedia.org/T171167)
[08:31:14] <wikibugs>	 (03CR) 10Filippo Giunchedi: [C: 032] librenms: enable graphite extension [puppet] - 10https://gerrit.wikimedia.org/r/366836 (https://phabricator.wikimedia.org/T171167) (owner: 10Filippo Giunchedi)
[08:32:56] <wikibugs>	 (03CR) 10Volans: [C: 04-1] "Nice check to add! I have a couple of general doubts and there are a couple of things to fix." (0312 comments) [puppet] - 10https://gerrit.wikimedia.org/r/367662 (https://phabricator.wikimedia.org/T134893) (owner: 10Ema)
[08:36:27] <elukey>	 "couple of doubts" --> 12 comments
[08:36:36] * elukey loves volans code reviews
[08:36:51] <volans>	 elukey: read the whole comment though ;)
[08:37:29] <elukey>	 volans: I was kiddiiiinggggg
[08:38:03] <volans>	 :D
[08:40:02] <ema>	 s/a couple/a couple dozen/
[08:40:22] <volans>	 rotfl
[08:40:52] <volans>	 "couple" is an arbitrary definition ;)
[08:42:09] <jynus>	 !log upgrading and restarting db1066
[08:42:19] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:42:27] <wikibugs>	 10Operations, 10MediaWiki-extensions-LocalisationUpdate, 10MinervaNeue, 10Release-Engineering-Team: The mobile-frontend-placeholder message is not updated in din.wikipedia.org - https://phabricator.wikimedia.org/T171711#3473906 (10KartikMistry) It seems LocalisationUpdate is failing. See: https://tools.wmf...
[08:45:17] <wikibugs>	 (03PS5) 10Ema: pybal: bind instrumentation TCP port to private addresses [puppet] - 10https://gerrit.wikimedia.org/r/348074 (https://phabricator.wikimedia.org/T103882)
[08:45:22] <wikibugs>	 10Operations, 10ops-eqiad, 10hardware-requests, 10Patch-For-Review: Decommission mw1196 - https://phabricator.wikimedia.org/T170441#3431403 (10MoritzMuehlenhoff) Have the "non-interruptuable steps" really been completed? mw1196 still has a salt key and shows up  https://servermon.wikimedia.org/hosts/
[08:45:55] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s1 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:46:44] <elukey>	 !log upload logster 0.0.10-2~jessie1 to jessie-wikimedia
[08:46:52] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:47:26] <elukey>	 !log rollout logster 0.0.10-2~jessie1 to the cache hosts
[08:47:36] <elukey>	 ema: --^
[08:47:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:48:23] <wikibugs>	 (03PS1) 10Filippo Giunchedi: librenms: fix log file ownership after rotation [puppet] - 10https://gerrit.wikimedia.org/r/367863
[08:49:18] <ema>	 elukey: \o/
[08:50:24] <wikibugs>	 (03PS2) 10Filippo Giunchedi: librenms: fix log file ownership after rotation [puppet] - 10https://gerrit.wikimedia.org/r/367863
[08:51:43] <wikibugs>	 (03CR) 10Filippo Giunchedi: [C: 032] librenms: fix log file ownership after rotation [puppet] - 10https://gerrit.wikimedia.org/r/367863 (owner: 10Filippo Giunchedi)
[08:56:14] <icinga-wm>	 RECOVERY - BGP status on cr1-eqord is OK: BGP OK - up: 54, down: 0, shutdown: 2
[08:58:11] <wikibugs>	 (03PS1) 10Jcrespo: Revert "mariadb: Depool db1066 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367865
[09:07:19] <wikibugs>	 10Operations, 10Traffic, 10Patch-For-Review, 10User-Elukey: logster should not resolve statsd's IP every time it sends a metric - https://phabricator.wikimedia.org/T171318#3473940 (10elukey) New package uploaded to jessie-wikimedia and rolled out to role cache::misc/upload/text/canary.
[09:07:42] <hashar>	 good morning
[09:08:11] <hashar>	 for info: the puppet masters for CI (and probably for tools as well)  yield Data retrieved from Integration/host/integration-puppetmaster01 is String, not Hash or nil at /etc/puppet/manifests/realm.pp:51
[09:08:19] <hashar>	 which is:  $app_routes = hiera('discovery::app_routes')
[09:08:34] <hashar>	 filled as T171712   I am trying to investigate :-}
[09:08:34] <stashbot>	 T171712: integration puppetmaster yield String, not Hash or nil at /etc/puppet/manifests/realm.pp:51 - https://phabricator.wikimedia.org/T171712
[09:09:08] <volans>	 hashar: _joe_ is already on it
[09:09:50] <hashar>	 good joe :)
[09:10:49] <wikibugs>	 (03PS1) 10Muehlenhoff: Install mw2152 and mw2246 with jessie [puppet] - 10https://gerrit.wikimedia.org/r/367868
[09:21:23] <wikibugs>	 10Operations, 10Traffic, 10Patch-For-Review, 10User-Elukey: logster should not resolve statsd's IP every time it sends a metric - https://phabricator.wikimedia.org/T171318#3473953 (10elukey) 05Open>03Resolved a:03elukey Impact to maerlant and acamar's traffic:  {F8854082}  {F8854085}  In eqiad hydron...
[09:21:24] <wikibugs>	 10Operations, 10netops: "MySQL server has gone away" from librenms logs - https://phabricator.wikimedia.org/T171714#3473956 (10fgiunchedi)
[09:21:59] <icinga-wm>	 PROBLEM - mediawiki-installation DSH group on mw2119 is CRITICAL: Host mw2119 is not in mediawiki-installation dsh group
[09:22:49] <icinga-wm>	 PROBLEM - nutcracker port on mw2119 is CRITICAL: connect to address 127.0.0.1 and port 11212: Connection refused
[09:22:49] <icinga-wm>	 PROBLEM - Check systemd state on mw2119 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[09:22:56] <wikibugs>	 (03CR) 10Ema: pybal: bind instrumentation TCP port to private addresses (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/348074 (https://phabricator.wikimedia.org/T103882) (owner: 10Ema)
[09:23:40] <icinga-wm>	 PROBLEM - nutcracker process on mw2119 is CRITICAL: PROCS CRITICAL: 0 processes with UID = 111 (nutcracker), command name nutcracker
[09:24:20] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: hiera: fix mwcache library [puppet] - 10https://gerrit.wikimedia.org/r/367870
[09:25:10] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 032] hiera: fix mwcache library [puppet] - 10https://gerrit.wikimedia.org/r/367870 (owner: 10Giuseppe Lavagetto)
[09:26:09] <hashar>	 _joe_: wanna test it on a labs puppet master?
[09:26:16] <hashar>	 and I had https://phabricator.wikimedia.org/T171712 for that 
[09:26:19] <icinga-wm>	 PROBLEM - Host mw2119 is DOWN: PING CRITICAL - Packet loss = 100%
[09:26:40] <_joe_>	 hashar: please do
[09:26:44] <hashar>	 doing
[09:26:50] <icinga-wm>	 PROBLEM - puppet last run on mw1213 is CRITICAL: CRITICAL: Puppet last ran 21 hours ago
[09:27:08] <wikibugs>	 (03PS2) 10Hashar: hiera: fix mwcache library [puppet] - 10https://gerrit.wikimedia.org/r/367870 (https://phabricator.wikimedia.org/T171712) (owner: 10Giuseppe Lavagetto)
[09:27:11] <_joe_>	 hashar: you will need to run puppet on the puppetmaster to fix it I think
[09:27:16] <wikibugs>	 (03CR) 10Hashar: "testing it on the CI puppet master" [puppet] - 10https://gerrit.wikimedia.org/r/367870 (https://phabricator.wikimedia.org/T171712) (owner: 10Giuseppe Lavagetto)
[09:27:29] <icinga-wm>	 RECOVERY - Host mw2119 is UP: PING OK - Packet loss = 0%, RTA = 36.08 ms
[09:27:44] <hashar>	 I am still unsure how you manage to fix those weird issues
[09:27:49] <icinga-wm>	 RECOVERY - nutcracker process on mw2119 is OK: PROCS OK: 1 process with UID = 111 (nutcracker), command name nutcracker
[09:27:52] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 032] Install mw2152 and mw2246 with jessie [puppet] - 10https://gerrit.wikimedia.org/r/367868 (owner: 10Muehlenhoff)
[09:27:53] <hashar>	 cherry picked on integration-puppetmaster
[09:27:58] <hashar>	 ran puppet which applied your patch
[09:27:59] <icinga-wm>	 RECOVERY - puppet last run on mw1213 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[09:27:59] <icinga-wm>	 RECOVERY - nutcracker port on mw2119 is OK: TCP OK - 0.000 second response time on 127.0.0.1 port 11212
[09:27:59] <icinga-wm>	 RECOVERY - Check systemd state on mw2119 is OK: OK - running: The system is fully operational
[09:28:07] <hashar>	 Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class classes for integration-puppetmaster01.integration.eqiad.wmflabs on node integration-puppetmaster01.integration.eqiad.wmflabs
[09:28:08] <hashar>	 :(
[09:28:16] <_joe_>	 ok 
[09:28:22] <_joe_>	 it seems we fixed that issue at least
[09:28:43] <hashar>	 it is on integration-puppetmaster01.integration.eqiad.wmflabs if you want to live hack it
[09:28:44] <volans>	 lol
[09:29:10] <hashar>	 maybe I can restart apache2 / passenger?
[09:29:12] <_joe_>	 yes
[09:29:14] <_joe_>	 do that
[09:29:38] <hashar>	 restarted apache2, running puppet
[09:29:43] <hashar>	 Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Reading data from Integration/host/integration-puppetmaster01 failed: TypeError: Data retrieved from Integration/host/integration-puppetmaster01 is String, not Hash or nil at /etc/puppet/manifests/realm.pp:51 on node integration-puppetmaster01.integration.eqiad.wmflabs
[09:29:45] <hashar>	 bah
[09:29:47] <hashar>	 it is back
[09:30:09] <icinga-wm>	 PROBLEM - puppet last run on mw2119 is CRITICAL: CRITICAL: Puppet has 6 failures. Last run 6 minutes ago with 6 failures. Failed resources (up to 3 shown): File_line[login.defs-SYS_GID_MAX],File[/etc/firejail/mediawiki-converters.profile],Package[fonts-noto-cjk],Service[nutcracker]
[09:30:33] <hashar>	 stil in realm.pp:51 for the discovery::apps_route which is probably the first hiera() call
[09:30:39] <_joe_>	 yes
[09:31:05] <hashar>	 the "could not find class classes"  was probably a misleading error
[09:31:09] <_joe_>	 I'm gonna play with it
[09:31:15] <hashar>	 ok
[09:32:03] <hashar>	 and if you want a guinea ping puppet agent, you can use   integration-r-lang-01.integration.eqiad.wmflabs (jessie)
[09:32:13] <_joe_>	 ok, thanks
[09:32:14] <_joe_>	 sigh
[09:32:28] <hashar>	 any clue why it would suddenly start falling ?
[09:32:36] <_joe_>	 yes
[09:32:43] <_joe_>	 a change by paravoid that I merged
[09:32:58] <hashar>	 cool!
[09:33:09] <icinga-wm>	 RECOVERY - puppet last run on mw2119 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures
[09:33:20] <hashar>	 I mean, at least the root cause is known :}
[09:33:30] <_joe_>	 ohhh jeez
[09:33:40] <_joe_>	 I forgot something in my fix
[09:36:16] <wikibugs>	 (03PS1) 10Ema: ipresolve: update documentation [puppet] - 10https://gerrit.wikimedia.org/r/367871
[09:36:34] <wikibugs>	 (03PS3) 10Giuseppe Lavagetto: hiera: fix mwcache library [puppet] - 10https://gerrit.wikimedia.org/r/367870
[09:37:00] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s1 on dbstore1001 is OK: OK slave_sql_lag Replication lag: 89998.71 seconds
[09:38:42] <hashar>	 _joe_: and while at it you can attach it to   Bug: T171712  :}
[09:38:43] <stashbot>	 T171712: integration puppetmaster yield String, not Hash or nil at /etc/puppet/manifests/realm.pp:51 - https://phabricator.wikimedia.org/T171712
[09:39:41] <_joe_>	 hashar: fixed I'd say
[09:39:52] <wikibugs>	 (03CR) 10Volans: [C: 031] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/367870 (owner: 10Giuseppe Lavagetto)
[09:39:54] <_joe_>	 can you try on other servers?
[09:40:07] <hashar>	 sure
[09:40:50] <hashar>	 catalog being cached
[09:41:10] <_joe_>	 so it works
[09:41:11] <_joe_>	 ok
[09:41:15] <_joe_>	 let me merge this
[09:41:16] <hashar>	 at least puppet is no more complainig
[09:41:21] <hashar>	 _joe_: can you add  Bug: T171712   to it ?
[09:41:27] <hashar>	 might help for later reference
[09:41:40] <_joe_>	 sure
[09:41:44] <hashar>	 and yeah jessie/trusty hosts are passing just fine  \O/
[09:42:28] <wikibugs>	 (03PS4) 10Giuseppe Lavagetto: hiera: fix mwcache library [puppet] - 10https://gerrit.wikimedia.org/r/367870 (https://phabricator.wikimedia.org/T171712)
[09:42:50] <wikibugs>	 (03CR) 10Hashar: [C: 031] "That fixed it on the CI puppet master" [puppet] - 10https://gerrit.wikimedia.org/r/367870 (https://phabricator.wikimedia.org/T171712) (owner: 10Giuseppe Lavagetto)
[09:43:40] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 032] hiera: fix mwcache library [puppet] - 10https://gerrit.wikimedia.org/r/367870 (https://phabricator.wikimedia.org/T171712) (owner: 10Giuseppe Lavagetto)
[09:47:22] <_joe_>	 fun thing - hiera changes are enabled on a puppetmaster only when the agent runs on it
[09:47:25] <_joe_>	 or it is restarted
[09:47:33] <_joe_>	 can you fucking believe it?
[09:47:34] <_joe_>	 :P
[09:48:13] <wikibugs>	 (03CR) 10Jcrespo: [C: 032] Revert "mariadb: Depool db1066 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367865 (owner: 10Jcrespo)
[09:48:52] <hashar>	 _joe_: sounds like the script used to deployed puppet changes would have to take care of that whenever /hieradata is touched ? :(
[09:49:31] <paladox>	 hmm since today puppet now fails on a puppet master running stretch
[09:49:32] <paladox>	 E: Package 'ruby-mysql' has no installation candidate
[09:49:33] <hashar>	 or maybe a HUP is sufficient
[09:50:35] <wikibugs>	 (03Merged) 10jenkins-bot: Revert "mariadb: Depool db1066 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367865 (owner: 10Jcrespo)
[09:50:45] <wikibugs>	 (03CR) 10jenkins-bot: Revert "mariadb: Depool db1066 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367865 (owner: 10Jcrespo)
[09:52:11] <moritzm>	 paladox: it's ruby-mysql2 in stretch
[09:52:18] <paladox>	 ah thanks
[09:52:37] <paladox>	 https://github.com/wikimedia/puppet/blob/7908798a594f809fc2286333c9e3f8387362a6af/modules/puppetmaster/manifests/init.pp#L87 should probaly need upating :)
[09:52:45] <moritzm>	 not sure if it's a clean drop-in replacement, though, when that has been tested, we can update puppet
[09:52:56] <logmsgbot>	 !log jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1066 after maintenance (duration: 00m 46s)
[09:53:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:53:17] <volans>	 moritzm: version numbers are very different though
[09:53:18] <_joe_>	 puppet master running stretch is not supported.
[09:53:20] <_joe_>	 next
[09:53:37] <_joe_>	 do not try to fix it
[09:54:34] <moritzm>	 volans: src:ruby-mysql was removed from Debian with the comment "replaced by ruby-mysql2", so that seems fine: https://packages.qa.debian.org/r/ruby-mysql/news/20161222T191352Z.html
[09:54:51] <_joe_>	 moritzm: again, stretch has puppet 4
[09:54:55] <_joe_>	 it's completely unsupported
[09:55:12] <_joe_>	 and if he's using a puppet3 master on stretch
[09:55:23] <_joe_>	 that's unsupported by us and makes no sense to put effort into it
[09:57:05] <moritzm>	 !log reimaging mw2152 to jessie (T145742)
[09:57:16] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:57:17] <stashbot>	 T145742: Migrate video scalers to jessie - https://phabricator.wikimedia.org/T145742
[09:57:36] <moritzm>	 _joe_: sure, I'm not planning to work on that anyway
[09:58:31] <logmsgbot>	 !log jmm@puppetmaster1001 conftool action : set/pooled=yes; selector: mw2119.codfw.wmnet
[09:58:39] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:58:42] <logmsgbot>	 !log jmm@puppetmaster1001 conftool action : set/pooled=no; selector: mw2119.codfw.wmnet
[09:58:50] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:58:55] <logmsgbot>	 !log jmm@puppetmaster1001 conftool action : set/pooled=no; selector: mw2119.codfw.wmnet
[09:59:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:05:04] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 031] Tests: simplify and improve parametrized tests (032 comments) [software/cumin] - 10https://gerrit.wikimedia.org/r/366733 (https://phabricator.wikimedia.org/T154588) (owner: 10Volans)
[10:07:52] <hashar>	 _joe_: tools labs puppet  is recovering as well. Kudos!
[10:09:22] <_joe_>	 well I merged the bad patch
[10:09:32] <_joe_>	 so I guess not really 'kudos'
[10:16:35] <wikibugs>	 (03PS1) 10Filippo Giunchedi: librenms: explicit graphite port [puppet] - 10https://gerrit.wikimedia.org/r/367875 (https://phabricator.wikimedia.org/T171167)
[10:17:56] <wikibugs>	 (03CR) 10Filippo Giunchedi: [C: 032] librenms: explicit graphite port [puppet] - 10https://gerrit.wikimedia.org/r/367875 (https://phabricator.wikimedia.org/T171167) (owner: 10Filippo Giunchedi)
[10:19:22] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: role::mediawiki::canary_appserver: move to the future parser [puppet] - 10https://gerrit.wikimedia.org/r/367876
[10:19:52] <hashar>	 if anyone has bandwidth,  I could use a puppet merge for yet another CI role   (  https://gerrit.wikimedia.org/r/#/c/367411/  )
[10:20:07] <hashar>	 (still a role cause I havent switched my mind yet to the role/profile/module pattern :-(  )
[10:21:49] <volans>	 hashar: so it's a -1 by definition :-P
[10:22:00] <icinga-wm>	 RECOVERY - mediawiki-installation DSH group on mw2119 is OK: OK
[10:22:03] <hashar>	 yeah I guess
[10:22:16] <hashar>	 then I would have to refactor the whole CI mess which is slightly more complicated ;}
[10:22:25] <volans>	 I'll leave it to the human-puppetmasters ;)
[10:22:54] <hashar>	 I will probably refactor the  zuul stuff first to train myself. But that would be after my relocation/vacations
[10:23:49] <wikibugs>	 10Operations, 10monitoring, 10User-fgiunchedi: prometheus-puppet-agent-stats cronspam on missing puppet stats - https://phabricator.wikimedia.org/T170932#3474166 (10faidon)
[10:25:27] <wikibugs>	 10Operations, 10monitoring, 10netops: "MySQL server has gone away" from librenms logs - https://phabricator.wikimedia.org/T171714#3474171 (10faidon)
[10:30:08] <wikibugs>	 (03PS3) 10Faidon Liambotis: Kill module puppet_statsd [puppet] - 10https://gerrit.wikimedia.org/r/359448
[10:31:04] <wikibugs>	 (03CR) 10Faidon Liambotis: [C: 032] Kill module puppet_statsd [puppet] - 10https://gerrit.wikimedia.org/r/359448 (owner: 10Faidon Liambotis)
[10:33:50] <icinga-wm>	 PROBLEM - puppet last run on alsafi is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:11] <paravoid>	 grumble grumble
[10:34:20] <icinga-wm>	 PROBLEM - puppet last run on dbmonitor2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:20] <icinga-wm>	 PROBLEM - puppet last run on dbmonitor1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:20] <icinga-wm>	 PROBLEM - puppet last run on elastic1026 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:21] <icinga-wm>	 PROBLEM - puppet last run on mw1167 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:30] <icinga-wm>	 PROBLEM - puppet last run on mw2122 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:30] <icinga-wm>	 PROBLEM - puppet last run on db1021 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:30] <icinga-wm>	 PROBLEM - puppet last run on dubnium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:30] <icinga-wm>	 PROBLEM - puppet last run on helium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:30] <icinga-wm>	 PROBLEM - puppet last run on cp1060 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:30] <icinga-wm>	 PROBLEM - puppet last run on mw1161 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:30] <icinga-wm>	 PROBLEM - puppet last run on mc2028 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:31] <icinga-wm>	 PROBLEM - puppet last run on planet2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:40] <icinga-wm>	 PROBLEM - puppet last run on mw1256 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:40] <icinga-wm>	 PROBLEM - puppet last run on db1092 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:40] <icinga-wm>	 PROBLEM - puppet last run on mw1278 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:34:40] <icinga-wm>	 PROBLEM - puppet last run on analytics1040 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[10:35:18] <wikibugs>	 (03PS1) 10Faidon Liambotis: Revert "Kill module puppet_statsd" [puppet] - 10https://gerrit.wikimedia.org/r/367877
[10:35:27] <wikibugs>	 (03CR) 10Faidon Liambotis: [V: 032 C: 032] Revert "Kill module puppet_statsd" [puppet] - 10https://gerrit.wikimedia.org/r/367877 (owner: 10Faidon Liambotis)
[10:35:28] <wikibugs>	 10Operations, 10monitoring, 10netops, 10Patch-For-Review, 10User-fgiunchedi: Evaluate LibreNMS' Graphite backend - https://phabricator.wikimedia.org/T171167#3474186 (10fgiunchedi) 05Open>03Resolved This is the resolved, note that the port in https://gerrit.wikimedia.org/r/367875 is required since `$c...
[10:59:30] <wikibugs>	 (03CR) 10Daniel Kinzler: "From the comments, it doesn't look like T167784 was caused by Wikidata.  T171370 is my bad, patch is up, see I9100c1745. However, I don't " [puppet] - 10https://gerrit.wikimedia.org/r/366887 (https://phabricator.wikimedia.org/T171263) (owner: 10Ladsgroup)
[11:02:31] <hoo>	 !log Updated the Wikidata property suggester with data from Monday's JSON dump and applied the T132839 workarounds
[11:02:43] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:02:43] <stashbot>	 T132839: [RfC] Property suggester suggests human properties for non-human items - https://phabricator.wikimedia.org/T132839
[11:02:58] <sjoerddebruin>	 hoo: <3
[11:03:59] <wikibugs>	 (03CR) 10Daniel Kinzler: ">  Also, we would be raising the dispatch rate of wikidata changes by ~ 25%. That's sign-off from the DBAs." [puppet] - 10https://gerrit.wikimedia.org/r/366887 (https://phabricator.wikimedia.org/T171263) (owner: 10Ladsgroup)
[11:04:56] <wikibugs>	 (03CR) 10Jcrespo: [C: 04-1] "We have to degradations of service directly related to Wikidata jobs/crons/maintenance- we do not want to add more variables because we wi" [puppet] - 10https://gerrit.wikimedia.org/r/366887 (https://phabricator.wikimedia.org/T171263) (owner: 10Ladsgroup)
[11:08:36] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on db1068 - https://phabricator.wikimedia.org/T171723#3474264 (10ops-monitoring-bot)
[11:10:20] <wikibugs>	 10Operations, 10ops-eqiad, 10DBA: Degraded RAID on db1068 - https://phabricator.wikimedia.org/T171723#3474269 (10Volans) p:05Triage>03High This is s4 master.
[11:22:36] <wikibugs>	 (03CR) 10Daniel Kinzler: "This patch was made to address one of these service degradations. We can hold it back, but I don't see how we can fix the problem without " [puppet] - 10https://gerrit.wikimedia.org/r/366887 (https://phabricator.wikimedia.org/T171263) (owner: 10Ladsgroup)
[11:49:40] <icinga-wm>	 RECOVERY - mediawiki-installation DSH group on mw2152 is OK: OK
[11:53:10] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s2 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[11:53:24] <zeljkof>	 jouncebot: next
[11:53:24] <jouncebot>	 In 1 hour(s) and 6 minute(s): European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170726T1300)
[11:53:31] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s3 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[11:53:59] <zeljkof>	 hashar: nothing for swat so far ^
[11:58:11] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s2 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[11:58:40] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s3 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[12:12:40] <moritzm>	 !log installing xorg-server updates from jessie 8.9 point release
[12:12:50] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:24:00] <icinga-wm>	 PROBLEM - carbon-cache too many creates on graphite1001 is CRITICAL: CRITICAL: 1.67% of data above the critical threshold [1000.0]
[12:49:10] <icinga-wm>	 PROBLEM - Apache HTTP on mw1289 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 0.002 second response time
[12:49:21] <icinga-wm>	 PROBLEM - HHVM rendering on mw1289 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 3.850 second response time
[12:49:21] <icinga-wm>	 PROBLEM - Nginx local proxy to apache on mw1289 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 1308 bytes in 1.858 second response time
[12:50:10] <icinga-wm>	 RECOVERY - Apache HTTP on mw1289 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 612 bytes in 0.025 second response time
[12:50:20] <icinga-wm>	 RECOVERY - HHVM rendering on mw1289 is OK: HTTP OK: HTTP/1.1 200 OK - 75226 bytes in 0.273 second response time
[12:50:20] <icinga-wm>	 RECOVERY - Nginx local proxy to apache on mw1289 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 613 bytes in 0.083 second response time
[12:51:11] <zeljkof>	 jouncebot: next
[12:51:11] <jouncebot>	 In 0 hour(s) and 8 minute(s): European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170726T1300)
[13:00:04] <jouncebot>	 addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Dear anthropoid, the time has come. Please deploy European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170726T1300).
[13:00:27] <zeljkof>	 o/
[13:00:32] <zeljkof>	 but nothing to deploy...
[13:02:37] * TabbyCat checks just in case he's got anything
[13:03:04] <TabbyCat>	 zeljkof: https://gerrit.wikimedia.org/r/#/c/367676/ <-- wanna do?
[13:03:11] <TabbyCat>	 if yes, I can list at wikitech
[13:04:18] <zeljkof>	 TabbyCat: sure
[13:04:44] <wikibugs>	 (03PS2) 10Giuseppe Lavagetto: role::mediawiki::canary_appserver: move to the future parser [puppet] - 10https://gerrit.wikimedia.org/r/367876
[13:05:07] <TabbyCat>	 zeljkof: okay I'll list it right now
[13:05:40] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s2 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[13:06:33] <TabbyCat>	 done
[13:06:40] <zeljkof>	 TabbyCat: on it
[13:06:47] <zeljkof>	 for the record: I can SWAT today!
[13:07:17] <zeljkof>	 hashar: one thing for SWAT, I'll take care of it
[13:08:11] <wikibugs>	 (03PS3) 10Zfilipin: HD logos for eswikivoyage and added some missing paths to the config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367676 (https://phabricator.wikimedia.org/T170604) (owner: 10MarcoAurelio)
[13:08:29] <wikibugs>	 (03PS3) 10Ema: pybal::monitoring: add check_pybal_ipvs_diff [puppet] - 10https://gerrit.wikimedia.org/r/367662 (https://phabricator.wikimedia.org/T134893)
[13:08:55] <zeljkof>	 TabbyCat: having some intertubes slowdown at the moment, so it might be a bit slower that usual...
[13:09:21] <TabbyCat>	 zeljkof: no problem, take your time or we can delay it if the server is too overloaded
[13:12:10] <wikibugs>	 (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367676 (https://phabricator.wikimedia.org/T170604) (owner: 10MarcoAurelio)
[13:12:45] <wikibugs>	 (03CR) 10Ema: pybal::monitoring: add check_pybal_ipvs_diff (0312 comments) [puppet] - 10https://gerrit.wikimedia.org/r/367662 (https://phabricator.wikimedia.org/T134893) (owner: 10Ema)
[13:12:53] <zeljkof>	 TabbyCat: it will get done in a few minutes, tubes not as bad as I thought
[13:13:00] <zeljkof>	 the problem is on my side, servers are fine
[13:13:11] <TabbyCat>	 okay no probs
[13:13:38] <wikibugs>	 (03Merged) 10jenkins-bot: HD logos for eswikivoyage and added some missing paths to the config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367676 (https://phabricator.wikimedia.org/T170604) (owner: 10MarcoAurelio)
[13:13:40] <icinga-wm>	 PROBLEM - puppet last run on stat1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[13:13:51] <wikibugs>	 (03CR) 10jenkins-bot: HD logos for eswikivoyage and added some missing paths to the config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367676 (https://phabricator.wikimedia.org/T170604) (owner: 10MarcoAurelio)
[13:14:11] <wikibugs>	 10Operations, 10Diamond, 10Traffic, 10monitoring, 10Prometheus-metrics-monitoring: Enable diamond PowerDNSRecursor collector on dnsrecursors - https://phabricator.wikimedia.org/T169600#3474646 (10ema)
[13:15:40] <icinga-wm>	 RECOVERY - puppet last run on stat1002 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures
[13:18:25] <zeljkof>	 TabbyCat: the commit is at mwdebug1002, can you test there?
[13:18:35] <TabbyCat>	 zeljkof: sure thing, I'm on it
[13:19:38] <TabbyCat>	 zeljkof: oops https://es.wikisource.org/wiki/Portada <_<
[13:19:45] <TabbyCat>	 I guess I need to reduce that one
[13:20:28] <zeljkof>	 TabbyCat: should I revert? or will you create a follow up commit?
[13:20:31] <TabbyCat>	 but for everything else, all looks good to me
[13:20:34] <TabbyCat>	 I'll follow-up
[13:20:47] <zeljkof>	 TabbyCat: ok, so I can deploy?
[13:21:19] <TabbyCat>	 zeljkof: yes, and I'll create a followup removing the path for eswikisource until I can find the right sizes
[13:21:35] <zeljkof>	 TabbyCat: ok, deploying...
[13:22:46] <logmsgbot>	 !log zfilipin@tin Synchronized static/images/project-logos/: SWAT: [[gerrit:367676|HD logos for eswikivoyage and added some missing paths to the config (T170604)]] (duration: 00m 54s)
[13:22:57] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:22:57] <stashbot>	 T170604: High density logos for spanish sister projects - https://phabricator.wikimedia.org/T170604
[13:24:10] <wikibugs>	 (03Draft2) 10MarcoAurelio: Revert eswikisource paths due to oversized logos. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367889
[13:24:15] <wikibugs>	 (03Draft1) 10MarcoAurelio: Revert eswikisource paths due to oversized logos. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367889
[13:24:31] <TabbyCat>	 zeljkof: ^^
[13:24:49] <logmsgbot>	 !log zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:367676|HD logos for eswikivoyage and added some missing paths to the config (T170604)]] (duration: 00m 46s)
[13:24:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:25:11] <zeljkof>	 TabbyCat: deployed, please check
[13:25:26] <zeljkof>	 TabbyCat: also, could you please add the follow up commit to Deployments page?
[13:25:37] <TabbyCat>	 yes, I was doing that :)
[13:27:06] <TabbyCat>	 done
[13:27:35] <wikibugs>	 (03CR) 10Zfilipin: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367889 (owner: 10MarcoAurelio)
[13:27:43] <zeljkof>	 TabbyCat: thanks
[13:29:00] <wikibugs>	 (03Merged) 10jenkins-bot: Revert eswikisource paths due to oversized logos. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367889 (owner: 10MarcoAurelio)
[13:29:13] <wikibugs>	 (03CR) 10jenkins-bot: Revert eswikisource paths due to oversized logos. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367889 (owner: 10MarcoAurelio)
[13:29:20] <zeljkof>	 TabbyCat: you have added it to yesterday :) please update
[13:29:50] <TabbyCat>	 sorry, I am tense due to this
[13:29:52] <TabbyCat>	 fixing
[13:30:13] <wikibugs>	 (03CR) 10Jcrespo: "I don't know if this can create a regression, but this will likely mitigate T171638 (maybe), even if it is a separate issue." [puppet] - 10https://gerrit.wikimedia.org/r/367710 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush)
[13:30:37] <zeljkof>	 TabbyCat: 367889 is at mwdebug1002, please check
[13:30:54] <TabbyCat>	 on it
[13:31:10] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s3 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[13:31:30] <TabbyCat>	 looks good again on mwdebug1002
[13:31:30] <icinga-wm>	 PROBLEM - salt-minion processes on stat1002 is CRITICAL: Return code of 255 is out of bounds
[13:31:37] <wikibugs>	 10Operations, 10monitoring, 10User-fgiunchedi: Diamond log level set to DEBUG spams syslog - https://phabricator.wikimedia.org/T171580#3474672 (10jcrespo) Migrating my comment from T171638, as it may be useful for debugging:    > I believe it is on stretch, because I have not seen it on other hosts, but it c...
[13:31:40] <icinga-wm>	 PROBLEM - configured eth on stat1002 is CRITICAL: Return code of 255 is out of bounds
[13:31:41] <icinga-wm>	 PROBLEM - puppet last run on stat1002 is CRITICAL: Return code of 255 is out of bounds
[13:31:45] <wikibugs>	 (03PS3) 10Giuseppe Lavagetto: role::mediawiki::canary_appserver: move to the future parser [puppet] - 10https://gerrit.wikimedia.org/r/367876
[13:31:50] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s2 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[13:32:25] <zeljkof>	 TabbyCat: ok, deploying
[13:32:34] <jynus>	 I thought 1001 was getting less loaded
[13:32:36] <elukey>	 I am working on stat1002 sorry, silenced
[13:33:10] <icinga-wm>	 PROBLEM - MariaDB Slave SQL: x1 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[13:33:11] <icinga-wm>	 PROBLEM - MariaDB Slave IO: s6 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[13:33:20] <icinga-wm>	 PROBLEM - MariaDB Slave IO: m3 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[13:33:23] <icinga-wm>	 PROBLEM - MariaDB Slave IO: x1 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[13:33:23] <icinga-wm>	 PROBLEM - MariaDB Slave IO: s1 on dbstore1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[13:33:30] <icinga-wm>	 RECOVERY - salt-minion processes on stat1002 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[13:33:40] <icinga-wm>	 RECOVERY - configured eth on stat1002 is OK: OK - interfaces up
[13:33:51] <icinga-wm>	 RECOVERY - puppet last run on stat1002 is OK: OK: Puppet is currently enabled, last run 18 minutes ago with 0 failures
[13:34:01] <icinga-wm>	 RECOVERY - MariaDB Slave SQL: x1 on dbstore1001 is OK: OK slave_sql_state Slave_SQL_Running: No, (no error: intentional)
[13:34:01] <icinga-wm>	 RECOVERY - MariaDB Slave IO: s6 on dbstore1001 is OK: OK slave_io_state Slave_IO_Running: Yes
[13:34:10] <icinga-wm>	 RECOVERY - MariaDB Slave IO: m3 on dbstore1001 is OK: OK slave_io_state Slave_IO_Running: Yes
[13:34:10] <icinga-wm>	 RECOVERY - MariaDB Slave IO: x1 on dbstore1001 is OK: OK slave_io_state Slave_IO_Running: Yes
[13:34:11] <icinga-wm>	 RECOVERY - MariaDB Slave IO: s1 on dbstore1001 is OK: OK slave_io_state Slave_IO_Running: Yes
[13:34:16] <logmsgbot>	 !log zfilipin@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:367889|Revert eswikisource paths due to oversized logos (T170604)]] (duration: 00m 46s)
[13:34:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:34:27] <stashbot>	 T170604: High density logos for spanish sister projects - https://phabricator.wikimedia.org/T170604
[13:34:39] <zeljkof>	 TabbyCat: deployed, please check
[13:35:05] <TabbyCat>	 lgtm
[13:35:20] <icinga-wm>	 PROBLEM - Check Varnish expiry mailbox lag on cp1074 is CRITICAL: CRITICAL: expiry mailbox lag is 2040233
[13:35:21] <zeljkof>	 TabbyCat: anything else, or are we done with scap for now? :)
[13:35:33] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: apache::conf: convert to use validate_numeric [puppet] - 10https://gerrit.wikimedia.org/r/367891
[13:35:50] <TabbyCat>	 zeljkof: I give way now, no more from me to swat
[13:35:55] <TabbyCat>	 thank you for your help
[13:36:01] <zeljkof>	 !log EU SWAT finished
[13:36:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:36:15] <zeljkof>	 TabbyCat: thanks for deploying with #releng :)
[13:37:22] <TabbyCat>	 as if it could be done with another "company" xD -- my pleasure :)
[13:37:47] <zeljkof>	 TabbyCat: ;)
[13:41:20] <icinga-wm>	 PROBLEM - carbon-cache too many creates on graphite1001 is CRITICAL: CRITICAL: 1.67% of data above the critical threshold [1000.0]
[13:44:21] <icinga-wm>	 PROBLEM - carbon-cache too many creates on graphite1001 is CRITICAL: CRITICAL: 1.67% of data above the critical threshold [1000.0]
[13:44:30] <gehel>	 !log restarting cassandra on maps clusters
[13:44:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:50:59] <wikibugs>	 (03Draft2) 10MarcoAurelio: Make ptwikimedia a fishbowl wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367892 (https://phabricator.wikimedia.org/T171501)
[13:51:07] <wikibugs>	 (03Draft1) 10MarcoAurelio: Make ptwikimedia a fishbowl wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367892 (https://phabricator.wikimedia.org/T171501)
[13:51:16] <wikibugs>	 (03PS2) 10Giuseppe Lavagetto: apache::conf: convert to use validate_numeric [puppet] - 10https://gerrit.wikimedia.org/r/367891 (https://phabricator.wikimedia.org/T171704)
[13:51:18] <wikibugs>	 (03PS4) 10Giuseppe Lavagetto: role::mediawiki::canary_appserver: move to the future parser [puppet] - 10https://gerrit.wikimedia.org/r/367876 (https://phabricator.wikimedia.org/T171704)
[13:51:54] <wikibugs>	 (03PS3) 10MarcoAurelio: Make ptwikimedia a fishbowl wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367892 (https://phabricator.wikimedia.org/T171501)
[13:53:20] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Make ptwikimedia a fishbowl wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367892 (https://phabricator.wikimedia.org/T171501) (owner: 10MarcoAurelio)
[13:58:30] <wikibugs>	 (03PS4) 10MarcoAurelio: Make ptwikimedia a fishbowl wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367892 (https://phabricator.wikimedia.org/T171501)
[14:02:17] <TabbyCat>	 Dereckson: does it make sense to make ptwikimedia fishbowl, given that they use the abusefilter to make it ''de facto'' already?
[14:02:54] <wikibugs>	 (03CR) 10Volans: "Much nicer!. I still have some doubts about getting the data from prometheus because of potential stale data that doesn't come from the so" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/367662 (https://phabricator.wikimedia.org/T134893) (owner: 10Ema)
[14:04:20] <wikibugs>	 (03PS3) 10Jcrespo: mariadb: Pool db2072 with low load as s1 main traffic [mediawiki-config] - 10https://gerrit.wikimedia.org/r/365285 (https://phabricator.wikimedia.org/T170662)
[14:05:20] <wikibugs>	 (03CR) 10Volans: [C: 04-1] "Reply inline" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/367662 (https://phabricator.wikimedia.org/T134893) (owner: 10Ema)
[14:05:26] <volans>	 damn gerrit
[14:05:29] <volans>	 it was not a -1
[14:06:02] <volans>	 seems a bug on the IRC side though, the patch doesn't have the -1 :D
[14:08:28] <wikibugs>	 (03PS8) 10Rush: diamond: set diskspace filesystems explicitly [puppet] - 10https://gerrit.wikimedia.org/r/367710 (https://phabricator.wikimedia.org/T171583)
[14:08:47] <wikibugs>	 (03PS9) 10Rush: diamond: set diskspace filesystems explicitly [puppet] - 10https://gerrit.wikimedia.org/r/367710 (https://phabricator.wikimedia.org/T171583)
[14:09:21] <wikibugs>	 (03CR) 10Jcrespo: [C: 032] mariadb: Pool db2072 with low load as s1 main traffic [mediawiki-config] - 10https://gerrit.wikimedia.org/r/365285 (https://phabricator.wikimedia.org/T170662) (owner: 10Jcrespo)
[14:11:52] * elukey will not make jokes about volans' last -1
[14:11:58] <wikibugs>	 (03Merged) 10jenkins-bot: mariadb: Pool db2072 with low load as s1 main traffic [mediawiki-config] - 10https://gerrit.wikimedia.org/r/365285 (https://phabricator.wikimedia.org/T170662) (owner: 10Jcrespo)
[14:12:13] <wikibugs>	 (03CR) 10jenkins-bot: mariadb: Pool db2072 with low load as s1 main traffic [mediawiki-config] - 10https://gerrit.wikimedia.org/r/365285 (https://phabricator.wikimedia.org/T170662) (owner: 10Jcrespo)
[14:12:13] <volans>	 elukey: you're so kind
[14:12:17] <volans>	 :-P
[14:12:20] <elukey>	 :D
[14:13:01] <wikibugs>	 (03Abandoned) 10Rush: DON'T MERGE: labsdb: in case labsdb1001 falls over [puppet] - 10https://gerrit.wikimedia.org/r/367625 (https://phabricator.wikimedia.org/T171538) (owner: 10Rush)
[14:13:57] <wikibugs>	 (03CR) 10Jcrespo: "Don't keep it too far ;-)" [puppet] - 10https://gerrit.wikimedia.org/r/367625 (https://phabricator.wikimedia.org/T171538) (owner: 10Rush)
[14:16:24] <wikibugs>	 10Operations, 10ops-eqiad, 10DBA, 10Patch-For-Review: Degraded RAID on db1066 - https://phabricator.wikimedia.org/T169448#3474822 (10jcrespo) 05Open>03Resolved a:03Cmjohnson
[14:17:09] <wikibugs>	 (03CR) 10Filippo Giunchedi: [C: 031] diamond: set diskspace filesystems explicitly [puppet] - 10https://gerrit.wikimedia.org/r/367710 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush)
[14:21:13] <logmsgbot>	 !log jynus@tin Synchronized wmf-config/db-codfw.php: Pool db2072 (duration: 00m 45s)
[14:21:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:25:19] <wikibugs>	 10Operations, 10Beta-Cluster-Infrastructure, 10media-storage, 10Release-Engineering-Team (Kanban): nscd does not cache localhost causing high CPU usage when localhost is often resolved - https://phabricator.wikimedia.org/T171745#3474846 (10hashar)
[14:25:51] <wikibugs>	 (03PS2) 10Hashar: swift: save nscd CPU by using IP address [puppet] - 10https://gerrit.wikimedia.org/r/358799 (https://phabricator.wikimedia.org/T160990)
[14:27:18] <wikibugs>	 (03CR) 10Hashar: "This patch is cherry picked on the beta cluster and definitely reduce the load / CPU usages of nscd on the labs instances." [puppet] - 10https://gerrit.wikimedia.org/r/358799 (https://phabricator.wikimedia.org/T160990) (owner: 10Hashar)
[14:28:09] <wikibugs>	 10Operations, 10Beta-Cluster-Infrastructure, 10media-storage, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): nscd does not cache localhost causing high CPU usage when localhost is often resolved - https://phabricator.wikimedia.org/T171745#3474869 (10hashar) For #beta-cluster-infrastructure the...
[14:33:40] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 031] CLI: simplify imports and introspection [software/cumin] - 10https://gerrit.wikimedia.org/r/366734 (owner: 10Volans)
[14:34:51] <wikibugs>	 (03PS1) 10Jcrespo: mariadb: Depool db2068 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367896
[14:35:56] <wikibugs>	 (03PS2) 10Jcrespo: mariadb: Depool db2068 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367896
[14:40:50] <moritzm>	 !log installing spice security updates
[14:40:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:45:01] <wikibugs>	 (03CR) 10Filippo Giunchedi: "LGTM, one comment re: exceptions" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/367662 (https://phabricator.wikimedia.org/T134893) (owner: 10Ema)
[14:50:17] <wikibugs>	 (03CR) 10Volans: pybal::monitoring: add check_pybal_ipvs_diff (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/367662 (https://phabricator.wikimedia.org/T134893) (owner: 10Ema)
[14:51:19] <wikibugs>	 10Operations, 10ops-codfw, 10User-fgiunchedi: ms-be2024 not powering on - https://phabricator.wikimedia.org/T171275#3474929 (10Papaul) @fgiunchedi  Yes the server is back at the stage it was yesterday before updating the firmware and removing the power. I am going to remove the power again and let you try to...
[14:54:14] <godog>	 papaul: ^ thanks! I'm about to jump into a meeting, will be able to take a look in an hour or so
[14:55:00] <icinga-wm>	 PROBLEM - Host ms-be2024.mgmt is DOWN: PING CRITICAL - Packet loss = 100%
[15:00:08] <papaul>	 godog: ok
[15:00:12] <icinga-wm>	 PROBLEM - HHVM jobrunner on mw1169 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[15:01:10] <icinga-wm>	 RECOVERY - HHVM jobrunner on mw1169 is OK: HTTP OK: HTTP/1.1 200 OK - 202 bytes in 0.002 second response time
[15:07:47] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 031] Logging: add a custom trace() logging level [software/cumin] - 10https://gerrit.wikimedia.org/r/366735 (owner: 10Volans)
[15:08:02] <wikibugs>	 10Operations, 10Performance-Team, 10Thumbor, 10Patch-For-Review, 10User-fgiunchedi: Deploy thumbor in codfw - https://phabricator.wikimedia.org/T167801#3474987 (10Papaul)
[15:08:06] <wikibugs>	 10Operations, 10ops-codfw, 10Performance-Team, 10Thumbor, 10User-fgiunchedi: Rename mw2148 / mw2149 / mw2259 / mw2260 to thumbor200[1234] - https://phabricator.wikimedia.org/T168881#3474985 (10Papaul) 05Open>03Resolved Complete
[15:10:50] <icinga-wm>	 RECOVERY - Host ms-be2024.mgmt is UP: PING OK - Packet loss = 0%, RTA = 36.55 ms
[15:11:15] <wikibugs>	 (03CR) 10Jcrespo: [C: 032] mariadb: Depool db2068 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367896 (owner: 10Jcrespo)
[15:12:32] <wikibugs>	 (03Merged) 10jenkins-bot: mariadb: Depool db2068 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367896 (owner: 10Jcrespo)
[15:12:44] <wikibugs>	 (03CR) 10jenkins-bot: mariadb: Depool db2068 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367896 (owner: 10Jcrespo)
[15:15:21] <logmsgbot>	 !log jynus@tin Synchronized wmf-config/db-codfw.php: Depool db2068 (duration: 00m 46s)
[15:15:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:17:37] <jynus>	 !log restarting and upgrading db2068
[15:17:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:20:00] <wikibugs>	 (03PS3) 10Dzahn: contint: webperformance Jenkins slave [puppet] - 10https://gerrit.wikimedia.org/r/367411 (https://phabricator.wikimedia.org/T166756) (owner: 10Hashar)
[15:20:55] <moritzm>	 !log upgrade nodejs on scb2001 (currently depooled for testing)
[15:21:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:21:53] <icinga-wm>	 PROBLEM - Check Varnish expiry mailbox lag on cp1049 is CRITICAL: CRITICAL: expiry mailbox lag is 2026383
[15:21:53] <wikibugs>	 (03CR) 10Dzahn: [C: 032] contint: webperformance Jenkins slave [puppet] - 10https://gerrit.wikimedia.org/r/367411 (https://phabricator.wikimedia.org/T166756) (owner: 10Hashar)
[15:22:56] <wikibugs>	 (03PS2) 10Dzahn: visualdiff: Remove manually built `uprightdiff` [puppet] - 10https://gerrit.wikimedia.org/r/367131 (owner: 10Legoktm)
[15:23:46] <hashar>	 mutante: danke :)
[15:27:21] <mutante>	 de rien
[15:27:40] <wikibugs>	 (03CR) 10Dzahn: [C: 032] visualdiff: Remove manually built `uprightdiff` [puppet] - 10https://gerrit.wikimedia.org/r/367131 (owner: 10Legoktm)
[15:29:21] <moritzm>	 !log upgrade nodejs on remaining scb hosts (along with service restarts)
[15:29:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:29:53] <wikibugs>	 10Operations, 10MediaWiki-extensions-LocalisationUpdate, 10MinervaNeue, 10Reading-Web-Backlog, 10Release-Engineering-Team: The mobile-frontend-placeholder message is not updated in din.wikipedia.org - https://phabricator.wikimedia.org/T171711#3475060 (10bmansurov)
[15:30:17] <wikibugs>	 (03PS9) 10Andrew Bogott: Puppetmaster:  Fix apache config ssldir [puppet] - 10https://gerrit.wikimedia.org/r/365053
[15:31:34] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 032] Puppetmaster:  Fix apache config ssldir [puppet] - 10https://gerrit.wikimedia.org/r/365053 (owner: 10Andrew Bogott)
[15:32:15] <wikibugs>	 (03PS1) 10Jcrespo: Revert "mariadb: Depool db2068 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367900
[15:32:43] <andrewbogott>	 !log patching puppetmaster1001, possible puppet hiccups coming up
[15:32:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:36:43] <wikibugs>	 (03CR) 10Dzahn: "cleaned up manually on ruthenium to reflect this" [puppet] - 10https://gerrit.wikimedia.org/r/367131 (owner: 10Legoktm)
[15:38:12] <wikibugs>	 (03CR) 10Dzahn: [C: 031] "Can you confirm the desired result is that on jessie both PHP5 and PHP7 are installed in parallel?" [puppet] - 10https://gerrit.wikimedia.org/r/361680 (https://phabricator.wikimedia.org/T166611) (owner: 10Paladox)
[15:38:35] <wikibugs>	 (03CR) 10Dzahn: [C: 031] "@Hashar" [puppet] - 10https://gerrit.wikimedia.org/r/361680 (https://phabricator.wikimedia.org/T166611) (owner: 10Paladox)
[15:38:54] <wikibugs>	 (03PS2) 10Jcrespo: Revert "mariadb: Depool db2068 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367900
[15:38:56] <wikibugs>	 (03PS1) 10Jcrespo: mariadb: Increase db2072 weight after pooling it with low weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367901
[15:38:58] <wikibugs>	 (03PS1) 10Jcrespo: mariadb: Depool db2069 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367902
[15:39:00] <wikibugs>	 (03PS1) 10Ema: Add support for One-packet scheduling (OPS) [debs/pybal] - 10https://gerrit.wikimedia.org/r/367903
[15:39:12] <moritzm>	 !log rolling upgrade/service restarts of nodejs in eqiad
[15:39:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:39:45] <wikibugs>	 (03CR) 10Jcrespo: [C: 032] Revert "mariadb: Depool db2068 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367900 (owner: 10Jcrespo)
[15:40:03] <wikibugs>	 (03CR) 10Jcrespo: [C: 032] mariadb: Increase db2072 weight after pooling it with low weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367901 (owner: 10Jcrespo)
[15:40:21] <wikibugs>	 (03CR) 10Jcrespo: [C: 032] mariadb: Depool db2069 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367902 (owner: 10Jcrespo)
[15:40:40] <wikibugs>	 (03PS2) 10Dzahn: Don't need to update submodules recursively [puppet] - 10https://gerrit.wikimedia.org/r/367639 (owner: 10Reedy)
[15:41:16] <wikibugs>	 (03Merged) 10jenkins-bot: Revert "mariadb: Depool db2068 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367900 (owner: 10Jcrespo)
[15:41:26] <wikibugs>	 (03CR) 10jenkins-bot: Revert "mariadb: Depool db2068 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367900 (owner: 10Jcrespo)
[15:42:33] <wikibugs>	 (03Merged) 10jenkins-bot: mariadb: Increase db2072 weight after pooling it with low weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367901 (owner: 10Jcrespo)
[15:42:52] <wikibugs>	 (03Merged) 10jenkins-bot: mariadb: Depool db2069 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367902 (owner: 10Jcrespo)
[15:43:55] <wikibugs>	 (03CR) 10jenkins-bot: mariadb: Increase db2072 weight after pooling it with low weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367901 (owner: 10Jcrespo)
[15:43:57] <wikibugs>	 (03CR) 10jenkins-bot: mariadb: Depool db2069 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367902 (owner: 10Jcrespo)
[15:45:58] <logmsgbot>	 !log jynus@tin Synchronized wmf-config/db-codfw.php: Repool db2068, depool db2069, pool db2072 with more weight (duration: 00m 46s)
[15:46:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:46:28] <jynus>	 db2068 IPMI Temperature check starting timing out after reboot :-/
[15:48:03] <jynus>	 !log upgrade and reboot db2069
[15:48:12] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:48:38] <wikibugs>	 (03CR) 10Dzahn: [C: 032] Don't need to update submodules recursively [puppet] - 10https://gerrit.wikimedia.org/r/367639 (owner: 10Reedy)
[15:51:23] <wikibugs>	 (03PS1) 10Jcrespo: mariadb: Change db2069 mysql socket location to the default [puppet] - 10https://gerrit.wikimedia.org/r/367906 (https://phabricator.wikimedia.org/T148507)
[15:53:22] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Add support for One-packet scheduling (OPS) [debs/pybal] - 10https://gerrit.wikimedia.org/r/367903 (owner: 10Ema)
[15:54:22] <wikibugs>	 (03CR) 10Jcrespo: [C: 032] mariadb: Change db2069 mysql socket location to the default [puppet] - 10https://gerrit.wikimedia.org/r/367906 (https://phabricator.wikimedia.org/T148507) (owner: 10Jcrespo)
[15:57:48] <icinga-wm>	 PROBLEM - Check Varnish expiry mailbox lag on cp1099 is CRITICAL: CRITICAL: expiry mailbox lag is 2082684
[15:57:57] <wikibugs>	 (03PS10) 10Rush: diamond: set diskspace filesystems explicitly [puppet] - 10https://gerrit.wikimedia.org/r/367710 (https://phabricator.wikimedia.org/T171583)
[15:59:00] <wikibugs>	 10Operations, 10ops-eqiad, 10hardware-requests, 10Patch-For-Review: Decommission mw1196 - https://phabricator.wikimedia.org/T170441#3475150 (10RobH)   >>! In T170441#3473908, @MoritzMuehlenhoff wrote: > Have the "non-interruptuable steps" really been completed? mw1196 still has a salt key and shows up  htt...
[15:59:47] <wikibugs>	 10Operations, 10ops-codfw, 10User-fgiunchedi: ms-be2024 not powering on - https://phabricator.wikimedia.org/T171275#3475157 (10Papaul) @fgiunchedi  Now the server can't not power on even when using the power button on the server . I contact HP and after troubleshooting they decide to send a replacement board...
[16:01:16] <wikibugs>	 (03CR) 10Rush: [C: 032] diamond: set diskspace filesystems explicitly [puppet] - 10https://gerrit.wikimedia.org/r/367710 (https://phabricator.wikimedia.org/T171583) (owner: 10Rush)
[16:03:03] <wikibugs>	 (03CR) 10Dzahn: "doesn't this mean merging this would break scap deployment on all labs instances that aren't using that puppetmaster with the cherry-picke" [puppet] - 10https://gerrit.wikimedia.org/r/365891 (https://phabricator.wikimedia.org/T166013) (owner: 1020after4)
[16:04:21] <wikibugs>	 (03CR) 10Dzahn: "I don't understand how this is related to /home since it's /var/lib/scap before and /var/lib/something_else after. Where is $deploy_user g" [puppet] - 10https://gerrit.wikimedia.org/r/365891 (https://phabricator.wikimedia.org/T166013) (owner: 1020after4)
[16:08:08] <wikibugs>	 (03PS1) 10Jcrespo: Revert "mariadb: Depool db2069 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367909
[16:12:19] <wikibugs>	 (03PS2) 10Jcrespo: Revert "mariadb: Depool db2069 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367909
[16:12:21] <wikibugs>	 (03PS1) 10Jcrespo: mariadb: Depool db2070 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367911
[16:14:25] <moritzm>	 !log upgraded nodejs on restbase*
[16:14:35] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:19:06] <wikibugs>	 (03PS1) 10Jcrespo: mariadb: Move db2070 socket location to the default after reboot [puppet] - 10https://gerrit.wikimedia.org/r/367912 (https://phabricator.wikimedia.org/T148507)
[16:19:08] <wikibugs>	 (03CR) 10Jcrespo: [C: 032] Revert "mariadb: Depool db2069 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367909 (owner: 10Jcrespo)
[16:19:53] <wikibugs>	 (03CR) 10Jcrespo: [C: 032] mariadb: Depool db2070 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367911 (owner: 10Jcrespo)
[16:20:07] <wikibugs>	 (03CR) 10Daniel Kinzler: [C: 031] "yes, please." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367393 (https://phabricator.wikimedia.org/T165197) (owner: 10Ladsgroup)
[16:20:22] <wikibugs>	 (03Merged) 10jenkins-bot: Revert "mariadb: Depool db2069 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367909 (owner: 10Jcrespo)
[16:20:32] <wikibugs>	 (03CR) 10jenkins-bot: Revert "mariadb: Depool db2069 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367909 (owner: 10Jcrespo)
[16:21:01] <wikibugs>	 (03Merged) 10jenkins-bot: mariadb: Depool db2070 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367911 (owner: 10Jcrespo)
[16:22:33] <icinga-wm>	 PROBLEM - puppet last run on mw1295 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:22:42] <icinga-wm>	 PROBLEM - puppet last run on kafka1022 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:22:42] <icinga-wm>	 PROBLEM - puppet last run on prometheus2004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:22:42] <icinga-wm>	 PROBLEM - puppet last run on db1020 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:22:43] <icinga-wm>	 PROBLEM - puppet last run on cp3043 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:22:46] <logmsgbot>	 !log jynus@tin Synchronized wmf-config/db-codfw.php: Repool db2069, depool db2070 (duration: 00m 45s)
[16:22:52] <icinga-wm>	 PROBLEM - puppet last run on aluminium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:22:52] <icinga-wm>	 PROBLEM - puppet last run on ms-be1032 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:22:52] <icinga-wm>	 PROBLEM - puppet last run on ores1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:22:55] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:23:02] <icinga-wm>	 PROBLEM - puppet last run on notebook1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:02] <icinga-wm>	 PROBLEM - puppet last run on wtp1041 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:03] <icinga-wm>	 PROBLEM - puppet last run on labtestservices2002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:12] <icinga-wm>	 PROBLEM - puppet last run on restbase1010 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:12] <icinga-wm>	 PROBLEM - puppet last run on db1098 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:12] <icinga-wm>	 PROBLEM - puppet last run on db1100 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:12] <icinga-wm>	 PROBLEM - puppet last run on krypton is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:13] <icinga-wm>	 PROBLEM - puppet last run on wtp1019 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:13] <icinga-wm>	 PROBLEM - puppet last run on baham is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:22] <icinga-wm>	 PROBLEM - puppet last run on install1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:22] <icinga-wm>	 PROBLEM - puppet last run on mw1212 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:22] <icinga-wm>	 PROBLEM - puppet last run on wtp1026 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:32] <icinga-wm>	 PROBLEM - puppet last run on analytics1051 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:32] <icinga-wm>	 PROBLEM - puppet last run on db1029 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:32] <icinga-wm>	 PROBLEM - puppet last run on lvs3004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:23:32] <wikibugs>	 (03CR) 10jenkins-bot: mariadb: Depool db2070 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367911 (owner: 10Jcrespo)
[16:23:33] <icinga-wm>	 PROBLEM - puppet last run on wtp1017 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[16:25:13] <icinga-wm>	 RECOVERY - puppet last run on db1098 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures
[16:25:14] <wikibugs>	 10Operations, 10ops-codfw, 10hardware-requests: Decommission subra/suhail - https://phabricator.wikimedia.org/T169506#3475211 (10Papaul)
[16:26:13] <icinga-wm>	 RECOVERY - puppet last run on wtp1019 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures
[16:27:05] <wikibugs>	 (03PS1) 10Lucas Werkmeister (WMDE): Remove wbq_evaluation logging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367913
[16:27:28] <jynus>	 !log upgrading and rebooting db2070
[16:27:37] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:28:22] <wikibugs>	 10Operations, 10ops-codfw, 10hardware-requests: reclaim/decom tmh200[12] - https://phabricator.wikimedia.org/T168472#3475218 (10Papaul)
[16:29:23] <wikibugs>	 (03CR) 10Jcrespo: [C: 032] mariadb: Move db2070 socket location to the default after reboot [puppet] - 10https://gerrit.wikimedia.org/r/367912 (https://phabricator.wikimedia.org/T148507) (owner: 10Jcrespo)
[16:30:57] <wikibugs>	 10Operations, 10MediaWiki-extensions-LocalisationUpdate, 10MinervaNeue, 10Release-Engineering-Team, 10Reading-Web-Backlog (Tracking): The mobile-frontend-placeholder message is not updated in din.wikipedia.org - https://phabricator.wikimedia.org/T171711#3475232 (10Jdlrobson)
[16:31:54] <icinga-wm>	 RECOVERY - Check Varnish expiry mailbox lag on cp1049 is OK: OK: expiry mailbox lag is 587
[16:33:22] <icinga-wm>	 RECOVERY - puppet last run on mw1212 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures
[16:34:41] <wikibugs>	 (03PS1) 10Lucas Werkmeister (WMDE): Log 'WikibaseQualityConstraints' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367914 (https://phabricator.wikimedia.org/T171281)
[16:35:38] <wikibugs>	 10Operations, 10MediaWiki-extensions-LocalisationUpdate, 10Release-Engineering-Team: The mobile-frontend-placeholder message is not updated in din.wikipedia.org - https://phabricator.wikimedia.org/T171711#3475238 (10Amire80) Removing reading tags. It's most likely an ops issue with LU.
[16:36:11] <wikibugs>	 (03PS1) 10Jcrespo: Revert "mariadb: Depool db2070 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367915
[16:36:18] <wikibugs>	 (03CR) 10Jcrespo: [C: 04-1] "Not yet" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367915 (owner: 10Jcrespo)
[16:36:27] <wikibugs>	 (03CR) 10Lucas Werkmeister (WMDE): "If you don’t like the log channel name, it can still be changed – the code that uses it hasn’t been deployed yet." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367914 (https://phabricator.wikimedia.org/T171281) (owner: 10Lucas Werkmeister (WMDE))
[16:36:41] <godog>	 papaul: thanks for dealing with ms-be2024, did they give you an ETA?
[16:37:42] <icinga-wm>	 RECOVERY - Check Varnish expiry mailbox lag on cp1099 is OK: OK: expiry mailbox lag is 147718
[16:38:00] <wikibugs>	 (03CR) 10Gehel: [C: 032] Decrease elasticsearch search thread pool to 32 for cirrus servers (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/367709 (https://phabricator.wikimedia.org/T169498) (owner: 10EBernhardson)
[16:38:08] <wikibugs>	 (03CR) 10Gehel: [C: 04-1] Decrease elasticsearch search thread pool to 32 for cirrus servers [puppet] - 10https://gerrit.wikimedia.org/r/367709 (https://phabricator.wikimedia.org/T169498) (owner: 10EBernhardson)
[16:40:02] <papaul>	 godog: tomorrow between 9am-1pm
[16:40:04] <wikibugs>	 10Operations, 10MediaWiki-extensions-LocalisationUpdate, 10Release-Engineering-Team: l10nupdate failing with "git pull of extensions failed" since July 19th - https://phabricator.wikimedia.org/T171711#3475257 (10greg)
[16:40:12] <wikibugs>	 10Operations, 10MediaWiki-extensions-LocalisationUpdate, 10Release-Engineering-Team: l10nupdate failing with "git pull of extensions failed" since July 19th - https://phabricator.wikimedia.org/T171711#3473889 (10greg) p:05Triage>03High
[16:40:41] <wikibugs>	 (03CR) 10Dzahn: "regarding the previous comments about this not being merged earlier etc - i have to point out now that YES, it did BREAK and caused a new " [puppet] - 10https://gerrit.wikimedia.org/r/255958 (owner: 10Reedy)
[16:41:45] <wikibugs>	 10Operations, 10MediaWiki-extensions-LocalisationUpdate, 10Release-Engineering-Team: l10nupdate failing with "git pull of extensions failed" since July 19th - https://phabricator.wikimedia.org/T171711#3473889 (10Dzahn) likely caused by https://gerrit.wikimedia.org/r/#/c/255958/  follow-up to fix it was uploa...
[16:43:09] <greg-g>	 mutante: thanks for that ^ so likely will run successfully tonight?
[16:43:51] <wikibugs>	 10Operations, 10MediaWiki-extensions-LocalisationUpdate, 10Release-Engineering-Team: l10nupdate failing with "git pull of extensions failed" since July 19th - https://phabricator.wikimedia.org/T171711#3475273 (10greg) >>! In T171711#3475263, @Dzahn wrote: > likely caused by https://gerrit.wikimedia.org/r/#/c...
[16:44:00] <greg-g>	 Reedy: ^ just fyi
[16:44:15] <greg-g>	 Reedy: well, more than fyi, more like "hey, think you fixed it?"
[16:44:16] <Reedy>	 greg-g: I made a fix a day or two ago
[16:44:43] <greg-g>	 yeah, merged this morning, so hopefully fixed tonight
[16:44:48] <wikibugs>	 (03CR) 10Jcrespo: [C: 032] Revert "mariadb: Depool db2070 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367915 (owner: 10Jcrespo)
[16:45:22] <wikibugs>	 10Operations, 10MediaWiki-extensions-LocalisationUpdate, 10Release-Engineering-Team: l10nupdate failing with "git pull of extensions failed" since July 19th - https://phabricator.wikimedia.org/T171711#3475277 (10greg) a:03Reedy
[16:46:11] <wikibugs>	 (03Merged) 10jenkins-bot: Revert "mariadb: Depool db2070 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367915 (owner: 10Jcrespo)
[16:46:23] <wikibugs>	 (03CR) 10jenkins-bot: Revert "mariadb: Depool db2070 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367915 (owner: 10Jcrespo)
[16:47:48] <wikibugs>	 (03PS2) 10Ema: Add support for One-packet scheduling (OPS) [debs/pybal] - 10https://gerrit.wikimedia.org/r/367903
[16:49:02] <wikibugs>	 10Operations, 10monitoring, 10netops, 10Patch-For-Review, 10User-fgiunchedi: Evaluate LibreNMS' Graphite backend - https://phabricator.wikimedia.org/T171167#3475301 (10fgiunchedi) Turns out this is more data than I expected (just slowly increasing by now)  ``` $ du -hcs /var/lib/carbon/whisper/librenms/...
[16:49:13] <icinga-wm>	 RECOVERY - puppet last run on wtp1041 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures
[16:49:22] <icinga-wm>	 RECOVERY - puppet last run on labtestservices2002 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures
[16:49:22] <icinga-wm>	 RECOVERY - puppet last run on krypton is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures
[16:49:42] <icinga-wm>	 RECOVERY - puppet last run on wtp1017 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures
[16:50:02] <icinga-wm>	 RECOVERY - puppet last run on aluminium is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures
[16:50:03] <icinga-wm>	 RECOVERY - puppet last run on cp3043 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures
[16:50:22] <icinga-wm>	 RECOVERY - puppet last run on restbase1010 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures
[16:50:23] <icinga-wm>	 RECOVERY - puppet last run on db1100 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures
[16:50:42] <icinga-wm>	 RECOVERY - puppet last run on db1029 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures
[16:50:52] <icinga-wm>	 RECOVERY - puppet last run on lvs3004 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures
[16:51:02] <icinga-wm>	 RECOVERY - puppet last run on prometheus2004 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures
[16:51:02] <icinga-wm>	 RECOVERY - puppet last run on ores1004 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures
[16:51:32] <icinga-wm>	 RECOVERY - puppet last run on install1002 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures
[16:51:32] <icinga-wm>	 RECOVERY - puppet last run on baham is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures
[16:51:33] <icinga-wm>	 RECOVERY - puppet last run on wtp1026 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[16:51:33] <wikibugs>	 10Operations, 10Commons, 10MediaWiki-extensions-Scribunto, 10Wikimedia-log-errors: Some Commons pages transcluding Template:Countries_of_Europe HTTP 500/503 when accessed from non-English languages specified in the template - https://phabricator.wikimedia.org/T171392#3475310 (10Anomie) a:03Anomie
[16:51:37] <wikibugs>	 (03PS5) 10Elukey: puppetdb: Bump Java Heap max size to 6GB [puppet] - 10https://gerrit.wikimedia.org/r/366229 (https://phabricator.wikimedia.org/T170740) (owner: 10Alexandros Kosiaris)
[16:51:52] <icinga-wm>	 RECOVERY - puppet last run on db1020 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures
[16:52:02] <wikibugs>	 (03CR) 10Reedy: "The thing that broke it... Was adding something that wasn't there before :(" [puppet] - 10https://gerrit.wikimedia.org/r/255958 (owner: 10Reedy)
[16:52:12] <icinga-wm>	 RECOVERY - puppet last run on ms-be1032 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures
[16:52:12] <icinga-wm>	 RECOVERY - puppet last run on notebook1001 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures
[16:52:42] <icinga-wm>	 RECOVERY - puppet last run on analytics1051 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[16:52:52] <icinga-wm>	 RECOVERY - puppet last run on kafka1022 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures
[16:54:52] <icinga-wm>	 RECOVERY - puppet last run on mw1295 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures
[16:55:10] <logmsgbot>	 !log jynus@tin Synchronized wmf-config/db-codfw.php: Repool db2070 (duration: 00m 45s)
[16:55:20] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:58:46] <wikibugs>	 (03CR) 10BBlack: [C: 031] Add support for One-packet scheduling (OPS) [debs/pybal] - 10https://gerrit.wikimedia.org/r/367903 (owner: 10Ema)
[16:59:53] <wikibugs>	 (03PS3) 10Gehel: Decrease elasticsearch search thread pool to 32 for cirrus servers [puppet] - 10https://gerrit.wikimedia.org/r/367709 (https://phabricator.wikimedia.org/T169498) (owner: 10EBernhardson)
[17:02:50] <wikibugs>	 (03CR) 10Ema: [C: 032] Add support for One-packet scheduling (OPS) [debs/pybal] - 10https://gerrit.wikimedia.org/r/367903 (owner: 10Ema)
[17:04:39] <wikibugs>	 (03CR) 10Faidon Liambotis: [C: 031] "LGTM. Maybe add IPv6 while we're at it?" [dns] - 10https://gerrit.wikimedia.org/r/367809 (https://phabricator.wikimedia.org/T169643) (owner: 10Ayounsi)
[17:06:10] <wikibugs>	 10Operations, 10Puppet, 10Patch-For-Review: PuppetDB misbehaving on 2017-07-15 - https://phabricator.wikimedia.org/T170740#3475402 (10Volans) So we had a small hiccup today in which puppetdb responded 28 times 503s between 16:20:13 and 16:20:39 UTC, of those 17 where POSTs to update the hosts facts and we ha...
[17:06:56] <icinga-wm>	 RECOVERY - carbon-cache too many creates on graphite1001 is OK: OK: Less than 1.00% above the threshold [500.0]
[17:10:05] <wikibugs>	 10Operations, 10ops-codfw, 10hardware-requests: Decommission subra/suhail - https://phabricator.wikimedia.org/T169506#3475415 (10Papaul)
[17:10:30] <wikibugs>	 (03CR) 10Gehel: [C: 04-1] "puppet compiler result: https://puppet-compiler.wmflabs.org/compiler02/7171/" [puppet] - 10https://gerrit.wikimedia.org/r/367709 (https://phabricator.wikimedia.org/T169498) (owner: 10EBernhardson)
[17:11:11] <wikibugs>	 (03PS1) 10Chad: Group1 to wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367918
[17:11:22] <wikibugs>	 (03CR) 10Chad: [C: 04-2] "later" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367918 (owner: 10Chad)
[17:11:41] <moritzm>	 !log installing openjdk-8 security updates on cobalt and removing unused openjdk-7 packages
[17:11:43] <icinga-wm>	 PROBLEM - Check systemd state on relforge1001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[17:11:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:12:32] <moritzm>	 !log restarting gerrit to pick up Java security update
[17:12:39] <gehel>	 ^ checking relforge...
[17:12:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:14:53] <icinga-wm>	 ACKNOWLEDGEMENT - Check systemd state on relforge1001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. Gehel experimentation in progress by dcausse
[17:14:54] <icinga-wm>	 ACKNOWLEDGEMENT - Check systemd state on relforge1002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. Gehel experimentation in progress by dcausse
[17:15:02] <icinga-wm>	 PROBLEM - puppet last run on stat1006 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[git_pull_statistics_mediawiki]
[17:15:58] <wikibugs>	 (03PS1) 10Papaul: DNS: Remove mgmt DNS entries for subra and suhail [dns] - 10https://gerrit.wikimedia.org/r/367919
[17:16:59] <wikibugs>	 10Operations, 10Traffic: Investigate better DNS cache/lookup solutions - https://phabricator.wikimedia.org/T104442#3475432 (10BBlack) So to recap a small part of IRC discussion today in the wake of issues with rebooting hydrogen, I think our short-term improvement plan looks like this:  1) Implement OPS (one-p...
[17:17:51] <wikibugs>	 10Operations, 10ops-codfw, 10hardware-requests, 10Patch-For-Review: Decommission subra/suhail - https://phabricator.wikimedia.org/T169506#3475436 (10Papaul)
[17:18:27] <wikibugs>	 10Operations, 10ops-codfw: failing RAID disk on frdb2001 - https://phabricator.wikimedia.org/T171584#3475443 (10RobH) We've ordered two disks for this, one for immediate use and one for standby spares.  They should arrive by early next week.
[17:22:18] <wikibugs>	 10Operations, 10Puppet, 10Mobile, 10Need-volunteer, 10Reading-Web-Backlog (Tracking): URLs with title query string parameter and additional query string parameters do not redirect to mobile site - https://phabricator.wikimedia.org/T154227#3475450 (10Jdlrobson)
[17:22:52] <wikibugs>	 10Operations, 10Analytics, 10Analytics-Cluster, 10Research-management: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843#2734568 (10leila) (There is a bit of IRC, email, meeting discussions as background missing here. but basically, Aaron, Andrew, and I chatted a couple of weeks ago...
[17:25:48] <wikibugs>	 (03PS1) 10Ema: Add support for One-packet scheduling (OPS) [debs/pybal] (1.13) - 10https://gerrit.wikimedia.org/r/367923
[17:26:53] <wikibugs>	 (03PS2) 10RobH: DNS: Remove mgmt DNS entries for subra and suhail [dns] - 10https://gerrit.wikimedia.org/r/367919 (owner: 10Papaul)
[17:27:12] <wikibugs>	 (03CR) 10RobH: [C: 032] DNS: Remove mgmt DNS entries for subra and suhail [dns] - 10https://gerrit.wikimedia.org/r/367919 (owner: 10Papaul)
[17:28:11] <wikibugs>	 (03PS1) 10Ema: 1.13.10: Add support for One-packet scheduling (OPS) [debs/pybal] - 10https://gerrit.wikimedia.org/r/367924
[17:28:25] <wikibugs>	 10Operations, 10ops-codfw, 10hardware-requests, 10Patch-For-Review: Decommission subra/suhail - https://phabricator.wikimedia.org/T169506#3475464 (10RobH) a:05Papaul>03RobH merging papaul's dns change and removing switch port config
[17:29:45] <wikibugs>	 (03PS2) 10Ema: 1.13.10: Add support for One-packet scheduling (OPS) [debs/pybal] - 10https://gerrit.wikimedia.org/r/367924 (https://phabricator.wikimedia.org/T104442)
[17:30:54] <wikibugs>	 (03CR) 10Ema: [C: 032] Add support for One-packet scheduling (OPS) [debs/pybal] (1.13) - 10https://gerrit.wikimedia.org/r/367923 (owner: 10Ema)
[17:31:11] <wikibugs>	 (03PS1) 10Ema: 1.13.10: Add support for One-packet scheduling (OPS) [debs/pybal] (1.13) - 10https://gerrit.wikimedia.org/r/367925 (https://phabricator.wikimedia.org/T104442)
[17:31:35] <wikibugs>	 10Operations, 10ops-codfw, 10DBA, 10Patch-For-Review: pdu phase inbalances: ps1-a3-codfw, ps1-c6-codfw, & ps1-d6-codfw - https://phabricator.wikimedia.org/T163339#3475474 (10Papaul) p:05Normal>03Low
[17:31:59] <wikibugs>	 (03PS1) 10BBlack: VCL: mobile_redirect: unconditional https [puppet] - 10https://gerrit.wikimedia.org/r/367926
[17:32:20] <wikibugs>	 (03PS1) 10BBlack: recdns: do not use self in local resolv.conf [puppet] - 10https://gerrit.wikimedia.org/r/367927 (https://phabricator.wikimedia.org/T104442)
[17:35:42] <icinga-wm>	 PROBLEM - puppet last run on conf1005 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:35:44] <icinga-wm>	 PROBLEM - puppet last run on cp3034 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:35:44] <icinga-wm>	 PROBLEM - puppet last run on relforge1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:35:44] <icinga-wm>	 PROBLEM - puppet last run on mw1167 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:35:52] <icinga-wm>	 PROBLEM - puppet last run on etherpad1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:35:52] <icinga-wm>	 PROBLEM - puppet last run on mw1161 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:35:57] <wikibugs>	 10Operations, 10ops-codfw, 10hardware-requests: Decommission subra/suhail - https://phabricator.wikimedia.org/T169506#3475500 (10RobH) 05Open>03Resolved
[17:36:02] <icinga-wm>	 PROBLEM - puppet last run on mw1256 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:02] <icinga-wm>	 PROBLEM - puppet last run on mw1278 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:02] <icinga-wm>	 PROBLEM - puppet last run on labvirt1011 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:02] <icinga-wm>	 PROBLEM - puppet last run on mc1036 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:02] <icinga-wm>	 PROBLEM - puppet last run on mw1300 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:03] <icinga-wm>	 PROBLEM - puppet last run on ms-fe1006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:03] <icinga-wm>	 PROBLEM - puppet last run on ms1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:03] <icinga-wm>	 PROBLEM - puppet last run on db1075 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:12] <icinga-wm>	 PROBLEM - puppet last run on cp1066 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:12] <icinga-wm>	 PROBLEM - puppet last run on dbproxy1011 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:12] <icinga-wm>	 PROBLEM - puppet last run on stat1005 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:12] <icinga-wm>	 PROBLEM - puppet last run on db1036 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:13] <icinga-wm>	 PROBLEM - puppet last run on db1026 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:13] <icinga-wm>	 PROBLEM - puppet last run on wtp1015 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:13] <icinga-wm>	 PROBLEM - puppet last run on labvirt1007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:22] <icinga-wm>	 PROBLEM - puppet last run on dbmonitor1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:22] <icinga-wm>	 PROBLEM - puppet last run on alsafi is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:22] <icinga-wm>	 PROBLEM - puppet last run on achernar is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:22] <icinga-wm>	 PROBLEM - puppet last run on mw1190 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:23] <icinga-wm>	 PROBLEM - puppet last run on mw1210 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:23] <icinga-wm>	 PROBLEM - puppet last run on analytics1050 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:23] <icinga-wm>	 PROBLEM - puppet last run on mwdebug1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:23] <icinga-wm>	 PROBLEM - puppet last run on graphite1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:23] <icinga-wm>	 PROBLEM - puppet last run on maps1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:32] <icinga-wm>	 PROBLEM - puppet last run on cp1045 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:34] <icinga-wm>	 PROBLEM - puppet last run on labcontrol1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:34] <icinga-wm>	 PROBLEM - puppet last run on db1030 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:34] <icinga-wm>	 PROBLEM - puppet last run on praseodymium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:34] <icinga-wm>	 PROBLEM - puppet last run on ms-be1037 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:34] <icinga-wm>	 PROBLEM - puppet last run on analytics1039 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:34] <icinga-wm>	 PROBLEM - puppet last run on rdb1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:34] <icinga-wm>	 PROBLEM - puppet last run on dbproxy1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:34] <icinga-wm>	 PROBLEM - puppet last run on iridium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:34] <icinga-wm>	 PROBLEM - puppet last run on wtp1010 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:34] <icinga-wm>	 PROBLEM - puppet last run on prometheus2003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:42] <icinga-wm>	 PROBLEM - puppet last run on db1082 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:42] <icinga-wm>	 PROBLEM - puppet last run on db1092 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:42] <icinga-wm>	 PROBLEM - puppet last run on labvirt1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:42] <icinga-wm>	 PROBLEM - puppet last run on lvs1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:42] <icinga-wm>	 PROBLEM - puppet last run on mw1303 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:43] <icinga-wm>	 PROBLEM - puppet last run on helium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:43] <icinga-wm>	 PROBLEM - puppet last run on etcd1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:43] <icinga-wm>	 PROBLEM - puppet last run on cp1060 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:36:43] <icinga-wm>	 PROBLEM - puppet last run on labweb1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:02] <icinga-wm>	 PROBLEM - puppet last run on elastic1046 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:02] <icinga-wm>	 PROBLEM - puppet last run on mw1240 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:02] <icinga-wm>	 PROBLEM - puppet last run on mw1250 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:02] <icinga-wm>	 PROBLEM - puppet last run on rdb1005 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:02] <icinga-wm>	 PROBLEM - puppet last run on analytics1056 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:03] <icinga-wm>	 PROBLEM - puppet last run on mw1168 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:03] <icinga-wm>	 PROBLEM - puppet last run on puppetmaster1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:04] <icinga-wm>	 PROBLEM - puppet last run on lvs1005 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:12] <icinga-wm>	 PROBLEM - puppet last run on pc1006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:12] <icinga-wm>	 PROBLEM - puppet last run on elastic1042 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:12] <icinga-wm>	 PROBLEM - puppet last run on dbproxy1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:22] <icinga-wm>	 PROBLEM - puppet last run on cp3037 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:22] <icinga-wm>	 PROBLEM - puppet last run on uranium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:37:23] <icinga-wm>	 PROBLEM - puppet last run on mw1218 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:38:03] <icinga-wm>	 PROBLEM - puppet last run on mw1246 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:38:32] <icinga-wm>	 PROBLEM - puppet last run on francium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:33] <icinga-wm>	 PROBLEM - puppet last run on analytics1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:33] <icinga-wm>	 PROBLEM - puppet last run on notebook1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:42] <icinga-wm>	 PROBLEM - puppet last run on db1070 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:42] <icinga-wm>	 PROBLEM - puppet last run on restbase1011 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:42] <icinga-wm>	 PROBLEM - puppet last run on sodium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:43] <icinga-wm>	 PROBLEM - puppet last run on kafka1020 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:43] <icinga-wm>	 PROBLEM - puppet last run on etcd1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:52] <icinga-wm>	 PROBLEM - puppet last run on ms-be1031 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:52] <icinga-wm>	 PROBLEM - puppet last run on mw1265 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:52] <icinga-wm>	 PROBLEM - puppet last run on mw1267 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:52] <icinga-wm>	 PROBLEM - puppet last run on mw1268 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:52] <icinga-wm>	 PROBLEM - puppet last run on ores1007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:52] <icinga-wm>	 PROBLEM - puppet last run on es1013 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:55] <icinga-wm>	 PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:56] <icinga-wm>	 PROBLEM - puppet last run on wasat is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:56] <icinga-wm>	 PROBLEM - puppet last run on cp1073 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:56] <icinga-wm>	 PROBLEM - puppet last run on labweb1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:56] <icinga-wm>	 PROBLEM - puppet last run on ores1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:39:56] <icinga-wm>	 PROBLEM - puppet last run on ores1006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:06] <icinga-wm>	 PROBLEM - puppet last run on ms-fe3002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:13] <icinga-wm>	 PROBLEM - puppet last run on ms-fe3001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:13] <icinga-wm>	 PROBLEM - puppet last run on es1014 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:13] <icinga-wm>	 PROBLEM - puppet last run on logstash1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:13] <icinga-wm>	 PROBLEM - puppet last run on wtp1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:13] <icinga-wm>	 PROBLEM - puppet last run on db1065 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:13] <icinga-wm>	 PROBLEM - puppet last run on analytics1065 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:13] <icinga-wm>	 PROBLEM - puppet last run on analytics1062 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:14] <icinga-wm>	 PROBLEM - puppet last run on mw1185 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:14] <icinga-wm>	 PROBLEM - puppet last run on db1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:15] <icinga-wm>	 PROBLEM - puppet last run on wtp1029 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:15] <icinga-wm>	 PROBLEM - puppet last run on maps1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:16] <icinga-wm>	 PROBLEM - puppet last run on californium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:22] <icinga-wm>	 PROBLEM - puppet last run on dbstore1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:22] <icinga-wm>	 PROBLEM - puppet last run on labsdb1006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:22] <icinga-wm>	 PROBLEM - puppet last run on cp3035 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:22] <icinga-wm>	 PROBLEM - puppet last run on mw1223 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:22] <icinga-wm>	 PROBLEM - puppet last run on wtp1040 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:23] <icinga-wm>	 PROBLEM - puppet last run on elastic1051 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:23] <icinga-wm>	 PROBLEM - puppet last run on labtestweb2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:23] <icinga-wm>	 PROBLEM - puppet last run on cp1062 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:34] <icinga-wm>	 PROBLEM - puppet last run on phab2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:42] <icinga-wm>	 PROBLEM - puppet last run on labsdb1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:42] <icinga-wm>	 PROBLEM - puppet last run on db1066 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:43] <icinga-wm>	 PROBLEM - puppet last run on cp3030 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:52] <icinga-wm>	 PROBLEM - puppet last run on kubestagetcd1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:52] <icinga-wm>	 PROBLEM - puppet last run on hassium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:52] <icinga-wm>	 PROBLEM - puppet last run on aqs1008 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:52] <icinga-wm>	 PROBLEM - puppet last run on aqs1007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:52] <icinga-wm>	 PROBLEM - puppet last run on ores1009 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:52] <icinga-wm>	 PROBLEM - puppet last run on cp1061 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:52] <icinga-wm>	 PROBLEM - puppet last run on mw1191 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:53] <icinga-wm>	 PROBLEM - puppet last run on prometheus1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:53] <icinga-wm>	 PROBLEM - puppet last run on mw1271 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:54] <icinga-wm>	 PROBLEM - puppet last run on db1023 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:40:58] <icinga-wm>	 PROBLEM - puppet last run on oresrdb1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:02] <icinga-wm>	 PROBLEM - puppet last run on ganeti1007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:02] <icinga-wm>	 PROBLEM - puppet last run on mw1254 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:02] <icinga-wm>	 PROBLEM - puppet last run on rhenium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:02] <icinga-wm>	 PROBLEM - puppet last run on mw1299 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:02] <icinga-wm>	 PROBLEM - puppet last run on rdb1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:03] <icinga-wm>	 PROBLEM - puppet last run on cp1051 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:03] <icinga-wm>	 PROBLEM - puppet last run on mc1019 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:03] <icinga-wm>	 PROBLEM - puppet last run on mw1283 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:03] <icinga-wm>	 PROBLEM - puppet last run on mw1255 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:04] <icinga-wm>	 PROBLEM - puppet last run on mw1197 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:04] <icinga-wm>	 PROBLEM - puppet last run on stat1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:12] <icinga-wm>	 PROBLEM - puppet last run on rdb1007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:12] <icinga-wm>	 PROBLEM - puppet last run on cp3044 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:12] <icinga-wm>	 PROBLEM - puppet last run on radon is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:12] <icinga-wm>	 PROBLEM - puppet last run on oresrdb1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:12] <icinga-wm>	 PROBLEM - puppet last run on cp1050 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:13] <icinga-wm>	 PROBLEM - puppet last run on dbproxy1006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:13] <icinga-wm>	 PROBLEM - puppet last run on mw1282 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:13] <icinga-wm>	 PROBLEM - puppet last run on bast1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:32] <icinga-wm>	 PROBLEM - puppet last run on kafka1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:33] <icinga-wm>	 PROBLEM - puppet last run on eventlog1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:42] <icinga-wm>	 PROBLEM - puppet last run on copper is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:42] <icinga-wm>	 PROBLEM - puppet last run on ms-be1025 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:42] <icinga-wm>	 PROBLEM - puppet last run on cp1099 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:42] <icinga-wm>	 PROBLEM - puppet last run on db1011 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:42] <icinga-wm>	 PROBLEM - puppet last run on ms-be1022 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:44] <icinga-wm>	 PROBLEM - puppet last run on oxygen is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:44] <icinga-wm>	 PROBLEM - puppet last run on sca1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:44] <icinga-wm>	 PROBLEM - puppet last run on labtestservices2003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:52] <icinga-wm>	 PROBLEM - puppet last run on cobalt is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:52] <icinga-wm>	 PROBLEM - puppet last run on kubestage1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:52] <icinga-wm>	 PROBLEM - puppet last run on ms-be1030 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:52] <icinga-wm>	 PROBLEM - puppet last run on cp3006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:52] <icinga-wm>	 PROBLEM - puppet last run on mw1269 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:41:53] <icinga-wm>	 PROBLEM - puppet last run on thorium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:03] <icinga-wm>	 PROBLEM - puppet last run on mw1206 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:03] <icinga-wm>	 PROBLEM - puppet last run on elastic1038 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:03] <icinga-wm>	 PROBLEM - puppet last run on analytics1057 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:12] <icinga-wm>	 PROBLEM - puppet last run on es1019 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:12] <icinga-wm>	 PROBLEM - puppet last run on mw1296 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:22] <icinga-wm>	 PROBLEM - puppet last run on db1091 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:23] <icinga-wm>	 PROBLEM - puppet last run on analytics1030 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:42] <icinga-wm>	 PROBLEM - puppet last run on poolcounter1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:42] <icinga-wm>	 PROBLEM - puppet last run on db1071 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:43] <icinga-wm>	 PROBLEM - puppet last run on kafka1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:52] <icinga-wm>	 PROBLEM - puppet last run on mc1031 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:52] <icinga-wm>	 PROBLEM - puppet last run on db1097 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:42:52] <icinga-wm>	 PROBLEM - puppet last run on mw1263 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:02] <icinga-wm>	 PROBLEM - puppet last run on ms-be1034 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:04] <icinga-wm>	 PROBLEM - puppet last run on mw1251 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:04] <icinga-wm>	 PROBLEM - puppet last run on ms-be1017 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:04] <icinga-wm>	 PROBLEM - puppet last run on mw1252 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:04] <icinga-wm>	 PROBLEM - puppet last run on wdqs1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:12] <icinga-wm>	 PROBLEM - puppet last run on cerium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:12] <icinga-wm>	 PROBLEM - puppet last run on conf1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:12] <icinga-wm>	 PROBLEM - puppet last run on ganeti1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:12] <icinga-wm>	 PROBLEM - puppet last run on einsteinium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:12] <icinga-wm>	 PROBLEM - puppet last run on snapshot1006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:13] <icinga-wm>	 PROBLEM - puppet last run on labvirt1017 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:13] <icinga-wm>	 PROBLEM - puppet last run on ruthenium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:13] <icinga-wm>	 PROBLEM - puppet last run on wtp1048 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:22] <icinga-wm>	 PROBLEM - puppet last run on bast3002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:22] <icinga-wm>	 PROBLEM - puppet last run on ganeti1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:33] <icinga-wm>	 PROBLEM - puppet last run on wtp1037 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:42] <icinga-wm>	 PROBLEM - puppet last run on db1039 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:43] <icinga-wm>	 PROBLEM - puppet last run on mw1239 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:43] <icinga-wm>	 PROBLEM - puppet last run on db1085 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:43] <icinga-wm>	 PROBLEM - puppet last run on mw1281 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:43] <icinga-wm>	 PROBLEM - puppet last run on labvirt1005 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:52] <icinga-wm>	 PROBLEM - puppet last run on db1043 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:54] <icinga-wm>	 PROBLEM - puppet last run on mwdebug1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:54] <icinga-wm>	 PROBLEM - puppet last run on db1099 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:43:54] <icinga-wm>	 PROBLEM - puppet last run on mw1209 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:02] <icinga-wm>	 PROBLEM - puppet last run on elastic1027 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:12] <icinga-wm>	 PROBLEM - puppet last run on mc1035 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:12] <icinga-wm>	 PROBLEM - puppet last run on mw1184 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:12] <icinga-wm>	 PROBLEM - puppet last run on analytics1034 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:12] <icinga-wm>	 PROBLEM - puppet last run on scb1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:12] <icinga-wm>	 PROBLEM - puppet last run on mw1297 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:12] <icinga-wm>	 PROBLEM - puppet last run on ms-be3003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:13] <icinga-wm>	 PROBLEM - puppet last run on labcontrol1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:13] <icinga-wm>	 PROBLEM - puppet last run on thumbor1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:13] <icinga-wm>	 PROBLEM - puppet last run on analytics1061 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:14] <icinga-wm>	 PROBLEM - puppet last run on graphite1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:23] <icinga-wm>	 PROBLEM - puppet last run on ores1005 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:23] <icinga-wm>	 PROBLEM - puppet last run on mw1229 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:23] <icinga-wm>	 PROBLEM - puppet last run on wtp1046 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:23] <icinga-wm>	 PROBLEM - puppet last run on db1096 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:23] <icinga-wm>	 PROBLEM - puppet last run on mw1162 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:33] <icinga-wm>	 PROBLEM - puppet last run on db1041 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:33] <icinga-wm>	 PROBLEM - puppet last run on mw1231 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:33] <icinga-wm>	 PROBLEM - puppet last run on wtp1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:33] <icinga-wm>	 PROBLEM - puppet last run on db1054 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:42] <icinga-wm>	 PROBLEM - puppet last run on rdb1008 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:42] <icinga-wm>	 PROBLEM - puppet last run on rhodium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:42] <icinga-wm>	 PROBLEM - puppet last run on stat1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:42] <icinga-wm>	 PROBLEM - puppet last run on scb1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:43] <icinga-wm>	 PROBLEM - puppet last run on mw1202 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:52] <icinga-wm>	 PROBLEM - puppet last run on cp3047 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:52] <icinga-wm>	 PROBLEM - puppet last run on mw1207 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:52] <icinga-wm>	 PROBLEM - puppet last run on dysprosium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:52] <icinga-wm>	 PROBLEM - puppet last run on mw1211 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:53] <icinga-wm>	 PROBLEM - puppet last run on analytics1052 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:53] <icinga-wm>	 PROBLEM - puppet last run on elastic1029 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:53] <icinga-wm>	 PROBLEM - puppet last run on ocg1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:44:53] <icinga-wm>	 PROBLEM - puppet last run on labtestservices2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:02] <icinga-wm>	 PROBLEM - puppet last run on cp1046 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:03] <icinga-wm>	 PROBLEM - puppet last run on db1087 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:03] <icinga-wm>	 PROBLEM - puppet last run on darmstadtium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:03] <icinga-wm>	 PROBLEM - puppet last run on db1072 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:12] <icinga-wm>	 PROBLEM - puppet last run on analytics1067 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:12] <icinga-wm>	 PROBLEM - puppet last run on analytics1048 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:12] <icinga-wm>	 PROBLEM - puppet last run on db1035 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:13] <icinga-wm>	 PROBLEM - puppet last run on mw1294 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:13] <icinga-wm>	 PROBLEM - puppet last run on analytics1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:13] <icinga-wm>	 PROBLEM - puppet last run on logstash1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:13] <icinga-wm>	 PROBLEM - puppet last run on db1059 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:13] <icinga-wm>	 PROBLEM - puppet last run on aqs1005 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:22] <icinga-wm>	 PROBLEM - puppet last run on thumbor1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:22] <icinga-wm>	 PROBLEM - puppet last run on ms-be1015 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:32] <icinga-wm>	 PROBLEM - puppet last run on mc1023 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:32] <icinga-wm>	 PROBLEM - puppet last run on ms-be1029 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:32] <icinga-wm>	 PROBLEM - puppet last run on elastic1018 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:36] <icinga-wm>	 PROBLEM - puppet last run on labcontrol1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:42] <icinga-wm>	 PROBLEM - puppet last run on mw1232 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:42] <icinga-wm>	 PROBLEM - puppet last run on db1048 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:42] <icinga-wm>	 PROBLEM - puppet last run on poolcounter1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:52] <icinga-wm>	 PROBLEM - puppet last run on mw1275 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:52] <icinga-wm>	 PROBLEM - puppet last run on db1105 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:52] <icinga-wm>	 PROBLEM - puppet last run on db1061 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:52] <icinga-wm>	 PROBLEM - puppet last run on ms-be1033 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:45:52] <icinga-wm>	 PROBLEM - puppet last run on bromine is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:02] <icinga-wm>	 PROBLEM - puppet last run on ms-fe1008 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:03] <icinga-wm>	 PROBLEM - puppet last run on db1044 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:03] <icinga-wm>	 PROBLEM - puppet last run on labsdb1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:03] <icinga-wm>	 PROBLEM - puppet last run on wtp1044 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:03] <icinga-wm>	 PROBLEM - puppet last run on labvirt1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:12] <icinga-wm>	 PROBLEM - puppet last run on mw1276 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:12] <icinga-wm>	 PROBLEM - puppet last run on mw1289 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:12] <icinga-wm>	 PROBLEM - puppet last run on analytics1058 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:13] <icinga-wm>	 PROBLEM - puppet last run on acamar is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:13] <bblack>	 !log nitrogen: disabled puppet agent, manually hacked puppetdb.service unit file, restarted puppetdb.service...
[17:46:22] <icinga-wm>	 PROBLEM - puppet last run on mw1285 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:22] <icinga-wm>	 PROBLEM - puppet last run on db1047 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:46:22] <icinga-wm>	 PROBLEM - puppet last run on mw1260 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:22] <icinga-wm>	 PROBLEM - puppet last run on elastic1052 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:32] <icinga-wm>	 PROBLEM - puppet last run on labstore1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:33] <icinga-wm>	 PROBLEM - puppet last run on ganeti1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:42] <icinga-wm>	 PROBLEM - puppet last run on mw1216 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:42] <icinga-wm>	 PROBLEM - puppet last run on analytics1036 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:42] <icinga-wm>	 PROBLEM - puppet last run on cp1054 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:43] <icinga-wm>	 PROBLEM - puppet last run on cp3033 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:43] <icinga-wm>	 PROBLEM - puppet last run on cp1048 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:55] <icinga-wm>	 PROBLEM - puppet last run on lvs1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:55] <icinga-wm>	 PROBLEM - puppet last run on mw1305 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:55] <icinga-wm>	 PROBLEM - puppet last run on fermium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:55] <icinga-wm>	 PROBLEM - puppet last run on mw1245 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:46:55] <icinga-wm>	 PROBLEM - puppet last run on analytics1064 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:02] <icinga-wm>	 PROBLEM - puppet last run on mw1215 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:02] <icinga-wm>	 PROBLEM - puppet last run on mw1165 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:02] <icinga-wm>	 PROBLEM - puppet last run on elastic1028 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:02] <icinga-wm>	 PROBLEM - puppet last run on etcd1006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:03] <icinga-wm>	 PROBLEM - puppet last run on mw1248 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:03] <icinga-wm>	 PROBLEM - puppet last run on mw1247 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:12] <icinga-wm>	 PROBLEM - puppet last run on elastic1043 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:12] <icinga-wm>	 PROBLEM - puppet last run on cp3040 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:12] <icinga-wm>	 PROBLEM - puppet last run on cp3041 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:12] <icinga-wm>	 PROBLEM - puppet last run on elastic1039 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:12] <icinga-wm>	 PROBLEM - puppet last run on analytics1066 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:12] <icinga-wm>	 PROBLEM - puppet last run on analytics1063 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:12] <icinga-wm>	 PROBLEM - puppet last run on labtestcontrol2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:13] <icinga-wm>	 PROBLEM - puppet last run on wtp1020 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:22] <icinga-wm>	 PROBLEM - puppet last run on wtp1022 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:24] <icinga-wm>	 PROBLEM - puppet last run on ms-be1018 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:32] <icinga-wm>	 PROBLEM - puppet last run on cp3045 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:32] <icinga-wm>	 PROBLEM - puppet last run on ms-be1035 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:32] <icinga-wm>	 PROBLEM - puppet last run on elastic1050 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:42] <icinga-wm>	 PROBLEM - puppet last run on aqs1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:42] <icinga-wm>	 PROBLEM - puppet last run on mw1225 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:42] <icinga-wm>	 PROBLEM - puppet last run on dbstore1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:52] <icinga-wm>	 PROBLEM - puppet last run on db1037 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:52] <icinga-wm>	 PROBLEM - puppet last run on kafka1014 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:47:52] <icinga-wm>	 PROBLEM - puppet last run on db1084 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:48:02] <icinga-wm>	 PROBLEM - puppet last run on lvs1006 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:48:12] <icinga-wm>	 PROBLEM - puppet last run on kubernetes1002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[17:48:17] <bblack>	 I got acamar (one of the failed above) to run
[17:48:22] <icinga-wm>	 RECOVERY - puppet last run on acamar is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures
[17:48:23] <bblack>	 I wonder if things will now settle on their own, or not?
[17:48:34] <bblack>	 (or how delayed these reports are, too)
[17:48:35] <wikibugs>	 (03CR) 10Ayounsi: [C: 032] "Thanks, I'll add IPv6 to the interfaces when we will bring v6 to the services." [dns] - 10https://gerrit.wikimedia.org/r/367809 (https://phabricator.wikimedia.org/T169643) (owner: 10Ayounsi)
[17:48:39] <wikibugs>	 (03PS2) 10Ayounsi: Add pfw3-codfw loopback and uplinks IPs to DNS [dns] - 10https://gerrit.wikimedia.org/r/367809 (https://phabricator.wikimedia.org/T169643)
[17:53:15] <wikibugs>	 10Operations, 10Puppet, 10Patch-For-Review: PuppetDB misbehaving on 2017-07-15 - https://phabricator.wikimedia.org/T170740#3475569 (10BBlack) So, things fell over again with a ton of puppetfail spam.  As a stopgap, I've done the following:  1. Disabled the agent on nitrogen 2. Edited the puppetdb.service sys...
[17:54:46] <wikibugs>	 10Operations, 10ops-codfw: failing RAID disk on frdb2001 - https://phabricator.wikimedia.org/T171584#3475576 (10RobH) p:05Normal>03High a:03Papaul I've assigned this to @papaul and moved it into the high priority column on the #ops-codfw workboard.  This is blocked until the disks ordered on T171620 arri...
[17:55:12] <wikibugs>	 10Operations, 10Cloud-VPS: rack/setup/install labtestmetal2001.codfw.wmnet - https://phabricator.wikimedia.org/T168891#3475586 (10RobH)
[17:55:34] <wikibugs>	 10Operations, 10Cloud-VPS, 10Patch-For-Review: rack/setup/install labtestpuppetmaster2001 - https://phabricator.wikimedia.org/T167157#3475589 (10RobH)
[17:55:51] <Sagan>	 schana: there you go
[17:56:00] <wikibugs>	 (03PS1) 10Bearloga: statistics::discovery: Reconfigure for Golden data retrieval [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494)
[17:56:28] <schana>	 thanks Sagan 
[17:56:34] <Sagan>	 you're welcome :)
[17:57:27] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] statistics::discovery: Reconfigure for Golden data retrieval [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) (owner: 10Bearloga)
[17:58:52] <icinga-wm>	 RECOVERY - puppet last run on prometheus2003 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures
[17:58:56] <wikibugs>	 (03PS1) 10Ayounsi: Assign internal IPs to pfw3-codfw<->pfw3-eqiad ipsec link [dns] - 10https://gerrit.wikimedia.org/r/367933 (https://phabricator.wikimedia.org/T169643)
[17:59:00] <wikibugs>	 (03PS2) 10Bearloga: statistics::discovery: Reconfigure for Golden data retrieval [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494)
[18:00:04] <jouncebot>	 addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Respected human, time to deploy Morning SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170726T1800). Please do the needful.
[18:00:05] <jouncebot>	 RoanKattouw: A patch you scheduled for Morning SWAT (Max 8 patches) is about to be deployed. Please be available during the process.
[18:00:18] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] statistics::discovery: Reconfigure for Golden data retrieval [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) (owner: 10Bearloga)
[18:00:24] <bearloga>	 blergh
[18:01:52] <icinga-wm>	 RECOVERY - puppet last run on wtp1010 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures
[18:01:52] <icinga-wm>	 RECOVERY - puppet last run on dbproxy1003 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures
[18:01:53] <icinga-wm>	 RECOVERY - puppet last run on helium is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures
[18:02:03] <icinga-wm>	 PROBLEM - Upload HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [1000.0]
[18:02:12] <icinga-wm>	 RECOVERY - puppet last run on mw1256 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures
[18:02:12] <icinga-wm>	 RECOVERY - puppet last run on mw1278 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures
[18:02:13] <icinga-wm>	 RECOVERY - puppet last run on mc1036 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures
[18:02:13] <icinga-wm>	 RECOVERY - puppet last run on analytics1056 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures
[18:02:22] <icinga-wm>	 RECOVERY - puppet last run on puppetmaster1002 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures
[18:02:22] <icinga-wm>	 RECOVERY - puppet last run on dbproxy1011 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures
[18:02:32] <icinga-wm>	 RECOVERY - puppet last run on db1036 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures
[18:02:32] <icinga-wm>	 RECOVERY - puppet last run on dbmonitor1001 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures
[18:02:33] <icinga-wm>	 RECOVERY - puppet last run on alsafi is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures
[18:02:42] <icinga-wm>	 RECOVERY - puppet last run on mw1210 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures
[18:02:42] <icinga-wm>	 RECOVERY - puppet last run on cp1045 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:02:52] <icinga-wm>	 RECOVERY - puppet last run on conf1005 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures
[18:02:53] <icinga-wm>	 RECOVERY - puppet last run on etcd1001 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures
[18:03:02] <icinga-wm>	 RECOVERY - puppet last run on mw1167 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures
[18:03:02] <icinga-wm>	 RECOVERY - puppet last run on labweb1002 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures
[18:03:03] <icinga-wm>	 RECOVERY - puppet last run on etherpad1001 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures
[18:03:03] <icinga-wm>	 RECOVERY - puppet last run on mw1161 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures
[18:03:12] <icinga-wm>	 RECOVERY - puppet last run on labvirt1011 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:03:12] <icinga-wm>	 RECOVERY - puppet last run on rdb1005 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures
[18:03:15] <Niharika>	 I can SWAT.
[18:03:22] <icinga-wm>	 RECOVERY - puppet last run on ms-fe1006 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:03:22] <framawiki>	 o/
[18:03:23] <icinga-wm>	 RECOVERY - puppet last run on elastic1042 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures
[18:03:32] <icinga-wm>	 RECOVERY - puppet last run on wtp1015 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures
[18:03:32] <icinga-wm>	 RECOVERY - puppet last run on labvirt1007 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures
[18:03:33] <icinga-wm>	 RECOVERY - puppet last run on mw1190 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures
[18:03:42] <icinga-wm>	 RECOVERY - puppet last run on analytics1050 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures
[18:03:42] <icinga-wm>	 RECOVERY - puppet last run on mwdebug1001 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures
[18:03:42] <icinga-wm>	 RECOVERY - puppet last run on cp3037 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures
[18:03:42] <icinga-wm>	 RECOVERY - puppet last run on graphite1003 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures
[18:03:42] <icinga-wm>	 RECOVERY - puppet last run on achernar is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures
[18:03:43] <icinga-wm>	 RECOVERY - puppet last run on analytics1039 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures
[18:03:43] <icinga-wm>	 RECOVERY - puppet last run on iridium is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures
[18:03:52] <icinga-wm>	 RECOVERY - puppet last run on db1092 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures
[18:03:53] <icinga-wm>	 RECOVERY - puppet last run on labvirt1002 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures
[18:03:53] <icinga-wm>	 RECOVERY - puppet last run on mw1303 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures
[18:03:53] <icinga-wm>	 RECOVERY - puppet last run on relforge1001 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures
[18:04:02] <icinga-wm>	 RECOVERY - puppet last run on cp1060 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures
[18:04:02] <icinga-wm>	 RECOVERY - puppet last run on cp3034 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures
[18:04:13] <icinga-wm>	 RECOVERY - puppet last run on mw1240 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures
[18:04:13] <icinga-wm>	 RECOVERY - puppet last run on mw1246 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures
[18:04:13] <icinga-wm>	 RECOVERY - puppet last run on mw1168 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures
[18:04:19] <RoanKattouw>	 I'm here
[18:04:22] <icinga-wm>	 RECOVERY - puppet last run on db1075 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:04:22] <icinga-wm>	 RECOVERY - puppet last run on lvs1005 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures
[18:04:22] <icinga-wm>	 RECOVERY - puppet last run on cp1066 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures
[18:04:22] <icinga-wm>	 RECOVERY - puppet last run on pc1006 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures
[18:04:32] <icinga-wm>	 RECOVERY - puppet last run on dbproxy1001 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures
[18:04:32] <icinga-wm>	 RECOVERY - puppet last run on stat1005 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures
[18:04:32] <icinga-wm>	 RECOVERY - puppet last run on uranium is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures
[18:04:42] <icinga-wm>	 RECOVERY - puppet last run on mw1218 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures
[18:04:43] <icinga-wm>	 RECOVERY - puppet last run on maps1003 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures
[18:04:43] <icinga-wm>	 RECOVERY - puppet last run on francium is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures
[18:04:43] <icinga-wm>	 RECOVERY - puppet last run on labcontrol1002 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures
[18:04:43] <icinga-wm>	 RECOVERY - puppet last run on db1030 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures
[18:04:52] <icinga-wm>	 RECOVERY - puppet last run on rdb1003 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures
[18:04:52] <icinga-wm>	 RECOVERY - puppet last run on ms-be1037 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures
[18:04:52] <icinga-wm>	 RECOVERY - puppet last run on db1082 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures
[18:05:02] <icinga-wm>	 RECOVERY - puppet last run on aqs1007 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures
[18:05:12] <icinga-wm>	 RECOVERY - puppet last run on ores1001 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures
[18:05:12] <icinga-wm>	 RECOVERY - puppet last run on ores1006 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures
[18:05:13] <icinga-wm>	 RECOVERY - puppet last run on mw1241 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures
[18:05:13] <icinga-wm>	 RECOVERY - puppet last run on elastic1046 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures
[18:05:13] <icinga-wm>	 RECOVERY - puppet last run on mw1250 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:05:13] <icinga-wm>	 RECOVERY - puppet last run on mw1300 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures
[18:05:22] <icinga-wm>	 RECOVERY - puppet last run on ms1001 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures
[18:05:22] <icinga-wm>	 RECOVERY - puppet last run on es1014 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures
[18:05:32] <icinga-wm>	 RECOVERY - puppet last run on ms-be3004 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:05:32] <icinga-wm>	 RECOVERY - puppet last run on db1026 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[18:05:32] <icinga-wm>	 RECOVERY - puppet last run on maps1001 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures
[18:05:42] <icinga-wm>	 RECOVERY - puppet last run on relforge1002 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures
[18:05:42] <icinga-wm>	 RECOVERY - puppet last run on db1034 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures
[18:05:43] <icinga-wm>	 RECOVERY - puppet last run on kubestage1001 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures
[18:05:43] <icinga-wm>	 RECOVERY - puppet last run on praseodymium is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[18:05:52] <icinga-wm>	 RECOVERY - puppet last run on analytics1002 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures
[18:06:02] <icinga-wm>	 RECOVERY - puppet last run on db1070 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures
[18:06:02] <icinga-wm>	 RECOVERY - puppet last run on lvs1001 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures
[18:06:02] <icinga-wm>	 RECOVERY - puppet last run on mw1271 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures
[18:06:03] <icinga-wm>	 RECOVERY - puppet last run on cp1073 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:06:12] <icinga-wm>	 RECOVERY - puppet last run on ganeti1007 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures
[18:06:12] <icinga-wm>	 RECOVERY - puppet last run on labweb1001 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures
[18:06:12] <icinga-wm>	 RECOVERY - puppet last run on db1053 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures
[18:06:12] <icinga-wm>	 RECOVERY - puppet last run on mw1302 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures
[18:06:12] <icinga-wm>	 RECOVERY - puppet last run on cp1051 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures
[18:06:22] <icinga-wm>	 RECOVERY - puppet last run on alcyone is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures
[18:06:22] <icinga-wm>	 RECOVERY - puppet last run on contint2001 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures
[18:06:22] <icinga-wm>	 RECOVERY - puppet last run on db1065 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures
[18:06:32] <icinga-wm>	 RECOVERY - puppet last run on wtp1040 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures
[18:06:33] <icinga-wm>	 RECOVERY - puppet last run on cp1062 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures
[18:06:43] <icinga-wm>	 RECOVERY - puppet last run on kafka1013 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures
[18:06:51] <wikibugs>	 (03PS2) 10Niharika29: Create 'rollbacker' user group in frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/365538 (https://phabricator.wikimedia.org/T170780) (owner: 10Framawiki)
[18:06:52] <icinga-wm>	 RECOVERY - puppet last run on ms-be1025 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures
[18:07:02] <icinga-wm>	 RECOVERY - puppet last run on restbase1011 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures
[18:07:02] <icinga-wm>	 RECOVERY - puppet last run on kafka1020 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures
[18:07:02] <icinga-wm>	 RECOVERY - puppet last run on etcd1004 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures
[18:07:02] <icinga-wm>	 RECOVERY - puppet last run on cp1061 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures
[18:07:02] <icinga-wm>	 RECOVERY - puppet last run on mw1191 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures
[18:07:02] <icinga-wm>	 RECOVERY - puppet last run on mw1268 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures
[18:07:03] <icinga-wm>	 RECOVERY - puppet last run on ores1007 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures
[18:07:03] <icinga-wm>	 RECOVERY - puppet last run on es1013 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures
[18:07:03] <icinga-wm>	 RECOVERY - puppet last run on cp3030 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures
[18:07:04] <icinga-wm>	 RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[18:07:04] <icinga-wm>	 RECOVERY - puppet last run on ms-be1031 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures
[18:07:08] <wikibugs>	 (03CR) 10Niharika29: [C: 032] Create 'rollbacker' user group in frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/365538 (https://phabricator.wikimedia.org/T170780) (owner: 10Framawiki)
[18:07:12] <icinga-wm>	 RECOVERY - puppet last run on db1076 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures
[18:07:13] <icinga-wm>	 RECOVERY - puppet last run on mc1019 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:07:13] <icinga-wm>	 RECOVERY - puppet last run on maps1004 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures
[18:07:22] <icinga-wm>	 RECOVERY - puppet last run on wtp1018 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures
[18:07:22] <icinga-wm>	 RECOVERY - puppet last run on wtp1002 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures
[18:07:22] <icinga-wm>	 RECOVERY - puppet last run on dbproxy1006 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures
[18:07:27] <Niharika>	 Shush, icinga-wm.
[18:07:32] <icinga-wm>	 RECOVERY - puppet last run on bast1001 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures
[18:07:33] <icinga-wm>	 RECOVERY - puppet last run on mw1185 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures
[18:07:33] <icinga-wm>	 RECOVERY - puppet last run on wtp1029 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures
[18:07:33] <icinga-wm>	 RECOVERY - puppet last run on labsdb1006 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures
[18:07:33] <icinga-wm>	 RECOVERY - puppet last run on mw1223 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures
[18:07:42] <icinga-wm>	 RECOVERY - puppet last run on mw1166 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures
[18:07:43] <icinga-wm>	 RECOVERY - puppet last run on labtestweb2001 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures
[18:07:52] <icinga-wm>	 RECOVERY - puppet last run on kafka1002 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures
[18:07:52] <icinga-wm>	 RECOVERY - puppet last run on xenon is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures
[18:07:52] <icinga-wm>	 RECOVERY - puppet last run on mw1227 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures
[18:07:52] <icinga-wm>	 RECOVERY - puppet last run on notebook1002 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures
[18:07:53] <icinga-wm>	 RECOVERY - puppet last run on phab2001 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures
[18:08:02] <icinga-wm>	 RECOVERY - puppet last run on labsdb1001 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures
[18:08:02] <icinga-wm>	 RECOVERY - puppet last run on sodium is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures
[18:08:03] <icinga-wm>	 RECOVERY - puppet last run on hassium is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures
[18:08:03] <icinga-wm>	 RECOVERY - puppet last run on aqs1008 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:08:12] <icinga-wm>	 RECOVERY - puppet last run on mw1267 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures
[18:08:12] <icinga-wm>	 RECOVERY - puppet last run on mw1265 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures
[18:08:12] <icinga-wm>	 RECOVERY - puppet last run on db1023 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures
[18:08:13] <icinga-wm>	 RECOVERY - puppet last run on mw1254 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[18:08:13] <icinga-wm>	 RECOVERY - puppet last run on rhenium is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures
[18:08:22] <icinga-wm>	 RECOVERY - puppet last run on rdb1004 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures
[18:08:23] <icinga-wm>	 RECOVERY - puppet last run on mw1244 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures
[18:08:23] <icinga-wm>	 RECOVERY - puppet last run on mw1197 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures
[18:08:23] <icinga-wm>	 RECOVERY - puppet last run on chlorine is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures
[18:08:23] <icinga-wm>	 RECOVERY - puppet last run on stat1004 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures
[18:08:23] <icinga-wm>	 RECOVERY - puppet last run on logstash1003 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures
[18:08:32] <icinga-wm>	 RECOVERY - puppet last run on analytics1065 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures
[18:08:32] <icinga-wm>	 RECOVERY - puppet last run on californium is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures
[18:08:32] <icinga-wm>	 RECOVERY - puppet last run on db1001 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures
[18:08:32] <icinga-wm>	 RECOVERY - puppet last run on dbstore1002 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures
[18:08:33] <icinga-wm>	 RECOVERY - puppet last run on cp3044 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures
[18:08:33] <icinga-wm>	 RECOVERY - puppet last run on mw1187 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures
[18:08:33] <icinga-wm>	 RECOVERY - puppet last run on ms-fe3002 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures
[18:08:34] <icinga-wm>	 RECOVERY - puppet last run on ms-fe3001 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures
[18:08:42] <icinga-wm>	 RECOVERY - puppet last run on elastic1051 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures
[18:08:42] <icinga-wm>	 RECOVERY - puppet last run on labsdb1005 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[18:08:43] <icinga-wm>	 RECOVERY - puppet last run on wtp1025 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures
[18:08:43] <icinga-wm>	 RECOVERY - puppet last run on wtp1038 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures
[18:08:43] <icinga-wm>	 RECOVERY - puppet last run on cp3035 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures
[18:08:43] <icinga-wm>	 RECOVERY - puppet last run on ms-be1028 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures
[18:08:52] <icinga-wm>	 RECOVERY - puppet last run on elastic1021 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures
[18:08:52] <icinga-wm>	 RECOVERY - puppet last run on ms-be1016 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures
[18:09:02] <icinga-wm>	 RECOVERY - puppet last run on oxygen is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures
[18:09:02] <icinga-wm>	 RECOVERY - puppet last run on sca1004 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures
[18:09:02] <icinga-wm>	 RECOVERY - puppet last run on db1066 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures
[18:09:03] <icinga-wm>	 RECOVERY - puppet last run on ores1009 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures
[18:09:03] <icinga-wm>	 RECOVERY - puppet last run on prometheus1003 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures
[18:09:12] <icinga-wm>	 RECOVERY - puppet last run on ms-be1030 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures
[18:09:12] <icinga-wm>	 RECOVERY - puppet last run on oresrdb1002 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures
[18:09:12] <icinga-wm>	 RECOVERY - puppet last run on cp1074 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures
[18:09:13] <wikibugs>	 (03PS3) 10Bearloga: statistics::discovery: Reconfigure for Golden data retrieval [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494)
[18:09:13] <icinga-wm>	 RECOVERY - puppet last run on wasat is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures
[18:09:22] <icinga-wm>	 RECOVERY - puppet last run on cp3006 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures
[18:09:22] <icinga-wm>	 RECOVERY - puppet last run on mw1200 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures
[18:09:22] <icinga-wm>	 RECOVERY - puppet last run on mw1283 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures
[18:09:22] <icinga-wm>	 RECOVERY - puppet last run on mw1255 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures
[18:09:22] <icinga-wm>	 RECOVERY - puppet last run on es1019 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures
[18:09:23] <icinga-wm>	 RECOVERY - puppet last run on rdb1007 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures
[18:09:23] <icinga-wm>	 RECOVERY - puppet last run on radon is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[18:09:23] <icinga-wm>	 RECOVERY - puppet last run on oresrdb1001 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures
[18:09:28] <wikibugs>	 (03Merged) 10jenkins-bot: Create 'rollbacker' user group in frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/365538 (https://phabricator.wikimedia.org/T170780) (owner: 10Framawiki)
[18:09:32] <icinga-wm>	 RECOVERY - puppet last run on mw1282 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures
[18:09:32] <icinga-wm>	 RECOVERY - puppet last run on mw1259 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures
[18:09:32] <icinga-wm>	 RECOVERY - puppet last run on analytics1062 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures
[18:09:33] <icinga-wm>	 RECOVERY - puppet last run on db1091 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures
[18:09:37] <wikibugs>	 (03CR) 10jenkins-bot: Create 'rollbacker' user group in frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/365538 (https://phabricator.wikimedia.org/T170780) (owner: 10Framawiki)
[18:09:42] <icinga-wm>	 RECOVERY - puppet last run on wtp1006 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures
[18:09:52] <icinga-wm>	 RECOVERY - puppet last run on eventlog1001 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures
[18:09:52] <icinga-wm>	 RECOVERY - puppet last run on wtp1037 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures
[18:09:52] <icinga-wm>	 RECOVERY - puppet last run on copper is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures
[18:09:52] <icinga-wm>	 RECOVERY - puppet last run on cp1099 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures
[18:09:53] <icinga-wm>	 RECOVERY - puppet last run on db1011 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures
[18:10:02] <icinga-wm>	 RECOVERY - puppet last run on poolcounter1001 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures
[18:10:02] <icinga-wm>	 RECOVERY - puppet last run on ms-be1022 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures
[18:10:02] <icinga-wm>	 RECOVERY - puppet last run on kafka1001 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures
[18:10:03] <icinga-wm>	 RECOVERY - puppet last run on cobalt is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:10:03] <icinga-wm>	 RECOVERY - puppet last run on kubestage1002 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures
[18:10:03] <icinga-wm>	 RECOVERY - puppet last run on labtestservices2003 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures
[18:10:12] <icinga-wm>	 RECOVERY - puppet last run on mc1031 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures
[18:10:12] <icinga-wm>	 RECOVERY - puppet last run on mw1269 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures
[18:10:12] <icinga-wm>	 RECOVERY - puppet last run on mw1263 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures
[18:10:12] <icinga-wm>	 RECOVERY - puppet last run on db1097 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures
[18:10:12] <icinga-wm>	 RECOVERY - puppet last run on thorium is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures
[18:10:22] <icinga-wm>	 RECOVERY - puppet last run on mw1299 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures
[18:10:24] <icinga-wm>	 RECOVERY - puppet last run on mw1251 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures
[18:10:24] <icinga-wm>	 RECOVERY - puppet last run on mw1252 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[18:10:24] <icinga-wm>	 RECOVERY - puppet last run on mw1206 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures
[18:10:24] <icinga-wm>	 RECOVERY - puppet last run on elastic1038 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures
[18:10:24] <icinga-wm>	 RECOVERY - puppet last run on analytics1057 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[18:10:24] <icinga-wm>	 RECOVERY - puppet last run on wdqs1002 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures
[18:10:24] <icinga-wm>	 RECOVERY - puppet last run on conf1001 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures
[18:10:24] <icinga-wm>	 RECOVERY - puppet last run on ganeti1003 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures
[18:10:25] <icinga-wm>	 RECOVERY - puppet last run on cp1050 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures
[18:10:25] <icinga-wm>	 RECOVERY - puppet last run on ms-be1017 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures
[18:10:26] <icinga-wm>	 RECOVERY - puppet last run on snapshot1006 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures
[18:10:32] <icinga-wm>	 RECOVERY - puppet last run on labvirt1017 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures
[18:10:32] <icinga-wm>	 RECOVERY - puppet last run on analytics1046 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures
[18:10:32] <icinga-wm>	 RECOVERY - puppet last run on wtp1048 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures
[18:10:42] <icinga-wm>	 RECOVERY - puppet last run on ganeti1001 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures
[18:10:42] <icinga-wm>	 RECOVERY - puppet last run on analytics1030 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures
[18:11:02] <icinga-wm>	 RECOVERY - puppet last run on mw1239 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures
[18:11:02] <icinga-wm>	 RECOVERY - puppet last run on db1071 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures
[18:11:02] <icinga-wm>	 RECOVERY - puppet last run on db1085 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures
[18:11:03] <icinga-wm>	 RECOVERY - puppet last run on db1043 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:11:12] <icinga-wm>	 RECOVERY - Upload HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]
[18:11:13] <icinga-wm>	 RECOVERY - puppet last run on elastic1027 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures
[18:11:22] <icinga-wm>	 RECOVERY - puppet last run on db1072 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures
[18:11:22] <icinga-wm>	 RECOVERY - puppet last run on mc1035 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures
[18:11:22] <icinga-wm>	 RECOVERY - puppet last run on analytics1048 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures
[18:11:22] <icinga-wm>	 RECOVERY - puppet last run on ms-be1034 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures
[18:11:22] <icinga-wm>	 RECOVERY - puppet last run on scb1001 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures
[18:11:22] <icinga-wm>	 RECOVERY - puppet last run on cerium is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures
[18:11:23] <icinga-wm>	 RECOVERY - puppet last run on einsteinium is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures
[18:11:32] <icinga-wm>	 RECOVERY - puppet last run on labcontrol1003 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures
[18:11:32] <icinga-wm>	 RECOVERY - puppet last run on analytics1061 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures
[18:11:32] <icinga-wm>	 RECOVERY - puppet last run on ruthenium is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures
[18:11:32] <icinga-wm>	 RECOVERY - puppet last run on ores1005 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures
[18:11:33] <icinga-wm>	 RECOVERY - puppet last run on mw1229 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures
[18:11:43] <icinga-wm>	 RECOVERY - puppet last run on bast3002 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures
[18:11:52] <icinga-wm>	 RECOVERY - puppet last run on db1054 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures
[18:11:53] <icinga-wm>	 RECOVERY - puppet last run on rhodium is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures
[18:12:02] <icinga-wm>	 RECOVERY - puppet last run on mw1281 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:12:04] <icinga-wm>	 RECOVERY - puppet last run on labvirt1005 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures
[18:12:04] <icinga-wm>	 RECOVERY - puppet last run on dysprosium is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures
[18:12:12] <icinga-wm>	 RECOVERY - puppet last run on db1099 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures
[18:12:13] <icinga-wm>	 RECOVERY - puppet last run on ocg1003 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures
[18:12:13] <icinga-wm>	 RECOVERY - puppet last run on elastic1029 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[18:12:13] <icinga-wm>	 RECOVERY - puppet last run on mw1209 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures
[18:12:22] <icinga-wm>	 RECOVERY - puppet last run on analytics1067 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures
[18:12:22] <icinga-wm>	 RECOVERY - puppet last run on mw1184 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures
[18:12:22] <icinga-wm>	 RECOVERY - puppet last run on analytics1034 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures
[18:12:23] <icinga-wm>	 RECOVERY - puppet last run on mw1296 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures
[18:12:23] <icinga-wm>	 RECOVERY - puppet last run on mw1297 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures
[18:12:23] <icinga-wm>	 RECOVERY - puppet last run on db1035 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures
[18:12:32] <icinga-wm>	 RECOVERY - puppet last run on logstash1004 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures
[18:12:32] <icinga-wm>	 RECOVERY - puppet last run on thumbor1004 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures
[18:12:32] <icinga-wm>	 RECOVERY - puppet last run on db1059 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures
[18:12:32] <icinga-wm>	 RECOVERY - puppet last run on graphite1001 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures
[18:12:40] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] statistics::discovery: Reconfigure for Golden data retrieval [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) (owner: 10Bearloga)
[18:12:42] <icinga-wm>	 RECOVERY - puppet last run on stat1006 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures
[18:12:42] <icinga-wm>	 RECOVERY - puppet last run on wtp1046 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures
[18:12:42] <icinga-wm>	 RECOVERY - puppet last run on db1096 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures
[18:12:42] <icinga-wm>	 RECOVERY - puppet last run on mw1162 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures
[18:12:52] <icinga-wm>	 RECOVERY - puppet last run on mw1231 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures
[18:12:53] <icinga-wm>	 RECOVERY - puppet last run on stat1002 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures
[18:12:53] <icinga-wm>	 RECOVERY - puppet last run on rdb1008 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures
[18:13:02] <icinga-wm>	 RECOVERY - puppet last run on db1039 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures
[18:13:02] <icinga-wm>	 RECOVERY - puppet last run on mw1202 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures
[18:13:02] <icinga-wm>	 RECOVERY - puppet last run on mw1207 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[18:13:03] <icinga-wm>	 RECOVERY - puppet last run on kubestagetcd1003 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures
[18:13:12] <icinga-wm>	 RECOVERY - puppet last run on mwdebug1002 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures
[18:13:12] <icinga-wm>	 RECOVERY - puppet last run on cp3047 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures
[18:13:13] <icinga-wm>	 RECOVERY - puppet last run on cp1046 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures
[18:13:22] <icinga-wm>	 RECOVERY - puppet last run on darmstadtium is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures
[18:13:32] <icinga-wm>	 RECOVERY - puppet last run on aqs1005 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures
[18:13:33] <icinga-wm>	 RECOVERY - puppet last run on ms-be1015 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures
[18:13:42] <icinga-wm>	 RECOVERY - puppet last run on ms-be3003 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures
[18:13:42] <icinga-wm>	 RECOVERY - puppet last run on mc1023 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures
[18:13:43] <icinga-wm>	 RECOVERY - puppet last run on elastic1018 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures
[18:13:52] <icinga-wm>	 RECOVERY - puppet last run on ms-be1029 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures
[18:13:52] <icinga-wm>	 RECOVERY - puppet last run on labcontrol1001 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures
[18:13:52] <icinga-wm>	 RECOVERY - puppet last run on db1041 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures
[18:13:52] <icinga-wm>	 RECOVERY - puppet last run on mw1232 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures
[18:13:52] <icinga-wm>	 RECOVERY - puppet last run on wtp1003 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures
[18:13:53] <icinga-wm>	 RECOVERY - puppet last run on db1048 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures
[18:13:53] <icinga-wm>	 RECOVERY - puppet last run on cp1048 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures
[18:14:02] <icinga-wm>	 RECOVERY - puppet last run on scb1003 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures
[18:14:03] <icinga-wm>	 RECOVERY - puppet last run on db1105 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures
[18:14:12] <icinga-wm>	 RECOVERY - puppet last run on bromine is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures
[18:14:12] <icinga-wm>	 RECOVERY - puppet last run on mw1211 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures
[18:14:12] <icinga-wm>	 RECOVERY - puppet last run on analytics1052 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[18:14:13] <icinga-wm>	 RECOVERY - puppet last run on labsdb1004 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[18:14:13] <icinga-wm>	 RECOVERY - puppet last run on wtp1044 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures
[18:14:22] <icinga-wm>	 RECOVERY - puppet last run on db1087 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures
[18:14:23] <icinga-wm>	 RECOVERY - puppet last run on mw1289 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures
[18:14:23] <icinga-wm>	 RECOVERY - puppet last run on analytics1058 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures
[18:14:32] <icinga-wm>	 RECOVERY - puppet last run on analytics1003 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures
[18:14:32] <Niharika>	 Gah, I'm sorry, someone will have to take over the SWAT. This wifi network doesn't let me ssh in anywhere. :( I should have checked before I volunteered. 
[18:14:32] <icinga-wm>	 RECOVERY - puppet last run on thumbor1003 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures
[18:14:33] <icinga-wm>	 RECOVERY - puppet last run on mw1260 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures
[18:14:33] <icinga-wm>	 RECOVERY - puppet last run on elastic1052 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures
[18:14:42] <icinga-wm>	 RECOVERY - puppet last run on labstore1003 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures
[18:14:43] <Niharika>	 Or wait.
[18:14:52] <icinga-wm>	 RECOVERY - puppet last run on ganeti1004 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures
[18:14:52] <icinga-wm>	 RECOVERY - puppet last run on aqs1004 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures
[18:14:52] <icinga-wm>	 RECOVERY - puppet last run on mw1216 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures
[18:14:56] <Niharika>	 Oh, I'm in. Never mind.
[18:15:02] <icinga-wm>	 RECOVERY - puppet last run on cp1054 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures
[18:15:02] <icinga-wm>	 RECOVERY - puppet last run on poolcounter1002 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures
[18:15:02] <icinga-wm>	 RECOVERY - puppet last run on mw1275 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures
[18:15:03] <icinga-wm>	 RECOVERY - puppet last run on fermium is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures
[18:15:12] <icinga-wm>	 RECOVERY - puppet last run on db1061 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures
[18:15:12] <icinga-wm>	 RECOVERY - puppet last run on cp3033 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures
[18:15:12] <icinga-wm>	 RECOVERY - puppet last run on elastic1028 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[18:15:12] <icinga-wm>	 RECOVERY - puppet last run on mw1215 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures
[18:15:12] <icinga-wm>	 RECOVERY - puppet last run on ms-fe1008 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures
[18:15:13] <icinga-wm>	 RECOVERY - puppet last run on ms-be1033 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures
[18:15:13] <icinga-wm>	 RECOVERY - puppet last run on db1044 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures
[18:15:13] <icinga-wm>	 RECOVERY - puppet last run on etcd1006 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures
[18:15:13] <icinga-wm>	 RECOVERY - puppet last run on labtestservices2001 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures
[18:15:22] <icinga-wm>	 RECOVERY - puppet last run on labvirt1001 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures
[18:15:23] <icinga-wm>	 RECOVERY - puppet last run on mw1276 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures
[18:15:23] <icinga-wm>	 RECOVERY - puppet last run on elastic1043 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures
[18:15:23] <icinga-wm>	 RECOVERY - puppet last run on elastic1039 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures
[18:15:23] <icinga-wm>	 RECOVERY - puppet last run on analytics1063 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures
[18:15:23] <icinga-wm>	 RECOVERY - puppet last run on kubernetes1002 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures
[18:15:23] <icinga-wm>	 RECOVERY - puppet last run on wtp1020 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures
[18:15:32] <icinga-wm>	 RECOVERY - puppet last run on labtestcontrol2001 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures
[18:15:32] <icinga-wm>	 RECOVERY - puppet last run on db1047 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures
[18:15:32] <icinga-wm>	 RECOVERY - puppet last run on mw1285 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[18:15:32] <icinga-wm>	 RECOVERY - puppet last run on cp3040 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures
[18:15:32] <icinga-wm>	 RECOVERY - puppet last run on cp3041 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures
[18:15:32] <icinga-wm>	 RECOVERY - puppet last run on wtp1022 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures
[18:15:52] <icinga-wm>	 RECOVERY - puppet last run on elastic1050 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures
[18:15:52] <icinga-wm>	 RECOVERY - puppet last run on ms-be1035 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures
[18:15:52] <icinga-wm>	 RECOVERY - puppet last run on mw1225 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures
[18:15:52] <icinga-wm>	 RECOVERY - puppet last run on analytics1036 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures
[18:16:02] <icinga-wm>	 RECOVERY - puppet last run on dbstore1001 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures
[18:16:02] <icinga-wm>	 RECOVERY - puppet last run on db1037 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures
[18:16:02] <icinga-wm>	 RECOVERY - puppet last run on mw1305 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures
[18:16:03] <icinga-wm>	 RECOVERY - puppet last run on db1084 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures
[18:16:03] <icinga-wm>	 RECOVERY - puppet last run on lvs1004 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures
[18:16:12] <icinga-wm>	 RECOVERY - puppet last run on mw1245 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures
[18:16:12] <icinga-wm>	 RECOVERY - puppet last run on analytics1064 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures
[18:16:12] <icinga-wm>	 RECOVERY - puppet last run on mw1165 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures
[18:16:12] <icinga-wm>	 RECOVERY - puppet last run on lvs1006 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures
[18:16:22] <icinga-wm>	 RECOVERY - puppet last run on mw1248 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures
[18:16:23] <icinga-wm>	 RECOVERY - puppet last run on mw1247 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures
[18:16:24] <icinga-wm>	 RECOVERY - puppet last run on analytics1066 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[18:16:32] <icinga-wm>	 RECOVERY - puppet last run on mw1294 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures
[18:16:36] <Niharika>	 framawiki: Can you check your changes on mwdebug1002? They're there. 
[18:16:42] <icinga-wm>	 RECOVERY - puppet last run on ms-be1018 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures
[18:16:49] <framawiki>	 Niharika: i'm on it
[18:16:52] <icinga-wm>	 RECOVERY - puppet last run on cp3045 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures
[18:17:03] <icinga-wm>	 RECOVERY - puppet last run on kafka1014 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures
[18:17:44] <framawiki>	 Niharika: good for me on mwdebug1002
[18:17:54] <Niharika>	 framawiki: Ack. 
[18:18:38] <Niharika>	 RoanKattouw: Hey there. https://gerrit.wikimedia.org/r/#/c/367833/ is on mwdebug1002. 
[18:19:01] <RoanKattouw>	 Cool, checking
[18:20:33] <RoanKattouw>	 Niharika: Looks like it's working, but it also needs https://gerrit.wikimedia.org/r/#/c/367850 otherwise it'll be worse than what's there currently
[18:20:53] <logmsgbot>	 !log niharika29@tin Synchronized wmf-config/InitialiseSettings.php: Create 'rollbacker' user group in frwiki https://gerrit.wikimedia.org/r/#/c/365538/ (duration: 00m 47s)
[18:21:01] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:21:12] <Niharika>	 framawiki: All synced. 
[18:21:17] <Niharika>	 RoanKattouw: I'm on it. 
[18:22:22] <framawiki>	 Niharika: :) thx !
[18:23:00] <Niharika>	 framawiki: What's the deal with https://gerrit.wikimedia.org/r/#/c/341267/ ?
[18:23:15] <Niharika>	 I read the discussion briefly.
[18:23:45] <framawiki>	 Nikerabbit: looks like it's not for today.
[18:24:06] <framawiki>	 I'll remove this from wikitech deployments page
[18:24:34] <Niharika>	 If I had a penny for every time someone pinged Niklas when they meant to ping me...
[18:24:42] <Niharika>	 framawiki: Okay, cool. 
[18:25:06] <Sagan>	 lol
[18:28:14] <wikibugs>	 10Operations, 10Puppet, 10Traffic, 10Mobile, and 2 others: URLs with title query string parameter and additional query string parameters do not redirect to mobile site - https://phabricator.wikimedia.org/T154227#3475647 (10Dzahn)
[18:31:37] <Niharika>	 RoanKattouw: https://gerrit.wikimedia.org/r/#/c/367837/ is on mwdebug1002 as well. Do you also want https://gerrit.wikimedia.org/r/#/c/367850/ to be able to test this one?
[18:32:06] <RoanKattouw>	 No I can test that separaetly
[18:33:08] <RoanKattouw>	 Niharika: It works
[18:34:43] <Niharika>	 RoanKattouw: Syncing...
[18:35:07] <wikibugs>	 (03PS1) 10Jdlrobson: Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556)
[18:35:15] <logmsgbot>	 !log niharika29@tin Synchronized php-1.30.0-wmf.11/resources/src/mediawiki.rcfilters/: RCFilters: Improve loading animation https://gerrit.wikimedia.org/r/#/c/367833/, RCFilters UI: Unbreak limit and days widgets in non-experimental mode https://gerrit.wikimedia.org/r/#/c/367837/ (duration: 00m 45s)
[18:35:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:35:29] <jdlrobson>	 room for one ore swat Niharika ? https://gerrit.wikimedia.org/r/367938 Update several Wikipedia projects to existing wordmarks    
[18:35:34] <Niharika>	 Both of those done.
[18:35:41] <Niharika>	 Yup, jdlrobson. Add it to the calendar. 
[18:35:57] <jdlrobson>	 thank you and done :)
[18:36:59] <wikibugs>	 (03CR) 10Niharika29: [C: 032] Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson)
[18:38:01] <jdlrobson>	 oh shoot waiy
[18:38:13] <wikibugs>	 (03PS2) 10Jdlrobson: Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556)
[18:38:15] <jdlrobson>	 i forgot to check in the dblists... ^
[18:38:47] <jdlrobson>	 ^ Niharika 
[18:39:41] <wikibugs>	 (03CR) 10Niharika29: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson)
[18:39:59] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson)
[18:40:37] <Niharika>	 RoanKattouw: And https://gerrit.wikimedia.org/r/#/c/367850/ is on mwdebug1002 too. 
[18:40:43] <Niharika>	 jdlrobson: ^^
[18:41:01] <jdlrobson>	 Niharika: what about the -1 above?
[18:41:12] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson)
[18:41:24] <Niharika>	 jdlrobson: Yeah, the -1. 
[18:41:34] <jdlrobson>	 somethings broke..
[18:41:49] <Niharika>	 NocDblistTest::testNocDblists 18:39:58 Failed asserting that two arrays are equal.
[18:43:58] <jdlrobson>	 Niharika: that's weird.. is that for ps1 or ps2?
[18:44:06] <wikibugs>	 (03CR) 10Jdlrobson: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson)
[18:44:36] <Niharika>	 The first one got a +2 from Jenkins. 
[18:46:15] <jdlrobson>	 can replicate locally.. so exploring what's going on here
[18:46:20] <wikibugs>	 (03CR) 10Niharika29: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson)
[18:46:34] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson)
[18:47:46] <RoanKattouw>	 Niharika: Looks good
[18:47:56] <Niharika>	 Alright then. 
[18:48:27] <jdlrobson>	 anyone know what docroot/noc/conf is for?
[18:48:46] <jdlrobson>	 RainbowSprinkles: ?
[18:49:26] <greg-g>	 jdlrobson: https://noc.wikimedia.org/
[18:49:30] <jdlrobson>	 it looks like the process for adding dblists changed.. https://gerrit.wikimedia.org/r/#/c/367938/2 and im not sure what i need to do
[18:50:26] <logmsgbot>	 !log niharika29@tin Synchronized php-1.30.0-wmf.11/resources/src/mediawiki.rcfilters/: RCFilters: Followup I78e23f85c3: Don't disable RCFilters system when fetching results https://gerrit.wikimedia.org/r/#/c/367850/ (duration: 00m 46s)
[18:50:35] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:50:36] <Niharika>	 RoanKattouw: Done^
[18:51:45] <Niharika>	 jdlrobson: Um, should we take if off this SWAT?
[18:53:51] <RainbowSprinkles>	 I don't like that patch
[18:53:54] <RainbowSprinkles>	 It looks weird
[18:53:57] <RainbowSprinkles>	 Plz remove
[18:54:11] <jdlrobson>	 RainbowSprinkles: what's weird about it? what am i doing wrong?
[18:54:38] <wikibugs>	 (03PS3) 10Jdlrobson: Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556)
[18:55:00] <RainbowSprinkles>	 jdlrobson: It just feels weird. I haven't even looked at jenkins failing yet.
[18:55:14] <RainbowSprinkles>	 Swat ends in 5 and I have train window, let's just boot it for now
[18:55:26] <jdlrobson>	 sure, but can you articulate more?  I will be swatting this at 4pm today if not now
[18:55:45] <jdlrobson>	 so i need to understand what's weird about it. I'm trying to avoid having 20 identical lines
[18:55:49] <jdlrobson>	 (in config)
[18:56:01] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson)
[18:56:18] <jdlrobson>	 dblist seems appropriate in this case, or if not that a programattic approach e.g. if lang=='hu' use the fr value
[18:56:50] <jdlrobson>	 I'm not familiar with NocDblistTest::testNocDblists so I'm not sure what it's testing..
[18:58:09] <wikibugs>	 (03PS4) 10Jdlrobson: Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556)
[18:58:09] <Reedy>	 I think it's making sure all dblists are shown on https://noc.wikimedia.org/conf/
[18:58:18] <wikibugs>	 (03CR) 10Jdlrobson: "dblist seems appropriate in this case, or if not that a programattic approach e.g. if lang=='hu' use the fr value" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson)
[18:59:16] <RainbowSprinkles>	 jdlrobson: Well, they're not alphabetized for oneeeee
[18:59:24] <RainbowSprinkles>	 And we don't really do any wikipedia-* style ones
[18:59:28] <RainbowSprinkles>	 But that could be me nitpicking
[18:59:30] <RainbowSprinkles>	 Anyway
[18:59:41] * RainbowSprinkles orders lunch, chugs rest of coffee, puts on train conductor hat
[19:00:01] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Update several Wikipedia projects to existing wordmarks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367938 (https://phabricator.wikimedia.org/T171556) (owner: 10Jdlrobson)
[19:00:04] <jouncebot>	 RainbowSprinkles: Dear anthropoid, the time has come. Please deploy MediaWiki train (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170726T1900).
[19:00:57] <RainbowSprinkles>	 Choo choo
[19:01:05] <gehel>	 !log depooling wdqs1001 for data reload - T166244
[19:01:07] <logmsgbot>	 !log gehel@puppetmaster1001 conftool action : set/pooled=no; selector: name=wdqs1001.wmnet
[19:01:13] <logmsgbot>	 !log gehel@puppetmaster1001 conftool action : set/pooled=no; selector: name=wdqs1001.eqiad.wmnet
[19:01:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:01:15] <stashbot>	 T166244: Reload WDQS data after T131960 is merged - https://phabricator.wikimedia.org/T166244
[19:01:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:01:33] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:06:06] <bblack>	 !log cp1074: run-no-puppet varnish-backend-restart (mailbox lag in icinga)
[19:06:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:12:36] <wikibugs>	 (03PS2) 10Chad: Group1 to wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367918
[19:13:18] <mutante>	 jdlrobson: the "NocDBlistTest" checks if there are links in both  ./dblists/ with ./docroot/noc/conf/ directories
[19:13:26] <mutante>	 https://github.com/wikimedia/operations-mediawiki-config/blob/master/tests/noc-conf/NOCDblistTest.php
[19:13:44] <mutante>	 eh, i meant "if there are links from both directories to a third place", i guess
[19:15:32] <icinga-wm>	 RECOVERY - Check Varnish expiry mailbox lag on cp1074 is OK: OK: expiry mailbox lag is 0
[19:26:48] <wikibugs>	 (03CR) 10Chad: [C: 032] Group1 to wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367918 (owner: 10Chad)
[19:28:08] <wikibugs>	 (03Merged) 10jenkins-bot: Group1 to wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367918 (owner: 10Chad)
[19:28:19] <wikibugs>	 (03CR) 10jenkins-bot: Group1 to wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367918 (owner: 10Chad)
[19:30:50] <mutante>	 !log mx1001 - temp disable puppet to test adjusted sudo privileges for an icinga check
[19:30:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:35:31] <logmsgbot>	 !log demon@tin Started scap: group1 to wmf.11
[19:35:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:39:14] <wikibugs>	 (03PS1) 10Eevans: Enable prometheus jmx exporter in dev environment [puppet] - 10https://gerrit.wikimedia.org/r/367952 (https://phabricator.wikimedia.org/T171772)
[19:40:39] <wikibugs>	 (03Draft2) 10محمد شعیب: Add urdu logo to mobile site [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367946 (https://phabricator.wikimedia.org/T171769)
[19:47:35] <logmsgbot>	 !log demon@tin scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
[19:47:35] <logmsgbot>	 !log demon@tin scap failed: RuntimeError scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details) (duration: 12m 03s)
[19:47:45] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:47:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:48:15] <logmsgbot>	 !log demon@tin Started scap: group1 to wmf.11
[19:48:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:50:56] <wikibugs>	 (03CR) 10Eevans: [C: 031] "Enables the Prometheus agent in the dev environment only ([see puppet compiler output](http://puppet-compiler.wmflabs.org/7172))" [puppet] - 10https://gerrit.wikimedia.org/r/367952 (https://phabricator.wikimedia.org/T171772) (owner: 10Eevans)
[19:57:10] <wikibugs>	 (03CR) 10Dzahn: [C: 032] "thanks for adding compiler link. merging per "dev only"" [puppet] - 10https://gerrit.wikimedia.org/r/367952 (https://phabricator.wikimedia.org/T171772) (owner: 10Eevans)
[19:57:39] <wikibugs>	 (03PS2) 10Dzahn: restbase: Enable prometheus jmx exporter in dev environment [puppet] - 10https://gerrit.wikimedia.org/r/367952 (https://phabricator.wikimedia.org/T171772) (owner: 10Eevans)
[20:00:04] <jouncebot>	 gwicke, cscott, arlolra, subbu, bearND, halfak, and Amir1: Respected human, time to deploy Services – Parsoid / OCG / Citoid / Mobileapps / ORES / … (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170726T2000). Please do the needful.
[20:00:18] <wikibugs>	 (03CR) 10Mobrovac: [C: 031] restbase: Enable prometheus jmx exporter in dev environment [puppet] - 10https://gerrit.wikimedia.org/r/367952 (https://phabricator.wikimedia.org/T171772) (owner: 10Eevans)
[20:01:37] <logmsgbot>	 !log demon@tin Finished scap: group1 to wmf.11 (duration: 13m 22s)
[20:01:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:04:43] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s3 on dbstore1001 is OK: OK slave_sql_lag Replication lag: 89976.84 seconds
[20:05:16] <subbu>	 no parsoid deploy today
[20:06:59] <urandom>	 mutante: thanks!
[20:08:04] <mutante>	 urandom: yw. and now it's actually submitted
[20:08:19] <logmsgbot>	 !log mobrovac@tin Started deploy [cxserver/deploy@f43ef96]: Switch node_modules to node v6.11
[20:08:30] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:09:06] <logmsgbot>	 !log mobrovac@tin Started deploy [citoid/deploy@43c2776]: Switch node_modules to Node v6.11
[20:09:09] <logmsgbot>	 !log demon@tin Started scap: no-op, ideal timing scenario
[20:09:16] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:09:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:10:55] <logmsgbot>	 !log mobrovac@tin Finished deploy [cxserver/deploy@f43ef96]: Switch node_modules to node v6.11 (duration: 02m 36s)
[20:11:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:11:27] <logmsgbot>	 !log mobrovac@tin Started deploy [graphoid/deploy@1707b3c]: Switch node_modules to node v6.11
[20:11:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:12:03] <logmsgbot>	 !log mobrovac@tin Finished deploy [citoid/deploy@43c2776]: Switch node_modules to Node v6.11 (duration: 02m 56s)
[20:12:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:12:35] <wikibugs>	 (03CR) 10Bearloga: "Seems to me the build failure is (1) unrelated to the patch, and (2) about multiple Phab tickets in the commit message which is also weird" [puppet] - 10https://gerrit.wikimedia.org/r/367930 (https://phabricator.wikimedia.org/T170494) (owner: 10Bearloga)
[20:12:45] <logmsgbot>	 !log demon@tin Finished scap: no-op, ideal timing scenario (duration: 03m 35s)
[20:12:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:18:10] <logmsgbot>	 !log mobrovac@tin Started deploy [mobileapps/deploy@bb81d91]: Switch node_modules to Node v6.11
[20:18:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:19:18] <logmsgbot>	 !log mobrovac@tin Finished deploy [graphoid/deploy@1707b3c]: Switch node_modules to node v6.11 (duration: 07m 50s)
[20:19:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:20:06] <logmsgbot>	 !log mobrovac@tin Started deploy [trending-edits/deploy@22967f3]: Switch node_modules to node v6.11
[20:20:18] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:21:42] <icinga-wm>	 PROBLEM - HHVM jobrunner on mw1301 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 473 bytes in 0.001 second response time
[20:22:18] <logmsgbot>	 !log mobrovac@tin Finished deploy [mobileapps/deploy@bb81d91]: Switch node_modules to Node v6.11 (duration: 04m 08s)
[20:22:30] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:22:39] <logmsgbot>	 !log mobrovac@tin Started deploy [changeprop/deploy@444223d]: Switch node_modules to Node v6.11
[20:22:42] <icinga-wm>	 RECOVERY - HHVM jobrunner on mw1301 is OK: HTTP OK: HTTP/1.1 200 OK - 202 bytes in 0.001 second response time
[20:22:50] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:24:16] <logmsgbot>	 !log mobrovac@tin Finished deploy [changeprop/deploy@444223d]: Switch node_modules to Node v6.11 (duration: 01m 35s)
[20:24:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:25:12] <wikibugs>	 10Operations, 10ORES, 10Scoring-platform-team: rack/setup/install ores2001-2009 - https://phabricator.wikimedia.org/T165170#3476095 (10RobH)
[20:25:31] <logmsgbot>	 !log mobrovac@tin Started deploy [eventstreams/deploy@a2a0f19]: Switch node_modules to Node v6.11
[20:25:42] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:27:07] <logmsgbot>	 !log mobrovac@tin Finished deploy [trending-edits/deploy@22967f3]: Switch node_modules to node v6.11 (duration: 07m 01s)
[20:27:16] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:27:40] <logmsgbot>	 !log mobrovac@tin Started deploy [recommendation-api/deploy@e7adea0]: Switch node_modules to node v6.11
[20:27:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:28:28] <logmsgbot>	 !log mobrovac@tin Finished deploy [eventstreams/deploy@a2a0f19]: Switch node_modules to Node v6.11 (duration: 02m 57s)
[20:28:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:29:23] <logmsgbot>	 !log mobrovac@tin Started deploy [mathoid/deploy@44ea6d8]: Switch node_modules to Node v6.11
[20:29:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:30:06] <logmsgbot>	 !log mobrovac@tin Finished deploy [recommendation-api/deploy@e7adea0]: Switch node_modules to node v6.11 (duration: 02m 26s)
[20:30:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:31:56] <logmsgbot>	 !log mobrovac@tin Started deploy [electron-render/deploy@8dd5f13]: Switch node_modules to node v6.11
[20:32:00] <wikibugs>	 (03CR) 10Hoo man: [C: 031] "Fine to deploy at any time." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/367913 (owner: 10Lucas Werkmeister (WMDE))
[20:32:07] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:32:39] <logmsgbot>	 !log mobrovac@tin Finished deploy [mathoid/deploy@44ea6d8]: Switch node_modules to Node v6.11 (duration: 03m 16s)
[20:32:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:35:13] <icinga-wm>	 PROBLEM - pdfrender on scb2006 is CRITICAL: connect to address 10.192.32.20 and port 5252: Connection refused
[20:36:12] <icinga-wm>	 RECOVERY - pdfrender on scb2006 is OK: HTTP OK: HTTP/1.1 200 OK - 275 bytes in 0.074 second response time
[20:38:22] <icinga-wm>	 PROBLEM - pdfrender on scb1001 is CRITICAL: connect to address 10.64.0.16 and port 5252: Connection refused
[20:41:12] <icinga-wm>	 PROBLEM - carbon-cache too many creates on graphite1001 is CRITICAL: CRITICAL: 1.67% of data above the critical threshold [1000.0]
[20:48:24] <wikibugs>	 10Operations, 10Cloud-VPS, 10cloud-services-team (Kanban): Switch to new labs puppetmasters - https://phabricator.wikimedia.org/T171786#3476152 (10Andrew)
[20:51:01] <logmsgbot>	 !log mforns@tin Started deploy [analytics/refinery@58176d0]: deploying refinery to use 0.0.49 jars
[20:51:12] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:54:14] <logmsgbot>	 !log mforns@tin Finished deploy [analytics/refinery@58176d0]: deploying refinery to use 0.0.49 jars (duration: 03m 12s)
[20:54:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[21:00:52] <logmsgbot>	 !log mobrovac@tin Started deploy [electron-render/deploy@8dd5f13]: (no justification provided)
[21:01:03] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[21:01:27] <wikibugs>	 (03CR) 10Thcipriani: "inline comment" (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363970 (owner: 10Chad)
[21:01:39] <wikibugs>	 (03PS1) 10Andrew Bogott: puppet-merge: fix a syntax error when there's only one worker [puppet] - 10https://gerrit.wikimedia.org/r/368088
[21:02:06] <wikibugs>	 (03PS1) 10Andrew Bogott: add puppetmaster roles to labpuppetmaster1001 and 2 [puppet] - 10https://gerrit.wikimedia.org/r/368090
[21:03:31] <logmsgbot>	 !log mobrovac@tin Finished deploy [electron-render/deploy@8dd5f13]: (no justification provided) (duration: 02m 38s)
[21:03:41] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[21:04:12] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 032] add puppetmaster roles to labpuppetmaster1001 and 2 [puppet] - 10https://gerrit.wikimedia.org/r/368090 (owner: 10Andrew Bogott)
[21:07:29] <wikibugs>	 (03PS10) 10Dzahn: icinga/role:mail::mx: add monitoring of exim queue size [puppet] - 10https://gerrit.wikimedia.org/r/361023 (https://phabricator.wikimedia.org/T133110)
[21:08:23] <logmsgbot>	 !log mobrovac@tin Started deploy [electron-render/deploy@8dd5f13]: (no justification provided)
[21:08:36] <icinga-wm>	 PROBLEM - puppet last run on labpuppetmaster1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[21:08:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[21:10:00] <logmsgbot>	 !log mobrovac@tin Finished deploy [electron-render/deploy@8dd5f13]: (no justification provided) (duration: 01m 37s)
[21:10:12] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[21:13:43] <icinga-wm>	 PROBLEM - cassandra-a CQL 10.64.16.97:9042 on restbase-dev1005 is CRITICAL: connect to address 10.64.16.97 and port 9042: Connection refused
[21:13:49] <urandom>	 that's me ^^^
[21:13:52] <urandom>	 dev only
[21:14:03] <icinga-wm>	 PROBLEM - cassandra-a service on restbase-dev1005 is CRITICAL: CRITICAL - Expecting active but unit cassandra-a is failed
[21:14:17] <wikibugs>	 10Operations, 10Analytics, 10Analytics-Cluster, 10Research-management: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843#3476259 (10Halfak) Looks to me like this task is ready to be resolved.  Also, I have no idea why it is assigned to me as I've only consulted on it.   @dr0ptp4kt w...
[21:14:32] <icinga-wm>	 PROBLEM - cassandra-a SSL 10.64.16.97:7001 on restbase-dev1005 is CRITICAL: SSL CRITICAL - failed to connect or SSL handshake:Connection refused
[21:15:03] <icinga-wm>	 RECOVERY - cassandra-a service on restbase-dev1005 is OK: OK - cassandra-a is active
[21:15:32] <icinga-wm>	 RECOVERY - cassandra-a SSL 10.64.16.97:7001 on restbase-dev1005 is OK: SSL OK - Certificate restbase-dev1005-a valid until 2018-07-20 15:08:07 +0000 (expires in 358 days)
[21:15:42] <icinga-wm>	 RECOVERY - cassandra-a CQL 10.64.16.97:9042 on restbase-dev1005 is OK: TCP OK - 0.000 second response time on 10.64.16.97 port 9042
[21:21:32] <icinga-wm>	 RECOVERY - pdfrender on scb1001 is OK: HTTP OK: HTTP/1.1 200 OK - 275 bytes in 0.004 second response time
[21:21:56] <wikibugs>	 (03PS1) 10Andrew Bogott: labtestpuppetmaster:  add lots of hiera defs [puppet] - 10https://gerrit.wikimedia.org/r/368097
[21:23:02] <wikibugs>	 (03CR) 10Dzahn: [C: 04-1] icinga/role:mail::mx: add monitoring of exim queue size (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/361023 (https://phabricator.wikimedia.org/T133110) (owner: 10Dzahn)
[21:23:05] <wikibugs>	 (03PS4) 10Chad: WIP: Simple wrapper around updating the interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363970
[21:23:26] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 032] labtestpuppetmaster:  add lots of hiera defs [puppet] - 10https://gerrit.wikimedia.org/r/368097 (owner: 10Andrew Bogott)
[21:25:13] <wikibugs>	 (03PS5) 10Chad: WIP: Simple wrapper around updating the interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363970
[21:25:32] <wikibugs>	 (03PS1) 10Andrew Bogott: labpuppetmaster:  more hiera config [puppet] - 10https://gerrit.wikimedia.org/r/368099
[21:25:39] <wikibugs>	 (03PS6) 10Chad: WIP: Simple wrapper around updating the interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363970
[21:25:41] <wikibugs>	 (03CR) 10Chad: [C: 032] WIP: Simple wrapper around updating the interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363970 (owner: 10Chad)
[21:25:43] <wikibugs>	 (03PS1) 10Chad: Updating interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368100
[21:25:45] <wikibugs>	 (03CR) 10Chad: [C: 032] Updating interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368100 (owner: 10Chad)
[21:26:56] <wikibugs>	 (03PS11) 10Dzahn: icinga/role:mail::mx: add monitoring of exim queue size [puppet] - 10https://gerrit.wikimedia.org/r/361023 (https://phabricator.wikimedia.org/T133110)
[21:29:02] <icinga-wm>	 PROBLEM - pdfrender on scb1002 is CRITICAL: connect to address 10.64.16.21 and port 5252: Connection refused
[21:29:17] <mobrovac>	 known ^
[21:29:53] <icinga-wm>	 ACKNOWLEDGEMENT - puppet last run on labpuppetmaster1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues andrew bogott Its going to take me a while to get this right.
[21:30:02] <icinga-wm>	 RECOVERY - pdfrender on scb1002 is OK: HTTP OK: HTTP/1.1 200 OK - 275 bytes in 0.002 second response time
[21:34:46] <wikibugs>	 10Operations, 10Services (done), 10User-mobrovac: nodejs 6.11 - https://phabricator.wikimedia.org/T170548#3476271 (10mobrovac) FYI, all of the SCB services have been migrated.
[21:37:32] <icinga-wm>	 RECOVERY - carbon-cache too many creates on graphite1001 is OK: OK: Less than 1.00% above the threshold [500.0]
[21:39:24] <wikibugs>	 (03PS1) 10MaxSem: WIP: [labs] Puppetize XTools [puppet] - 10https://gerrit.wikimedia.org/r/368101 (https://phabricator.wikimedia.org/T170514)
[21:39:35] <andrewbogott>	 !log restarting rabbitmq on labcontrol1001
[21:39:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[21:51:27] <wikibugs>	 (03PS1) 10Andrew Bogott: Nodepool:  raise rate to 8 seconds [puppet] - 10https://gerrit.wikimedia.org/r/368102 (https://phabricator.wikimedia.org/T170492)
[21:52:36] <wikibugs>	 (03CR) 10Andrew Bogott: [V: 032 C: 032] Nodepool:  raise rate to 8 seconds [puppet] - 10https://gerrit.wikimedia.org/r/368102 (https://phabricator.wikimedia.org/T170492) (owner: 10Andrew Bogott)
[21:54:46] <chasemp>	 andrewbogott: why the restart?
[21:56:18] <wikibugs>	 (03CR) 10Dzahn: [V: 032 C: 032] "nodepool issue and already verified before rebase" [puppet] - 10https://gerrit.wikimedia.org/r/361023 (https://phabricator.wikimedia.org/T133110) (owner: 10Dzahn)
[21:56:31] <wikibugs>	 (03PS12) 10Dzahn: icinga/role:mail::mx: add monitoring of exim queue size [puppet] - 10https://gerrit.wikimedia.org/r/361023 (https://phabricator.wikimedia.org/T133110)
[21:56:32] <andrewbogott>	 chasemp: lots of things were timing out… I'm not sure exactly what's going on.
[21:56:35] <chasemp>	 kk
[21:56:39] <chasemp>	 andrewbogott: how did you notice?
[21:56:45] <andrewbogott>	 chasemp: I turned down the nodepool rate a bit
[21:57:01] <andrewbogott>	 I was waiting for Jenkins and it was slow so then looked at the contintcloud server list
[21:57:06] <chasemp>	 gotcha
[21:58:00] <wikibugs>	 (03CR) 1020after4: "@dzahn: it depends. On labs, it gets the value from ldap for mwdeploy because that is where the user is defined. The current value on labs" [puppet] - 10https://gerrit.wikimedia.org/r/365891 (https://phabricator.wikimedia.org/T166013) (owner: 1020after4)
[21:59:00] <wikibugs>	 (03CR) 1020after4: "On production we shouldn't have this problem because puppet can change the user's home directory, due to not being defined in ldap." [puppet] - 10https://gerrit.wikimedia.org/r/365891 (https://phabricator.wikimedia.org/T166013) (owner: 1020after4)
[22:01:40] <wikibugs>	 (03CR) 10Dzahn: [V: 032 C: 032] icinga/role:mail::mx: add monitoring of exim queue size [puppet] - 10https://gerrit.wikimedia.org/r/361023 (https://phabricator.wikimedia.org/T133110) (owner: 10Dzahn)
[22:04:56] <wikibugs>	 (03CR) 10Dzahn: "[mx1001:~] $ sudo -u nagios /usr/local/lib/nagios/plugins/check_exim_queue -w 1000 -c 3000" [puppet] - 10https://gerrit.wikimedia.org/r/361023 (https://phabricator.wikimedia.org/T133110) (owner: 10Dzahn)
[22:06:10] <wikibugs>	 (03CR) 10Dzahn: "# This file is managed by Puppet!" [puppet] - 10https://gerrit.wikimedia.org/r/361023 (https://phabricator.wikimedia.org/T133110) (owner: 10Dzahn)
[22:12:58] <wikibugs>	 (03PS2) 10Andrew Bogott: labpuppetmaster:  more hiera config [puppet] - 10https://gerrit.wikimedia.org/r/368099
[22:17:03] <MaxSem>	 Warning: Certificate 'Puppet CA: virt1000.wikimedia.org' will expire on 2017-08-15T20:55:45UTC
[22:17:40] <wikibugs>	 10Operations, 10monitoring, 10Patch-For-Review: Check for an oversized exim4 queue indicating mail delivery failures - https://phabricator.wikimedia.org/T133110#3476378 (10Dzahn) ``` [mx1001:~] $ sudo -u nagios /usr/local/lib/nagios/plugins/check_exim_queue -w 1000 -c 3000 OK: Less than 1000 mails in exim qu...
[22:17:43] <MaxSem>	 eh, wrong channel prolly
[22:18:36] <andrewbogott>	 MaxSem: I'm working on it — all the labs VMs will be saying that for a week or two.
[22:19:16] <MaxSem>	 so it's because imma running shiny new stretch? :p
[22:20:33] <wikibugs>	 (03CR) 10Dzahn: "https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=exim+queue" [puppet] - 10https://gerrit.wikimedia.org/r/361023 (https://phabricator.wikimedia.org/T133110) (owner: 10Dzahn)
[22:21:06] <mutante>	 MaxSem: https://phabricator.wikimedia.org/T168110
[22:21:13] <mutante>	 no
[22:21:20] <mutante>	 not because of stretch 
[22:23:19] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] WIP: [labs] Puppetize XTools [puppet] - 10https://gerrit.wikimedia.org/r/368101 (https://phabricator.wikimedia.org/T170514) (owner: 10MaxSem)
[22:26:33] <wikibugs>	 (03Merged) 10jenkins-bot: WIP: Simple wrapper around updating the interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363970 (owner: 10Chad)
[22:26:35] <wikibugs>	 (03Merged) 10jenkins-bot: Updating interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368100 (owner: 10Chad)
[22:26:44] <wikibugs>	 (03CR) 10jenkins-bot: WIP: Simple wrapper around updating the interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363970 (owner: 10Chad)
[22:29:20] <wikibugs>	 (03CR) 10jenkins-bot: Updating interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368100 (owner: 10Chad)
[22:45:22] <Krinkle>	 RainbowSprinkles: hmpf, for merging WIP
[22:45:45] <Krinkle>	 There was an idea to make Jenkins reject those in gate at some point. Oh well :)
[22:59:26] <James_F>	 Ha.
[23:00:04] <jouncebot>	 addshore, hashar, anomie, RainbowSprinkles, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Dear anthropoid, the time has come. Please deploy Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20170726T2300).
[23:00:04] <jouncebot>	 RoanKattouw: A patch you scheduled for Evening SWAT (Max 8 patches) is about to be deployed. Please be available during the process.
[23:00:11] <James_F>	 Roan's here, ish.
[23:00:45] <James_F>	 I can surpervise.
[23:00:57] <James_F>	 Err. I can also supervise, more usefully.
[23:02:00] <thcipriani>	 I can SWAT
[23:02:23] <thcipriani>	 James_F: sounds like you're volunteering to check RoanKattouw 's patch for SWAT?
[23:03:36] <wikibugs>	 (03PS2) 10Thcipriani: Enable Echo per-user blacklist on meta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363049 (https://phabricator.wikimedia.org/T150419) (owner: 10Catrope)
[23:06:32] <wikibugs>	 (03CR) 10Thcipriani: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363049 (https://phabricator.wikimedia.org/T150419) (owner: 10Catrope)
[23:08:36] <wikibugs>	 (03CR) 10Thcipriani: Enable Echo per-user blacklist on meta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363049 (https://phabricator.wikimedia.org/T150419) (owner: 10Catrope)
[23:08:45] <wikibugs>	 (03CR) 10Thcipriani: [C: 032] "SWAT try again" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363049 (https://phabricator.wikimedia.org/T150419) (owner: 10Catrope)
[23:09:32] <James_F>	 thcipriani: Yes, sorry.
[23:10:04] <thcipriani>	 okie doke :)
[23:10:15] <wikibugs>	 (03Merged) 10jenkins-bot: Enable Echo per-user blacklist on meta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363049 (https://phabricator.wikimedia.org/T150419) (owner: 10Catrope)
[23:10:26] <wikibugs>	 (03CR) 10jenkins-bot: Enable Echo per-user blacklist on meta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363049 (https://phabricator.wikimedia.org/T150419) (owner: 10Catrope)
[23:11:08] <thcipriani>	 James_F: patch is live on mwdebug1002, check please
[23:12:46] <James_F>	 thcipriani: Hmm. Checking, seems odd.
[23:13:51] <RoanKattouw>	 Sorry guys, I forgot I had a patch listed
[23:18:08] <thcipriani>	 RoanKattouw: np, live on mwdebug1002 now, the UI is there, seems to work, but haven't tested any further. James_F what are you seeing that's odd?
[23:18:55] <wikibugs>	 (03PS1) 10Dzahn: lists/icinga: remove I/O monitoring on lists server [puppet] - 10https://gerrit.wikimedia.org/r/368110 (https://phabricator.wikimedia.org/T133110)
[23:19:32] <James_F>	 thcipriani: We worked it out. My second test account wasn't going through mwdebug1002 and that's necessary.
[23:19:50] <RoanKattouw>	 thcipriani: Working fo rme
[23:19:59] <thcipriani>	 ah, cool, going live
[23:19:59] <James_F>	 thcipriani: Go for it.
[23:21:09] <thcipriani>	 oh
[23:21:13] <James_F>	 Oh?
[23:21:25] <wikibugs>	 (03CR) 10Dzahn: [C: 04-1] "yes, thanks for the extra details! the exim queue size monitoring has been added today now. so i am going to remove this one instead as su" [puppet] - 10https://gerrit.wikimedia.org/r/358504 (owner: 10Dzahn)
[23:21:27] <thcipriani>	 RainbowSprinkles: you are updating the interwiki cache and that has scap locked?
[23:21:37] <wikibugs>	 (03Abandoned) 10Dzahn: lists/icinga: remove mailman I/O stat CRITs [puppet] - 10https://gerrit.wikimedia.org/r/358504 (owner: 10Dzahn)
[23:21:41] <thcipriani>	 > 23:21:00 sync-file failed: <LockFailedError> Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "demon"; reason is "Updating interwiki cache"
[23:22:03] <James_F>	 Tut.
[23:22:22] <thcipriani>	 hrm, except I don't see that running anywhere...
[23:23:00] <mutante>	 thcipriani: did you see that earlier "Simple wrapper around updating the interwiki cache" was merged?
[23:23:09] <mutante>	 sounds so related
[23:23:31] <thcipriani>	 that would make sense
[23:23:37] <mutante>	 https://gerrit.wikimedia.org/r/363970
[23:26:09] <thcipriani>	 hrm, I don't know why that wouldn't clean up its lock file...
[23:27:45] <thcipriani>	 and RainbowSprinkles doesn't seem to be on this box and scap isn't running.
[23:28:02] <mutante>	 do you need somebody to delete the lock file as root or something?
[23:28:05] <thcipriani>	 mutante: could you use your superpowers to manually remove the lock for now.
[23:28:07] <thcipriani>	 yeah
[23:28:20] <mutante>	 which machine is it though
[23:28:24] <thcipriani>	 tin
[23:28:26] <mutante>	 ok
[23:28:32] <thcipriani>	 thank you
[23:29:04] <mutante>	 !log tin rm /var/lock/scap.operations_mediawiki-config.lock
[23:29:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:30:14] <logmsgbot>	 !log thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:363049|Enable Echo per-user blacklist on meta]] T150419 (duration: 00m 49s)
[23:30:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:30:26] <stashbot>	 T150419: Allow users to restrict who can send them notifications - https://phabricator.wikimedia.org/T150419
[23:30:33] <thcipriani>	 ^ RoanKattouw James_F live now, thanks for your patience
[23:30:39] <thcipriani>	 mutante: thanks for the assist
[23:30:40] <James_F>	 Thanks!
[23:30:49] <mutante>	 no problem, yw
[23:31:37] <wikibugs>	 (03CR) 10Dzahn: "16:26 < thcipriani> hrm, I don't know why that wouldn't clean up its lock file..." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/363970 (owner: 10Chad)
[23:42:58] <wikibugs>	 (03PS1) 10Mattflaschen: Make emails for minor edits always available; keep defaults [mediawiki-config] - 10https://gerrit.wikimedia.org/r/368113 (https://phabricator.wikimedia.org/T29884)
[23:44:21] <wikibugs>	 (03PS2) 10Dzahn: lists/icinga: remove I/O monitoring on lists server [puppet] - 10https://gerrit.wikimedia.org/r/368110 (https://phabricator.wikimedia.org/T133110)
[23:47:15] <wikibugs>	 (03CR) 10Dzahn: [C: 032] lists/icinga: remove I/O monitoring on lists server [puppet] - 10https://gerrit.wikimedia.org/r/368110 (https://phabricator.wikimedia.org/T133110) (owner: 10Dzahn)
[23:56:43] <wikibugs>	 (03CR) 10Dzahn: "removed 'bc' package manually, removed from icinga config by puppet" [puppet] - 10https://gerrit.wikimedia.org/r/368110 (https://phabricator.wikimedia.org/T133110) (owner: 10Dzahn)