[00:10:37] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [01:33:04] (03CR) 10Krinkle: "LGTM, but either before or closely after this, recommend we also drop the parameter from core (keep, but rename to "unused")." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/508476 (https://phabricator.wikimedia.org/T222539) (owner: 10Catrope) [01:51:34] !log DIsabled 2FA for my staff account [01:51:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:59:47] PROBLEM - Mediawiki Cirrussearch update rate - codfw on icinga1001 is CRITICAL: CRITICAL: 10.00% of data under the critical threshold [50.0] https://wikitech.wikimedia.org/wiki/Search%23No_updates https://grafana.wikimedia.org/d/JLK3I_siz/elasticsearch-indexing?panelId=44&fullscreen&orgId=1 [03:00:15] PROBLEM - Mediawiki Cirrussearch update rate - eqiad on icinga1001 is CRITICAL: CRITICAL: 10.00% of data under the critical threshold [50.0] https://wikitech.wikimedia.org/wiki/Search%23No_updates https://grafana.wikimedia.org/d/JLK3I_siz/elasticsearch-indexing?panelId=44&fullscreen&orgId=1 [03:17:00] (03PS1) 10Jeena Huneidi: Add mediawiki development chart. [deployment-charts] - 10https://gerrit.wikimedia.org/r/522584 [03:19:13] RECOVERY - Mediawiki Cirrussearch update rate - eqiad on icinga1001 is OK: OK: Less than 1.00% under the threshold [80.0] https://wikitech.wikimedia.org/wiki/Search%23No_updates https://grafana.wikimedia.org/d/JLK3I_siz/elasticsearch-indexing?panelId=44&fullscreen&orgId=1 [03:20:13] RECOVERY - Mediawiki Cirrussearch update rate - codfw on icinga1001 is OK: OK: Less than 1.00% under the threshold [80.0] https://wikitech.wikimedia.org/wiki/Search%23No_updates https://grafana.wikimedia.org/d/JLK3I_siz/elasticsearch-indexing?panelId=44&fullscreen&orgId=1 [03:55:49] PROBLEM - puppet last run on dns2001 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [04:23:05] RECOVERY - puppet last run on dns2001 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:29:01] PROBLEM - puppet last run on ms-be2045 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/etc/profile.d/field.sh] [06:29:19] PROBLEM - puppet last run on elastic1033 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [06:29:25] PROBLEM - puppet last run on ms-be2040 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/local/bin/puppet-enabled] [06:32:39] PROBLEM - IPv6 ping to eqsin on ripe-atlas-eqsin IPv6 is CRITICAL: CRITICAL - failed 76 probes of 438 (alerts on 35) - https://atlas.ripe.net/measurements/11645088/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts [06:34:01] PROBLEM - BGP status on cr1-eqsin is CRITICAL: BGP CRITICAL - AS6939/IPv6: Connect, AS6939/IPv4: Connect https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [06:34:15] PROBLEM - puppet last run on dbmonitor2001 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 8 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/etc/apache2/sites-available/00-dummy.conf] [06:38:05] RECOVERY - IPv6 ping to eqsin on ripe-atlas-eqsin IPv6 is OK: OK - failed 26 probes of 438 (alerts on 35) - https://atlas.ripe.net/measurements/11645088/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts [06:41:23] RECOVERY - BGP status on cr1-eqsin is OK: BGP OK - up: 265, down: 1, shutdown: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [06:56:01] RECOVERY - puppet last run on dbmonitor2001 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [06:56:17] RECOVERY - puppet last run on ms-be2045 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:56:33] RECOVERY - puppet last run on elastic1033 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [06:56:39] RECOVERY - puppet last run on ms-be2040 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:08:13] PROBLEM - IPv6 ping to eqsin on ripe-atlas-eqsin IPv6 is CRITICAL: CRITICAL - failed 153 probes of 438 (alerts on 35) - https://atlas.ripe.net/measurements/11645088/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts [07:12:01] PROBLEM - BGP status on cr1-eqsin is CRITICAL: BGP CRITICAL - AS6939/IPv4: Connect, AS6939/IPv6: Active https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [07:16:27] RECOVERY - BGP status on cr1-eqsin is OK: BGP OK - up: 265, down: 1, shutdown: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [07:19:07] RECOVERY - IPv6 ping to eqsin on ripe-atlas-eqsin IPv6 is OK: OK - failed 20 probes of 438 (alerts on 35) - https://atlas.ripe.net/measurements/11645088/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts [07:27:35] PROBLEM - IPv6 ping to eqsin on ripe-atlas-eqsin IPv6 is CRITICAL: CRITICAL - failed 139 probes of 438 (alerts on 35) - https://atlas.ripe.net/measurements/11645088/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts [09:01:19] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [09:12:45] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.92 ms [09:20:13] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [09:48:49] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 236.14 ms [10:02:01] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [10:07:45] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 86%, RTA = 235.72 ms [10:20:55] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [10:38:09] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.63 ms [10:51:21] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [11:20:03] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.79 ms [11:40:43] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [11:52:07] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 86%, RTA = 235.68 ms [12:07:01] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [12:18:29] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.85 ms [12:25:57] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [12:26:08] (03PS1) 10Urbanecm: Edit urbanecm's .profile [puppet] - 10https://gerrit.wikimedia.org/r/522595 [12:28:05] (03PS2) 10Urbanecm: Edit urbanecm's .profile [puppet] - 10https://gerrit.wikimedia.org/r/522595 [12:37:25] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 86%, RTA = 235.68 ms [12:52:19] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [13:15:17] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.71 ms [13:34:23] PROBLEM - puppet last run on icinga1001 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [13:37:43] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [13:49:09] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.80 ms [13:50:47] RECOVERY - puppet last run on icinga1001 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [14:13:49] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [14:53:59] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.73 ms [15:01:27] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [15:07:11] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 237.61 ms [15:29:31] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [15:46:39] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.57 ms [15:54:07] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [15:59:51] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 86%, RTA = 235.78 ms [16:07:21] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [16:13:03] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.69 ms [16:20:31] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [16:38:09] PROBLEM - puppet last run on dns2002 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [16:49:11] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.87 ms [17:05:25] RECOVERY - puppet last run on dns2002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:23:05] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [17:32:39] :o [17:34:33] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.84 ms [17:53:27] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [18:28:35] (03CR) 10Catrope: [C: 03+1] Edit urbanecm's .profile [puppet] - 10https://gerrit.wikimedia.org/r/522595 (owner: 10Urbanecm) [18:33:00] (03CR) 10Jforrester: [C: 03+1] "LGTM. C+1'ed for 11 days. Should we just land this?" [deployment-charts] - 10https://gerrit.wikimedia.org/r/517573 (https://phabricator.wikimedia.org/T215319) (owner: 10Thcipriani) [18:33:35] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.81 ms [19:05:41] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [19:11:25] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 235.73 ms [20:09:55] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [20:21:21] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 93%, RTA = 229.43 ms [22:13:39] PROBLEM - Host mr1-eqsin.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100% [22:19:23] RECOVERY - Host mr1-eqsin.oob IPv6 is UP: PING WARNING - Packet loss = 86%, RTA = 229.48 ms