[00:00:00] <paladox>	 woo
[00:00:02] <paladox>	 no_justification ^^
[00:00:04] <paladox>	 it worked
[00:00:04] <icinga-wm>	 RECOVERY - puppet last run on stat1005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[00:01:36] <paladox>	 oh https://gerrit.wikimedia.org/r/#/q/is:wip that was alot of drafts :)
[00:02:43] <icinga-wm>	 RECOVERY - puppet last run on db2095 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[00:03:03] <icinga-wm>	 RECOVERY - puppet last run on tungsten is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[00:03:13] <icinga-wm>	 RECOVERY - puppet last run on labsdb1010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[00:03:33] <icinga-wm>	 RECOVERY - puppet last run on kafka2002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[00:03:53] <icinga-wm>	 RECOVERY - puppet last run on releases1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[00:04:23] <icinga-wm>	 RECOVERY - puppet last run on contint1001 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures
[00:04:54] <icinga-wm>	 RECOVERY - puppet last run on releases2001 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures
[00:05:13] <icinga-wm>	 RECOVERY - puppet last run on db1124 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures
[00:05:14] <icinga-wm>	 RECOVERY - puppet last run on labsdb1011 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[00:05:23] <icinga-wm>	 RECOVERY - puppet last run on labsdb1009 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[00:08:13] <icinga-wm>	 RECOVERY - puppet last run on kafka1001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures
[00:08:43] <icinga-wm>	 RECOVERY - puppet last run on bromine is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures
[00:08:44] <icinga-wm>	 RECOVERY - puppet last run on kafka2001 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[00:10:33] <icinga-wm>	 RECOVERY - puppet last run on thorium is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[00:10:33] <icinga-wm>	 RECOVERY - puppet last run on analytics1003 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[00:10:33] <icinga-wm>	 RECOVERY - puppet last run on kafka1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[00:11:43] <icinga-wm>	 RECOVERY - puppet last run on stat1004 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures
[00:13:43] <icinga-wm>	 RECOVERY - puppet last run on webperf2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[00:14:14] <icinga-wm>	 RECOVERY - puppet last run on db1125 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[00:15:35] <wikibugs>	 (03PS3) 10Paladox: Add icinga2 [puppet] - 10https://gerrit.wikimedia.org/r/351540
[00:15:54] <icinga-wm>	 RECOVERY - puppet last run on stat1006 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[00:16:03] <paladox>	 (that was a private comment, we should probaley get wikibugs to ignore WIP changes)
[00:16:17] <paladox>	 comment = commit (ie draft but was converted to wip)
[00:16:56] <Krenair>	 doesn't wikibugs use a public feed?
[00:17:04] <icinga-wm>	 RECOVERY - puppet last run on webperf1001 is OK: OK: Puppet is currently enabled, last run 5 minutes ago with 0 failures
[00:17:12] <paladox>	 yep, but i mean ignore the WIP changes
[00:17:29] <paladox>	 since there could be many changes to WIP thus could spam channels
[00:18:44] <Reedy>	 There's suddenly not going to be many more commits because there's a new feature for WIP
[00:18:48] <wikibugs>	 10Operations: Deactivate Chad's Racktables account - https://phabricator.wikimedia.org/T196787#4268451 (10demon) p:05Triage>03Normal
[00:21:09] <paladox>	 Reedy nope but could be quite annoying, if you work on something an and it spams the channel.
[00:21:25] <Reedy>	 That happens now?
[00:21:37] <paladox>	 if you have open changes
[00:21:42] <paladox>	 but work around was to use draft
[00:22:13] <icinga-wm>	 RECOVERY - puppet last run on db1102 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[00:25:08] <Reedy>	 Few people actually used drafts for more than the initial staging of creating a change via the web
[00:25:29] <paladox>	 i used it as i got told off in here for creating spammy changes.
[00:25:43] <icinga-wm>	 RECOVERY - puppet last run on kafka1003 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures
[00:39:07] <wikibugs>	 (03PS4) 10Alex Monk: Allow PuppetDB use on standalone puppetmasters [puppet] - 10https://gerrit.wikimedia.org/r/435631 (https://phabricator.wikimedia.org/T194962)
[00:39:23] <icinga-wm>	 PROBLEM - SSH cp3042.mgmt on cp3042.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[01:39:23] <icinga-wm>	 RECOVERY - SSH cp3042.mgmt on cp3042.mgmt is OK: SSH OK - OpenSSH_5.8 (protocol 2.0)
[02:05:16] <wikibugs>	 (03CR) 10Alex Monk: "Obsoleted by Ia65009dc" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/387570 (https://phabricator.wikimedia.org/T179371) (owner: 10Filippo Giunchedi)
[02:06:24] <wikibugs>	 (03CR) 10Alex Monk: "Obsoleted by I411fcef3" [puppet] - 10https://gerrit.wikimedia.org/r/387579 (https://phabricator.wikimedia.org/T179371) (owner: 10Filippo Giunchedi)
[02:07:27] <wikibugs>	 (03CR) 10Alex Monk: "Obsoleted by I411fcef3 ?" [puppet] - 10https://gerrit.wikimedia.org/r/386869 (https://phabricator.wikimedia.org/T179371) (owner: 10Filippo Giunchedi)
[02:12:23] <wikibugs>	 10Operations, 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10Prometheus-metrics-monitoring, 10User-fgiunchedi: Move deployment-prep redis instances to stretch - https://phabricator.wikimedia.org/T179371#4268622 (10Krenair) It looks like @joe has made and merged patches that essentially obsolete tho...
[03:23:57] <wikibugs>	 (03PS5) 10Sau226: Implementing Patroller User Rights for azwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/437777 (https://phabricator.wikimedia.org/T196488)
[03:24:33] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 785.17 seconds
[04:22:32] <icinga-wm>	 PROBLEM - Device not healthy -SMART- on db1065 is CRITICAL: cluster=mysql device=megaraid,1 instance=db1065:9100 job=node site=eqiad https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=db1065&var-datasource=eqiad%2520prometheus%252Fops
[04:41:53] <icinga-wm>	 RECOVERY - Device not healthy -SMART- on rdb1004 is OK: All metrics within thresholds. https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=rdb1004&var-datasource=eqiad%2520prometheus%252Fops
[05:26:52] <icinga-wm>	 PROBLEM - Device not healthy -SMART- on db1063 is CRITICAL: cluster=mysql device=megaraid,3 instance=db1063:9100 job=node site=eqiad https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=db1063&var-datasource=eqiad%2520prometheus%252Fops
[06:04:23] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 272.42 seconds
[06:17:53] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s1 on db2070 is OK: OK slave_sql_lag Replication lag: 0.44 seconds
[06:22:52] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s1 on db2055 is OK: OK slave_sql_lag Replication lag: 0.24 seconds
[06:28:33] <icinga-wm>	 PROBLEM - puppet last run on ganeti1006 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/local/bin/puppet-enabled]
[06:29:32] <icinga-wm>	 PROBLEM - puppet last run on analytics1071 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/etc/rsyslog.d/10-puppet-agent.conf]
[06:30:33] <icinga-wm>	 PROBLEM - puppet last run on mw1300 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 4 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/lib/nagios/plugins/check_conntrack]
[06:56:03] <icinga-wm>	 RECOVERY - puppet last run on mw1300 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:59:03] <icinga-wm>	 RECOVERY - puppet last run on ganeti1006 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[07:00:03] <icinga-wm>	 RECOVERY - puppet last run on analytics1071 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures
[07:43:42] <icinga-wm>	 RECOVERY - MegaRAID on labstore1003 is OK: OK: optimal, 5 logical, 34 physical
[07:50:46] <wikibugs>	 (03PS1) 10Urbanecm: Regenerate logo for bnwikivoyage [mediawiki-config] - 10https://gerrit.wikimedia.org/r/439394 (https://phabricator.wikimedia.org/T196803)
[07:53:09] <wikibugs>	 (03PS5) 10Urbanecm: id_privatewikimedia: Initial configuration [mediawiki-config] - 10https://gerrit.wikimedia.org/r/438279
[08:02:33] <wikibugs>	 10Operations, 10ops-eqiad, 10DBA: Bad disk on db1063 - https://phabricator.wikimedia.org/T196804#4268877 (10Marostegui)
[08:02:44] <wikibugs>	 10Operations, 10ops-eqiad, 10DBA: Bad disk on db1063 - https://phabricator.wikimedia.org/T196804#4268889 (10Marostegui) p:05Triage>03Normal
[08:04:42] <wikibugs>	 10Operations, 10ops-eqiad, 10DBA: Bad disk on db1065 - https://phabricator.wikimedia.org/T196806#4268905 (10Marostegui)
[08:04:54] <wikibugs>	 10Operations, 10ops-eqiad, 10DBA: Bad disk on db1065 - https://phabricator.wikimedia.org/T196806#4268917 (10Marostegui) p:05Triage>03Normal
[08:05:26] <icinga-wm>	 ACKNOWLEDGEMENT - Device not healthy -SMART- on db1065 is CRITICAL: cluster=mysql device=megaraid,1 instance=db1065:9100 job=node site=eqiad Marostegui T196806 - The acknowledgement expires at: 2018-06-14 08:05:10. https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=db1065&var-datasource=eqiad%2520prometheus%252Fops
[08:05:57] <icinga-wm>	 ACKNOWLEDGEMENT - Device not healthy -SMART- on db1063 is CRITICAL: cluster=mysql device=megaraid,3 instance=db1063:9100 job=node site=eqiad Marostegui T196804 - The acknowledgement expires at: 2018-06-14 08:05:46. https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=db1063&var-datasource=eqiad%2520prometheus%252Fops
[08:31:43] <icinga-wm>	 PROBLEM - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1972 bytes in 0.105 second response time
[08:35:41] <icinga-wm>	 PROBLEM - IPv4 ping to ulsfo on ripe-atlas-ulsfo is CRITICAL: Traceback (most recent call last)
[08:35:41] <icinga-wm>	 PROBLEM - IPv6 ping to ulsfo on ripe-atlas-ulsfo IPv6 is CRITICAL: Traceback (most recent call last)
[08:36:42] <icinga-wm>	 RECOVERY - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is OK: HTTP OK: HTTP/1.1 200 OK - 1943 bytes in 0.079 second response time
[08:41:01] <icinga-wm>	 RECOVERY - IPv6 ping to ulsfo on ripe-atlas-ulsfo IPv6 is OK: OK - failed 8 probes of 302 (alerts on 19) - https://atlas.ripe.net/measurements/1791309/#!map
[08:41:01] <icinga-wm>	 RECOVERY - IPv4 ping to ulsfo on ripe-atlas-ulsfo is OK: OK - failed 1 probes of 322 (alerts on 19) - https://atlas.ripe.net/measurements/1791307/#!map
[09:01:08] <icinga-wm>	 PROBLEM - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1961 bytes in 0.077 second response time
[09:15:37] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s7 on dbstore2001 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 360.65 seconds
[09:16:17] <icinga-wm>	 RECOVERY - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is OK: HTTP OK: HTTP/1.1 200 OK - 1971 bytes in 0.086 second response time
[09:18:58] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s7 on db2061 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 468.72 seconds
[09:28:28] <icinga-wm>	 PROBLEM - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1965 bytes in 0.086 second response time
[09:33:28] <icinga-wm>	 RECOVERY - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is OK: HTTP OK: HTTP/1.1 200 OK - 1950 bytes in 0.089 second response time
[09:40:37] <icinga-wm>	 PROBLEM - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1966 bytes in 0.080 second response time
[10:05:58] <icinga-wm>	 RECOVERY - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is OK: HTTP OK: HTTP/1.1 200 OK - 1949 bytes in 0.071 second response time
[10:18:27] <icinga-wm>	 PROBLEM - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1968 bytes in 0.069 second response time
[10:21:57] <icinga-wm>	 PROBLEM - Check systemd state on restbase-dev1004 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[10:28:49] <wikibugs>	 (03CR) 10Volans: "Thanks a lot for this initial version, this is great! I've added some general question/proposal inline." (033 comments) [software/debmonitor] - 10https://gerrit.wikimedia.org/r/438018 (https://phabricator.wikimedia.org/T191298) (owner: 10Muehlenhoff)
[10:52:08] <wikibugs>	 (03CR) 10Chad: "I'd like to run under dual mode for awhile, to be safe." [puppet] - 10https://gerrit.wikimedia.org/r/408298 (https://phabricator.wikimedia.org/T174034) (owner: 10Paladox)
[10:54:07] <icinga-wm>	 RECOVERY - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is OK: HTTP OK: HTTP/1.1 200 OK - 1969 bytes in 0.111 second response time
[11:26:57] <icinga-wm>	 PROBLEM - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1965 bytes in 0.076 second response time
[11:36:58] <icinga-wm>	 RECOVERY - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is OK: HTTP OK: HTTP/1.1 200 OK - 1970 bytes in 0.096 second response time
[12:10:12] <wikibugs>	 (03PS3) 10Urbanecm: id_privatewikimedia: register in DNS [dns] - 10https://gerrit.wikimedia.org/r/438275 (https://phabricator.wikimedia.org/T196747)
[12:10:16] <wikibugs>	 (03PS3) 10Urbanecm: id_privatewikimedia: add Apache configuration [puppet] - 10https://gerrit.wikimedia.org/r/438276 (https://phabricator.wikimedia.org/T196747)
[12:12:20] <wikibugs>	 (03PS1) 10Aklapper: Create a FeaturedFeed for the News on mediawikiwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/439436 (https://phabricator.wikimedia.org/T165773)
[12:13:16] <wikibugs>	 (03CR) 10Aklapper: "I have no idea what I'm doing here. See https://phabricator.wikimedia.org/T165773#4267238 etc." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/439436 (https://phabricator.wikimedia.org/T165773) (owner: 10Aklapper)
[12:19:12] <wikibugs>	 (03CR) 10Reedy: [C: 04-1] Create a FeaturedFeed for the News on mediawikiwiki (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/439436 (https://phabricator.wikimedia.org/T165773) (owner: 10Aklapper)
[12:44:24] <wikibugs>	 (03PS2) 10Aklapper: Create a FeaturedFeed for the News on mediawikiwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/439436 (https://phabricator.wikimedia.org/T165773)
[12:48:38] <icinga-wm>	 RECOVERY - Check systemd state on kubernetes2003 is OK: OK - running: The system is fully operational
[12:48:44] <wikibugs>	 (03PS1) 10Reedy: Move if onto newline in FeaturedFeedsWMF.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/439438
[12:49:15] <wikibugs>	 (03PS2) 10Reedy: Move if onto newline in FeaturedFeedsWMF.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/439438
[12:50:32] <wikibugs>	 (03CR) 10Reedy: Move if onto newline in FeaturedFeedsWMF.php (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/439438 (owner: 10Reedy)
[12:51:57] <icinga-wm>	 PROBLEM - Check systemd state on kubernetes2003 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[13:08:49] <wikibugs>	 10Operations, 10Gerrit, 10Patch-For-Review, 10Release-Engineering-Team (Someday): Gerrit shows HTTP 500 error when pasting extended unicode characters - https://phabricator.wikimedia.org/T145885#4269142 (10Paladox) 05stalled>03Resolved We are now on 2.15 and just tested and emoji's work now!
[13:09:39] <wikibugs>	 (03PS4) 10Daimona Eaytoy: Enable $wgAbuseFilterProfile on every wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/423660 (https://phabricator.wikimedia.org/T191039)
[13:10:07] <wikibugs>	 (03PS2) 10Daimona Eaytoy: Enable $wgAbuseFilterRuntimeProfile on every wiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/423945 (https://phabricator.wikimedia.org/T191039)
[14:39:28] <wikibugs>	 (03PS1) 10Paladox: Gerrit: Make PolyGerrit the default ui [puppet] - 10https://gerrit.wikimedia.org/r/439444
[14:43:38] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s3 on db2074 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 301.19 seconds
[14:43:47] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s3 on db2094 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 304.87 seconds
[14:43:58] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s3 on db2057 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 310.72 seconds
[14:44:07] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s3 on db2043 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 312.71 seconds
[14:44:08] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s3 on db2036 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 312.77 seconds
[14:44:08] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s3 on db2050 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 313.36 seconds
[14:47:18] <wikibugs>	 (03PS2) 10Paladox: Gerrit: Make PolyGerrit the default ui [puppet] - 10https://gerrit.wikimedia.org/r/439444 (https://phabricator.wikimedia.org/T196812)
[14:49:00] <wikibugs>	 (03CR) 10Paladox: "This change is ready for review." [puppet] - 10https://gerrit.wikimedia.org/r/439444 (https://phabricator.wikimedia.org/T196812) (owner: 10Paladox)
[14:49:17] <icinga-wm>	 RECOVERY - Check systemd state on kubernetes2003 is OK: OK - running: The system is fully operational
[14:49:27] <paladox>	 that is the WIP button, i didn't write that :)
[14:49:36] <paladox>	 changes from WIP to read for review now.
[14:52:28] <icinga-wm>	 PROBLEM - Check systemd state on kubernetes2003 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[15:12:07] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s3 on db1124 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 373.82 seconds
[15:19:47] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s3 on db1124 is OK: OK slave_sql_lag Replication lag: 32.90 seconds
[16:04:17] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s3 on db2074 is OK: OK slave_sql_lag Replication lag: 12.22 seconds
[16:04:38] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s3 on db2057 is OK: OK slave_sql_lag Replication lag: 0.26 seconds
[16:04:38] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s3 on db2043 is OK: OK slave_sql_lag Replication lag: 0.04 seconds
[16:04:47] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s3 on db2036 is OK: OK slave_sql_lag Replication lag: 0.00 seconds
[16:04:48] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s3 on db2050 is OK: OK slave_sql_lag Replication lag: 0.09 seconds
[16:05:28] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s3 on db2094 is OK: OK slave_sql_lag Replication lag: 0.00 seconds
[16:14:08] <icinga-wm>	 PROBLEM - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1971 bytes in 0.064 second response time
[16:24:27] <icinga-wm>	 RECOVERY - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is OK: HTTP OK: HTTP/1.1 200 OK - 1974 bytes in 0.090 second response time
[16:36:47] <icinga-wm>	 PROBLEM - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1972 bytes in 0.092 second response time
[17:02:08] <icinga-wm>	 RECOVERY - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is OK: HTTP OK: HTTP/1.1 200 OK - 1947 bytes in 0.079 second response time
[17:09:07] <icinga-wm>	 PROBLEM - BGP status on cr2-knams is CRITICAL: BGP CRITICAL - AS13030/IPv6: OpenConfirm, AS13030/IPv4: OpenConfirm
[17:20:08] <icinga-wm>	 RECOVERY - BGP status on cr2-knams is OK: BGP OK - up: 11, down: 0, shutdown: 0
[17:23:28] <icinga-wm>	 PROBLEM - BGP status on cr2-knams is CRITICAL: BGP CRITICAL - AS13030/IPv4: OpenConfirm
[17:25:38] <icinga-wm>	 RECOVERY - BGP status on cr2-knams is OK: BGP OK - up: 11, down: 0, shutdown: 0
[17:32:27] <icinga-wm>	 PROBLEM - BGP status on cr2-knams is CRITICAL: BGP CRITICAL - AS13030/IPv6: Active
[17:33:28] <icinga-wm>	 RECOVERY - BGP status on cr2-knams is OK: BGP OK - up: 11, down: 0, shutdown: 0
[17:36:48] <icinga-wm>	 PROBLEM - BGP status on cr2-knams is CRITICAL: BGP CRITICAL - AS13030/IPv6: OpenConfirm, AS13030/IPv4: OpenConfirm
[17:46:38] <icinga-wm>	 RECOVERY - BGP status on cr2-knams is OK: BGP OK - up: 11, down: 0, shutdown: 0
[17:49:17] <icinga-wm>	 RECOVERY - Check systemd state on kubernetes2003 is OK: OK - running: The system is fully operational
[17:49:58] <icinga-wm>	 PROBLEM - BGP status on cr2-knams is CRITICAL: BGP CRITICAL - AS13030/IPv6: OpenConfirm, AS13030/IPv4: Active
[17:52:17] <icinga-wm>	 RECOVERY - BGP status on cr2-knams is OK: BGP OK - up: 11, down: 0, shutdown: 0
[17:52:37] <icinga-wm>	 PROBLEM - Check systemd state on kubernetes2003 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[18:07:47] <icinga-wm>	 PROBLEM - BGP status on cr2-knams is CRITICAL: BGP CRITICAL - AS13030/IPv4: OpenConfirm, AS13030/IPv6: OpenConfirm
[18:14:28] <icinga-wm>	 RECOVERY - BGP status on cr2-knams is OK: BGP OK - up: 11, down: 0, shutdown: 0
[18:21:08] <icinga-wm>	 PROBLEM - BGP status on cr2-knams is CRITICAL: BGP CRITICAL - AS13030/IPv4: OpenConfirm, AS13030/IPv6: Active
[18:23:27] <icinga-wm>	 RECOVERY - BGP status on cr2-knams is OK: BGP OK - up: 11, down: 0, shutdown: 0
[18:26:47] <icinga-wm>	 PROBLEM - BGP status on cr2-knams is CRITICAL: BGP CRITICAL - AS13030/IPv4: OpenConfirm, AS13030/IPv6: OpenConfirm
[18:34:37] <icinga-wm>	 RECOVERY - BGP status on cr2-knams is OK: BGP OK - up: 11, down: 0, shutdown: 0
[18:56:48] <icinga-wm>	 PROBLEM - BGP status on cr2-knams is CRITICAL: BGP CRITICAL - AS13030/IPv6: OpenConfirm, AS13030/IPv4: OpenConfirm
[18:57:57] <icinga-wm>	 RECOVERY - BGP status on cr2-knams is OK: BGP OK - up: 11, down: 0, shutdown: 0
[19:03:27] <icinga-wm>	 PROBLEM - BGP status on cr2-knams is CRITICAL: BGP CRITICAL - AS13030/IPv6: OpenConfirm, AS13030/IPv4: OpenConfirm
[19:13:18] <icinga-wm>	 RECOVERY - BGP status on cr2-knams is OK: BGP OK - up: 11, down: 0, shutdown: 0
[20:12:38] <icinga-wm>	 PROBLEM - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 1974 bytes in 0.086 second response time
[20:19:07] <icinga-wm>	 RECOVERY - Check systemd state on kubernetes2003 is OK: OK - running: The system is fully operational
[20:22:27] <icinga-wm>	 PROBLEM - Check systemd state on kubernetes2003 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[20:22:48] <icinga-wm>	 RECOVERY - wikidata.org dispatch lag is higher than 300s on www.wikidata.org is OK: HTTP OK: HTTP/1.1 200 OK - 1966 bytes in 0.074 second response time
[20:32:18] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s6 on dbstore2001 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 337.49 seconds
[20:32:27] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s6 on db2046 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 340.94 seconds
[20:32:28] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s6 on db2039 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 341.68 seconds
[20:32:37] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s6 on db2067 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 345.83 seconds
[20:32:57] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s6 on db2087 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 354.63 seconds
[20:33:07] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s6 on db2060 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 359.20 seconds
[20:33:07] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s6 on db2089 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 359.90 seconds
[20:33:08] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s6 on db2053 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 360.17 seconds
[20:33:08] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s6 on db2076 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 360.11 seconds
[20:45:18] <icinga-wm>	 PROBLEM - Memory correctable errors -EDAC- on scb1002 is CRITICAL: 4.001 ge 4 https://grafana.wikimedia.org/dashboard/db/host-overview?orgId=1&var-server=scb1002&var-datasource=eqiad%2520prometheus%252Fops
[21:38:34] <wikibugs>	 (03CR) 10Alex Monk: "... Well this is a dependency regardless." [puppet] - 10https://gerrit.wikimedia.org/r/372764 (owner: 10Alex Monk)
[21:43:39] <wikibugs>	 (03Abandoned) 10Alex Monk: Fix mwrepl to require expanddblist dependency, from scap::scripts [puppet] - 10https://gerrit.wikimedia.org/r/372764 (owner: 10Alex Monk)
[21:45:28] <wikibugs>	 (03CR) 10Alex Monk: "Removed this commit from the deployment-prep puppetmasters." [puppet] - 10https://gerrit.wikimedia.org/r/372764 (owner: 10Alex Monk)
[21:54:42] <wikibugs>	 (03PS6) 10Alex Monk: Move some production apache config files to templates [puppet] - 10https://gerrit.wikimedia.org/r/322602 (https://phabricator.wikimedia.org/T1256)
[22:00:59] <wikibugs>	 (03CR) 10Alex Monk: "This is the next step in killing a lot of beta-prod divergence and it's trivial. Can someone please review it? Remember Puppet SWAT is not" [puppet] - 10https://gerrit.wikimedia.org/r/322602 (https://phabricator.wikimedia.org/T1256) (owner: 10Alex Monk)
[22:26:13] <wikibugs>	 (03Abandoned) 10Alex Monk: Try to separate trebuchet stuff from role::deployment::server [puppet] - 10https://gerrit.wikimedia.org/r/284851 (owner: 10Alex Monk)
[22:28:52] <wikibugs>	 (03Abandoned) 10Alex Monk: Get rid of mw-deployment-vars.sh [puppet] - 10https://gerrit.wikimedia.org/r/316928 (owner: 10Alex Monk)
[22:29:14] <wikibugs>	 (03Abandoned) 10Alex Monk: [WIP] Move from ircecho to tcpircbot [puppet] - 10https://gerrit.wikimedia.org/r/240945 (owner: 10Alex Monk)
[22:29:38] <wikibugs>	 (03Abandoned) 10Alex Monk: tcpircbot: Allow per-infile channel lists [puppet] - 10https://gerrit.wikimedia.org/r/240939 (owner: 10Alex Monk)
[22:30:34] <wikibugs>	 (03CR) 10Alex Monk: [C: 04-1] ""we don't bother rewriting legacy URLs" - breaks existing URLs" [puppet] - 10https://gerrit.wikimedia.org/r/422571 (owner: 10Chad)
[22:31:03] <wikibugs>	 (03CR) 10Alex Monk: [C: 04-1] "Yeah except that one breaks legacy URLs." [puppet] - 10https://gerrit.wikimedia.org/r/322425 (owner: 10Alex Monk)
[22:34:28] <wikibugs>	 (03CR) 10Alex Monk: "I gave up with trying to make LVS in beta due to labs networking security restrictions - the Neutron upgrade should make it doable (hopefu" [puppet] - 10https://gerrit.wikimedia.org/r/316512 (owner: 10Alex Monk)
[22:34:53] <wikibugs>	 (03PS2) 10Alex Monk: deployment-prep: Make LVS config compatible with new requirements [puppet] - 10https://gerrit.wikimedia.org/r/316512 (https://phabricator.wikimedia.org/T196662)
[22:50:36] <wikibugs>	 (03Abandoned) 10Paladox: Replace TEMPLATE_CONTEXT_PROCESSORS with TEMPLATES [software/servermon] - 10https://gerrit.wikimedia.org/r/362600 (owner: 10Paladox)
[22:50:45] <wikibugs>	 (03PS3) 10Alex Monk: keystone: Create top-level domain for each new project [puppet] - 10https://gerrit.wikimedia.org/r/375089 (https://phabricator.wikimedia.org/T162977)
[22:55:26] <wikibugs>	 (03PS2) 10Alex Monk: shinkengen for all projects [puppet] - 10https://gerrit.wikimedia.org/r/374897 (https://phabricator.wikimedia.org/T166845)
[22:56:09] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] shinkengen for all projects [puppet] - 10https://gerrit.wikimedia.org/r/374897 (https://phabricator.wikimedia.org/T166845) (owner: 10Alex Monk)
[23:00:26] <wikibugs>	 (03PS3) 10Alex Monk: shinkengen for all projects [puppet] - 10https://gerrit.wikimedia.org/r/374897 (https://phabricator.wikimedia.org/T166845)
[23:09:47] <icinga-wm>	 PROBLEM - PyBal backends health check on lvs1016 is CRITICAL: PYBAL CRITICAL - CRITICAL - api-https_443: Servers mw1341.eqiad.wmnet are marked down but pooled
[23:10:48] <icinga-wm>	 RECOVERY - PyBal backends health check on lvs1016 is OK: PYBAL OK - All pools are healthy