[01:07:36] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T226912 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[01:07:39] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226912 (10ops-monitoring-bot)
[01:38:51] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T226913 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[01:38:55] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226913 (10ops-monitoring-bot)
[01:55:10] <wikibugs>	 10Operations, 10observability: ops-monitoring-bot creating dupes - https://phabricator.wikimedia.org/T226908 (10Peachey88) Is this actually a problem with the bot, or with how acknowledgements are working in icinga?
[01:55:30] <wikibugs>	 10Operations, 10Icinga, 10observability: ops-monitoring-bot creating dupes - https://phabricator.wikimedia.org/T226908 (10Peachey88)
[02:10:14] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T226915 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[02:10:17] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226915 (10ops-monitoring-bot)
[03:12:57] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T226916 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[03:13:04] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226916 (10ops-monitoring-bot)
[03:42:21] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T224794 (10Peachey88)
[03:42:23] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226915 (10Peachey88)
[03:42:25] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226916 (10Peachey88)
[03:42:27] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226913 (10Peachey88)
[03:42:29] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226912 (10Peachey88)
[04:29:00] <wikibugs>	 (03CR) 10KartikMistry: [C: 03+1] "> > This can wait until July 11." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/518260 (https://phabricator.wikimedia.org/T225398) (owner: 10Petar.petkovic)
[05:18:36] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T226917 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[05:18:39] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226917 (10ops-monitoring-bot)
[05:36:08] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226917 (10Peachey88)
[05:36:10] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T224794 (10Peachey88)
[06:10:53] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T226919 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[06:10:57] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226919 (10ops-monitoring-bot)
[06:17:19] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226919 (10Peachey88)
[06:17:21] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T224794 (10Peachey88)
[06:28:37] <icinga-wm>	 PROBLEM - puppet last run on elastic1045 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle.
[06:28:45] <wikibugs>	 (03CR) 10Urbanecm: [C: 03+1] "> Patch Set 1:" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/518260 (https://phabricator.wikimedia.org/T225398) (owner: 10Petar.petkovic)
[06:33:17] <icinga-wm>	 PROBLEM - puppet last run on analytics1048 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 7 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/etc/R/update-library.R]
[06:55:05] <icinga-wm>	 RECOVERY - puppet last run on analytics1048 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures
[07:01:20] <icinga-wm>	 RECOVERY - puppet last run on elastic1045 is OK: OK: Puppet is currently enabled, last run 5 minutes ago with 0 failures
[07:05:23] <Urbanecm>	 !log Remove 2FA from User:SQL (T226918)
[07:05:29] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:05:29] <stashbot>	 T226918: Disable two factor authentication for Wikimedia account "SQL" - https://phabricator.wikimedia.org/T226918
[07:06:13] <SQL>	 <3
[07:25:01] <icinga-wm>	 PROBLEM - puppet last run on multatuli is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle.
[07:52:19] <icinga-wm>	 RECOVERY - puppet last run on multatuli is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures
[08:08:22] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists, 10Space (Jan-Mar-2020): Integrate mailing lists in Wikimedia Space - https://phabricator.wikimedia.org/T226727 (10Tgr) Note that deleting a thread and purging it from storage are different things. The latter will be needed here for compliance with the data retention p...
[08:37:18] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T226921 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[08:37:21] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226921 (10ops-monitoring-bot)
[08:48:31] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226921 (10Peachey88)
[08:48:34] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T224794 (10Peachey88)
[09:08:41] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T226923 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[09:08:44] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226923 (10ops-monitoring-bot)
[09:11:36] <thedj>	 like most places, shit gets f'up when people start polarizing into bad/
[09:11:53] <thedj>	 ugh. wwe
[09:19:00] <wikibugs>	 (03CR) 10Gergő Tisza: Create new http://www.mediawiki.org/xml/sitelist-1.1/ to reference sitelist-1.1.xsd (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/508130 (https://phabricator.wikimedia.org/T222516) (owner: 10Luca Mauri)
[09:40:04] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T226924 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[09:40:06] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226924 (10ops-monitoring-bot)
[10:45:50] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226923 (10Reedy)
[10:45:52] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T224794 (10Reedy)
[10:46:08] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226924 (10Reedy)
[10:46:10] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T224794 (10Reedy)
[10:47:11] <wikibugs>	 10Operations, 10Icinga, 10observability: ops-monitoring-bot creating dupes - https://phabricator.wikimedia.org/T226908 (10Reedy) >>! In T226908#5294154, @Peachey88 wrote: > Is this actually a problem with the bot, or with how acknowledgements are working in icinga?  Pass. The user facing "issue" is the dupli...
[11:56:03] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T226933 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[11:56:06] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226933 (10ops-monitoring-bot)
[11:57:21] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226933 (10Peachey88)
[11:57:23] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T224794 (10Peachey88)
[12:37:54] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T226936 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[12:37:59] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226936 (10ops-monitoring-bot)
[12:49:10] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T224794 (10Peachey88)
[12:49:12] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226936 (10Peachey88)
[12:49:57] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T224794 (10Peachey88)
[12:49:59] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226936 (10Peachey88)
[12:50:06] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T226936 (10Peachey88)
[12:50:09] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T224794 (10Peachey88)
[13:06:24] <wikibugs>	 10Operations, 10Icinga, 10observability: ops-monitoring-bot creating dupes - https://phabricator.wikimedia.org/T226908 (10Volans) Sorry for the spam. My guess is that the check is flapping between critical and unknown. The script ignores the unknowns but it doesn't know if there is already a task opened (lon...
[14:33:53] <wikibugs>	 10Operations, 10media-storage: Not possible to server-side upload certain images - https://phabricator.wikimedia.org/T226937 (10Urbanecm)
[14:34:22] <wikibugs>	 10Operations, 10media-storage: Not possible to server-side upload certain images - https://phabricator.wikimedia.org/T226937 (10Urbanecm) Tagging @Reedy, who had problems with this too in T226845.
[14:40:00] <icinga-wm>	 PROBLEM - puppet last run on bast2002 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle.
[15:07:20] <icinga-wm>	 RECOVERY - puppet last run on bast2002 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[17:21:53] <wikibugs>	 10Operations, 10LDAP-Access-Requests: Grant WMDE engineers access to logstash and creating grafana boards  / Add WMDE engineers to 'nda' LDAP group - https://phabricator.wikimedia.org/T225004 (10Addshore) >>! In T225004#5291475, @MoritzMuehlenhoff wrote: > we have two ways to approach this: If you specifically...
[17:38:56] <icinga-wm>	 RECOVERY - Router interfaces on cr2-eqord is OK: OK: host 208.80.154.198, interfaces up: 55, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[17:39:34] <icinga-wm>	 RECOVERY - Router interfaces on cr3-ulsfo is OK: OK: host 198.35.26.192, interfaces up: 68, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[17:43:46] <icinga-wm>	 PROBLEM - IPv6 ping to ulsfo on ripe-atlas-ulsfo IPv6 is CRITICAL: CRITICAL - failed 36 probes of 433 (alerts on 35) - https://atlas.ripe.net/measurements/1791309/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts
[17:45:34] <wikibugs>	 (03CR) 10ArielGlenn: dumpwikidatajson: Fix error code detection (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/519494 (https://phabricator.wikimedia.org/T226601) (owner: 10Hoo man)
[17:49:11] <wikibugs>	 10Operations, 10Traffic, 10Wikidata, 10Wikidata-Query-Service, and 2 others: Reduce / remove the aggessive cache busting behaviour of wdqs-updater - https://phabricator.wikimedia.org/T217897 (10Addshore) I guess this will eventually be in wdqs 0.3.3 ?
[17:49:14] <icinga-wm>	 RECOVERY - IPv6 ping to ulsfo on ripe-atlas-ulsfo IPv6 is OK: OK - failed 19 probes of 433 (alerts on 35) - https://atlas.ripe.net/measurements/1791309/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts
[18:21:50] <wikibugs>	 10Operations, 10Wikidata, 10wikidata-tech-focus: Move dispatching of wikidata to a dedicated node - https://phabricator.wikimedia.org/T193733 (10Addshore) >>! In T193733#5276659, @Ladsgroup wrote: > So I like to just drop the whole thing but first we need to address {T220696} which enables us to make all the...
[18:55:12] <icinga-wm>	 PROBLEM - puppet last run on relforge1001 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle.
[19:22:28] <icinga-wm>	 RECOVERY - puppet last run on relforge1001 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures
[20:59:17] <wikibugs>	 10Operations, 10media-storage: Not possible to server-side upload certain images - https://phabricator.wikimedia.org/T226937 (10Urbanecm) ` [urbanecm@mwmaint1002 T223052-upload2]$ ls Hurtigruten.05.11.1920x1080.NRK2.webm [urbanecm@mwmaint1002 T223052-upload2]$ mwscript importImages.php --wiki=commonswiki --use...
[21:08:27] <wikibugs>	 10Operations, 10media-storage: Not possible to server-side upload certain images: "An unknown error occurred in storage backend "local-swift-eqiad"" - https://phabricator.wikimedia.org/T226937 (10Aklapper)
[21:14:30] <icinga-wm>	 PROBLEM - Check the Netbox report-s- puppetdb for fail status. on netmon1002 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports
[21:16:27] <Oshwah>	 jbond42|away: Are you around? I need to PM you or a en-wiki sysadmin...
[21:23:17] <wikibugs>	 (03PS1) 10Aaron Schulz: Update my obsolete YubiKey-stored SSH keys [puppet] - 10https://gerrit.wikimedia.org/r/519941
[22:02:49] <wikibugs>	 10Operations, 10Traffic, 10Wikidata, 10Wikidata-Query-Service, and 2 others: Reduce / remove the aggessive cache busting behaviour of wdqs-updater - https://phabricator.wikimedia.org/T217897 (10Smalyshev) Eventually, yes.
[22:24:42] <icinga-wm>	 PROBLEM - puppet last run on dns2001 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle.
[22:25:32] <icinga-wm>	 PROBLEM - Check the Netbox report-s- puppetdb for fail status. on netmon1002 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports
[22:51:58] <icinga-wm>	 RECOVERY - puppet last run on dns2001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures
[23:24:35] <wikibugs>	 (03PS1) 10Urbanecm: Update interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/519945
[23:24:37] <wikibugs>	 (03CR) 10Urbanecm: [C: 03+2] Update interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/519945 (owner: 10Urbanecm)
[23:25:38] <wikibugs>	 (03Merged) 10jenkins-bot: Update interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/519945 (owner: 10Urbanecm)
[23:26:20] <wikibugs>	 (03CR) 10jenkins-bot: Update interwiki cache [mediawiki-config] - 10https://gerrit.wikimedia.org/r/519945 (owner: 10Urbanecm)
[23:27:08] <logmsgbot>	 !log urbanecm@deploy1001 Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 45s)
[23:27:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:49:36] <Reedy>	 Urbanecm: Um
[23:49:43] <Reedy>	 Why are you merging and deploying no-ops?