[00:01:48] <wikibugs>	 (03PS1) 10Dzahn: Revert "testreduce: use regular package{} instead of require_package" [puppet] - 10https://gerrit.wikimedia.org/r/482390
[00:02:42] <wikibugs>	 (03CR) 10Dzahn: [C: 03+2] "Duplicate declaration: Package[nodejs] is already declared in file /etc/puppet/modules/visualdiff/manifests/init.pp:16; cannot redeclare a" [puppet] - 10https://gerrit.wikimedia.org/r/482390 (owner: 10Dzahn)
[00:24:55] <wikibugs>	 10Operations, 10uprightdiff, 10Parsoid-Tests: stretch version of uprightdiff package - https://phabricator.wikimedia.org/T212987 (10Dzahn)
[00:25:49] <wikibugs>	 10Operations, 10uprightdiff, 10Parsoid-Tests: stretch version of uprightdiff package - https://phabricator.wikimedia.org/T212987 (10Dzahn)
[00:25:53] <wikibugs>	 10Operations, 10Parsoid, 10Patch-For-Review: rack/setup/install scandium.eqiad.wmnet (parsoid test box) - https://phabricator.wikimedia.org/T201366 (10Dzahn)
[00:26:49] <wikibugs>	 10Operations, 10Parsoid, 10Patch-For-Review: rack/setup/install scandium.eqiad.wmnet (parsoid test box) - https://phabricator.wikimedia.org/T201366 (10Dzahn) some issues solved (no more broken packages, icinga happy),  but blocked on T212987 and still has a dependency issue with apt::pin
[00:27:06] <wikibugs>	 10Operations, 10Parsoid, 10Patch-For-Review: rack/setup/install scandium.eqiad.wmnet (parsoid test box) - https://phabricator.wikimedia.org/T201366 (10Dzahn) a:03Dzahn
[00:31:22] <wikibugs>	 10Operations, 10uprightdiff, 10Parsoid-Tests: stretch version of uprightdiff package - https://phabricator.wikimedia.org/T212987 (10ssastry) @Legoktm In case you can help with this packaging of uprightdiff.
[00:57:14] <wikibugs>	 (03CR) 10Bstorm: wmcs::nfs::misc - Refactor into profile/role (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/482051 (https://phabricator.wikimedia.org/T209527) (owner: 10GTirloni)
[01:12:38] <wikibugs>	 (03CR) 10Bstorm: "> Aren't these packages really universal and not tied to any particular distribution? Does the distribution really mean anything to us at " [software/tools-manifest] - 10https://gerrit.wikimedia.org/r/479181 (https://phabricator.wikimedia.org/T107878) (owner: 10GTirloni)
[01:15:36] <wikibugs>	 (03CR) 10Paladox: "This is what it will look like: https://phabricator.wikimedia.org/F27793705" [puppet] - 10https://gerrit.wikimedia.org/r/482379 (owner: 10Paladox)
[01:15:55] <wikibugs>	 (03CR) 10Bstorm: [C: 03+1] "So yeah, presuming we can eliminate some confusion by using "unstable", I like this change :)" [software/tools-manifest] - 10https://gerrit.wikimedia.org/r/479181 (https://phabricator.wikimedia.org/T107878) (owner: 10GTirloni)
[02:03:20] <wikibugs>	 (03CR) 10Krinkle: "The green conflicts with the logo, making the text hard to read and the logo no longer clearly recognisable. The text should probably dark" [puppet] - 10https://gerrit.wikimedia.org/r/482379 (owner: 10Paladox)
[02:04:40] <wikibugs>	 (03CR) 10Krinkle: "(or use an all-white version of the logo and make the green darker still)." [puppet] - 10https://gerrit.wikimedia.org/r/482379 (owner: 10Paladox)
[02:57:57] <wikibugs>	 10Operations, 10Phabricator, 10Release-Engineering-Team: Convert Phabricator mail config to use cluster.mailers - https://phabricator.wikimedia.org/T212989 (10Paladox)
[02:58:26] <wikibugs>	 10Operations, 10Phabricator, 10Release-Engineering-Team: Convert Phabricator mail config to use cluster.mailers - https://phabricator.wikimedia.org/T212989 (10Paladox)
[03:33:07] <icinga-wm>	 PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 903.38 seconds
[03:42:57] <wikibugs>	 (03PS1) 10Paladox: phabricator: Migrate mail config to cluster.mailers [puppet] - 10https://gerrit.wikimedia.org/r/482400
[03:49:17] <wikibugs>	 (03PS2) 10Paladox: phabricator: Migrate mail config to cluster.mailers [puppet] - 10https://gerrit.wikimedia.org/r/482400
[03:49:35] <wikibugs>	 (03PS3) 10Paladox: phabricator: Migrate mail config to cluster.mailers [puppet] - 10https://gerrit.wikimedia.org/r/482400 (https://phabricator.wikimedia.org/T212989)
[03:50:23] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] phabricator: Migrate mail config to cluster.mailers [puppet] - 10https://gerrit.wikimedia.org/r/482400 (https://phabricator.wikimedia.org/T212989) (owner: 10Paladox)
[03:50:29] <wikibugs>	 (03PS4) 10Paladox: phabricator: Migrate mail config to cluster.mailers [puppet] - 10https://gerrit.wikimedia.org/r/482400 (https://phabricator.wikimedia.org/T212989)
[03:51:33] <wikibugs>	 (03PS5) 10Paladox: phabricator: Migrate mail config to cluster.mailers [puppet] - 10https://gerrit.wikimedia.org/r/482400 (https://phabricator.wikimedia.org/T212989)
[03:52:07] <wikibugs>	 (03PS6) 10Paladox: phabricator: Migrate mail config to cluster.mailers [puppet] - 10https://gerrit.wikimedia.org/r/482400 (https://phabricator.wikimedia.org/T212989)
[03:52:12] <wikibugs>	 (03CR) 10Paladox: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/482400 (https://phabricator.wikimedia.org/T212989) (owner: 10Paladox)
[03:55:41] <wikibugs>	 (03CR) 10Paladox: "@20after4 would you be able to review this please? I believe we have to do the same for incoming emails to (though im not sure how to do t" [puppet] - 10https://gerrit.wikimedia.org/r/482400 (https://phabricator.wikimedia.org/T212989) (owner: 10Paladox)
[04:02:19] <icinga-wm>	 PROBLEM - Device not healthy -SMART- on helium is CRITICAL: cluster=misc device=megaraid,6 instance=helium:9100 job=node site=eqiad https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=helium&var-datasource=eqiad%2520prometheus%252Fops
[04:21:55] <wikibugs>	 10Operations, 10uprightdiff, 10Parsoid-Tests: stretch version of uprightdiff package - https://phabricator.wikimedia.org/T212987 (10Legoktm) I uploaded uprightdiff to stretch-backports, it'll take a few days for it to get reviewed by the backports FTP masters, if that's not an issue (otherwise I can get a ve...
[04:22:01] <icinga-wm>	 RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 140.89 seconds
[04:49:23] <icinga-wm>	 PROBLEM - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded)
[04:49:30] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T212990 (10ops-monitoring-bot)
[06:28:22] <icinga-wm>	 PROBLEM - Check systemd state on netmon1002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[06:28:56] <icinga-wm>	 PROBLEM - puppet last run on authdns2001 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/local/bin/nrpe_check_systemd_unit_state]
[06:29:12] <icinga-wm>	 PROBLEM - netbox HTTPS on netmon1002 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 547 bytes in 0.008 second response time
[06:31:30] <icinga-wm>	 PROBLEM - puppet last run on labstore1003 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 5 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/etc/profile.d/mysql-ps1.sh]
[06:57:32] <icinga-wm>	 RECOVERY - puppet last run on labstore1003 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures
[07:00:12] <icinga-wm>	 RECOVERY - puppet last run on authdns2001 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures
[07:07:28] <icinga-wm>	 RECOVERY - Check systemd state on netmon1002 is OK: OK - running: The system is fully operational
[07:11:06] <icinga-wm>	 PROBLEM - Check systemd state on netmon1002 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[07:31:22] <icinga-wm>	 RECOVERY - netbox HTTPS on netmon1002 is OK: HTTP OK: HTTP/1.1 302 Found - 348 bytes in 0.559 second response time
[07:31:37] <elukey>	 a puppet run fixed it, it is the recurrent log rotation segfault --^
[07:31:48] <icinga-wm>	 RECOVERY - Check systemd state on netmon1002 is OK: OK - running: The system is fully operational
[07:37:06] <wikibugs>	 (03PS1) 10Elukey: Decomission two Hadoop worker nodes from the Analtytics cluster [puppet] - 10https://gerrit.wikimedia.org/r/482401 (https://phabricator.wikimedia.org/T209929)
[07:38:09] <wikibugs>	 (03CR) 10Elukey: [C: 03+2] Decomission two Hadoop worker nodes from the Analtytics cluster [puppet] - 10https://gerrit.wikimedia.org/r/482401 (https://phabricator.wikimedia.org/T209929) (owner: 10Elukey)
[07:40:09] <wikibugs>	 10Operations, 10ops-eqiad: Degraded RAID on helium - https://phabricator.wikimedia.org/T212990 (10elukey) p:05Triage→03High a:03Cmjohnson
[07:41:34] <icinga-wm>	 ACKNOWLEDGEMENT - Device not healthy -SMART- on helium is CRITICAL: cluster=misc device=megaraid,6 instance=helium:9100 job=node site=eqiad Elukey T212990 https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=helium&var-datasource=eqiad%2520prometheus%252Fops
[07:41:34] <icinga-wm>	 ACKNOWLEDGEMENT - MegaRAID on helium is CRITICAL: CRITICAL: 1 failed LD(s) (Partially Degraded) Elukey T212990
[11:44:13] <wikibugs>	 10Operations, 10monitoring: Degraded RAID alert not acking notifications - https://phabricator.wikimedia.org/T212969 (10Volans) a:03Volans
[11:48:38] <wikibugs>	 (03PS1) 10Volans: icinga: raid_handler fix path to command file [puppet] - 10https://gerrit.wikimedia.org/r/482403 (https://phabricator.wikimedia.org/T212969)
[11:49:08] <wikibugs>	 10Operations, 10monitoring, 10Patch-For-Review: Degraded RAID alert not acking notifications - https://phabricator.wikimedia.org/T212969 (10Volans) The raid handler had the old path valid in jessie for the command file.
[11:50:09] <wikibugs>	 (03CR) 10Volans: [C: 03+2] icinga: raid_handler fix path to command file [puppet] - 10https://gerrit.wikimedia.org/r/482403 (https://phabricator.wikimedia.org/T212969) (owner: 10Volans)
[11:58:06] <wikibugs>	 10Operations, 10monitoring, 10Patch-For-Review: Degraded RAID alert not acking notifications - https://phabricator.wikimedia.org/T212969 (10Volans) 05Open→03Resolved Path deployed, resolving for now. Please re-open if that doesn't fix it.
[12:10:21] <wikibugs>	 (03CR) 10Volans: "> Patch Set 1:" (032 comments) [dns] - 10https://gerrit.wikimedia.org/r/481833 (owner: 10Volans)
[13:56:02] <Vulpix>	 Hello. This URL doesn't load for me. It stops sending more data after a while, until it timeouts: https://dpaste.de/eXcB (I put it on dpaste because it's long)
[13:58:11] <Vulpix>	 https://dpaste.de/kreZ/raw
[14:00:05] <Vulpix>	 Well, it works now, maybe a temporary hiccup
[15:02:42] <wikibugs>	 10Operations, 10Phabricator, 10Release-Engineering-Team, 10Patch-For-Review: Convert Phabricator mail config to use cluster.mailers - https://phabricator.wikimedia.org/T212989 (10Paladox) they now include "void-recipient@<domain>" in the email https://github.com/phacility/phabricator/blob/73e3057c52f46ec6d...
[16:17:43] <wikibugs>	 10Operations, 10ops-eqiad, 10DC-Ops, 10cloud-services-team (Kanban): Update label and switch to rename labvirt1013 to cloudvirt1013 - https://phabricator.wikimedia.org/T212522 (10Andrew) a:05Andrew→03Cmjohnson
[18:26:30] <wikibugs>	 (03CR) 10Paladox: "> (or use an all-white version of the logo and make the green darker" [puppet] - 10https://gerrit.wikimedia.org/r/482379 (owner: 10Paladox)
[19:01:16] <wikibugs>	 10Operations, 10Core Platform Team (PHP7 (TEC4)), 10Core Platform Team Kanban (Doing), 10HHVM, and 3 others: Migrate to PHP 7 in WMF production - https://phabricator.wikimedia.org/T176370 (10Reedy) Luasandbox seems to be segfaulting on vagrant... dunno if it’s more widely replicable yet, or applicable to p...
[19:48:24] <icinga-wm>	 PROBLEM - Disk space on analytics-tool1002 is CRITICAL: DISK CRITICAL - free space: / 721 MB (3% inode=89%)
[20:21:21] <elukey>	 ouch
[20:21:27] <elukey>	 checking an-tool1002
[20:22:30] <icinga-wm>	 RECOVERY - Disk space on analytics-tool1002 is OK: DISK OK
[20:23:31] <elukey>	 !log manually clean up of big logs under /var/log/..
[20:23:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:23:37] <elukey>	 sigh
[20:23:43] * elukey amends the sal
[23:51:02] <wikibugs>	 (03PS1) 10BryanDavis: toolforge: Add missing php packages [puppet] - 10https://gerrit.wikimedia.org/r/482481
[23:52:37] <wikibugs>	 (03CR) 10BryanDavis: toolforge: Add missing php packages (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/482481 (owner: 10BryanDavis)