[01:36:01] <wikibugs>	 06Operations, 10OTRS: Proposal: Centralize OTRS login methodology - https://phabricator.wikimedia.org/T133476#2232834 (10Matthewrbowker)
[02:24:10] <wikibugs>	 06Operations, 10OTRS: Proposal: Centralize OTRS login methodology - https://phabricator.wikimedia.org/T133476#2232834 (10Krenair) While I was an agent I wrote scripts for the admins to turn half of that process into a simple form. What happened with that?  I also don't understand why you're proposing setting u...
[02:26:22] <icinga-wm>	 PROBLEM - RAID on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[02:27:03] <icinga-wm>	 PROBLEM - salt-minion processes on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[02:27:14] <icinga-wm>	 PROBLEM - DPKG on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[02:27:34] <icinga-wm>	 PROBLEM - Check size of conntrack table on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[02:27:41] <wikibugs>	 06Operations, 10OTRS: Proposal: Centralize OTRS login methodology - https://phabricator.wikimedia.org/T133476#2232878 (10Matthewrbowker) >>! In T133476#2232876, @Krenair wrote: > While I was an agent I wrote scripts for the admins to turn half of that process into a simple form. What happened with that?  Erm.....
[02:27:44] <icinga-wm>	 PROBLEM - SSH on mw1142 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[02:27:52] <icinga-wm>	 PROBLEM - configured eth on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[02:27:52] <icinga-wm>	 PROBLEM - nutcracker process on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[02:27:53] <icinga-wm>	 PROBLEM - HHVM processes on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[02:28:33] <icinga-wm>	 PROBLEM - dhclient process on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[02:28:52] <icinga-wm>	 PROBLEM - nutcracker port on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[02:31:53] <icinga-wm>	 PROBLEM - Disk space on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[02:33:03] <icinga-wm>	 RECOVERY - nutcracker port on mw1142 is OK: TCP OK - 0.000 second response time on port 11212
[02:33:22] <icinga-wm>	 RECOVERY - salt-minion processes on mw1142 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[02:33:23] <icinga-wm>	 RECOVERY - DPKG on mw1142 is OK: All packages OK
[02:33:43] <icinga-wm>	 RECOVERY - Disk space on mw1142 is OK: DISK OK
[02:33:44] <icinga-wm>	 RECOVERY - Check size of conntrack table on mw1142 is OK: OK: nf_conntrack is 0 % full
[02:33:54] <icinga-wm>	 RECOVERY - SSH on mw1142 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6 (protocol 2.0)
[02:34:02] <icinga-wm>	 RECOVERY - nutcracker process on mw1142 is OK: PROCS OK: 1 process with UID = 108 (nutcracker), command name nutcracker
[02:34:02] <icinga-wm>	 RECOVERY - configured eth on mw1142 is OK: OK - interfaces up
[02:34:03] <icinga-wm>	 RECOVERY - HHVM processes on mw1142 is OK: PROCS OK: 6 processes with command name hhvm
[02:34:27] <grrrit-wm>	 (03PS2) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[02:34:29] <grrrit-wm>	 (03PS1) 10Yuvipanda: uwsgi: Allow specifying multiple values for keys [puppet] - 10https://gerrit.wikimedia.org/r/285053 
[02:34:33] <icinga-wm>	 RECOVERY - RAID on mw1142 is OK: OK: no RAID installed
[02:34:43] <icinga-wm>	 RECOVERY - dhclient process on mw1142 is OK: PROCS OK: 0 processes with command name dhclient
[02:35:40] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) (owner: 10Yuvipanda)
[02:39:14] <grrrit-wm>	 (03PS3) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[02:40:14] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) (owner: 10Yuvipanda)
[02:41:34] <wikibugs>	 06Operations, 10OTRS: Proposal: Centralize OTRS login methodology - https://phabricator.wikimedia.org/T133476#2232879 (10Krenair) >>! In T133476#2232878, @Matthewrbowker wrote: >>>! In T133476#2232876, @Krenair wrote: >> While I was an agent I wrote scripts for the admins to turn half of that process into a si...
[02:44:02] <grrrit-wm>	 (03PS4) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[02:45:01] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) (owner: 10Yuvipanda)
[02:45:58] <wikibugs>	 06Operations, 10OTRS: Proposal: Centralize OTRS login methodology - https://phabricator.wikimedia.org/T133476#2232880 (10Krenair) So I guess the tricky part would be finding a sane way for OTRS admins to control the LDAP groups. We'd presumably want it integrated with either OTRS (sounds from "This module has...
[02:46:54] <icinga-wm>	 PROBLEM - RAID on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[02:47:03] <icinga-wm>	 PROBLEM - dhclient process on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[02:48:53] <icinga-wm>	 RECOVERY - RAID on mw1142 is OK: OK: no RAID installed
[02:49:02] <icinga-wm>	 RECOVERY - dhclient process on mw1142 is OK: PROCS OK: 0 processes with command name dhclient
[02:52:45] <wikibugs>	 06Operations, 10OTRS: Proposal: Centralize OTRS login methodology - https://phabricator.wikimedia.org/T133476#2232883 (10Krenair) And I suppose an argument in favour of a new separate LDAP system would be the existing OTRS/OTRS wiki users conflicting with the existing LDAP users - although maybe we could put O...
[03:00:10] <wikibugs>	 06Operations, 10OTRS: Proposal: Centralize OTRS login methodology - https://phabricator.wikimedia.org/T133476#2232886 (10Matthewrbowker) >>! In T133476#2232879, @Krenair wrote: >>>! In T133476#2232878, @Matthewrbowker wrote: >>>>! In T133476#2232876, @Krenair wrote: >>> While I was an agent I wrote scripts for...
[03:07:26] <wikibugs>	 06Operations, 10OTRS: Proposal: Centralize OTRS login methodology - https://phabricator.wikimedia.org/T133476#2232888 (10Krenair) >>! In T133476#2232886, @Matthewrbowker wrote: >>>! In T133476#2232879, @Krenair wrote: >>>>! In T133476#2232878, @Matthewrbowker wrote: >>>>>! In T133476#2232876, @Krenair wrote: >...
[03:09:03] <grrrit-wm>	 (03PS5) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[03:09:06] <grrrit-wm>	 (03PS2) 10Yuvipanda: uwsgi: Allow specifying multiple values for keys [puppet] - 10https://gerrit.wikimedia.org/r/285053 
[03:10:30] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) (owner: 10Yuvipanda)
[03:12:06] <grrrit-wm>	 (03PS6) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[03:12:07] <grrrit-wm>	 (03PS3) 10Yuvipanda: uwsgi: Allow specifying multiple values for keys [puppet] - 10https://gerrit.wikimedia.org/r/285053 
[03:12:51] <wikibugs>	 06Operations, 06Labs, 10Tool-Labs, 10Traffic, and 2 others: Detect tools.wmflabs.org tools which are HTTP-only - https://phabricator.wikimedia.org/T128409#2232903 (10scfc) @Magnus: https://petscan.wmflabs.org/ seems to work fine.  Did you mean something else?
[03:13:20] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) (owner: 10Yuvipanda)
[03:16:12] <icinga-wm>	 RECOVERY - Apache HTTP on mw1142 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 626 bytes in 0.066 second response time
[03:17:15] <icinga-wm>	 RECOVERY - HHVM rendering on mw1142 is OK: HTTP OK: HTTP/1.1 200 OK - 64810 bytes in 0.120 second response time
[03:18:03] <icinga-wm>	 RECOVERY - puppet last run on mw1142 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures
[03:22:43] <grrrit-wm>	 (03PS4) 10Yuvipanda: uwsgi: Allow specifying multiple values for keys [puppet] - 10https://gerrit.wikimedia.org/r/285053 
[03:23:40] <wikibugs>	 06Operations, 10OTRS: Proposal: Centralize OTRS login methodology - https://phabricator.wikimedia.org/T133476#2232904 (10Matthewrbowker) >>! In T133476#2232888, @Krenair wrote: >>>! In T133476#2232886, @Matthewrbowker wrote: >>>>! In T133476#2232879, @Krenair wrote: >>>>>! In T133476#2232878, @Matthewrbowker w...
[03:26:13] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032] uwsgi: Allow specifying multiple values for keys [puppet] - 10https://gerrit.wikimedia.org/r/285053 (owner: 10Yuvipanda)
[03:31:01] <grrrit-wm>	 (03CR) 10Ori.livneh: "Nice :)" [puppet] - 10https://gerrit.wikimedia.org/r/285053 (owner: 10Yuvipanda)
[03:34:07] <YuviPanda>	 ori: :D
[03:35:14] <icinga-wm>	 PROBLEM - puppet last run on mw1096 is CRITICAL: CRITICAL: Puppet has 1 failures
[03:35:23] <icinga-wm>	 PROBLEM - puppet last run on analytics1037 is CRITICAL: CRITICAL: Puppet has 1 failures
[03:35:52] <icinga-wm>	 PROBLEM - puppet last run on analytics1039 is CRITICAL: CRITICAL: Puppet has 1 failures
[03:36:32] <icinga-wm>	 PROBLEM - puppet last run on elastic1028 is CRITICAL: CRITICAL: Puppet has 1 failures
[03:36:52] <icinga-wm>	 PROBLEM - puppet last run on mw1192 is CRITICAL: CRITICAL: Puppet has 1 failures
[03:37:36] <grrrit-wm>	 (03CR) 10Tim Landscheidt: "The template uses the variable @ldapconfig that is set to the class parameter $ldap::role::config::labs::ldapconfig's value in various pla" [puppet] - 10https://gerrit.wikimedia.org/r/279682 (owner: 10Dzahn)
[03:43:03] <wikibugs>	 06Operations, 10DBA, 06Labs, 07Tracking: (Tracking) Database replication services - https://phabricator.wikimedia.org/T50930#2232909 (10scfc)
[03:45:42] <icinga-wm>	 PROBLEM - Disk space on ms-be2007 is CRITICAL: DISK CRITICAL - /srv/swift-storage/sdg1 is not accessible: Input/output error
[03:45:44] <icinga-wm>	 PROBLEM - Unmerged changes on repository puppet on strontium is CRITICAL: There is one unmerged change in puppet (dir /var/lib/git/operations/puppet).
[03:46:03] <icinga-wm>	 PROBLEM - RAID on ms-be2007 is CRITICAL: CRITICAL: 1 failed LD(s) (Offline)
[03:46:54] <icinga-wm>	 PROBLEM - puppet last run on ms-be2007 is CRITICAL: CRITICAL: Puppet has 1 failures
[03:49:11] <grrrit-wm>	 (03PS7) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[03:49:13] <grrrit-wm>	 (03PS1) 10Yuvipanda: uwsgi: Always die on term! [puppet] - 10https://gerrit.wikimedia.org/r/285054 
[03:50:32] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) (owner: 10Yuvipanda)
[03:50:51] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] uwsgi: Always die on term! [puppet] - 10https://gerrit.wikimedia.org/r/285054 (owner: 10Yuvipanda)
[03:51:06] <grrrit-wm>	 (03CR) 10Tim Landscheidt: [C: 031] toollabs: flake8 [puppet] - 10https://gerrit.wikimedia.org/r/283664 (owner: 10Ladsgroup)
[03:52:24] <YuviPanda>	 Let's pretend I went on a tirade now against aligning arrows
[03:54:11] <grrrit-wm>	 (03PS8) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[03:54:13] <grrrit-wm>	 (03PS2) 10Yuvipanda: uwsgi: Always die on term! [puppet] - 10https://gerrit.wikimedia.org/r/285054 
[03:54:56] <grrrit-wm>	 (03PS2) 10Yuvipanda: toollabs: flake8 [puppet] - 10https://gerrit.wikimedia.org/r/283664 (owner: 10Ladsgroup)
[03:55:08] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032 V: 032] "Thank you for the patch!" [puppet] - 10https://gerrit.wikimedia.org/r/283664 (owner: 10Ladsgroup)
[03:55:26] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) (owner: 10Yuvipanda)
[03:57:02] <grrrit-wm>	 (03PS9) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[04:01:13] <icinga-wm>	 RECOVERY - puppet last run on elastic1028 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[04:01:34] <icinga-wm>	 RECOVERY - puppet last run on mw1192 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures
[04:02:03] <icinga-wm>	 RECOVERY - puppet last run on mw1096 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[04:02:13] <icinga-wm>	 RECOVERY - Disk space on ms-be2007 is OK: DISK OK
[04:02:22] <icinga-wm>	 RECOVERY - puppet last run on analytics1037 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[04:02:42] <icinga-wm>	 RECOVERY - puppet last run on analytics1039 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[04:06:15] <grrrit-wm>	 (03PS10) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[04:07:21] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) (owner: 10Yuvipanda)
[04:08:33] <icinga-wm>	 RECOVERY - Unmerged changes on repository puppet on strontium is OK: No changes to merge.
[04:09:18] <grrrit-wm>	 (03PS11) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[04:09:45] <grrrit-wm>	 (03PS3) 10Yuvipanda: uwsgi: Always die on term! [puppet] - 10https://gerrit.wikimedia.org/r/285054 
[04:09:52] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032 V: 032] uwsgi: Always die on term! [puppet] - 10https://gerrit.wikimedia.org/r/285054 (owner: 10Yuvipanda)
[04:13:11] <grrrit-wm>	 (03CR) 10Tim Landscheidt: "@Dzahn: Did you mean to abandon this change?" [puppet] - 10https://gerrit.wikimedia.org/r/271735 (owner: 10Dzahn)
[04:22:04] <icinga-wm>	 PROBLEM - puppet last run on mw2200 is CRITICAL: CRITICAL: puppet fail
[04:50:42] <icinga-wm>	 RECOVERY - puppet last run on mw2200 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures
[05:15:06] <wikibugs>	 06Operations, 10OTRS: Proposal: Centralize OTRS login methodology - https://phabricator.wikimedia.org/T133476#2232933 (10Krd) Strong oppose. No need, more problems created than resolved, if any resolved at at. Additionally, no prior discussion has taken place at the appropriate venue, which would have been the...
[05:32:50] <grrrit-wm>	 (03PS12) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[05:33:57] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) (owner: 10Yuvipanda)
[05:42:58] <grrrit-wm>	 (03PS13) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[05:44:00] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) (owner: 10Yuvipanda)
[05:45:38] <grrrit-wm>	 (03PS14) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[06:31:31] <icinga-wm>	 PROBLEM - puppet last run on cp2001 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:32:31] <icinga-wm>	 PROBLEM - puppet last run on einsteinium is CRITICAL: CRITICAL: Puppet has 2 failures
[06:34:41] <icinga-wm>	 PROBLEM - puppet last run on mw2207 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:35:11] <icinga-wm>	 PROBLEM - puppet last run on mw2045 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:35:11] <icinga-wm>	 PROBLEM - puppet last run on cp3036 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:35:31] <icinga-wm>	 PROBLEM - puppet last run on mw2050 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:36:40] <grrrit-wm>	 (03PS15) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[06:36:42] <grrrit-wm>	 (03PS1) 10Yuvipanda: quarry: Make Quarry HTTPS only [puppet] - 10https://gerrit.wikimedia.org/r/285057 (https://phabricator.wikimedia.org/T107627) 
[06:37:34] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032 V: 032] quarry: Make Quarry HTTPS only [puppet] - 10https://gerrit.wikimedia.org/r/285057 (https://phabricator.wikimedia.org/T107627) (owner: 10Yuvipanda)
[06:50:58] <grrrit-wm>	 (03PS16) 10Yuvipanda: [WIP] MySQL backend for storing roles / hiera data for labs [puppet] - 10https://gerrit.wikimedia.org/r/285014 (https://phabricator.wikimedia.org/T133412) 
[06:51:00] <grrrit-wm>	 (03PS1) 10Yuvipanda: quarry: Enforce https only at nginx level [puppet] - 10https://gerrit.wikimedia.org/r/285058 (https://phabricator.wikimedia.org/T107627) 
[06:51:02] <grrrit-wm>	 (03PS1) 10Yuvipanda: extdist: Enforce HTTPS [puppet] - 10https://gerrit.wikimedia.org/r/285059 (https://phabricator.wikimedia.org/T133484) 
[06:52:12] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032 V: 032] quarry: Enforce https only at nginx level [puppet] - 10https://gerrit.wikimedia.org/r/285058 (https://phabricator.wikimedia.org/T107627) (owner: 10Yuvipanda)
[06:52:25] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032 V: 032] extdist: Enforce HTTPS [puppet] - 10https://gerrit.wikimedia.org/r/285059 (https://phabricator.wikimedia.org/T133484) (owner: 10Yuvipanda)
[06:56:11] <icinga-wm>	 PROBLEM - puppet last run on analytics1044 is CRITICAL: CRITICAL: Puppet has 1 failures
[06:56:33] <wikibugs>	 06Operations, 06Labs, 10Labs-Infrastructure, 10Quarry, and 3 others: Quarry should be HTTPS-only - https://phabricator.wikimedia.org/T107627#2233035 (10yuvipanda) It is now!
[06:56:39] <wikibugs>	 06Operations, 06Labs, 10Labs-Infrastructure, 10Quarry, and 3 others: Quarry should be HTTPS-only - https://phabricator.wikimedia.org/T107627#2233036 (10yuvipanda) 05Open>03Resolved a:03yuvipanda
[06:57:11] <icinga-wm>	 RECOVERY - puppet last run on einsteinium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:57:12] <icinga-wm>	 RECOVERY - puppet last run on mw2207 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures
[06:57:41] <icinga-wm>	 RECOVERY - puppet last run on mw2045 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:57:50] <icinga-wm>	 RECOVERY - puppet last run on cp3036 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures
[06:58:10] <icinga-wm>	 RECOVERY - puppet last run on mw2050 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[06:58:12] <icinga-wm>	 RECOVERY - puppet last run on cp2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[07:00:21] <icinga-wm>	 PROBLEM - puppet last run on mw2208 is CRITICAL: CRITICAL: Puppet has 1 failures
[07:18:35] <icinga-wm>	 PROBLEM - puppet last run on mw1140 is CRITICAL: CRITICAL: Puppet has 55 failures
[07:21:17] <icinga-wm>	 RECOVERY - puppet last run on analytics1044 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures
[07:25:26] <icinga-wm>	 RECOVERY - puppet last run on mw2208 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures
[07:45:33] <icinga-wm>	 PROBLEM - Apache HTTP on mw1140 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[07:47:14] <icinga-wm>	 PROBLEM - HHVM rendering on mw1140 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[07:47:43] <icinga-wm>	 PROBLEM - nutcracker process on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:47:44] <icinga-wm>	 PROBLEM - salt-minion processes on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:47:44] <icinga-wm>	 PROBLEM - dhclient process on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:47:44] <icinga-wm>	 PROBLEM - DPKG on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:47:54] <icinga-wm>	 PROBLEM - RAID on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:48:04] <icinga-wm>	 PROBLEM - Disk space on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:48:23] <icinga-wm>	 PROBLEM - HHVM processes on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:48:30] <grrrit-wm>	 (03CR) 10Dereckson: "We now also need to add flow_computed.dblist introduced in c99dce08" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/281977 (owner: 10Dereckson)
[07:48:35] <icinga-wm>	 PROBLEM - SSH on mw1140 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[07:48:54] <icinga-wm>	 PROBLEM - configured eth on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:49:04] <icinga-wm>	 PROBLEM - Check size of conntrack table on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:49:23] <icinga-wm>	 PROBLEM - nutcracker port on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:50:03] <icinga-wm>	 RECOVERY - Disk space on mw1140 is OK: DISK OK
[07:53:43] <icinga-wm>	 RECOVERY - nutcracker process on mw1140 is OK: PROCS OK: 1 process with UID = 108 (nutcracker), command name nutcracker
[07:53:44] <icinga-wm>	 RECOVERY - dhclient process on mw1140 is OK: PROCS OK: 0 processes with command name dhclient
[07:53:44] <icinga-wm>	 RECOVERY - salt-minion processes on mw1140 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[07:55:54] <icinga-wm>	 PROBLEM - puppet last run on mw1138 is CRITICAL: CRITICAL: Puppet has 73 failures
[07:59:21] <grrrit-wm>	 (03PS1) 10Dereckson: noc: jobqueue-eqiad.php.txt → jobqueue.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/285061 
[07:59:54] <icinga-wm>	 PROBLEM - nutcracker process on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:59:55] <icinga-wm>	 PROBLEM - dhclient process on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[07:59:55] <icinga-wm>	 PROBLEM - salt-minion processes on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:00:23] <icinga-wm>	 PROBLEM - Disk space on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:01:04] <grrrit-wm>	 (03PS1) 10Dereckson: noc: PoolCounterSettings-eqiad.php → PoolCounterSettings.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/285062 (https://phabricator.wikimedia.org/T133324) 
[08:02:15] <grrrit-wm>	 (03PS2) 10Dereckson: Flow dblist on noc.wikimedia.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/281977 
[08:02:38] <grrrit-wm>	 (03CR) 10Dereckson: "PS2: +flow_computed.dblist" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/281977 (owner: 10Dereckson)
[08:04:34] <grrrit-wm>	 (03CR) 10Dereckson: [C: 031] Add *.asc-test.nl to wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/284712 (https://phabricator.wikimedia.org/T133286) (owner: 10Urbanecm)
[08:12:39] <icinga-wm>	 RECOVERY - RAID on mw1140 is OK: OK: no RAID installed
[08:12:47] <icinga-wm>	 RECOVERY - HHVM processes on mw1140 is OK: PROCS OK: 6 processes with command name hhvm
[08:18:48] <icinga-wm>	 PROBLEM - RAID on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:18:58] <icinga-wm>	 PROBLEM - HHVM processes on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[08:37:48] <icinga-wm>	 RECOVERY - SSH on mw1140 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6 (protocol 2.0)
[08:37:48] <icinga-wm>	 RECOVERY - Check size of conntrack table on mw1140 is OK: OK: nf_conntrack is 0 % full
[08:37:49] <icinga-wm>	 RECOVERY - nutcracker port on mw1140 is OK: TCP OK - 0.000 second response time on port 11212
[08:37:58] <icinga-wm>	 RECOVERY - DPKG on mw1140 is OK: All packages OK
[08:38:08] <icinga-wm>	 RECOVERY - nutcracker process on mw1140 is OK: PROCS OK: 1 process with UID = 108 (nutcracker), command name nutcracker
[08:38:19] <icinga-wm>	 RECOVERY - Disk space on mw1140 is OK: DISK OK
[08:38:38] <icinga-wm>	 RECOVERY - configured eth on mw1140 is OK: OK - interfaces up
[08:38:47] <icinga-wm>	 RECOVERY - salt-minion processes on mw1140 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[08:39:08] <icinga-wm>	 RECOVERY - RAID on mw1140 is OK: OK: no RAID installed
[08:39:17] <icinga-wm>	 RECOVERY - HHVM processes on mw1140 is OK: PROCS OK: 6 processes with command name hhvm
[08:39:28] <icinga-wm>	 RECOVERY - Apache HTTP on mw1140 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 626 bytes in 0.060 second response time
[08:39:28] <icinga-wm>	 RECOVERY - dhclient process on mw1140 is OK: PROCS OK: 0 processes with command name dhclient
[08:40:38] <icinga-wm>	 RECOVERY - puppet last run on mw1140 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures
[08:40:38] <icinga-wm>	 RECOVERY - HHVM rendering on mw1140 is OK: HTTP OK: HTTP/1.1 200 OK - 66871 bytes in 0.095 second response time
[09:22:57] <icinga-wm>	 PROBLEM - HHVM rendering on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[09:24:08] <icinga-wm>	 PROBLEM - Apache HTTP on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[09:24:37] <icinga-wm>	 PROBLEM - RAID on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[09:24:37] <icinga-wm>	 PROBLEM - Check size of conntrack table on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[09:24:38] <icinga-wm>	 PROBLEM - nutcracker process on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[09:24:48] <icinga-wm>	 PROBLEM - nutcracker port on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[09:24:58] <icinga-wm>	 PROBLEM - DPKG on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[09:25:27] <icinga-wm>	 PROBLEM - Disk space on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[09:25:28] <icinga-wm>	 PROBLEM - dhclient process on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[09:30:29] <icinga-wm>	 PROBLEM - configured eth on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[09:30:39] <icinga-wm>	 PROBLEM - HHVM processes on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[09:32:21] <icinga-wm>	 PROBLEM - puppet last run on mw2120 is CRITICAL: CRITICAL: Puppet has 1 failures
[09:33:12] <icinga-wm>	 PROBLEM - SSH on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[09:34:33] <icinga-wm>	 PROBLEM - salt-minion processes on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[09:49:32] <icinga-wm>	 RECOVERY - nutcracker process on mw1138 is OK: PROCS OK: 1 process with UID = 108 (nutcracker), command name nutcracker
[09:49:33] <icinga-wm>	 RECOVERY - dhclient process on mw1138 is OK: PROCS OK: 0 processes with command name dhclient
[09:49:51] <icinga-wm>	 RECOVERY - DPKG on mw1138 is OK: All packages OK
[09:49:52] <icinga-wm>	 RECOVERY - nutcracker port on mw1138 is OK: TCP OK - 0.000 second response time on port 11212
[09:50:12] <icinga-wm>	 RECOVERY - Disk space on mw1138 is OK: DISK OK
[09:50:32] <icinga-wm>	 RECOVERY - salt-minion processes on mw1138 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[09:50:43] <icinga-wm>	 RECOVERY - HHVM processes on mw1138 is OK: PROCS OK: 6 processes with command name hhvm
[09:51:03] <icinga-wm>	 RECOVERY - SSH on mw1138 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6 (protocol 2.0)
[09:53:21] <icinga-wm>	 RECOVERY - configured eth on mw1138 is OK: OK - interfaces up
[09:53:31] <icinga-wm>	 RECOVERY - RAID on mw1138 is OK: OK: no RAID installed
[09:53:31] <icinga-wm>	 RECOVERY - Check size of conntrack table on mw1138 is OK: OK: nf_conntrack is 9 % full
[09:53:31] <icinga-wm>	 RECOVERY - Apache HTTP on mw1138 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 626 bytes in 0.048 second response time
[09:54:03] <icinga-wm>	 RECOVERY - HHVM rendering on mw1138 is OK: HTTP OK: HTTP/1.1 200 OK - 66343 bytes in 0.150 second response time
[09:54:12] <icinga-wm>	 RECOVERY - puppet last run on mw1138 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures
[09:57:53] <icinga-wm>	 RECOVERY - puppet last run on mw2120 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures
[10:00:36] <grrrit-wm>	 (03PS1) 10Yuvipanda: tools: Remove unused import [puppet] - 10https://gerrit.wikimedia.org/r/285065 
[10:00:38] <grrrit-wm>	 (03PS1) 10Yuvipanda: toollabs: Check env $PORT before $2 [puppet] - 10https://gerrit.wikimedia.org/r/285066 (https://phabricator.wikimedia.org/T98442) 
[10:01:13] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032 V: 032] tools: Remove unused import [puppet] - 10https://gerrit.wikimedia.org/r/285065 (owner: 10Yuvipanda)
[10:10:17] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032] toollabs: Check env $PORT before $2 [puppet] - 10https://gerrit.wikimedia.org/r/285066 (https://phabricator.wikimedia.org/T98442) (owner: 10Yuvipanda)
[10:53:29] <wikibugs>	 06Operations, 06Labs, 10Tool-Labs, 10Traffic, and 2 others: Detect tools.wmflabs.org tools which are HTTP-only - https://phabricator.wikimedia.org/T128409#2233119 (10Magnus) Wouldn't it be better for http to always redirect to https?
[12:22:23] <wikibugs>	 06Operations, 06Labs, 10Tool-Labs, 10Traffic, and 2 others: Detect tools.wmflabs.org tools which are HTTP-only - https://phabricator.wikimedia.org/T128409#2073537 (10tom29739) I think in Tools the tool never 'sees' http because everything goes through the proxy. So all tools should be compatible if http is...
[13:23:57] <icinga-wm>	 PROBLEM - puppet last run on ms-fe3002 is CRITICAL: CRITICAL: puppet fail
[13:36:57] <wikibugs>	 06Operations, 06Labs, 10Tool-Labs, 10Traffic, and 2 others: Detect tools.wmflabs.org tools which are HTTP-only - https://phabricator.wikimedia.org/T128409#2233299 (10yuvipanda) >>! In T128409#2217073, @valhallasw wrote: >> * naively grep all the projects' PHP and JavaScript code looking for hardcoded http:...
[13:38:09] <grrrit-wm>	 (03PS1) 10Yuvipanda: Revert "dynamicproxy: custom log schema (http/https) for tools" [puppet] - 10https://gerrit.wikimedia.org/r/285070 (https://phabricator.wikimedia.org/T128409) 
[13:38:17] <grrrit-wm>	 (03PS2) 10Yuvipanda: Revert "dynamicproxy: custom log schema (http/https) for tools" [puppet] - 10https://gerrit.wikimedia.org/r/285070 (https://phabricator.wikimedia.org/T128409) 
[13:38:44] <grrrit-wm>	 (03CR) 10Yuvipanda: [C: 032 V: 032] Revert "dynamicproxy: custom log schema (http/https) for tools" [puppet] - 10https://gerrit.wikimedia.org/r/285070 (https://phabricator.wikimedia.org/T128409) (owner: 10Yuvipanda)
[13:43:53] <wikibugs>	 06Operations, 06Labs, 10Tool-Labs, 10Traffic, and 2 others: Detect tools.wmflabs.org tools which are HTTP-only - https://phabricator.wikimedia.org/T128409#2073537 (10BBlack) >>! In T128409#2233237, @tom29739 wrote: > In Tools the tool never 'sees' http because everything goes through the proxy:  >>>! In T1...
[13:50:57] <icinga-wm>	 RECOVERY - puppet last run on ms-fe3002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[14:29:22] <icinga-wm>	 PROBLEM - Apache HTTP on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[14:30:23] <icinga-wm>	 PROBLEM - HHVM rendering on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[14:30:52] <icinga-wm>	 PROBLEM - RAID on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:31:02] <icinga-wm>	 PROBLEM - SSH on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[14:31:03] <icinga-wm>	 PROBLEM - Disk space on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:31:12] <icinga-wm>	 PROBLEM - nutcracker process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:31:22] <icinga-wm>	 PROBLEM - salt-minion processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:31:23] <icinga-wm>	 PROBLEM - configured eth on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:31:43] <icinga-wm>	 PROBLEM - dhclient process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:32:12] <icinga-wm>	 PROBLEM - Check size of conntrack table on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:32:13] <icinga-wm>	 PROBLEM - puppet last run on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:35:04] <icinga-wm>	 RECOVERY - Disk space on mw1135 is OK: DISK OK
[14:35:04] <icinga-wm>	 RECOVERY - SSH on mw1135 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6 (protocol 2.0)
[14:35:04] <icinga-wm>	 RECOVERY - nutcracker process on mw1135 is OK: PROCS OK: 1 process with UID = 108 (nutcracker), command name nutcracker
[14:35:23] <icinga-wm>	 RECOVERY - salt-minion processes on mw1135 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[14:36:13] <icinga-wm>	 PROBLEM - nutcracker port on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:36:13] <icinga-wm>	 PROBLEM - HHVM processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:36:33] <icinga-wm>	 PROBLEM - DPKG on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:41:12] <icinga-wm>	 PROBLEM - SSH on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[14:41:13] <icinga-wm>	 PROBLEM - Disk space on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:41:22] <icinga-wm>	 PROBLEM - nutcracker process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:41:24] <icinga-wm>	 PROBLEM - salt-minion processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[14:55:23] <icinga-wm>	 RECOVERY - Disk space on mw1135 is OK: DISK OK
[14:55:23] <icinga-wm>	 RECOVERY - nutcracker process on mw1135 is OK: PROCS OK: 1 process with UID = 108 (nutcracker), command name nutcracker
[14:55:23] <icinga-wm>	 RECOVERY - SSH on mw1135 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6 (protocol 2.0)
[14:55:33] <icinga-wm>	 RECOVERY - salt-minion processes on mw1135 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[14:55:34] <icinga-wm>	 RECOVERY - configured eth on mw1135 is OK: OK - interfaces up
[14:55:44] <icinga-wm>	 RECOVERY - Apache HTTP on mw1135 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 626 bytes in 0.039 second response time
[14:55:53] <icinga-wm>	 RECOVERY - dhclient process on mw1135 is OK: PROCS OK: 0 processes with command name dhclient
[14:56:13] <icinga-wm>	 RECOVERY - nutcracker port on mw1135 is OK: TCP OK - 0.000 second response time on port 11212
[14:56:23] <icinga-wm>	 RECOVERY - Check size of conntrack table on mw1135 is OK: OK: nf_conntrack is 10 % full
[14:56:23] <icinga-wm>	 RECOVERY - HHVM processes on mw1135 is OK: PROCS OK: 6 processes with command name hhvm
[14:56:24] <icinga-wm>	 RECOVERY - puppet last run on mw1135 is OK: OK: Puppet is currently enabled, last run 58 minutes ago with 0 failures
[14:56:42] <icinga-wm>	 RECOVERY - DPKG on mw1135 is OK: All packages OK
[14:56:42] <icinga-wm>	 RECOVERY - HHVM rendering on mw1135 is OK: HTTP OK: HTTP/1.1 200 OK - 66355 bytes in 0.099 second response time
[14:57:02] <icinga-wm>	 RECOVERY - RAID on mw1135 is OK: OK: no RAID installed
[15:02:03] <icinga-wm>	 PROBLEM - configured eth on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[15:02:41] <icinga-wm>	 PROBLEM - SSH on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[15:03:09] <icinga-wm>	 RECOVERY - configured eth on mw1135 is OK: OK - interfaces up
[15:03:40] <icinga-wm>	 PROBLEM - Apache HTTP on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[15:04:41] <icinga-wm>	 PROBLEM - HHVM rendering on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[15:05:19] <icinga-wm>	 PROBLEM - RAID on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[15:05:39] <icinga-wm>	 PROBLEM - Check size of conntrack table on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[15:05:45] <grrrit-wm>	 (03PS12) 10BBlack: letsencrypt module guts + acme-setup script [puppet] - 10https://gerrit.wikimedia.org/r/283988 (https://phabricator.wikimedia.org/T132812) 
[15:06:09] <icinga-wm>	 PROBLEM - puppet last run on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[15:06:23] <grrrit-wm>	 (03CR) 10BBlack: letsencrypt module guts + acme-setup script (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/283988 (https://phabricator.wikimedia.org/T132812) (owner: 10BBlack)
[15:08:22] <wikibugs>	 06Operations, 10OTRS: Proposal: Centralize OTRS login methodology - https://phabricator.wikimedia.org/T133476#2233464 (10Krenair) >>! In T133476#2232933, @Krd wrote: > No need, more problems created than resolved, if any resolved at at.  I don't think you could say no problems resolved until details were confi...
[15:09:12] <wikibugs>	 06Operations, 06Discovery, 10Wikidata, 10Wikidata-Query-Service, 07Varnish: Wikidata Query Service REST endpoint returns truncated results - https://phabricator.wikimedia.org/T133490#2233471 (10Mushroom)
[15:11:18] <wikibugs>	 06Operations, 06Discovery, 10Traffic, 10Wikidata, 10Wikidata-Query-Service: Wikidata Query Service REST endpoint returns truncated results - https://phabricator.wikimedia.org/T133490#2233472 (10BBlack)
[15:13:50] <icinga-wm>	 PROBLEM - nutcracker process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[15:13:50] <icinga-wm>	 PROBLEM - nutcracker port on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[15:14:00] <icinga-wm>	 PROBLEM - DPKG on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[15:14:16] <wikibugs>	 07Puppet, 10Beta-Cluster-Infrastructure, 06Labs, 13Patch-For-Review: /etc/puppet/puppet.conf keeps getting double content - first for labs-wide puppetmaster, then for the correct puppetmaster - https://phabricator.wikimedia.org/T132689#2233473 (10Krenair) deployment-cache-text04:/etc/puppet/puppet.conf.d/1...
[15:14:21] <icinga-wm>	 PROBLEM - HHVM processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[15:14:29] <icinga-wm>	 PROBLEM - salt-minion processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[15:14:30] <icinga-wm>	 PROBLEM - configured eth on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[15:14:36] <grrrit-wm>	 (03PS13) 10BBlack: letsencrypt module guts + acme-setup script [puppet] - 10https://gerrit.wikimedia.org/r/283988 (https://phabricator.wikimedia.org/T132812) 
[15:15:10] <icinga-wm>	 PROBLEM - dhclient process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[15:15:50] <icinga-wm>	 PROBLEM - Disk space on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[15:16:02] <wikibugs>	 07Puppet, 10Beta-Cluster-Infrastructure, 06Labs, 13Patch-For-Review: /etc/puppet/puppet.conf keeps getting double content - first for labs-wide puppetmaster, then for the correct puppetmaster - https://phabricator.wikimedia.org/T132689#2233475 (10Krenair)
[15:16:05] <wikibugs>	 07Puppet, 10Beta-Cluster-Infrastructure, 07Tracking: Deployment-prep hosts with puppet errors (tracking) - https://phabricator.wikimedia.org/T132259#2233474 (10Krenair)
[15:21:10] <icinga-wm>	 RECOVERY - dhclient process on mw1135 is OK: PROCS OK: 0 processes with command name dhclient
[15:24:20] <icinga-wm>	 RECOVERY - salt-minion processes on mw1135 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[15:24:20] <icinga-wm>	 RECOVERY - HHVM processes on mw1135 is OK: PROCS OK: 6 processes with command name hhvm
[15:27:21] <icinga-wm>	 RECOVERY - RAID on mw1135 is OK: OK: no RAID installed
[15:27:39] <icinga-wm>	 RECOVERY - SSH on mw1135 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6 (protocol 2.0)
[15:27:39] <icinga-wm>	 RECOVERY - Check size of conntrack table on mw1135 is OK: OK: nf_conntrack is 2 % full
[15:27:48] <grrrit-wm>	 (03PS14) 10BBlack: letsencrypt module guts + acme-setup script [puppet] - 10https://gerrit.wikimedia.org/r/283988 (https://phabricator.wikimedia.org/T132812) 
[15:27:50] <grrrit-wm>	 (03PS9) 10BBlack: create letsencrypt module, install acme-tiny [puppet] - 10https://gerrit.wikimedia.org/r/283761 (https://phabricator.wikimedia.org/T132812) (owner: 10Dzahn)
[15:27:50] <icinga-wm>	 RECOVERY - nutcracker process on mw1135 is OK: PROCS OK: 1 process with UID = 108 (nutcracker), command name nutcracker
[15:27:51] <icinga-wm>	 RECOVERY - Disk space on mw1135 is OK: DISK OK
[15:27:51] <icinga-wm>	 RECOVERY - nutcracker port on mw1135 is OK: TCP OK - 0.000 second response time on port 11212
[15:28:00] <icinga-wm>	 RECOVERY - DPKG on mw1135 is OK: All packages OK
[15:28:09] <icinga-wm>	 RECOVERY - puppet last run on mw1135 is OK: OK: Puppet is currently enabled, last run 1 hour ago with 0 failures
[15:28:31] <icinga-wm>	 RECOVERY - configured eth on mw1135 is OK: OK - interfaces up
[15:28:40] <icinga-wm>	 RECOVERY - HHVM rendering on mw1135 is OK: HTTP OK: HTTP/1.1 200 OK - 64354 bytes in 0.081 second response time
[15:29:21] <icinga-wm>	 RECOVERY - Apache HTTP on mw1135 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 626 bytes in 0.051 second response time
[15:35:05] <grrrit-wm>	 (03CR) 10BBlack: [C: 031] "acme-setup PS14 variant has had some testing against the acme staging server with jessie nginx config as docced. The bundling and puppeti" [puppet] - 10https://gerrit.wikimedia.org/r/283988 (https://phabricator.wikimedia.org/T132812) (owner: 10BBlack)
[15:36:29] <icinga-wm>	 PROBLEM - puppet last run on mw1135 is CRITICAL: CRITICAL: Puppet has 68 failures
[15:37:29] <icinga-wm>	 PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 648 600 - REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5157531 keys - replication_delay is 648
[15:39:29] <icinga-wm>	 RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5139529 keys - replication_delay is 0
[15:40:01] <grrrit-wm>	 (03PS15) 10BBlack: letsencrypt module guts + acme-setup script [puppet] - 10https://gerrit.wikimedia.org/r/283988 (https://phabricator.wikimedia.org/T132812) 
[15:40:27] <grrrit-wm>	 (03CR) 10BBlack: [C: 031] "PS15: pep8 fixups" [puppet] - 10https://gerrit.wikimedia.org/r/283988 (https://phabricator.wikimedia.org/T132812) (owner: 10BBlack)
[15:45:00] <icinga-wm>	 PROBLEM - puppet last run on cp3035 is CRITICAL: CRITICAL: puppet fail
[15:53:57] <grrrit-wm>	 (03PS1) 10BBlack: update-ocsp: time validity check bugfix [puppet] - 10https://gerrit.wikimedia.org/r/285072 
[16:05:11] <wikibugs>	 06Operations, 10Traffic, 10Wikimedia-Stream: stream.wikimedia.org - redirect http(s) to docs - https://phabricator.wikimedia.org/T70528#2233508 (10Krenair) doesn't look like apache is involved in this - see modules/rcstream/templates/rcstream.nginx.erb in the puppet repo
[16:05:37] <wikibugs>	 06Operations, 10Traffic, 10Wikimedia-Stream: stream.wikimedia.org - redirect http(s) to docs - https://phabricator.wikimedia.org/T70528#2233510 (10Krenair) (of course now I see it was me who added that project in the first place, oops...)
[16:06:50] <grrrit-wm>	 (03PS1) 10Reedy: Disable Special:GlobalAllocation as it OOMs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/285073 (https://phabricator.wikimedia.org/T55443) 
[16:11:31] <icinga-wm>	 RECOVERY - puppet last run on cp3035 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[16:18:26] <wikibugs>	 06Operations, 10Traffic, 10Wikimedia-Stream: stream.wikimedia.org - redirect http(s) to docs - https://phabricator.wikimedia.org/T70528#723098 (10BBlack) See also https://gerrit.wikimedia.org/r/#/c/284760/ pending patch, from trying to fix the /=>404 issue for HTTPS reasons in T132521 (the basic HTTPS issues...
[16:28:31] <icinga-wm>	 PROBLEM - Apache HTTP on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[16:28:50] <icinga-wm>	 PROBLEM - SSH on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[16:28:50] <icinga-wm>	 PROBLEM - Check size of conntrack table on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:30:02] <icinga-wm>	 PROBLEM - HHVM rendering on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[16:30:49] <icinga-wm>	 PROBLEM - RAID on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:31:36] <icinga-wm>	 PROBLEM - Disk space on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:31:36] <icinga-wm>	 PROBLEM - nutcracker process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:31:36] <icinga-wm>	 PROBLEM - nutcracker port on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:32:15] <icinga-wm>	 PROBLEM - DPKG on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:33:15] <icinga-wm>	 PROBLEM - configured eth on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:33:34] <icinga-wm>	 PROBLEM - dhclient process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:33:55] <icinga-wm>	 RECOVERY - Disk space on mw1135 is OK: DISK OK
[16:34:24] <icinga-wm>	 RECOVERY - nutcracker process on mw1135 is OK: PROCS OK: 1 process with UID = 108 (nutcracker), command name nutcracker
[16:34:24] <icinga-wm>	 RECOVERY - nutcracker port on mw1135 is OK: TCP OK - 0.000 second response time on port 11212
[16:35:34] <icinga-wm>	 RECOVERY - dhclient process on mw1135 is OK: PROCS OK: 0 processes with command name dhclient
[16:35:34] <icinga-wm>	 RECOVERY - DPKG on mw1135 is OK: All packages OK
[16:37:52] <grrrit-wm>	 (03CR) 10Awight: [C: 031] "Thank you and apologies for the ongoing fiasco!" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/285073 (https://phabricator.wikimedia.org/T55443) (owner: 10Reedy)
[16:39:55] <icinga-wm>	 PROBLEM - Disk space on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:40:24] <icinga-wm>	 PROBLEM - nutcracker port on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:40:25] <icinga-wm>	 PROBLEM - nutcracker process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:40:25] <icinga-wm>	 PROBLEM - HHVM processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:40:44] <icinga-wm>	 PROBLEM - salt-minion processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:40:50] <grrrit-wm>	 (03PS2) 10Tim Landscheidt: Remove unused import in labs [puppet] - 10https://gerrit.wikimedia.org/r/279896 (owner: 10Ladsgroup)
[16:41:34] <icinga-wm>	 PROBLEM - dhclient process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:41:35] <icinga-wm>	 PROBLEM - DPKG on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[16:42:45] <grrrit-wm>	 (03CR) 10Glaisher: "I'll note that this page does work *sometimes* and is useful when it works. Unless this causes huge issues, I'm not sure if we should be d" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/285073 (https://phabricator.wikimedia.org/T55443) (owner: 10Reedy)
[16:56:24] <icinga-wm>	 RECOVERY - nutcracker port on mw1135 is OK: TCP OK - 0.000 second response time on port 11212
[16:56:24] <icinga-wm>	 RECOVERY - HHVM processes on mw1135 is OK: PROCS OK: 6 processes with command name hhvm
[16:56:24] <icinga-wm>	 RECOVERY - nutcracker process on mw1135 is OK: PROCS OK: 1 process with UID = 108 (nutcracker), command name nutcracker
[16:56:54] <icinga-wm>	 RECOVERY - salt-minion processes on mw1135 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[17:02:43] <icinga-wm>	 PROBLEM - salt-minion processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[17:04:03] <icinga-wm>	 PROBLEM - HHVM processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[17:05:32] <icinga-wm>	 PROBLEM - nutcracker port on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[17:05:44] <icinga-wm>	 PROBLEM - nutcracker process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[17:13:53] <icinga-wm>	 RECOVERY - HHVM processes on mw1135 is OK: PROCS OK: 6 processes with command name hhvm
[17:14:44] <icinga-wm>	 PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 603 600 - REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5143313 keys - replication_delay is 603
[17:20:02] <icinga-wm>	 PROBLEM - HHVM processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[17:22:52] <icinga-wm>	 RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5138977 keys - replication_delay is 0
[17:30:13] <icinga-wm>	 RECOVERY - HHVM processes on mw1135 is OK: PROCS OK: 6 processes with command name hhvm
[17:36:32] <icinga-wm>	 PROBLEM - HHVM processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[17:43:13] <icinga-wm>	 RECOVERY - dhclient process on mw1135 is OK: PROCS OK: 0 processes with command name dhclient
[17:47:43] <grrrit-wm>	 (03PS1) 10Alex Monk: Get rid of redirects for non-resolving/parked domains [puppet] - 10https://gerrit.wikimedia.org/r/285084 (https://phabricator.wikimedia.org/T105981) 
[17:49:23] <icinga-wm>	 PROBLEM - dhclient process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[17:56:25] <grrrit-wm>	 (03PS1) 10Alex Monk: Set up yue.wikipedia.org DNS record [dns] - 10https://gerrit.wikimedia.org/r/285085 (https://phabricator.wikimedia.org/T105999) 
[17:56:48] <grrrit-wm>	 (03PS1) 10Alex Monk: Redirect yue.wikipedia.org to zh-yue.wikipedia.org for now [puppet] - 10https://gerrit.wikimedia.org/r/285086 (https://phabricator.wikimedia.org/T105999) 
[17:57:01] <wikibugs>	 06Operations, 10DNS, 10Traffic, 10Wikimedia-Apache-configuration, 13Patch-For-Review: Redirect yue.wikipedia.org to zh-yue.wikipedia.org - https://phabricator.wikimedia.org/T105999#2233666 (10Krenair) a:03Krenair
[18:12:02] <icinga-wm>	 PROBLEM - NTP on mw1135 is CRITICAL: NTP CRITICAL: No response from NTP server
[18:20:02] <icinga-wm>	 RECOVERY - NTP on mw1135 is OK: NTP OK: Offset -0.05896604061 secs
[18:25:23] <icinga-wm>	 RECOVERY - Disk space on mw1135 is OK: DISK OK
[18:27:42] <icinga-wm>	 PROBLEM - puppet last run on mw1116 is CRITICAL: CRITICAL: Puppet has 61 failures
[18:31:33] <icinga-wm>	 PROBLEM - Disk space on mw1135 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake.
[18:43:02] <icinga-wm>	 RECOVERY - dhclient process on mw1135 is OK: PROCS OK: 0 processes with command name dhclient
[18:46:52] <icinga-wm>	 PROBLEM - puppet last run on db2009 is CRITICAL: CRITICAL: puppet fail
[18:48:22] <icinga-wm>	 RECOVERY - salt-minion processes on mw1135 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[18:50:02] <grrrit-wm>	 (03CR) 10Reedy: "Having a way to cause OOMs isn't good for the cluster." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/285073 (https://phabricator.wikimedia.org/T55443) (owner: 10Reedy)
[18:53:22] <icinga-wm>	 PROBLEM - dhclient process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[18:54:33] <icinga-wm>	 PROBLEM - salt-minion processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[19:13:42] <icinga-wm>	 RECOVERY - puppet last run on db2009 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures
[20:04:44] <ori>	 !log Deployed change Ib7e248ccf to statsv (commit id 5323cece2b3; task T132770)
[20:04:46] <stashbot>	 T132770: "Throughput of EventLogging NavigationTiming events" UNKNOWN - https://phabricator.wikimedia.org/T132770
[20:04:50] <morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master
[20:05:52] <icinga-wm>	 PROBLEM - NTP on mw1135 is CRITICAL: NTP CRITICAL: No response from NTP server
[20:05:53] <ori>	 bd808: I love Stashbot. The Phabricator "Mentioned in SAL" thing is great.
[20:06:34] <bd808>	 :) Thanks. I think that g.reg asked for that bit
[20:08:02] <bd808>	 nope, it was go.dog in T108720
[20:08:03] <stashbot>	 T108720: pick up ticket mentions from !log lines - https://phabricator.wikimedia.org/T108720
[20:39:48] <_joe_>	 indeed it's very useful
[20:39:51] <_joe_>	 thanks
[20:42:14] <wikibugs>	 06Operations, 10Wikimedia-General-or-Unknown: Page on aswikisource not accessible via page title, only via "curid" - https://phabricator.wikimedia.org/T133505#2233789 (10Ciencia_Al_Poder) https://as.wikisource.org/w/api.php?action=query&prop=info&pageids=30 displays:  ```lang=json {     "batchcomplete": "",...
[20:45:12] <icinga-wm>	 PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 10.192.48.44 on port 6479
[20:45:48] <Reedy>	 !log ran namespaceDupes.php on aswikisource
[20:45:52] <morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master
[20:46:16] <wikibugs>	 06Operations, 10Wikimedia-General-or-Unknown: Page on aswikisource not accessible via page title, only via "curid" - https://phabricator.wikimedia.org/T133505#2233772 (10Reedy) ``` reedy@tin:~$ mwscript namespaceDupes.php aswikisource id=869 ns=0 dbk=Author:গণেশ_গগৈ *** dest title exists and --add-prefix not s...
[20:47:12] <icinga-wm>	 RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5142938 keys - replication_delay is 0
[21:07:32] <icinga-wm>	 PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 636 600 - REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5144289 keys - replication_delay is 636
[21:21:33] <icinga-wm>	 RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS on 10.192.48.44:6479 has 1 databases (db0) with 5143434 keys - replication_delay is 0
[21:49:05] <icinga-wm>	 RECOVERY - NTP on mw1135 is OK: NTP OK: Offset -0.09681725502 secs
[21:51:13] <grrrit-wm>	 (03CR) 10Legoktm: "Thanks, do you want to make the eqiad and codfw files visible too?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/285062 (https://phabricator.wikimedia.org/T133324) (owner: 10Dereckson)
[22:04:44] <icinga-wm>	 RECOVERY - HHVM processes on mw1135 is OK: PROCS OK: 6 processes with command name hhvm
[22:10:45] <icinga-wm>	 PROBLEM - HHVM processes on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[22:15:15] <icinga-wm>	 PROBLEM - citoid endpoints health on scb2002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[22:17:16] <icinga-wm>	 RECOVERY - citoid endpoints health on scb2002 is OK: All endpoints are healthy
[22:22:05] <icinga-wm>	 RECOVERY - SSH on mw1135 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6 (protocol 2.0)
[22:22:34] <icinga-wm>	 RECOVERY - configured eth on mw1135 is OK: OK - interfaces up
[22:22:34] <icinga-wm>	 RECOVERY - HHVM processes on mw1135 is OK: PROCS OK: 6 processes with command name hhvm
[22:22:45] <icinga-wm>	 RECOVERY - dhclient process on mw1135 is OK: PROCS OK: 0 processes with command name dhclient
[22:23:05] <icinga-wm>	 RECOVERY - nutcracker port on mw1135 is OK: TCP OK - 0.000 second response time on port 11212
[22:23:05] <icinga-wm>	 RECOVERY - DPKG on mw1135 is OK: All packages OK
[22:23:14] <icinga-wm>	 RECOVERY - Apache HTTP on mw1135 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 628 bytes in 5.413 second response time
[22:23:24] <icinga-wm>	 RECOVERY - nutcracker process on mw1135 is OK: PROCS OK: 1 process with UID = 108 (nutcracker), command name nutcracker
[22:23:24] <icinga-wm>	 RECOVERY - salt-minion processes on mw1135 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[22:23:34] <icinga-wm>	 RECOVERY - Check size of conntrack table on mw1135 is OK: OK: nf_conntrack is 1 % full
[22:23:44] <icinga-wm>	 RECOVERY - Disk space on mw1135 is OK: DISK OK
[22:23:55] <icinga-wm>	 RECOVERY - RAID on mw1135 is OK: OK: no RAID installed
[22:24:15] <icinga-wm>	 RECOVERY - HHVM rendering on mw1135 is OK: HTTP OK: HTTP/1.1 200 OK - 64364 bytes in 0.309 second response time
[22:25:36] <icinga-wm>	 RECOVERY - puppet last run on mw1135 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures
[22:28:24] <wikibugs>	 06Operations, 06WMF-Legal, 07Privacy: Consider moving policy.wikimedia.org away from WordPress.com - https://phabricator.wikimedia.org/T132104#2233892 (10Platonides) @Slaporte, you will probably know better the target authors. Who will be changing this site and how often?  Currently, there seems to be 9 page...
[22:45:22] <Platonides>	 ping mutante 
[23:25:24] <icinga-wm>	 PROBLEM - HHVM rendering on mw1116 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[23:25:35] <icinga-wm>	 PROBLEM - Apache HTTP on mw1116 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[23:25:54] <icinga-wm>	 PROBLEM - SSH on mw1116 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[23:26:04] <icinga-wm>	 PROBLEM - dhclient process on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:26:15] <icinga-wm>	 PROBLEM - configured eth on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:26:24] <icinga-wm>	 PROBLEM - salt-minion processes on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:26:25] <icinga-wm>	 PROBLEM - HHVM processes on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:26:35] <icinga-wm>	 PROBLEM - nutcracker port on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:26:54] <icinga-wm>	 PROBLEM - nutcracker process on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:26:56] <icinga-wm>	 PROBLEM - Check size of conntrack table on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:27:34] <icinga-wm>	 PROBLEM - RAID on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:27:35] <icinga-wm>	 PROBLEM - Disk space on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:27:44] <icinga-wm>	 PROBLEM - DPKG on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:33:32] <icinga-wm>	 RECOVERY - HHVM processes on mw1116 is OK: PROCS OK: 6 processes with command name hhvm
[23:33:51] <icinga-wm>	 RECOVERY - Disk space on mw1116 is OK: DISK OK
[23:34:00] <icinga-wm>	 RECOVERY - RAID on mw1116 is OK: OK: no RAID installed
[23:39:10] <icinga-wm>	 PROBLEM - HHVM processes on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:39:41] <icinga-wm>	 PROBLEM - RAID on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:39:50] <icinga-wm>	 PROBLEM - Disk space on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:50:30] <icinga-wm>	 RECOVERY - nutcracker port on mw1116 is OK: TCP OK - 0.000 second response time on port 11212
[23:50:41] <icinga-wm>	 RECOVERY - dhclient process on mw1116 is OK: PROCS OK: 0 processes with command name dhclient
[23:50:41] <icinga-wm>	 RECOVERY - nutcracker process on mw1116 is OK: PROCS OK: 1 process with UID = 108 (nutcracker), command name nutcracker
[23:50:41] <icinga-wm>	 RECOVERY - Check size of conntrack table on mw1116 is OK: OK: nf_conntrack is 0 % full
[23:50:52] <icinga-wm>	 RECOVERY - HHVM processes on mw1116 is OK: PROCS OK: 6 processes with command name hhvm
[23:51:01] <icinga-wm>	 RECOVERY - salt-minion processes on mw1116 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion
[23:51:01] <icinga-wm>	 RECOVERY - configured eth on mw1116 is OK: OK - interfaces up
[23:51:30] <icinga-wm>	 RECOVERY - RAID on mw1116 is OK: OK: no RAID installed
[23:51:31] <icinga-wm>	 RECOVERY - Disk space on mw1116 is OK: DISK OK
[23:56:31] <icinga-wm>	 PROBLEM - nutcracker port on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:56:41] <icinga-wm>	 PROBLEM - Check size of conntrack table on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:56:50] <icinga-wm>	 PROBLEM - dhclient process on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:56:50] <icinga-wm>	 PROBLEM - nutcracker process on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:57:01] <icinga-wm>	 PROBLEM - HHVM processes on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:57:10] <icinga-wm>	 PROBLEM - configured eth on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:57:31] <icinga-wm>	 PROBLEM - RAID on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.
[23:59:10] <icinga-wm>	 PROBLEM - salt-minion processes on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.