[00:00:29] <grrrit-wm>	 (03CR) 10Dzahn: [C: 032] racktables - apache, load mod_headers [puppet] - 10https://gerrit.wikimedia.org/r/160168 (owner: 10Dzahn)
[00:05:08] <grrrit-wm>	 (03CR) 10Dzahn: "fixed after I5a3032eb907a29ce5" [puppet] - 10https://gerrit.wikimedia.org/r/160164 (owner: 10Dzahn)
[00:12:35] <grrrit-wm>	 (03PS1) 10Ori.livneh: labs: update /data/project/apache/common-local -> /srv/mediawiki [puppet] - 10https://gerrit.wikimedia.org/r/160170 
[00:13:00] <ori-l_>	 ^ mutante, it's after five on a friday, but are you up for reviewing a small beta-only change? :)
[00:20:09] <gwicke>	 mutante is afk currently
[00:20:26] <ori>	 gwicke: ah, thanks
[00:23:57] <icinga-wm>	 PROBLEM - puppet last run on cp3020 is CRITICAL: CRITICAL: Epic puppet fail  
[00:25:27] <grrrit-wm>	 (03PS2) 10Ori.livneh: labs: update /data/project/apache/common-local -> /srv/mediawiki [puppet] - 10https://gerrit.wikimedia.org/r/160170 
[00:25:41] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 032 V: 032] "Applied in Labs; did the right thing." [puppet] - 10https://gerrit.wikimedia.org/r/160170 (owner: 10Ori.livneh)
[00:42:40] <icinga-wm>	 RECOVERY - puppet last run on cp3020 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures  
[01:13:27] <grrrit-wm>	 (03CR) 10Dzahn: [C: 032] "just an image" [puppet] - 10https://gerrit.wikimedia.org/r/160157 (owner: 10Dzahn)
[01:41:12] <logmsgbot>	 !log ori Synchronized php-1.24wmf20/extensions/Flow: Update flow for I4da934dfe (duration: 00m 08s)
[01:41:19] <morebots>	 Logged the message, Master
[01:45:31] <logmsgbot>	 !log ori Synchronized php-1.24wmf20/extensions/Flow: Update flow for I4da934dfe (duration: 00m 06s)
[01:45:38] <morebots>	 Logged the message, Master
[01:45:42] <logmsgbot>	 !log ori Synchronized php-1.24wmf21/extensions/Flow: Update flow for I4da934dfe (duration: 00m 06s)
[01:45:48] <morebots>	 Logged the message, Master
[01:46:05] <ori>	 ebernhardson: did wmf20 twice because i forgot to update the submodule the first time around
[02:05:58] <icinga-wm>	 PROBLEM - Disk space on virt0 is CRITICAL: DISK CRITICAL - free space: /a 3875 MB (3% inode=99%):  
[02:18:28] <icinga-wm>	 PROBLEM - puppet last run on mw1112 is CRITICAL: CRITICAL: Puppet has 1 failures  
[02:19:57] <icinga-wm>	 PROBLEM - puppet last run on mw1047 is CRITICAL: CRITICAL: Puppet has 1 failures  
[02:20:57] <icinga-wm>	 PROBLEM - puppet last run on mw1015 is CRITICAL: CRITICAL: Puppet has 1 failures  
[02:22:28] <icinga-wm>	 PROBLEM - puppet last run on mw1161 is CRITICAL: CRITICAL: Puppet has 1 failures  
[02:31:09] <icinga-wm>	 PROBLEM - puppet last run on mw1014 is CRITICAL: CRITICAL: Puppet has 1 failures  
[02:32:58] <icinga-wm>	 PROBLEM - puppet last run on mw1171 is CRITICAL: CRITICAL: Puppet has 1 failures  
[02:34:48] <icinga-wm>	 RECOVERY - puppet last run on mw1112 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures  
[02:37:08] <icinga-wm>	 RECOVERY - puppet last run on mw1047 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures  
[02:38:27] <logmsgbot>	 !log LocalisationUpdate completed (1.24wmf20) at 2014-09-13 02:38:26+00:00
[02:38:33] <morebots>	 Logged the message, Master
[02:39:12] <icinga-wm>	 RECOVERY - puppet last run on mw1015 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures  
[02:41:00] <icinga-wm>	 RECOVERY - puppet last run on mw1161 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures  
[02:49:07] <icinga-wm>	 RECOVERY - puppet last run on mw1014 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures  
[02:51:18] <icinga-wm>	 RECOVERY - puppet last run on mw1171 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures  
[03:00:58] <icinga-wm>	 RECOVERY - Disk space on virt0 is OK: DISK OK  
[03:11:40] <logmsgbot>	 !log LocalisationUpdate completed (1.24wmf21) at 2014-09-13 03:11:40+00:00
[03:11:47] <morebots>	 Logged the message, Master
[03:20:49] <icinga-wm>	 PROBLEM - puppet last run on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[03:47:58] <legoktm>	 I'm seeing random database errors via runJobs.php in exception.log:
[03:48:05] <legoktm>	 2014-09-12 20:45:29 mw1053 metawiki: [ef65e9e7] /rpc/RunJobs.php?wiki=metawiki&type=cirrusSearchLinksUpdate&maxtime=30&maxmem=300M   Exception from line 1216 of /srv/mediawiki/php-1.24wmf20/includes/db/Database.php: A database error has occurred. Did you forget to run maintenance/update.php after upgrading?  See: https://www.mediawiki.org/wiki/Manual:Upgrading#Run_the_update_script
[03:51:25] <legoktm>	 !log global rename for Trevor Parscal --> Trevor Parscal (WMF) looks stuck on metawiki and mswiki, in queued state for both but showJobs.php says the jobs are active and claimed
[03:51:31] <morebots>	 Logged the message, Master
[04:17:04] <ori>	 interesting
[04:22:04] <logmsgbot>	 !log LocalisationUpdate ResourceLoader cache refresh completed at Sat Sep 13 04:22:04 UTC 2014 (duration 22m 3s)
[04:22:09] <morebots>	 Logged the message, Master
[04:41:49] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0]  
[04:42:27] <legoktm>	 !log global rename for Trevor Parscal (WMF) unstuck itself, yay
[04:42:33] <morebots>	 Logged the message, Master
[04:42:45] <icinga-wm>	 PROBLEM - puppet last run on cp4017 is CRITICAL: CRITICAL: Puppet has 1 failures  
[04:51:08] <greg-g>	 hah
[04:54:33] <icinga-wm>	 PROBLEM - HTTP error ratio anomaly detection on tungsten is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 4 below the confidence bounds  
[04:57:03] <icinga-wm>	 PROBLEM - puppet last run on cp4012 is CRITICAL: CRITICAL: Epic puppet fail  
[04:59:04] <icinga-wm>	 RECOVERY - puppet last run on cp4017 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures  
[05:07:34] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0]  
[05:09:35] <icinga-wm>	 PROBLEM - puppet last run on cp4019 is CRITICAL: Timeout while attempting connection  
[05:09:54] <icinga-wm>	 PROBLEM - puppet last run on cp4001 is CRITICAL: Timeout while attempting connection  
[05:11:24] <icinga-wm>	 PROBLEM - puppet last run on cp4014 is CRITICAL: CRITICAL: Puppet has 2 failures  
[05:14:27] <icinga-wm>	 PROBLEM - puppet last run on lvs4003 is CRITICAL: CRITICAL: Puppet has 1 failures  
[05:15:33] <icinga-wm>	 RECOVERY - puppet last run on cp4012 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures  
[05:21:29] <grrrit-wm>	 (03CR) 10Legoktm: "Is this actually dependent upon the androidsdk patch?" [puppet] - 10https://gerrit.wikimedia.org/r/153784 (owner: 10Yuvipanda)
[05:26:53] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0]  
[05:27:43] <icinga-wm>	 RECOVERY - puppet last run on cp4014 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures  
[05:27:55] <icinga-wm>	 RECOVERY - puppet last run on cp4001 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures  
[05:28:45] <icinga-wm>	 RECOVERY - puppet last run on cp4019 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures  
[05:29:43] <icinga-wm>	 RECOVERY - puppet last run on lvs4003 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures  
[06:19:44] <icinga-wm>	 PROBLEM - puppet last run on cp4015 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:19:44] <icinga-wm>	 PROBLEM - puppet last run on cp4010 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:19:45] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [500.0]  
[06:23:34] <icinga-wm>	 PROBLEM - puppet last run on cp4007 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:26:06] <icinga-wm>	 PROBLEM - SSH on pdf3 is CRITICAL: Server answer:  
[06:27:05] <icinga-wm>	 RECOVERY - SSH on pdf3 is OK: SSH OK - OpenSSH_4.7p1 Debian-8ubuntu3 (protocol 2.0)  
[06:28:34] <icinga-wm>	 PROBLEM - puppet last run on db1023 is CRITICAL: CRITICAL: Epic puppet fail  
[06:28:34] <icinga-wm>	 PROBLEM - puppet last run on mw1126 is CRITICAL: CRITICAL: Epic puppet fail  
[06:28:46] <icinga-wm>	 PROBLEM - puppet last run on mw1002 is CRITICAL: CRITICAL: Epic puppet fail  
[06:28:46] <icinga-wm>	 PROBLEM - puppet last run on gallium is CRITICAL: CRITICAL: Epic puppet fail  
[06:29:05] <icinga-wm>	 PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Epic puppet fail  
[06:29:27] <icinga-wm>	 PROBLEM - puppet last run on db1046 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:29:37] <icinga-wm>	 PROBLEM - puppet last run on db1059 is CRITICAL: CRITICAL: Puppet has 4 failures  
[06:29:44] <icinga-wm>	 PROBLEM - puppet last run on cp1061 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:29:44] <icinga-wm>	 PROBLEM - puppet last run on analytics1030 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:44] <icinga-wm>	 PROBLEM - puppet last run on mw1170 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:55] <icinga-wm>	 PROBLEM - puppet last run on mw1092 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:30:04] <icinga-wm>	 PROBLEM - puppet last run on mw1025 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:04] <icinga-wm>	 PROBLEM - puppet last run on mw1118 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:04] <icinga-wm>	 PROBLEM - puppet last run on db1018 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:30:04] <icinga-wm>	 PROBLEM - puppet last run on mw1144 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:14] <icinga-wm>	 PROBLEM - puppet last run on db1040 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:14] <icinga-wm>	 PROBLEM - puppet last run on mw1119 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:34] <icinga-wm>	 PROBLEM - puppet last run on mw1052 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:44] <icinga-wm>	 PROBLEM - puppet last run on mw1123 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:45] <icinga-wm>	 PROBLEM - puppet last run on cp4003 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:45] <icinga-wm>	 PROBLEM - puppet last run on cp4008 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:54] <icinga-wm>	 PROBLEM - puppet last run on amssq55 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:55] <icinga-wm>	 PROBLEM - puppet last run on cp3014 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:55] <icinga-wm>	 PROBLEM - puppet last run on amssq47 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:32:04] <icinga-wm>	 RECOVERY - HTTP error ratio anomaly detection on tungsten is OK: OK: No anomaly detected  
[06:33:54] <icinga-wm>	 PROBLEM - puppet last run on hooft is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:34:57] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on logstash1003 is CRITICAL: CRITICAL - elasticsearch http://10.64.32.136:9200/_cluster/health error while fetching: Request timed out.  
[06:36:04] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on logstash1002 is CRITICAL: CRITICAL - elasticsearch http://10.64.32.137:9200/_cluster/health error while fetching: Request timed out.  
[06:36:14] <icinga-wm>	 PROBLEM - ElasticSearch health check on logstash1001 is CRITICAL: CRITICAL - Could not connect to server 10.64.32.138  
[06:36:20] <icinga-wm>	 PROBLEM - ElasticSearch health check on logstash1002 is CRITICAL: CRITICAL - Could not connect to server 10.64.32.137  
[06:37:54] <icinga-wm>	 RECOVERY - puppet last run on cp4010 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures  
[06:37:55] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0]  
[06:38:54] <icinga-wm>	 RECOVERY - puppet last run on cp4015 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures  
[06:41:44] <icinga-wm>	 RECOVERY - puppet last run on cp4007 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures  
[06:42:05] <icinga-wm>	 RECOVERY - ElasticSearch health check on logstash1001 is OK: OK - elasticsearch (production-logstash-eqiad) is running. status: green: timed_out: false: number_of_nodes: 3: number_of_data_nodes: 3: active_primary_shards: 36: active_shards: 103: relocating_shards: 0: initializing_shards: 0: unassigned_shards: 0  
[06:42:05] <icinga-wm>	 RECOVERY - ElasticSearch health check on logstash1002 is OK: OK - elasticsearch (production-logstash-eqiad) is running. status: green: timed_out: false: number_of_nodes: 3: number_of_data_nodes: 3: active_primary_shards: 36: active_shards: 103: relocating_shards: 0: initializing_shards: 0: unassigned_shards: 0  
[06:44:57] <icinga-wm>	 RECOVERY - Disk space on ms1004 is OK: DISK OK  
[06:45:14] <icinga-wm>	 RECOVERY - puppet last run on db1018 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures  
[06:45:34] <icinga-wm>	 RECOVERY - puppet last run on db1046 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures  
[06:45:35] <icinga-wm>	 RECOVERY - puppet last run on mw1052 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures  
[06:45:45] <icinga-wm>	 RECOVERY - puppet last run on db1059 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures  
[06:45:46] <icinga-wm>	 RECOVERY - puppet last run on cp1061 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures  
[06:45:46] <icinga-wm>	 RECOVERY - puppet last run on analytics1030 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures  
[06:45:46] <icinga-wm>	 RECOVERY - puppet last run on mw1123 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures  
[06:45:54] <icinga-wm>	 RECOVERY - puppet last run on mw1170 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures  
[06:45:54] <icinga-wm>	 RECOVERY - puppet last run on cp4003 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures  
[06:46:05] <icinga-wm>	 RECOVERY - puppet last run on mw1092 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures  
[06:46:06] <icinga-wm>	 RECOVERY - puppet last run on cp3014 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures  
[06:46:19] <icinga-wm>	 RECOVERY - puppet last run on mw1025 is OK: OK: Puppet is currently enabled, last run 60 seconds ago with 0 failures  
[06:46:21] <icinga-wm>	 RECOVERY - puppet last run on mw1118 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures  
[06:46:24] <icinga-wm>	 RECOVERY - puppet last run on mw1144 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures  
[06:46:37] <icinga-wm>	 RECOVERY - puppet last run on db1040 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures  
[06:46:37] <icinga-wm>	 RECOVERY - puppet last run on mw1119 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures  
[06:46:44] <icinga-wm>	 RECOVERY - puppet last run on db1023 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures  
[06:46:56] <icinga-wm>	 RECOVERY - puppet last run on gallium is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures  
[06:46:56] <icinga-wm>	 RECOVERY - puppet last run on cp4008 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures  
[06:47:05] <icinga-wm>	 RECOVERY - puppet last run on amssq47 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures  
[06:47:44] <icinga-wm>	 RECOVERY - puppet last run on mw1126 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures  
[06:47:58] <icinga-wm>	 RECOVERY - puppet last run on mw1002 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures  
[06:47:59] <icinga-wm>	 PROBLEM - Disk space on ms1004 is CRITICAL: DISK CRITICAL - free space: / 2 MB (0% inode=94%): /var/lib/ureadahead/debugfs 2 MB (0% inode=94%):  
[06:48:05] <icinga-wm>	 RECOVERY - puppet last run on amssq55 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures  
[06:48:15] <icinga-wm>	 RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures  
[06:51:05] <icinga-wm>	 RECOVERY - puppet last run on hooft is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures  
[06:56:35] <icinga-wm>	 PROBLEM - puppet last run on ssl1001 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:59:05] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on logstash1003 is CRITICAL: CRITICAL - elasticsearch http://10.64.32.136:9200/_cluster/health error while fetching: Request timed out.  
[07:00:45] <icinga-wm>	 PROBLEM - ElasticSearch health check on logstash1003 is CRITICAL: CRITICAL - Could not connect to server 10.64.32.136  
[07:01:05] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on logstash1003 is CRITICAL: CRITICAL - elasticsearch http://10.64.32.136:9200/_cluster/health error while fetching: Request timed out.  
[07:11:45] <icinga-wm>	 RECOVERY - ElasticSearch health check on logstash1003 is OK: OK - elasticsearch (production-logstash-eqiad) is running. status: green: timed_out: false: number_of_nodes: 3: number_of_data_nodes: 3: active_primary_shards: 36: active_shards: 103: relocating_shards: 0: initializing_shards: 0: unassigned_shards: 0  
[07:13:54] <icinga-wm>	 RECOVERY - puppet last run on ssl1001 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures  
[08:18:59] <icinga-wm>	 PROBLEM - puppet last run on amssq38 is CRITICAL: CRITICAL: Puppet has 1 failures  
[08:33:11] <icinga-wm>	 RECOVERY - puppet last run on amssq38 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures  
[09:38:10] <icinga-wm>	 PROBLEM - puppet last run on amssq37 is CRITICAL: CRITICAL: Epic puppet fail  
[09:56:32] <icinga-wm>	 RECOVERY - puppet last run on amssq37 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures  
[10:15:43] <icinga-wm>	 PROBLEM - Disk space on elastic1009 is CRITICAL: DISK CRITICAL - free space: /var/lib/elasticsearch 20122 MB (3% inode=99%):  
[11:44:23] <icinga-wm>	 PROBLEM - puppet last run on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[12:57:55] <grrrit-wm>	 (03CR) 10Calak: [C: 031] Remove 'renameuser' right from bureaucrats on CentralAuth wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/160158 (owner: 10Legoktm)
[13:06:33] <icinga-wm>	 PROBLEM - puppet last run on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[14:21:04] <icinga-wm>	 PROBLEM - puppet last run on amssq57 is CRITICAL: CRITICAL: Epic puppet fail  
[14:40:35] <icinga-wm>	 RECOVERY - puppet last run on amssq57 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures  
[15:22:01] <grrrit-wm>	 (03CR) 10JanZerebecki: "The nda group is not restricted to people not employed by the WMF. If someone should get access to things like logstash and servermon and " [puppet] - 10https://gerrit.wikimedia.org/r/159419 (owner: 10Dzahn)
[16:15:14] <grrrit-wm>	 (03CR) 10Filippo Giunchedi: "LGMT, minor comment" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/153783 (owner: 10Ori.livneh)
[16:16:07] <grrrit-wm>	 (03CR) 10Filippo Giunchedi: "LGTM, modulo a pending comment in modules/salt/manifests/minion.pp" [puppet] - 10https://gerrit.wikimedia.org/r/153727 (owner: 10Ori.livneh)
[17:18:27] <icinga-wm>	 PROBLEM - puppet last run on cp3018 is CRITICAL: CRITICAL: Puppet has 1 failures  
[17:31:53] <grrrit-wm>	 (03CR) 10Hoo man: [C: 04-1] "Had a quick look" (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/155753 (owner: 1001tonythomas)
[17:35:47] <icinga-wm>	 RECOVERY - puppet last run on cp3018 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures  
[17:45:44] <grrrit-wm>	 (03PS41) 1001tonythomas: Added the bouncehandler router to catch in all bounce emails [puppet] - 10https://gerrit.wikimedia.org/r/155753 
[17:59:29] <grrrit-wm>	 (03PS42) 1001tonythomas: Added the bouncehandler router to catch in all bounce emails [puppet] - 10https://gerrit.wikimedia.org/r/155753 
[18:22:35] <Steinsplitter>	 MatmaRex: gifs are in scope, therfore no need to nominete new one.
[18:22:59] <MatmaRex>	 huh?
[18:39:34] <MatmaRex>	 Steinsplitter: ooooh, you referred to the commons deletion request i started. that was a very confusing message to see in this channel.
[18:44:16] <Steinsplitter>	 yes, you are not in -common ;)
[18:45:47] <grrrit-wm>	 (03PS1) 10Ori.livneh: mediawiki: add some in-line documentation [puppet] - 10https://gerrit.wikimedia.org/r/160225 
[18:45:55] <ori>	 Steinsplitter: *nominate
[18:51:37] <Steinsplitter>	 MatmaRex: and pls dont edit in the MW namespace there :) thanks.
[18:52:42] <MatmaRex>	 Steinsplitter: are you implying i did something wrong in commons' MediaWiki namespace?
[18:53:22] <Steinsplitter>	 no. but ther was engough drama. so pls don't do it again.
[18:53:27] <MatmaRex>	 (why are we discussing this on this channel? please pm me if you want to continue.)
[18:53:52] <MatmaRex>	 Steinsplitter: are you implying i caused any of the drama? jesus
[18:54:29] <MatmaRex>	 if somebody has problems with my edits, i'd love them to tell me
[19:54:24] <grrrit-wm>	 (03CR) 10Ori.livneh: [C: 032] mediawiki: add some in-line documentation [puppet] - 10https://gerrit.wikimedia.org/r/160225 (owner: 10Ori.livneh)
[20:18:04] <grrrit-wm>	 (03PS1) 10Ori.livneh: hhvm: add comment about translation cache [puppet] - 10https://gerrit.wikimedia.org/r/160231 
[20:21:42] <grrrit-wm>	 (03PS1) 10Ori.livneh: misc::maintenance: clean-up [puppet] - 10https://gerrit.wikimedia.org/r/160232 
[20:35:08] <icinga-wm>	 PROBLEM - check_fundraising_jobs on db1025 is CRITICAL: CRITICAL missing_thank_yous=1537 [critical =500]: recurring_gc_contribs_missed=0: recurring_gc_failures_missed=0: recurring_gc_jobs_required=962: recurring_gc_schedule_sanity=0  
[20:39:08] <icinga-wm>	 PROBLEM - puppet last run on cp3017 is CRITICAL: CRITICAL: Epic puppet fail  
[20:40:08] <icinga-wm>	 PROBLEM - check_fundraising_jobs on db1025 is CRITICAL: CRITICAL missing_thank_yous=2197 [critical =500]: recurring_gc_contribs_missed=0: recurring_gc_failures_missed=0: recurring_gc_jobs_required=962: recurring_gc_schedule_sanity=0  
[20:45:08] <icinga-wm>	 PROBLEM - check_fundraising_jobs on db1025 is CRITICAL: CRITICAL missing_thank_yous=506 [critical =500]: recurring_gc_contribs_missed=0: recurring_gc_failures_missed=0: recurring_gc_jobs_required=962: recurring_gc_schedule_sanity=0  
[20:50:08] <icinga-wm>	 RECOVERY - check_fundraising_jobs on db1025 is OK: OK missing_thank_yous=0: recurring_gc_contribs_missed=0: recurring_gc_failures_missed=0: recurring_gc_jobs_required=962: recurring_gc_schedule_sanity=0  
[20:53:39] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0]  
[20:58:18] <icinga-wm>	 RECOVERY - puppet last run on cp3017 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures  
[21:07:07] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0]  
[22:57:23] <icinga-wm>	 PROBLEM - RAID on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:00:08] <icinga-wm>	 RECOVERY - RAID on fenari is OK: OK: Active: 2, Working: 2, Failed: 0, Spare: 0  
[23:03:22] <icinga-wm>	 PROBLEM - RAID on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:04:22] <icinga-wm>	 RECOVERY - RAID on fenari is OK: OK: Active: 2, Working: 2, Failed: 0, Spare: 0  
[23:07:36] <icinga-wm>	 PROBLEM - DPKG on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:08:23] <icinga-wm>	 RECOVERY - DPKG on fenari is OK: All packages OK  
[23:20:42] <icinga-wm>	 PROBLEM - RAID on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:22:24] <icinga-wm>	 PROBLEM - puppet last run on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:23:34] <icinga-wm>	 PROBLEM - HTTP on fenari is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[23:29:22] <icinga-wm>	 RECOVERY - puppet last run on fenari is OK: OK: Puppet is currently enabled, last run 4250 seconds ago with 0 failures  
[23:33:44] <icinga-wm>	 PROBLEM - check configured eth on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:33:44] <icinga-wm>	 PROBLEM - DPKG on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:34:32] <icinga-wm>	 PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 13.33% of data above the critical threshold [500.0]  
[23:35:03] <icinga-wm>	 PROBLEM - puppet last run on cp4020 is CRITICAL: CRITICAL: Epic puppet fail  
[23:36:55] <icinga-wm>	 RECOVERY - DPKG on fenari is OK: All packages OK  
[23:36:55] <icinga-wm>	 RECOVERY - check configured eth on fenari is OK: NRPE: Unable to read output  
[23:37:53] <icinga-wm>	 PROBLEM - SSH on fenari is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[23:38:12] <icinga-wm>	 PROBLEM - puppet last run on cp4012 is CRITICAL: CRITICAL: Puppet has 1 failures  
[23:38:52] <icinga-wm>	 RECOVERY - SSH on fenari is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.4 (protocol 2.0)  
[23:39:32] <icinga-wm>	 RECOVERY - HTTP on fenari is OK: HTTP OK: HTTP/1.1 200 OK - 4775 bytes in 5.523 second response time  
[23:41:33] <icinga-wm>	 PROBLEM - puppet last run on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:42:33] <icinga-wm>	 RECOVERY - puppet last run on fenari is OK: OK: Puppet is currently enabled, last run 5040 seconds ago with 0 failures  
[23:42:33] <icinga-wm>	 PROBLEM - HTTP on fenari is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[23:44:03] <icinga-wm>	 PROBLEM - DPKG on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:45:42] <icinga-wm>	 PROBLEM - puppet last run on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:46:02] <icinga-wm>	 PROBLEM - Disk space on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:46:02] <icinga-wm>	 PROBLEM - check configured eth on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:47:03] <icinga-wm>	 PROBLEM - check if dhclient is running on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:48:02] <icinga-wm>	 PROBLEM - nutcracker process on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:48:02] <icinga-wm>	 PROBLEM - nutcracker port on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:48:55] <icinga-wm>	 RECOVERY - nutcracker port on fenari is OK: TCP OK - 0.000 second response time on port 11212  
[23:49:54] <icinga-wm>	 RECOVERY - nutcracker process on fenari is OK: PROCS OK: 1 process with UID = 116 (nutcracker), command name nutcracker  
[23:49:54] <icinga-wm>	 RECOVERY - Disk space on fenari is OK: DISK OK  
[23:49:54] <icinga-wm>	 RECOVERY - check if dhclient is running on fenari is OK: PROCS OK: 0 processes with command name dhclient  
[23:49:54] <icinga-wm>	 RECOVERY - check configured eth on fenari is OK: NRPE: Unable to read output  
[23:50:43] <icinga-wm>	 RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0]  
[23:51:07] <icinga-wm>	 RECOVERY - DPKG on fenari is OK: All packages OK  
[23:51:53] <icinga-wm>	 RECOVERY - RAID on fenari is OK: OK: Active: 2, Working: 2, Failed: 0, Spare: 0  
[23:53:43] <icinga-wm>	 PROBLEM - puppet last run on cp3009 is CRITICAL: CRITICAL: Puppet has 1 failures  
[23:54:23] <icinga-wm>	 RECOVERY - puppet last run on cp4020 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures  
[23:55:12] <icinga-wm>	 PROBLEM - check configured eth on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:55:12] <icinga-wm>	 PROBLEM - RAID on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:55:12] <icinga-wm>	 PROBLEM - nutcracker process on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[23:55:22] <icinga-wm>	 RECOVERY - puppet last run on cp4012 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures  
[23:58:04] <icinga-wm>	 RECOVERY - nutcracker process on fenari is OK: PROCS OK: 1 process with UID = 116 (nutcracker), command name nutcracker  
[23:58:04] <icinga-wm>	 RECOVERY - check configured eth on fenari is OK: NRPE: Unable to read output  
[23:59:47] <legoktm>	 is fenari having issues?