[00:00:17] <AndyRussG>	 Right... I mean, my concern was that since Web requests are automatically wrapped in a transaction
[00:00:25] <hoo>	 if you don't wait / care for slaves you will likely get your master into trouble, but that's not a common case on our set up
[00:02:01] <AndyRussG>	 fantastic...! thanks so much :)
[00:02:45] <AndyRussG>	 BTW... related question... do you know where all this is configured, and what bit of the code does the automatic scoping of a transaction into a web request? (or maybe I'm misunderstanding?)
[00:03:22] <AndyRussG>	 Sorry to be so bothersome, it'd just be nice to see in full detail how it all works 8p
[00:09:12] <hoo>	 AndyRussG: https://www.mediawiki.org/wiki/Manual:$wgDBservers see flags there
[00:09:24] <hoo>	 related code should be in the databsebase class
[00:10:23] <AndyRussG>	 hoo: gotcha... thanks a million, really appreciate it! :D
[00:14:09] <hoo>	 :)
[00:40:43] <icinga-wm>	 PROBLEM - Disk space on ocg1001 is CRITICAL: DISK CRITICAL - free space: / 337 MB (3% inode=72%):  
[00:50:52] <icinga-wm>	 PROBLEM - Disk space on ocg1001 is CRITICAL: DISK CRITICAL - free space: / 350 MB (3% inode=72%):  
[00:51:33] <icinga-wm>	 PROBLEM - MySQL Replication Heartbeat on db1016 is CRITICAL: CRIT replication delay 305 seconds  
[00:52:02] <icinga-wm>	 PROBLEM - MySQL Slave Delay on db1016 is CRITICAL: CRIT replication delay 341 seconds  
[00:53:15] <icinga-wm>	 RECOVERY - MySQL Slave Delay on db1016 is OK: OK replication delay 0 seconds  
[00:53:50] <icinga-wm>	 RECOVERY - MySQL Replication Heartbeat on db1016 is OK: OK replication delay -0 seconds  
[00:58:54] <grrrit-wm1>	 (03CR) 10Legoktm: "Needs I2f53b23631aeeff91023ae8b44e2a4753c1f0ba3 to be deployed first." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/167408 (owner: 10Legoktm)
[02:14:23] <logmsgbot>	 !log LocalisationUpdate completed (1.25wmf3) at 2014-10-19 02:14:23+00:00
[02:14:34] <morebots>	 Logged the message, Master
[02:25:44] <logmsgbot>	 !log LocalisationUpdate completed (1.25wmf4) at 2014-10-19 02:25:44+00:00
[02:25:52] <morebots>	 Logged the message, Master
[03:36:03] <logmsgbot>	 !log LocalisationUpdate ResourceLoader cache refresh completed at Sun Oct 19 03:36:03 UTC 2014 (duration 36m 2s)
[03:36:09] <morebots>	 Logged the message, Master
[03:42:33] <grrrit-wm1>	 (03CR) 10Legoktm: "Actually I4eb6322183556b44bc748c24932892cb311880c0." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/167408 (owner: 10Legoktm)
[05:58:19] <icinga-wm>	 PROBLEM - Disk space on ocg1003 is CRITICAL: DISK CRITICAL - free space: / 349 MB (3% inode=72%):  
[06:26:00] <icinga-wm>	 RECOVERY - Disk space on ocg1003 is OK: DISK OK  
[06:26:30] <icinga-wm>	 RECOVERY - Disk space on ocg1001 is OK: DISK OK  
[06:29:10] <icinga-wm>	 PROBLEM - puppet last run on mw1042 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:29:19] <icinga-wm>	 PROBLEM - puppet last run on search1001 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:29] <icinga-wm>	 PROBLEM - puppet last run on mw1065 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:29:39] <icinga-wm>	 PROBLEM - puppet last run on mw1144 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:29:40] <icinga-wm>	 PROBLEM - puppet last run on db1040 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:29:50] <icinga-wm>	 PROBLEM - puppet last run on mw1118 is CRITICAL: CRITICAL: Puppet has 2 failures  
[06:29:50] <icinga-wm>	 PROBLEM - puppet last run on cp3014 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:19] <icinga-wm>	 PROBLEM - puppet last run on mw1025 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:30:29] <icinga-wm>	 PROBLEM - puppet last run on db1039 is CRITICAL: CRITICAL: Puppet has 3 failures  
[06:37:19] <icinga-wm>	 PROBLEM - puppet last run on db1035 is CRITICAL: CRITICAL: Puppet has 1 failures  
[06:45:41] <icinga-wm>	 RECOVERY - puppet last run on mw1065 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures  
[06:45:42] <icinga-wm>	 RECOVERY - puppet last run on mw1144 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures  
[06:45:51] <icinga-wm>	 RECOVERY - puppet last run on db1040 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures  
[06:46:11] <icinga-wm>	 RECOVERY - puppet last run on cp3014 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures  
[06:46:20] <icinga-wm>	 RECOVERY - puppet last run on mw1042 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures  
[06:46:30] <icinga-wm>	 RECOVERY - puppet last run on mw1025 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures  
[06:46:32] <icinga-wm>	 RECOVERY - puppet last run on search1001 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures  
[06:47:10] <icinga-wm>	 RECOVERY - puppet last run on mw1118 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures  
[06:48:50] <icinga-wm>	 RECOVERY - puppet last run on db1039 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures  
[06:55:51] <icinga-wm>	 RECOVERY - puppet last run on db1035 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures  
[07:38:01] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 370326 msg: ocg_render_job_queue 662 msg (=500 critical)  
[07:38:11] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 370475 msg: ocg_render_job_queue 769 msg (=500 critical)  
[07:38:35] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 370795 msg: ocg_render_job_queue 981 msg (=500 critical)  
[07:43:12] <icinga-wm>	 RECOVERY - OCG health on ocg1002 is OK: OK: ocg_job_status 371094 msg: ocg_render_job_queue 36 msg  
[07:43:21] <icinga-wm>	 RECOVERY - OCG health on ocg1003 is OK: OK: ocg_job_status 371101 msg: ocg_render_job_queue 16 msg  
[07:43:50] <icinga-wm>	 RECOVERY - OCG health on ocg1001 is OK: OK: ocg_job_status 371129 msg: ocg_render_job_queue 0 msg  
[08:35:17] <icinga-wm>	 PROBLEM - check_mysql on lutetium is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 709  
[08:40:24] <icinga-wm>	 PROBLEM - check_mysql on lutetium is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1008  
[08:45:20] <icinga-wm>	 RECOVERY - check_mysql on lutetium is OK: Uptime: 227561  Threads: 2  Questions: 1760156  Slow queries: 617  Opens: 2713  Flush tables: 2  Open tables: 64  Queries per second avg: 7.734 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0  
[09:05:49] <icinga-wm>	 PROBLEM - Disk space on ms-be2007 is CRITICAL: DISK CRITICAL - /srv/swift-storage/sdl1 is not accessible: Input/output error  
[09:06:29] <icinga-wm>	 PROBLEM - RAID on ms-be2007 is CRITICAL: CRITICAL: 1 failed LD(s) (Offline)  
[09:13:00] <icinga-wm>	 PROBLEM - puppet last run on ms-be2007 is CRITICAL: CRITICAL: Puppet has 1 failures  
[10:02:02] <icinga-wm>	 RECOVERY - Disk space on ms-be2007 is OK: DISK OK  
[10:08:20] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 347244 msg: ocg_render_job_queue 1115 msg (=500 critical)  
[10:08:20] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 347245 msg: ocg_render_job_queue 1115 msg (=500 critical)  
[10:08:45] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 347258 msg: ocg_render_job_queue 1091 msg (=500 critical)  
[10:16:46] <icinga-wm>	 RECOVERY - OCG health on ocg1002 is OK: OK: ocg_job_status 347958 msg: ocg_render_job_queue 72 msg  
[10:16:46] <icinga-wm>	 RECOVERY - OCG health on ocg1003 is OK: OK: ocg_job_status 347959 msg: ocg_render_job_queue 68 msg  
[10:16:56] <icinga-wm>	 RECOVERY - OCG health on ocg1001 is OK: OK: ocg_job_status 347980 msg: ocg_render_job_queue 53 msg  
[11:14:46] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 354360 msg: ocg_render_job_queue 570 msg (=500 critical)  
[11:14:59] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 354445 msg: ocg_render_job_queue 613 msg (=500 critical)  
[11:15:05] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 354446 msg: ocg_render_job_queue 604 msg (=500 critical)  
[11:20:05] <icinga-wm>	 RECOVERY - OCG health on ocg1002 is OK: OK: ocg_job_status 354905 msg: ocg_render_job_queue 63 msg  
[11:20:07] <icinga-wm>	 RECOVERY - OCG health on ocg1003 is OK: OK: ocg_job_status 354912 msg: ocg_render_job_queue 51 msg  
[11:20:15] <icinga-wm>	 RECOVERY - OCG health on ocg1001 is OK: OK: ocg_job_status 354917 msg: ocg_render_job_queue 48 msg  
[11:34:02] <grrrit-wm1>	 (03PS8) 10Ricordisamoa: minor changes to InitialiseSettings.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/129464 
[11:37:43] <grrrit-wm1>	 (03CR) 10Ricordisamoa: "Rebased again. Guys, we'll get to 6 months soon!" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/129464 (owner: 10Ricordisamoa)
[12:16:01] <grrrit-wm1>	 (03PS1) 10Nemo bis: Enable uploads on Hungarian Wikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/167439 (https://bugzilla.wikimedia.org/72231) 
[13:03:08] <icinga-wm>	 PROBLEM - puppet last run on cp4011 is CRITICAL: CRITICAL: Puppet has 1 failures  
[13:04:27] <icinga-wm>	 PROBLEM - puppet last run on cp4015 is CRITICAL: CRITICAL: Puppet has 1 failures  
[13:07:07] <icinga-wm>	 PROBLEM - puppet last run on cp4007 is CRITICAL: CRITICAL: Puppet has 1 failures  
[13:09:08] <icinga-wm>	 PROBLEM - puppet last run on cp4006 is CRITICAL: CRITICAL: Puppet has 1 failures  
[13:19:27] <icinga-wm>	 RECOVERY - puppet last run on cp4011 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures  
[13:20:57] <icinga-wm>	 RECOVERY - puppet last run on cp4015 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures  
[13:25:47] <icinga-wm>	 RECOVERY - puppet last run on cp4007 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures  
[13:26:46] <icinga-wm>	 RECOVERY - puppet last run on cp4006 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures  
[13:32:27] <icinga-wm>	 PROBLEM - puppet last run on cp4008 is CRITICAL: CRITICAL: Puppet has 1 failures  
[13:33:26] <icinga-wm>	 PROBLEM - puppet last run on cp4014 is CRITICAL: CRITICAL: Puppet has 1 failures  
[13:34:56] <icinga-wm>	 PROBLEM - puppet last run on cp4018 is CRITICAL: CRITICAL: Puppet has 1 failures  
[13:49:57] <icinga-wm>	 RECOVERY - puppet last run on cp4008 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures  
[13:51:58] <icinga-wm>	 RECOVERY - puppet last run on cp4014 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures  
[13:53:36] <icinga-wm>	 RECOVERY - puppet last run on cp4018 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures  
[13:55:11] <icinga-wm>	 PROBLEM - puppet last run on cp4005 is CRITICAL: CRITICAL: Puppet has 1 failures  
[14:00:27] <icinga-wm>	 PROBLEM - puppet last run on cp4020 is CRITICAL: CRITICAL: Puppet has 1 failures  
[14:00:47] <icinga-wm>	 PROBLEM - puppet last run on cp4002 is CRITICAL: CRITICAL: Puppet has 1 failures  
[14:10:48] <icinga-wm>	 RECOVERY - puppet last run on cp4005 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures  
[14:14:49] <icinga-wm>	 RECOVERY - puppet last run on cp4020 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures  
[14:16:18] <icinga-wm>	 RECOVERY - puppet last run on cp4002 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures  
[14:23:48] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 374151 msg: ocg_render_job_queue 647 msg (=500 critical)  
[14:23:59] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 374166 msg: ocg_render_job_queue 652 msg (=500 critical)  
[14:24:02] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 374167 msg: ocg_render_job_queue 650 msg (=500 critical)  
[14:28:58] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 375190 msg: ocg_render_job_queue 991 msg (=500 critical)  
[14:29:00] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 375243 msg: ocg_render_job_queue 1030 msg (=500 critical)  
[14:29:08] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 375253 msg: ocg_render_job_queue 1037 msg (=500 critical)  
[14:41:15] <grrrit-wm1>	 (03Abandoned) 10Yuvipanda: icinga: Add a parameter to icinga::web to parameterize SSL [puppet] - 10https://gerrit.wikimedia.org/r/164103 (owner: 10Yuvipanda)
[14:48:52] <icinga-wm>	 RECOVERY - OCG health on ocg1002 is OK: OK: ocg_job_status 377416 msg: ocg_render_job_queue 57 msg  
[14:48:53] <icinga-wm>	 RECOVERY - OCG health on ocg1001 is OK: OK: ocg_job_status 377422 msg: ocg_render_job_queue 52 msg  
[14:49:02] <icinga-wm>	 RECOVERY - OCG health on ocg1003 is OK: OK: ocg_job_status 377425 msg: ocg_render_job_queue 49 msg  
[15:32:11] <grrrit-wm1>	 (03PS3) 10Glaisher: Create Oriya Wikisource (orwikisource) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166186 (https://bugzilla.wikimedia.org/71875) 
[15:32:48] <grrrit-wm1>	 (03CR) 10Glaisher: "Rebased" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166186 (https://bugzilla.wikimedia.org/71875) (owner: 10Glaisher)
[16:48:20] <grrrit-wm1>	 (03PS1) 10Glaisher: Rename project and project talk namespaces on mrwikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/167447 (https://bugzilla.wikimedia.org/71774) 
[17:19:35] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 351913 msg: ocg_render_job_queue 506 msg (=500 critical)  
[17:26:45] <icinga-wm>	 RECOVERY - OCG health on ocg1002 is OK: OK: ocg_job_status 352563 msg: ocg_render_job_queue 51 msg  
[17:31:55] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 354015 msg: ocg_render_job_queue 645 msg (=500 critical)  
[17:32:06] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 354034 msg: ocg_render_job_queue 619 msg (=500 critical)  
[17:32:15] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 354039 msg: ocg_render_job_queue 612 msg (=500 critical)  
[17:38:27] <icinga-wm>	 RECOVERY - OCG health on ocg1001 is OK: OK: ocg_job_status 354504 msg: ocg_render_job_queue 98 msg  
[17:39:06] <icinga-wm>	 RECOVERY - OCG health on ocg1002 is OK: OK: ocg_job_status 354546 msg: ocg_render_job_queue 68 msg  
[17:39:27] <icinga-wm>	 RECOVERY - OCG health on ocg1003 is OK: OK: ocg_job_status 354566 msg: ocg_render_job_queue 45 msg  
[18:22:56] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 359181 msg: ocg_render_job_queue 511 msg (=500 critical)  
[18:23:06] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 359189 msg: ocg_render_job_queue 502 msg (=500 critical)  
[18:27:25] <icinga-wm>	 PROBLEM - puppet last run on amssq47 is CRITICAL: CRITICAL: puppet fail  
[18:27:46] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 359896 msg: ocg_render_job_queue 532 msg (=500 critical)  
[18:28:11] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 359914 msg: ocg_render_job_queue 509 msg (=500 critical)  
[18:28:15] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 359931 msg: ocg_render_job_queue 504 msg (=500 critical)  
[18:41:16] <icinga-wm>	 RECOVERY - OCG health on ocg1002 is OK: OK: ocg_job_status 360927 msg: ocg_render_job_queue 65 msg  
[18:41:35] <icinga-wm>	 RECOVERY - OCG health on ocg1003 is OK: OK: ocg_job_status 360950 msg: ocg_render_job_queue 58 msg  
[18:41:36] <icinga-wm>	 RECOVERY - OCG health on ocg1001 is OK: OK: ocg_job_status 360951 msg: ocg_render_job_queue 55 msg  
[18:47:58] <icinga-wm>	 RECOVERY - puppet last run on amssq47 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures  
[18:48:47] <grrrit-wm1>	 (03PS1) 10MaxSem: Perform mobile redirect only for GET requests [puppet] - 10https://gerrit.wikimedia.org/r/167453 (https://bugzilla.wikimedia.org/72186) 
[20:09:00] <icinga-wm>	 PROBLEM - check configured eth on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[20:09:00] <icinga-wm>	 PROBLEM - SSH on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[20:09:01] <icinga-wm>	 PROBLEM - nutcracker port on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[20:09:01] <icinga-wm>	 PROBLEM - HHVM rendering on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[20:09:01] <icinga-wm>	 PROBLEM - Apache HTTP on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[20:09:21] <icinga-wm>	 PROBLEM - Disk space on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[20:09:31] <icinga-wm>	 PROBLEM - puppet last run on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[20:09:32] <icinga-wm>	 PROBLEM - DPKG on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[20:09:32] <icinga-wm>	 PROBLEM - check if dhclient is running on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[20:09:41] <icinga-wm>	 PROBLEM - check if salt-minion is running on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[20:09:50] <icinga-wm>	 PROBLEM - nutcracker process on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[20:10:00] <icinga-wm>	 PROBLEM - RAID on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.  
[20:10:52] <icinga-wm>	 RECOVERY - nutcracker process on mw1114 is OK: PROCS OK: 1 process with UID = 110 (nutcracker), command name nutcracker  
[20:11:11] <icinga-wm>	 RECOVERY - SSH on mw1114 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2 (protocol 2.0)  
[20:11:11] <icinga-wm>	 RECOVERY - nutcracker port on mw1114 is OK: TCP OK - 0.000 second response time on port 11212  
[20:11:12] <icinga-wm>	 RECOVERY - check configured eth on mw1114 is OK: NRPE: Unable to read output  
[20:11:14] <icinga-wm>	 RECOVERY - Apache HTTP on mw1114 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 440 bytes in 2.836 second response time  
[20:11:14] <icinga-wm>	 RECOVERY - Disk space on mw1114 is OK: DISK OK  
[20:11:14] <icinga-wm>	 RECOVERY - HHVM rendering on mw1114 is OK: HTTP OK: HTTP/1.1 200 OK - 66984 bytes in 7.968 second response time  
[20:11:42] <icinga-wm>	 RECOVERY - check if dhclient is running on mw1114 is OK: PROCS OK: 0 processes with command name dhclient  
[20:11:42] <icinga-wm>	 RECOVERY - DPKG on mw1114 is OK: All packages OK  
[20:11:42] <icinga-wm>	 RECOVERY - check if salt-minion is running on mw1114 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion  
[20:11:52] <icinga-wm>	 RECOVERY - RAID on mw1114 is OK: OK: no RAID installed  
[20:25:16] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 370898 msg: ocg_render_job_queue 551 msg (=500 critical)  
[20:25:16] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 370901 msg: ocg_render_job_queue 548 msg (=500 critical)  
[20:25:26] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 370911 msg: ocg_render_job_queue 537 msg (=500 critical)  
[20:26:51] <icinga-wm>	 RECOVERY - puppet last run on mw1114 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures  
[20:27:40] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 371066 msg: ocg_render_job_queue 500 msg (=500 critical)  
[20:29:40] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 371186 msg: ocg_render_job_queue 503 msg (=500 critical)  
[20:29:50] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 371188 msg: ocg_render_job_queue 500 msg (=500 critical)  
[20:30:50] <icinga-wm>	 PROBLEM - puppet last run on ssl3002 is CRITICAL: CRITICAL: puppet fail  
[20:31:02] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 371262 msg: ocg_render_job_queue 528 msg (=500 critical)  
[20:43:12] <icinga-wm>	 RECOVERY - OCG health on ocg1002 is OK: OK: ocg_job_status 372027 msg: ocg_render_job_queue 89 msg  
[20:43:21] <icinga-wm>	 RECOVERY - OCG health on ocg1003 is OK: OK: ocg_job_status 372034 msg: ocg_render_job_queue 85 msg  
[20:43:31] <icinga-wm>	 RECOVERY - OCG health on ocg1001 is OK: OK: ocg_job_status 372053 msg: ocg_render_job_queue 90 msg  
[20:50:51] <icinga-wm>	 RECOVERY - puppet last run on ssl3002 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures  
[21:03:23] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 374370 msg: ocg_render_job_queue 699 msg (=500 critical)  
[21:03:23] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 374374 msg: ocg_render_job_queue 693 msg (=500 critical)  
[21:03:33] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 374386 msg: ocg_render_job_queue 684 msg (=500 critical)  
[21:11:03] <icinga-wm>	 PROBLEM - puppet last run on amssq42 is CRITICAL: CRITICAL: puppet fail  
[21:11:44] <icinga-wm>	 RECOVERY - OCG health on ocg1002 is OK: OK: ocg_job_status 374933 msg: ocg_render_job_queue 69 msg  
[21:11:53] <icinga-wm>	 RECOVERY - OCG health on ocg1003 is OK: OK: ocg_job_status 374944 msg: ocg_render_job_queue 58 msg  
[21:12:03] <icinga-wm>	 RECOVERY - OCG health on ocg1001 is OK: OK: ocg_job_status 374962 msg: ocg_render_job_queue 50 msg  
[21:29:54] <icinga-wm>	 RECOVERY - puppet last run on amssq42 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures  
[21:43:05] <icinga-wm>	 PROBLEM - BGP status on cr1-codfw is CRITICAL: CRITICAL: No response from remote host 208.80.153.192  
[21:44:12] <icinga-wm>	 RECOVERY - BGP status on cr1-codfw is OK: OK: host 208.80.153.192, sessions up: 6, down: 0, shutdown: 0  
[21:56:34] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 329050 msg: ocg_render_job_queue 703 msg (=500 critical)  
[21:56:34] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 329108 msg: ocg_render_job_queue 754 msg (=500 critical)  
[21:56:43] <icinga-wm>	 PROBLEM - BGP status on cr1-codfw is CRITICAL: CRITICAL: No response from remote host 208.80.153.192  
[21:57:23] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 329257 msg: ocg_render_job_queue 727 msg (=500 critical)  
[21:57:34] <icinga-wm>	 RECOVERY - BGP status on cr1-codfw is OK: OK: host 208.80.153.192, sessions up: 6, down: 0, shutdown: 0  
[22:05:34] <icinga-wm>	 RECOVERY - OCG health on ocg1002 is OK: OK: ocg_job_status 329789 msg: ocg_render_job_queue 96 msg  
[22:05:43] <icinga-wm>	 RECOVERY - OCG health on ocg1003 is OK: OK: ocg_job_status 329797 msg: ocg_render_job_queue 84 msg  
[22:05:44] <icinga-wm>	 RECOVERY - OCG health on ocg1001 is OK: OK: ocg_job_status 329803 msg: ocg_render_job_queue 73 msg  
[22:13:19] <icinga-wm>	 PROBLEM - Router interfaces on cr1-codfw is CRITICAL: CRITICAL: No response from remote host 208.80.153.192 for 1.3.6.1.2.1.2.2.1.8  with snmp version 2  
[22:15:24] <icinga-wm>	 RECOVERY - Router interfaces on cr1-codfw is OK: OK: host 208.80.153.192, interfaces up: 108, down: 0, dormant: 0, excluded: 1, unused: 0  
[23:19:22] <icinga-wm>	 PROBLEM - Router interfaces on cr1-codfw is CRITICAL: CRITICAL: No response from remote host 208.80.153.192 for 1.3.6.1.2.1.2.2.1.2  with snmp version 2  
[23:20:25] <icinga-wm>	 RECOVERY - Router interfaces on cr1-codfw is OK: OK: host 208.80.153.192, interfaces up: 108, down: 0, dormant: 0, excluded: 1, unused: 0  
[23:57:14] <icinga-wm>	 PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 339230 msg: ocg_render_job_queue 786 msg (=500 critical)  
[23:57:25] <icinga-wm>	 PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 339240 msg: ocg_render_job_queue 785 msg (=500 critical)  
[23:57:39] <icinga-wm>	 PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 339247 msg: ocg_render_job_queue 785 msg (=500 critical)