[00:00:22] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [00:02:22] PROBLEM Free ram is now: CRITICAL on integration-apache1 i-000002eb output: CHECK_NRPE: Socket timeout after 10 seconds. [00:16:27] PROBLEM Free ram is now: CRITICAL on etherpad-lite i-000002de output: CHECK_NRPE: Socket timeout after 10 seconds. [00:21:15] PROBLEM Free ram is now: UNKNOWN on etherpad-lite i-000002de output: NRPE: Unable to read output [00:30:25] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [00:45:05] PROBLEM Puppet freshness is now: CRITICAL on wikistats-01 i-00000042 output: Puppet has not run in last 20 hours [01:00:35] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [01:30:35] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [01:42:07] PROBLEM Free ram is now: UNKNOWN on integration-apache1 i-000002eb output: NRPE: Unable to read output [01:52:00] PROBLEM Free ram is now: CRITICAL on etherpad-lite i-000002de output: CHECK_NRPE: Socket timeout after 10 seconds. [01:56:40] PROBLEM Free ram is now: UNKNOWN on etherpad-lite i-000002de output: NRPE: Unable to read output [02:00:44] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [02:08:09] RECOVERY Current Load is now: OK on bots-sql2 i-000000af output: OK - load average: 5.33, 4.69, 4.94 [02:11:03] PROBLEM Total Processes is now: WARNING on incubator-bot2 i-00000252 output: PROCS WARNING: 151 processes [02:16:03] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 5.56, 5.26, 5.11 [02:28:53] PROBLEM Current Users is now: WARNING on bastion-restricted1 i-0000019b output: USERS WARNING - 6 users currently logged in [02:31:13] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [02:36:04] RECOVERY Current Load is now: OK on bots-sql2 i-000000af output: OK - load average: 2.95, 4.13, 4.83 [02:40:23] PROBLEM Free ram is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [02:41:05] RECOVERY Total Processes is now: OK on incubator-bot2 i-00000252 output: PROCS OK: 146 processes [02:43:50] PROBLEM Current Load is now: WARNING on nagios 127.0.0.1 output: WARNING - load average: 2.25, 4.82, 3.05 [02:44:10] PROBLEM Free ram is now: WARNING on bots-sql2 i-000000af output: Warning: 15% free memory [02:48:50] RECOVERY Current Load is now: OK on nagios 127.0.0.1 output: OK - load average: 1.15, 2.46, 2.50 [03:01:20] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [03:08:32] 06/30/2012 - 03:08:31 - User laner may have been modified in LDAP or locally, updating key in project(s): deployment-prep [03:08:42] RECOVERY Puppet freshness is now: OK on psm-precise i-000002f2 output: puppet ran at Sat Jun 30 03:08:33 UTC 2012 [03:31:24] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [03:42:03] PROBLEM Total Processes is now: CRITICAL on psm-precise i-000002f2 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:46:57] PROBLEM Free ram is now: WARNING on nova-daas-1 i-000000e7 output: Warning: 14% free memory [03:47:43] PROBLEM Total Processes is now: WARNING on psm-precise i-000002f2 output: PROCS WARNING: 164 processes [03:48:42] PROBLEM Free ram is now: WARNING on utils-abogott i-00000131 output: Warning: 15% free memory [03:49:45] PROBLEM Disk Space is now: CRITICAL on grail i-000002c6 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:51:02] PROBLEM dpkg-check is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:51:02] PROBLEM dpkg-check is now: CRITICAL on deployment-apache31 i-000002d4 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:51:52] PROBLEM Total Processes is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:52:02] PROBLEM Disk Space is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:52:03] PROBLEM Current Users is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:52:03] PROBLEM Current Load is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:52:28] PROBLEM Current Load is now: WARNING on bots-cb i-0000009e output: WARNING - load average: 1.56, 9.55, 6.48 [03:53:17] PROBLEM Current Load is now: WARNING on nagios 127.0.0.1 output: WARNING - load average: 2.36, 4.48, 3.37 [03:53:17] PROBLEM Free ram is now: WARNING on test-oneiric i-00000187 output: Warning: 15% free memory [03:53:37] RECOVERY Disk Space is now: OK on grail i-000002c6 output: DISK OK [03:55:17] RECOVERY dpkg-check is now: OK on deployment-apache31 i-000002d4 output: All packages OK [03:55:17] RECOVERY dpkg-check is now: OK on migration1 i-00000261 output: All packages OK [03:56:21] RECOVERY Total Processes is now: OK on incubator-bot2 i-00000252 output: PROCS OK: 147 processes [03:56:26] RECOVERY Disk Space is now: OK on migration1 i-00000261 output: DISK OK [03:56:26] RECOVERY Current Users is now: OK on migration1 i-00000261 output: USERS OK - 0 users currently logged in [03:56:26] RECOVERY Current Load is now: OK on migration1 i-00000261 output: OK - load average: 0.09, 1.63, 1.61 [03:57:17] RECOVERY Current Load is now: OK on bots-cb i-0000009e output: OK - load average: 0.63, 3.88, 4.85 [04:01:27] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [04:06:39] PROBLEM Free ram is now: CRITICAL on nova-daas-1 i-000000e7 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:08:21] PROBLEM Free ram is now: CRITICAL on test-oneiric i-00000187 output: Critical: 5% free memory [04:08:41] PROBLEM Free ram is now: CRITICAL on utils-abogott i-00000131 output: Critical: 3% free memory [04:11:54] PROBLEM Free ram is now: CRITICAL on orgcharts-dev i-0000018f output: CHECK_NRPE: Socket timeout after 10 seconds. [04:13:37] RECOVERY Free ram is now: OK on utils-abogott i-00000131 output: OK: 97% free memory [04:16:18] PROBLEM Free ram is now: WARNING on nova-daas-1 i-000000e7 output: Warning: 10% free memory [04:16:37] PROBLEM Free ram is now: WARNING on orgcharts-dev i-0000018f output: Warning: 11% free memory [04:18:17] RECOVERY Free ram is now: OK on test-oneiric i-00000187 output: OK: 97% free memory [04:20:22] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 9.08, 8.28, 7.91 [04:31:54] PROBLEM Free ram is now: CRITICAL on orgcharts-dev i-0000018f output: Critical: 3% free memory [04:32:54] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [04:53:28] PROBLEM Free ram is now: WARNING on incubator-bot1 i-00000251 output: Warning: 19% free memory [05:02:58] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [05:10:54] PROBLEM Free ram is now: CRITICAL on wikistats-history-01 i-000002e2 output: CHECK_NRPE: Socket timeout after 10 seconds. [05:10:59] PROBLEM Current Load is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [05:10:59] PROBLEM Free ram is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [05:15:35] PROBLEM Free ram is now: UNKNOWN on wikistats-history-01 i-000002e2 output: NRPE: Unable to read output [05:15:35] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 8.08, 8.13, 8.18 [05:15:35] PROBLEM Free ram is now: WARNING on bots-sql2 i-000000af output: Warning: 10% free memory [05:20:45] PROBLEM Free ram is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [05:33:46] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [06:03:48] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [06:29:40] PROBLEM Free ram is now: CRITICAL on etherpad-lite i-000002de output: CHECK_NRPE: Socket timeout after 10 seconds. [06:33:17] PROBLEM Disk Space is now: CRITICAL on mwreview i-000002ae output: CHECK_NRPE: Socket timeout after 10 seconds. [06:34:47] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [06:34:57] PROBLEM Total Processes is now: CRITICAL on ganglia-test2 i-00000250 output: PROCS CRITICAL: 201 processes [06:35:32] PROBLEM Free ram is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:35:42] PROBLEM Total Processes is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:35:52] PROBLEM Disk Space is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:35:52] PROBLEM Current Users is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:35:52] PROBLEM Total Processes is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:35:57] PROBLEM Current Load is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:37:25] PROBLEM Current Load is now: CRITICAL on nagios 127.0.0.1 output: CRITICAL - load average: 22.04, 19.07, 10.23 [06:37:40] RECOVERY Disk Space is now: OK on mwreview i-000002ae output: DISK OK [06:37:40] PROBLEM Free ram is now: CRITICAL on integration-apache1 i-000002eb output: CHECK_NRPE: Socket timeout after 10 seconds. [06:37:57] PROBLEM Total Processes is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:07] PROBLEM dpkg-check is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:07] PROBLEM Free ram is now: CRITICAL on mwreview i-000002ae output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:07] PROBLEM Total Processes is now: CRITICAL on mwreview i-000002ae output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:13] PROBLEM dpkg-check is now: CRITICAL on mwreview i-000002ae output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:13] PROBLEM Disk Space is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:18] PROBLEM Current Load is now: CRITICAL on mwreview i-000002ae output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:18] PROBLEM Current Load is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:18] PROBLEM Free ram is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:18] PROBLEM Disk Space is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:18] PROBLEM Current Users is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:18] PROBLEM Free ram is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:19] PROBLEM Current Users is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:19] PROBLEM Current Load is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:44:50] PROBLEM SSH is now: CRITICAL on bots-sql2 i-000000af output: (Service Check Timed Out) [06:44:50] PROBLEM dpkg-check is now: CRITICAL on bots-sql2 i-000000af output: (Service Check Timed Out) [06:44:50] PROBLEM Total Processes is now: CRITICAL on bots-sql2 i-000000af output: (Service Check Timed Out) [06:48:18] RECOVERY Total Processes is now: OK on incubator-bot2 i-00000252 output: PROCS OK: 147 processes [06:48:41] RECOVERY Free ram is now: OK on worker1 i-00000208 output: OK: 92% free memory [06:48:41] RECOVERY Current Load is now: OK on worker1 i-00000208 output: OK - load average: 4.32, 4.60, 3.83 [06:48:41] RECOVERY Total Processes is now: OK on worker1 i-00000208 output: PROCS OK: 83 processes [06:48:53] RECOVERY Disk Space is now: OK on worker1 i-00000208 output: DISK OK [06:48:53] RECOVERY Current Users is now: OK on worker1 i-00000208 output: USERS OK - 0 users currently logged in [06:49:41] PROBLEM Free ram is now: CRITICAL on psm-precise i-000002f2 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:49:41] PROBLEM Total Processes is now: CRITICAL on psm-precise i-000002f2 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:50:18] PROBLEM Free ram is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:50:57] PROBLEM Free ram is now: UNKNOWN on etherpad-lite i-000002de output: NRPE: Unable to read output [06:50:57] RECOVERY Total Processes is now: OK on rds i-00000207 output: PROCS OK: 75 processes [06:51:14] RECOVERY dpkg-check is now: OK on incubator-bot2 i-00000252 output: All packages OK [06:51:22] RECOVERY Disk Space is now: OK on rds i-00000207 output: DISK OK [06:51:22] RECOVERY Current Load is now: OK on incubator-bot2 i-00000252 output: OK - load average: 1.93, 2.25, 2.63 [06:51:22] RECOVERY Free ram is now: OK on incubator-bot2 i-00000252 output: OK: 33% free memory [06:51:22] RECOVERY Disk Space is now: OK on incubator-bot2 i-00000252 output: DISK OK [06:51:22] RECOVERY Current Users is now: OK on incubator-bot2 i-00000252 output: USERS OK - 0 users currently logged in [06:51:23] RECOVERY Free ram is now: OK on rds i-00000207 output: OK: 94% free memory [06:51:23] RECOVERY Current Users is now: OK on rds i-00000207 output: USERS OK - 0 users currently logged in [06:51:24] RECOVERY Current Load is now: OK on rds i-00000207 output: OK - load average: 0.31, 2.89, 3.52 [06:52:39] RECOVERY dpkg-check is now: OK on mwreview i-000002ae output: All packages OK [06:52:40] RECOVERY Current Load is now: OK on mwreview i-000002ae output: OK - load average: 0.19, 2.23, 3.15 [06:52:40] RECOVERY Total Processes is now: OK on mwreview i-000002ae output: PROCS OK: 117 processes [06:52:44] RECOVERY Free ram is now: OK on mwreview i-000002ae output: OK: 67% free memory [06:52:45] PROBLEM Free ram is now: CRITICAL on configtest-main i-000002dd output: CHECK_NRPE: Socket timeout after 10 seconds. [06:53:54] PROBLEM Current Users is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:53:54] PROBLEM Current Load is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:53:54] PROBLEM Total Processes is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:54:02] PROBLEM Free ram is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:54:03] PROBLEM dpkg-check is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:54:03] PROBLEM Current Users is now: CRITICAL on maps-tilemill1 i-00000294 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:55:15] PROBLEM Current Load is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [06:56:42] PROBLEM Free ram is now: WARNING on incubator-bot1 i-00000251 output: Warning: 18% free memory [06:57:55] PROBLEM Current Users is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [06:57:55] PROBLEM Disk Space is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [06:58:23] PROBLEM Disk Space is now: CRITICAL on integration-apache1 i-000002eb output: CHECK_NRPE: Socket timeout after 10 seconds. [06:58:24] PROBLEM Current Users is now: CRITICAL on integration-apache1 i-000002eb output: CHECK_NRPE: Socket timeout after 10 seconds. [06:58:24] PROBLEM dpkg-check is now: CRITICAL on integration-apache1 i-000002eb output: CHECK_NRPE: Socket timeout after 10 seconds. [06:59:08] PROBLEM Total Processes is now: WARNING on ganglia-test2 i-00000250 output: PROCS WARNING: 189 processes [06:59:13] PROBLEM Free ram is now: UNKNOWN on configtest-main i-000002dd output: NRPE: Unable to read output [06:59:13] PROBLEM Free ram is now: UNKNOWN on integration-apache1 i-000002eb output: NRPE: Unable to read output [06:59:13] RECOVERY Current Users is now: OK on mobile-testing i-00000271 output: USERS OK - 0 users currently logged in [06:59:13] RECOVERY Current Load is now: OK on mobile-testing i-00000271 output: OK - load average: 0.67, 2.75, 3.02 [06:59:13] RECOVERY Total Processes is now: OK on mobile-testing i-00000271 output: PROCS OK: 135 processes [06:59:26] RECOVERY Free ram is now: OK on mobile-testing i-00000271 output: OK: 94% free memory [06:59:26] RECOVERY Current Users is now: OK on maps-tilemill1 i-00000294 output: USERS OK - 0 users currently logged in [06:59:26] RECOVERY dpkg-check is now: OK on mobile-testing i-00000271 output: All packages OK [07:00:09] PROBLEM Current Load is now: WARNING on integration-apache1 i-000002eb output: WARNING - load average: 7.37, 9.08, 9.43 [07:00:15] PROBLEM Current Load is now: WARNING on deployment-apache30 i-000002d3 output: WARNING - load average: 7.37, 9.10, 5.96 [07:01:18] PROBLEM Current Load is now: CRITICAL on nova-precise1 i-00000236 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:01:19] PROBLEM Free ram is now: CRITICAL on nova-precise1 i-00000236 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:01:19] PROBLEM Disk Space is now: CRITICAL on nova-precise1 i-00000236 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:01:19] PROBLEM Current Users is now: CRITICAL on nova-precise1 i-00000236 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:01:19] PROBLEM Total Processes is now: CRITICAL on nova-precise1 i-00000236 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:01:49] PROBLEM Disk Space is now: CRITICAL on ipv6test1 i-00000282 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:01:58] RECOVERY Total Processes is now: OK on bots-sql2 i-000000af output: PROCS OK: 99 processes [07:02:05] RECOVERY dpkg-check is now: OK on bots-sql2 i-000000af output: All packages OK [07:02:05] RECOVERY SSH is now: OK on bots-sql2 i-000000af output: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [07:03:14] PROBLEM Free ram is now: UNKNOWN on psm-precise i-000002f2 output: NRPE: Unable to read output [07:03:14] PROBLEM Total Processes is now: WARNING on psm-precise i-000002f2 output: PROCS WARNING: 174 processes [07:03:27] RECOVERY Disk Space is now: OK on integration-apache1 i-000002eb output: DISK OK [07:03:28] RECOVERY Current Users is now: OK on integration-apache1 i-000002eb output: USERS OK - 1 users currently logged in [07:03:28] RECOVERY dpkg-check is now: OK on integration-apache1 i-000002eb output: All packages OK [07:03:32] RECOVERY Current Users is now: OK on bots-sql2 i-000000af output: USERS OK - 0 users currently logged in [07:03:33] RECOVERY Disk Space is now: OK on bots-sql2 i-000000af output: DISK OK [07:03:33] PROBLEM Current Load is now: WARNING on deployment-jobrunner05 i-0000028c output: WARNING - load average: 8.75, 7.85, 5.45 [07:04:25] PROBLEM Total Processes is now: CRITICAL on dumps-2 i-000002d8 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:05:05] PROBLEM Disk Space is now: CRITICAL on dumps-2 i-000002d8 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:05:05] PROBLEM Current Users is now: CRITICAL on dumps-2 i-000002d8 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:05:05] PROBLEM Free ram is now: CRITICAL on dumps-2 i-000002d8 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:05:05] PROBLEM dpkg-check is now: CRITICAL on dumps-2 i-000002d8 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:05:14] PROBLEM dpkg-check is now: CRITICAL on nova-precise1 i-00000236 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:05:57] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [07:06:08] PROBLEM Free ram is now: CRITICAL on etherpad-lite i-000002de output: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:17] PROBLEM Current Load is now: CRITICAL on deployment-apache30 i-000002d3 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:17] PROBLEM Total Processes is now: CRITICAL on mobile-wlm i-000002bc output: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:22] PROBLEM Disk Space is now: CRITICAL on ve-nodejs i-00000245 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:22] PROBLEM Free ram is now: CRITICAL on ve-nodejs i-00000245 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:22] PROBLEM dpkg-check is now: CRITICAL on ve-nodejs i-00000245 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:22] PROBLEM Total Processes is now: CRITICAL on ve-nodejs i-00000245 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:29] PROBLEM Current Load is now: CRITICAL on dumps-2 i-000002d8 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:29] PROBLEM Current Users is now: CRITICAL on ve-nodejs i-00000245 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:29] PROBLEM Current Load is now: CRITICAL on ve-nodejs i-00000245 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:36] PROBLEM Current Users is now: CRITICAL on e3 i-00000291 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:36] PROBLEM Total Processes is now: CRITICAL on e3 i-00000291 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:06:51] PROBLEM Disk Space is now: WARNING on ipv6test1 i-00000282 output: DISK WARNING - free space: / 68 MB (5% inode=57%): [07:06:51] PROBLEM Current Load is now: WARNING on pediapress-ocg1 i-00000233 output: WARNING - load average: 6.45, 6.65, 5.26 [07:07:15] PROBLEM Free ram is now: CRITICAL on wikistats-history-01 i-000002e2 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:07:52] PROBLEM Current Load is now: CRITICAL on build-precise1 i-00000273 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:07:53] PROBLEM Total Processes is now: CRITICAL on wikistats-history-01 i-000002e2 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:07:53] PROBLEM Current Load is now: CRITICAL on wikistats-history-01 i-000002e2 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:07:53] PROBLEM Disk Space is now: CRITICAL on wikistats-history-01 i-000002e2 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:08:56] RECOVERY dpkg-check is now: OK on nova-precise1 i-00000236 output: All packages OK [07:08:56] PROBLEM Current Load is now: CRITICAL on deployment-jobrunner05 i-0000028c output: CHECK_NRPE: Socket timeout after 10 seconds. [07:09:37] PROBLEM Total Processes is now: CRITICAL on deployment-apache30 i-000002d3 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:38] PROBLEM Disk Space is now: CRITICAL on deployment-apache30 i-000002d3 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:38] PROBLEM Current Users is now: CRITICAL on deployment-apache30 i-000002d3 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:38] PROBLEM Free ram is now: CRITICAL on deployment-apache30 i-000002d3 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:38] PROBLEM dpkg-check is now: CRITICAL on deployment-apache30 i-000002d3 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:38] PROBLEM Current Load is now: CRITICAL on mobile-wlm i-000002bc output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:38] PROBLEM dpkg-check is now: CRITICAL on zeromq1 i-000002b7 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:38] PROBLEM Current Users is now: CRITICAL on zeromq1 i-000002b7 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:39] PROBLEM Current Load is now: CRITICAL on zeromq1 i-000002b7 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:39] PROBLEM Disk Space is now: CRITICAL on zeromq1 i-000002b7 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:40] PROBLEM Free ram is now: CRITICAL on zeromq1 i-000002b7 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:48] RECOVERY Current Load is now: OK on integration-apache1 i-000002eb output: OK - load average: 0.35, 1.25, 4.75 [07:10:59] PROBLEM Current Load is now: WARNING on deployment-apache30 i-000002d3 output: WARNING - load average: 22.29, 15.62, 9.97 [07:10:59] RECOVERY Current Load is now: OK on dumps-2 i-000002d8 output: OK - load average: 2.92, 5.29, 4.56 [07:10:59] RECOVERY Disk Space is now: OK on ve-nodejs i-00000245 output: DISK OK [07:10:59] PROBLEM Current Load is now: WARNING on ve-nodejs i-00000245 output: WARNING - load average: 6.58, 7.23, 5.45 [07:10:59] RECOVERY Current Users is now: OK on ve-nodejs i-00000245 output: USERS OK - 0 users currently logged in [07:11:00] RECOVERY Total Processes is now: OK on ve-nodejs i-00000245 output: PROCS OK: 121 processes [07:11:07] RECOVERY Free ram is now: OK on ve-nodejs i-00000245 output: OK: 67% free memory [07:11:07] RECOVERY dpkg-check is now: OK on ve-nodejs i-00000245 output: All packages OK [07:11:51] PROBLEM Total Processes is now: CRITICAL on pediapress-ocg2 i-00000234 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:11:59] PROBLEM dpkg-check is now: CRITICAL on pediapress-ocg2 i-00000234 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:12:18] PROBLEM Free ram is now: UNKNOWN on wikistats-history-01 i-000002e2 output: NRPE: Unable to read output [07:12:18] PROBLEM Current Load is now: WARNING on wikistats-history-01 i-000002e2 output: WARNING - load average: 6.93, 10.55, 8.24 [07:12:18] RECOVERY Disk Space is now: OK on wikistats-history-01 i-000002e2 output: DISK OK [07:12:18] RECOVERY Total Processes is now: OK on wikistats-history-01 i-000002e2 output: PROCS OK: 94 processes [07:12:42] PROBLEM Current Load is now: CRITICAL on dumps-incr i-000002bb output: CHECK_NRPE: Socket timeout after 10 seconds. [07:13:32] PROBLEM Current Load is now: WARNING on deployment-jobrunner05 i-0000028c output: WARNING - load average: 2.27, 5.29, 5.70 [07:13:32] RECOVERY Total Processes is now: OK on dumps-2 i-000002d8 output: PROCS OK: 95 processes [07:13:40] RECOVERY Disk Space is now: OK on dumps-2 i-000002d8 output: DISK OK [07:13:40] RECOVERY Current Users is now: OK on dumps-2 i-000002d8 output: USERS OK - 0 users currently logged in [07:13:40] RECOVERY Free ram is now: OK on dumps-2 i-000002d8 output: OK: 90% free memory [07:13:40] RECOVERY dpkg-check is now: OK on dumps-2 i-000002d8 output: All packages OK [07:14:22] PROBLEM Total Processes is now: CRITICAL on zeromq1 i-000002b7 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:15:59] RECOVERY Current Users is now: OK on e3 i-00000291 output: USERS OK - 0 users currently logged in [07:15:59] RECOVERY Total Processes is now: OK on e3 i-00000291 output: PROCS OK: 96 processes [07:16:04] PROBLEM Current Load is now: CRITICAL on ve-nodejs i-00000245 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:17:20] PROBLEM Current Load is now: WARNING on build-precise1 i-00000273 output: WARNING - load average: 4.64, 5.56, 5.72 [07:17:20] PROBLEM Current Load is now: WARNING on dumps-incr i-000002bb output: WARNING - load average: 0.43, 3.90, 6.02 [07:17:24] PROBLEM Free ram is now: CRITICAL on wikistats-history-01 i-000002e2 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:17:24] PROBLEM Current Load is now: CRITICAL on wikistats-history-01 i-000002e2 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:19:16] RECOVERY Total Processes is now: OK on zeromq1 i-000002b7 output: PROCS OK: 88 processes [07:19:21] RECOVERY Total Processes is now: OK on deployment-apache30 i-000002d3 output: PROCS OK: 118 processes [07:20:36] RECOVERY Disk Space is now: OK on deployment-apache30 i-000002d3 output: DISK OK [07:20:36] RECOVERY Current Users is now: OK on deployment-apache30 i-000002d3 output: USERS OK - 0 users currently logged in [07:20:36] RECOVERY Free ram is now: OK on deployment-apache30 i-000002d3 output: OK: 91% free memory [07:20:36] PROBLEM Current Load is now: WARNING on mobile-wlm i-000002bc output: WARNING - load average: 0.42, 4.80, 6.65 [07:20:36] RECOVERY Current Users is now: OK on zeromq1 i-000002b7 output: USERS OK - 0 users currently logged in [07:20:36] RECOVERY Current Load is now: OK on zeromq1 i-000002b7 output: OK - load average: 0.12, 2.47, 3.27 [07:20:36] RECOVERY Disk Space is now: OK on zeromq1 i-000002b7 output: DISK OK [07:20:37] RECOVERY dpkg-check is now: OK on zeromq1 i-000002b7 output: All packages OK [07:20:37] RECOVERY dpkg-check is now: OK on deployment-apache30 i-000002d3 output: All packages OK [07:20:38] RECOVERY Free ram is now: OK on zeromq1 i-000002b7 output: OK: 82% free memory [07:20:56] RECOVERY Current Load is now: OK on nova-precise1 i-00000236 output: OK - load average: 0.12, 2.69, 4.54 [07:20:56] RECOVERY Free ram is now: OK on nova-precise1 i-00000236 output: OK: 82% free memory [07:20:56] RECOVERY Disk Space is now: OK on nova-precise1 i-00000236 output: DISK OK [07:20:56] RECOVERY Current Users is now: OK on nova-precise1 i-00000236 output: USERS OK - 0 users currently logged in [07:20:56] RECOVERY Total Processes is now: OK on mobile-wlm i-000002bc output: PROCS OK: 100 processes [07:21:01] RECOVERY Total Processes is now: OK on nova-precise1 i-00000236 output: PROCS OK: 120 processes [07:21:36] RECOVERY Total Processes is now: OK on pediapress-ocg2 i-00000234 output: PROCS OK: 90 processes [07:21:41] RECOVERY dpkg-check is now: OK on pediapress-ocg2 i-00000234 output: All packages OK [07:22:16] RECOVERY Current Load is now: OK on pediapress-ocg1 i-00000233 output: OK - load average: 1.16, 2.19, 4.09 [07:22:16] RECOVERY Current Load is now: OK on build-precise1 i-00000273 output: OK - load average: 0.33, 2.86, 4.57 [07:22:16] RECOVERY Current Load is now: OK on dumps-incr i-000002bb output: OK - load average: 0.18, 1.50, 4.39 [07:25:36] RECOVERY Current Load is now: OK on mobile-wlm i-000002bc output: OK - load average: 0.63, 2.03, 4.91 [07:25:56] PROBLEM Current Load is now: WARNING on nagios 127.0.0.1 output: WARNING - load average: 0.47, 0.91, 3.54 [07:30:56] RECOVERY Current Load is now: OK on nagios 127.0.0.1 output: OK - load average: 0.91, 0.73, 2.71 [07:37:25] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [08:08:01] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [08:38:06] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [08:45:38] PROBLEM Free ram is now: UNKNOWN on etherpad-lite i-000002de output: NRPE: Unable to read output [09:06:03] PROBLEM Free ram is now: CRITICAL on etherpad-lite i-000002de output: CHECK_NRPE: Socket timeout after 10 seconds. [09:08:33] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [09:10:43] PROBLEM Free ram is now: UNKNOWN on etherpad-lite i-000002de output: NRPE: Unable to read output [09:27:13] PROBLEM Free ram is now: WARNING on bots-sql2 i-000000af output: Warning: 11% free memory [09:27:13] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 7.01, 7.22, 7.29 [09:35:33] PROBLEM dpkg-check is now: CRITICAL on bots-3 i-000000e5 output: CHECK_NRPE: Socket timeout after 10 seconds. [09:38:37] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [09:41:47] PROBLEM Free ram is now: CRITICAL on bots-3 i-000000e5 output: CHECK_NRPE: Socket timeout after 10 seconds. [09:45:20] PROBLEM Current Load is now: WARNING on bots-3 i-000000e5 output: WARNING - load average: 7.27, 7.22, 5.62 [09:45:20] RECOVERY dpkg-check is now: OK on bots-3 i-000000e5 output: All packages OK [09:46:35] PROBLEM Free ram is now: WARNING on bots-3 i-000000e5 output: Warning: 8% free memory [09:48:18] PROBLEM Free ram is now: UNKNOWN on wikistats-history-01 i-000002e2 output: NRPE: Unable to read output [09:51:38] RECOVERY Free ram is now: OK on bots-3 i-000000e5 output: OK: 47% free memory [09:55:18] RECOVERY Current Load is now: OK on bots-3 i-000000e5 output: OK - load average: 2.64, 3.55, 4.47 [10:08:48] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [10:25:41] 06/30/2012 - 10:25:40 - User laner may have been modified in LDAP or locally, updating key in project(s): deployment-prep [10:29:50] PROBLEM host: geoip-on-labs is DOWN address: i-000002f4 check_ping: Invalid hostname/address - i-000002f4 [10:38:56] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [10:45:16] PROBLEM Puppet freshness is now: CRITICAL on wikistats-01 i-00000042 output: Puppet has not run in last 20 hours [11:05:22] PROBLEM host: deployment-jobrunner05 is DOWN address: i-0000028c CRITICAL - Host Unreachable (i-0000028c) [11:08:47] PROBLEM Total Processes is now: CRITICAL on psm-precise i-000002f2 output: PROCS CRITICAL: 207 processes [11:11:17] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [11:13:57] RECOVERY host: deployment-jobrunner05 is UP address: i-0000028c PING OK - Packet loss = 0%, RTA = 1.26 ms [11:18:47] PROBLEM Total Processes is now: WARNING on psm-precise i-000002f2 output: PROCS WARNING: 199 processes [11:27:30] 06/30/2012 - 11:27:30 - User laner may have been modified in LDAP or locally, updating key in project(s): deployment-prep [11:41:32] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [11:48:32] PROBLEM Free ram is now: CRITICAL on wikistats-history-01 i-000002e2 output: CHECK_NRPE: Socket timeout after 10 seconds. [11:53:22] PROBLEM Free ram is now: UNKNOWN on wikistats-history-01 i-000002e2 output: NRPE: Unable to read output [12:11:32] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [12:41:35] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [13:11:35] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [13:42:36] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [13:53:57] PROBLEM Total Processes is now: CRITICAL on psm-precise i-000002f2 output: PROCS CRITICAL: 201 processes [13:58:56] PROBLEM Total Processes is now: WARNING on psm-precise i-000002f2 output: PROCS WARNING: 200 processes [14:12:36] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [14:23:56] PROBLEM Total Processes is now: CRITICAL on psm-precise i-000002f2 output: PROCS CRITICAL: 201 processes [14:42:36] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [15:12:38] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [15:34:25] PROBLEM Current Users is now: CRITICAL on pdbhandler-dev i-000002f6 output: Connection refused by host [15:35:05] PROBLEM Disk Space is now: CRITICAL on pdbhandler-dev i-000002f6 output: Connection refused by host [15:35:45] PROBLEM Free ram is now: CRITICAL on pdbhandler-dev i-000002f6 output: Connection refused by host [15:36:26] PROBLEM SSH is now: CRITICAL on pdbhandler-dev i-000002f6 output: Connection refused [15:36:55] PROBLEM Total Processes is now: CRITICAL on pdbhandler-dev i-000002f6 output: Connection refused by host [15:37:35] PROBLEM dpkg-check is now: CRITICAL on pdbhandler-dev i-000002f6 output: Connection refused by host [15:38:45] PROBLEM Current Load is now: CRITICAL on pdbhandler-dev i-000002f6 output: Connection refused by host [15:42:44] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [15:46:14] RECOVERY SSH is now: OK on pdbhandler-dev i-000002f6 output: SSH OK - OpenSSH_5.8p1 Debian-7ubuntu1 (protocol 2.0) [16:07:47] PROBLEM Free ram is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [16:12:41] PROBLEM Free ram is now: WARNING on bots-sql2 i-000000af output: Warning: 10% free memory [16:14:11] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [16:14:52] i have a new project and am trying to get its environment as close to wikimedia's as i can. i created an instance with oneiric, but realized soon after that using lucid (the default) would be better. is it OK if i just delete that oneiric instance replace it with a lucid instance? [16:15:52] the instance doesn't yet have a floating IP address or DNS configured [16:16:01] PROBLEM Free ram is now: CRITICAL on etherpad-lite i-000002de output: CHECK_NRPE: Socket timeout after 10 seconds. [16:16:42] Emw: I think instance management is entirely up to you [16:17:05] GChriss: i guess that'd make sense [16:18:26] were you able to get set up with that project you needed? i found requesting one on labs-l the quickest venue [16:19:17] pending [16:19:28] * GChriss bugs channel [16:20:54] PROBLEM Free ram is now: UNKNOWN on etherpad-lite i-000002de output: NRPE: Unable to read output [16:35:44] PROBLEM Free ram is now: UNKNOWN on pdbhandler-dev i-000002f7 output: NRPE: Unable to read output [16:44:11] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [17:14:11] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [17:29:11] RECOVERY Free ram is now: OK on incubator-bot1 i-00000251 output: OK: 21% free memory [17:41:07] PROBLEM Free ram is now: CRITICAL on etherpad-lite i-000002de output: CHECK_NRPE: Socket timeout after 10 seconds. [17:44:12] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [18:02:26] PROBLEM Free ram is now: WARNING on incubator-bot1 i-00000251 output: Warning: 19% free memory [18:14:19] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [18:20:31] 06/30/2012 - 18:20:30 - User emw may have been modified in LDAP or locally, updating key in project(s): bastion,pdbhandler [18:20:37] 06/30/2012 - 18:20:37 - Updating keys for emw at /export/keys/emw [18:42:02] PROBLEM Free ram is now: WARNING on ganglia-test2 i-00000250 output: Warning: 19% free memory [18:44:22] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [18:47:02] RECOVERY Free ram is now: OK on ganglia-test2 i-00000250 output: OK: 20% free memory [19:00:04] PROBLEM Free ram is now: WARNING on ganglia-test2 i-00000250 output: Warning: 19% free memory [19:07:04] PROBLEM Total Processes is now: WARNING on bots-3 i-000000e5 output: PROCS WARNING: 151 processes [19:13:06] RECOVERY Current Load is now: OK on bots-sql2 i-000000af output: OK - load average: 3.91, 4.00, 4.68 [19:14:24] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [19:20:54] PROBLEM Free ram is now: UNKNOWN on etherpad-lite i-000002de output: NRPE: Unable to read output [19:44:24] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [19:56:08] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 5.12, 6.69, 5.87 [20:14:24] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [20:16:04] RECOVERY Current Load is now: OK on bots-sql2 i-000000af output: OK - load average: 3.46, 3.99, 4.77 [20:44:31] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [20:46:11] PROBLEM Puppet freshness is now: CRITICAL on wikistats-01 i-00000042 output: Puppet has not run in last 20 hours [20:50:59] PROBLEM Free ram is now: CRITICAL on etherpad-lite i-000002de output: CHECK_NRPE: Socket timeout after 10 seconds. [20:55:49] PROBLEM Free ram is now: UNKNOWN on etherpad-lite i-000002de output: NRPE: Unable to read output [21:12:10] RECOVERY Total Processes is now: OK on bots-3 i-000000e5 output: PROCS OK: 118 processes [21:14:40] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [21:44:46] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [22:14:50] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [22:19:10] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 7.79, 7.24, 5.65 [22:41:18] PROBLEM Total Processes is now: CRITICAL on ganglia-test2 i-00000250 output: CHECK_NRPE: Socket timeout after 10 seconds. [22:45:57] PROBLEM Total Processes is now: WARNING on ganglia-test2 i-00000250 output: PROCS WARNING: 186 processes [22:47:07] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [23:09:17] PROBLEM Current Load is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [23:14:07] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 6.08, 6.50, 6.85 [23:17:07] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0) [23:19:57] RECOVERY Free ram is now: OK on ganglia-test2 i-00000250 output: NRPE: Unable to read output [23:24:37] PROBLEM dpkg-check is now: CRITICAL on aggregator2 i-000002c0 output: CHECK_NRPE: Error - Could not complete SSL handshake. [23:29:58] PROBLEM Free ram is now: CRITICAL on psm-precise i-000002f2 output: CHECK_NRPE: Socket timeout after 10 seconds. [23:34:48] PROBLEM Free ram is now: UNKNOWN on psm-precise i-000002f2 output: NRPE: Unable to read output [23:47:09] PROBLEM host: nginx-dev2 is DOWN address: i-000002f0 CRITICAL - Host Unreachable (i-000002f0)