[00:26:48] PROBLEM Free ram is now: CRITICAL on ganglia-test2 i-00000250 output: CHECK_NRPE: Socket timeout after 10 seconds. [00:31:39] PROBLEM Free ram is now: WARNING on ganglia-test2 i-00000250 output: Warning: 14% free memory [00:42:33] !log deployment-prep hashar: Created a dumb `aawiki` database [00:42:36] Logged the message, Master [00:46:05] !log deployment-prep hashar: jobrunner01 seems to start catching up with jobs [00:46:06] Logged the message, Master [00:57:53] hashar: the labs RC is going to the same irc.wikimedia.org channels as the production cluster [00:58:03] coool [00:58:22] that one must be an old bug isn't it? [00:58:31] not sure, I only noticed it today [00:59:00] I didn't think we had set up RC feeds yet (although that was in january/feb) [00:59:31] oh [00:59:44] well could you possibly open a bug in bugzilla please ? :) [00:59:56] will take a look at it next week hopefully [01:00:14] sure [01:00:35] I'll do some more testing, make sure it's not just on one wiki [01:01:02] if that happen on one, I am pretty sure that is the case for all wiki [01:01:38] ok [01:05:46] !log deployment-prep might have disabled IRC notification by setting wgRC2UDPAddress in InitialiseSettingsDeploy.php [01:05:48] Logged the message, Master [01:05:56] Thehelpfulone: I might have disabled that, not sure [01:35:52] PROBLEM Puppet freshness is now: CRITICAL on localpuppet1 i-0000020b output: Puppet has not run in last 20 hours [02:02:18] !log deployment-prep on -nfs-memc, running 'chown -R 48 /mnt/export/upload6' so file get owned by user apache on apaches and job runner boxes [02:02:21] Logged the message, Master [02:39:19] 05/20/2012 - 02:39:19 - Updating keys for laner at /export/home/deployment-prep/laner [02:43:19] 05/20/2012 - 02:43:19 - Updating keys for laner at /export/home/deployment-prep/laner [02:49:19] 05/20/2012 - 02:49:19 - Updating keys for laner at /export/home/deployment-prep/laner [03:09:10] PROBLEM HTTP is now: CRITICAL on deployment-web5 i-00000213 output: CRITICAL - Socket timeout after 10 seconds [03:09:11] PROBLEM HTTP is now: CRITICAL on deployment-web i-00000217 output: CRITICAL - Socket timeout after 10 seconds [03:09:11] PROBLEM HTTP is now: CRITICAL on deployment-web4 i-00000214 output: CRITICAL - Socket timeout after 10 seconds [03:09:11] PROBLEM HTTP is now: CRITICAL on deployment-web3 i-00000219 output: CRITICAL - Socket timeout after 10 seconds [03:11:18] PROBLEM Disk Space is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [03:11:37] PROBLEM Current Users is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [03:11:37] PROBLEM Total Processes is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [03:11:42] PROBLEM Current Load is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [03:11:42] PROBLEM Free ram is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [03:16:37] PROBLEM Current Load is now: WARNING on deployment-nfs-memc i-000000d7 output: WARNING - load average: 11.86, 10.25, 6.94 [03:16:38] RECOVERY Disk Space is now: OK on upload-wizard i-0000021c output: DISK OK [03:16:38] RECOVERY Current Users is now: OK on upload-wizard i-0000021c output: USERS OK - 0 users currently logged in [03:16:38] RECOVERY Total Processes is now: OK on upload-wizard i-0000021c output: PROCS OK: 93 processes [03:16:43] RECOVERY Current Load is now: OK on upload-wizard i-0000021c output: OK - load average: 4.66, 4.47, 3.15 [03:16:43] RECOVERY Free ram is now: OK on upload-wizard i-0000021c output: OK: 91% free memory [03:17:55] PROBLEM Current Load is now: CRITICAL on nagios 127.0.0.1 output: CRITICAL - load average: 4.84, 8.61, 6.00 [03:19:21] PROBLEM Free ram is now: CRITICAL on ganglia-test2 i-00000250 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:23:54] PROBLEM Free ram is now: WARNING on ganglia-test2 i-00000250 output: Warning: 14% free memory [03:23:54] PROBLEM HTTP is now: WARNING on deployment-web4 i-00000214 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.016 second response time [03:23:54] PROBLEM HTTP is now: WARNING on deployment-web5 i-00000213 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.016 second response time [03:23:54] PROBLEM HTTP is now: WARNING on deployment-web i-00000217 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.014 second response time [03:23:54] PROBLEM HTTP is now: WARNING on deployment-web3 i-00000219 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.010 second response time [03:30:20] 05/20/2012 - 03:30:20 - Updating keys for laner at /export/home/deployment-prep/laner [03:31:24] RECOVERY Current Load is now: OK on deployment-nfs-memc i-000000d7 output: OK - load average: 0.74, 2.10, 4.71 [03:32:32] PROBLEM Current Load is now: WARNING on nagios 127.0.0.1 output: WARNING - load average: 1.30, 1.70, 3.74 [03:42:26] RECOVERY Current Load is now: OK on nagios 127.0.0.1 output: OK - load average: 1.55, 1.55, 2.74 [03:52:32] PROBLEM Free ram is now: WARNING on test-oneiric i-00000187 output: Warning: 14% free memory [03:56:22] PROBLEM Free ram is now: WARNING on nova-daas-1 i-000000e7 output: Warning: 15% free memory [04:00:48] PROBLEM Free ram is now: WARNING on utils-abogott i-00000131 output: Warning: 17% free memory [04:04:41] PROBLEM HTTP is now: CRITICAL on deployment-web4 i-00000214 output: CRITICAL - Socket timeout after 10 seconds [04:04:42] PROBLEM HTTP is now: CRITICAL on deployment-web i-00000217 output: CRITICAL - Socket timeout after 10 seconds [04:04:42] PROBLEM HTTP is now: CRITICAL on deployment-web5 i-00000213 output: CRITICAL - Socket timeout after 10 seconds [04:04:42] PROBLEM HTTP is now: CRITICAL on deployment-web3 i-00000219 output: CRITICAL - Socket timeout after 10 seconds [04:07:50] PROBLEM Free ram is now: CRITICAL on test-oneiric i-00000187 output: Critical: 5% free memory [04:17:44] RECOVERY Free ram is now: OK on test-oneiric i-00000187 output: OK: 97% free memory [04:18:24] PROBLEM Free ram is now: WARNING on orgcharts-dev i-0000018f output: Warning: 15% free memory [04:20:24] PROBLEM Free ram is now: CRITICAL on utils-abogott i-00000131 output: Critical: 4% free memory [04:27:05] RECOVERY Free ram is now: OK on utils-abogott i-00000131 output: OK: 97% free memory [04:27:24] PROBLEM Free ram is now: CRITICAL on nova-daas-1 i-000000e7 output: Critical: 4% free memory [04:32:49] PROBLEM Free ram is now: CRITICAL on ganglia-test2 i-00000250 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:33:55] PROBLEM Current Users is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:33:55] PROBLEM Current Load is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:33:55] PROBLEM dpkg-check is now: CRITICAL on reportcard2 i-000001ea output: CHECK_NRPE: Socket timeout after 10 seconds. [04:33:55] PROBLEM Free ram is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:33:55] PROBLEM Disk Space is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:33:55] PROBLEM Total Processes is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:35:03] RECOVERY Free ram is now: OK on nova-daas-1 i-000000e7 output: OK: 94% free memory [04:35:22] PROBLEM Disk Space is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:35:22] PROBLEM Current Load is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:35:22] PROBLEM Current Users is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:35:22] PROBLEM dpkg-check is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:35:22] PROBLEM Total Processes is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:36:01] PROBLEM Free ram is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [04:36:30] PROBLEM Current Load is now: WARNING on bots-cb i-0000009e output: WARNING - load average: 1.35, 12.89, 11.15 [04:38:20] RECOVERY Current Load is now: OK on rds i-00000207 output: OK - load average: 4.00, 5.87, 4.27 [04:38:20] RECOVERY Current Users is now: OK on rds i-00000207 output: USERS OK - 0 users currently logged in [04:38:20] RECOVERY Free ram is now: OK on rds i-00000207 output: OK: 94% free memory [04:38:20] RECOVERY Disk Space is now: OK on rds i-00000207 output: DISK OK [04:38:20] RECOVERY Total Processes is now: OK on rds i-00000207 output: PROCS OK: 78 processes [04:38:25] RECOVERY dpkg-check is now: OK on reportcard2 i-000001ea output: All packages OK [04:39:43] RECOVERY Current Load is now: OK on migration1 i-00000261 output: OK - load average: 1.67, 4.22, 3.26 [04:39:43] RECOVERY Disk Space is now: OK on migration1 i-00000261 output: DISK OK [04:39:43] RECOVERY Current Users is now: OK on migration1 i-00000261 output: USERS OK - 0 users currently logged in [04:39:43] RECOVERY Total Processes is now: OK on migration1 i-00000261 output: PROCS OK: 89 processes [04:39:48] RECOVERY dpkg-check is now: OK on migration1 i-00000261 output: All packages OK [04:40:24] PROBLEM Free ram is now: CRITICAL on orgcharts-dev i-0000018f output: Critical: 4% free memory [04:40:51] RECOVERY Free ram is now: OK on migration1 i-00000261 output: OK: 83% free memory [04:45:58] RECOVERY Free ram is now: OK on orgcharts-dev i-0000018f output: OK: 95% free memory [04:51:38] RECOVERY Current Load is now: OK on bots-cb i-0000009e output: OK - load average: 0.19, 1.03, 4.52 [05:16:50] PROBLEM Current Load is now: WARNING on deployment-nfs-memc i-000000d7 output: WARNING - load average: 2.91, 6.34, 5.58 [05:21:53] RECOVERY Current Load is now: OK on deployment-nfs-memc i-000000d7 output: OK - load average: 0.52, 2.80, 4.26 [05:33:03] PROBLEM Puppet freshness is now: CRITICAL on nova-ldap1 i-000000df output: Puppet has not run in last 20 hours [06:33:45] PROBLEM Free ram is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:33:45] PROBLEM Current Load is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:33:45] PROBLEM Disk Space is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:33:45] PROBLEM Current Users is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:33:45] PROBLEM Total Processes is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:33:50] PROBLEM dpkg-check is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:52] RECOVERY Free ram is now: OK on migration1 i-00000261 output: OK: 83% free memory [06:38:52] RECOVERY Disk Space is now: OK on migration1 i-00000261 output: DISK OK [06:38:57] RECOVERY dpkg-check is now: OK on migration1 i-00000261 output: All packages OK [06:38:57] RECOVERY Total Processes is now: OK on migration1 i-00000261 output: PROCS OK: 94 processes [06:39:02] RECOVERY Current Users is now: OK on migration1 i-00000261 output: USERS OK - 0 users currently logged in [06:39:02] RECOVERY Current Load is now: OK on migration1 i-00000261 output: OK - load average: 3.82, 4.29, 2.50 [06:44:53] PROBLEM Current Load is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:44:53] PROBLEM Total Processes is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:44:58] PROBLEM Current Users is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:44:58] PROBLEM Disk Space is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:45:09] PROBLEM Free ram is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:45:09] PROBLEM Current Load is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:45:09] PROBLEM Disk Space is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:45:09] PROBLEM Current Users is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:45:09] PROBLEM Free ram is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:45:09] PROBLEM dpkg-check is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:46:51] PROBLEM HTTP is now: WARNING on deployment-web4 i-00000214 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.018 second response time [06:58:02] PROBLEM HTTP is now: WARNING on deployment-web3 i-00000219 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.014 second response time [06:58:02] PROBLEM HTTP is now: WARNING on deployment-web5 i-00000213 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.014 second response time [06:58:11] PROBLEM HTTP is now: WARNING on deployment-web i-00000217 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.011 second response time [06:58:21] PROBLEM Current Load is now: WARNING on reportcard2 i-000001ea output: WARNING - load average: 16.09, 13.84, 7.46 [07:09:25] PROBLEM Free ram is now: CRITICAL on bots-2 i-0000009c output: CHECK_NRPE: Socket timeout after 10 seconds. [07:21:15] RECOVERY Free ram is now: OK on rds i-00000207 output: OK: 91% free memory [07:21:15] RECOVERY Disk Space is now: OK on rds i-00000207 output: DISK OK [07:21:15] RECOVERY Current Users is now: OK on rds i-00000207 output: USERS OK - 0 users currently logged in [07:21:20] PROBLEM HTTP is now: CRITICAL on deployment-web4 i-00000214 output: CRITICAL - Socket timeout after 10 seconds [07:21:25] PROBLEM HTTP is now: CRITICAL on deployment-web i-00000217 output: CRITICAL - Socket timeout after 10 seconds [07:21:25] PROBLEM HTTP is now: CRITICAL on deployment-web3 i-00000219 output: CRITICAL - Socket timeout after 10 seconds [07:21:25] PROBLEM HTTP is now: CRITICAL on deployment-web5 i-00000213 output: CRITICAL - Socket timeout after 10 seconds [07:21:50] PROBLEM Current Load is now: WARNING on incubator-bot2 i-00000252 output: WARNING - load average: 7.32, 7.79, 8.22 [07:30:09] PROBLEM Free ram is now: WARNING on bots-2 i-0000009c output: Warning: 13% free memory [07:31:06] PROBLEM Current Load is now: WARNING on rds i-00000207 output: WARNING - load average: 12.76, 9.83, 8.10 [07:31:11] RECOVERY Disk Space is now: OK on incubator-bot2 i-00000252 output: DISK OK [07:32:56] RECOVERY Current Users is now: OK on incubator-bot2 i-00000252 output: USERS OK - 0 users currently logged in [07:32:56] RECOVERY Free ram is now: OK on incubator-bot2 i-00000252 output: OK: 51% free memory [07:32:56] RECOVERY dpkg-check is now: OK on incubator-bot2 i-00000252 output: All packages OK [07:43:11] PROBLEM Current Load is now: WARNING on swift-be2 i-000001c8 output: WARNING - load average: 3.10, 4.64, 6.58 [07:43:16] PROBLEM Current Load is now: WARNING on wep i-000000c2 output: WARNING - load average: 3.47, 4.84, 6.48 [07:50:27] PROBLEM Current Load is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:50:32] PROBLEM Current Load is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:52:41] PROBLEM Current Load is now: WARNING on bots-2 i-0000009c output: WARNING - load average: 3.58, 4.56, 6.07 [07:52:42] PROBLEM Current Load is now: WARNING on ganglia-test2 i-00000250 output: WARNING - load average: 7.10, 7.91, 7.61 [07:52:42] PROBLEM Current Load is now: WARNING on labs-nfs1 i-0000005d output: WARNING - load average: 2.70, 5.75, 8.81 [07:52:52] PROBLEM Current Load is now: WARNING on deployment-web3 i-00000219 output: WARNING - load average: 4.03, 5.35, 5.97 [07:52:52] PROBLEM Current Load is now: WARNING on deployment-apache22 i-0000026f output: WARNING - load average: 4.33, 5.60, 6.50 [07:52:52] PROBLEM Current Load is now: WARNING on incubator-bot1 i-00000251 output: WARNING - load average: 6.47, 8.57, 9.14 [07:52:52] PROBLEM Current Load is now: WARNING on deployment-transcoding i-00000105 output: WARNING - load average: 2.84, 5.25, 5.84 [07:52:52] PROBLEM Current Load is now: WARNING on aggregator-test2 i-0000024e output: WARNING - load average: 7.64, 7.38, 6.87 [07:52:52] PROBLEM Current Load is now: CRITICAL on bots-cb i-0000009e output: CRITICAL - load average: 6.51, 15.64, 34.34 [07:52:52] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 9.93, 11.90, 11.60 [07:53:02] PROBLEM Current Load is now: WARNING on swift-be4 i-000001ca output: WARNING - load average: 10.45, 8.15, 8.36 [07:53:02] PROBLEM Disk Space is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:02] PROBLEM Total Processes is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:07] PROBLEM Current Load is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:07] PROBLEM Free ram is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:08] PROBLEM dpkg-check is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:08] PROBLEM Current Users is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:12] PROBLEM Current Load is now: CRITICAL on precise-test i-00000231 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:53:17] PROBLEM Current Load is now: CRITICAL on reportcard2 i-000001ea output: CHECK_NRPE: Socket timeout after 10 seconds. [07:56:58] PROBLEM Current Load is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:57:13] RECOVERY Current Load is now: OK on swift-be2 i-000001c8 output: OK - load average: 2.17, 2.24, 3.95 [07:57:13] RECOVERY Current Load is now: OK on wep i-000000c2 output: OK - load average: 1.27, 1.65, 3.68 [07:57:13] PROBLEM Current Load is now: WARNING on upload-wizard i-0000021c output: WARNING - load average: 7.76, 7.91, 9.30 [07:57:13] PROBLEM Current Load is now: WARNING on deployment-nfs-memc i-000000d7 output: WARNING - load average: 9.75, 10.17, 10.61 [07:57:13] PROBLEM Current Load is now: WARNING on swift-be3 i-000001c9 output: WARNING - load average: 2.44, 3.63, 5.33 [07:58:57] RECOVERY Current Load is now: OK on deployment-transcoding i-00000105 output: OK - load average: 1.46, 2.54, 4.36 [07:59:07] PROBLEM Current Users is now: CRITICAL on precise-test i-00000231 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:59:07] PROBLEM Disk Space is now: CRITICAL on precise-test i-00000231 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:59:07] PROBLEM dpkg-check is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [07:59:17] PROBLEM Current Load is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [08:03:19] PROBLEM Free ram is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:03:20] PROBLEM Disk Space is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:03:20] PROBLEM Current Users is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:03:20] PROBLEM Free ram is now: CRITICAL on precise-test i-00000231 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:03:20] PROBLEM Total Processes is now: CRITICAL on precise-test i-00000231 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:03:25] PROBLEM dpkg-check is now: CRITICAL on precise-test i-00000231 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:07:09] RECOVERY Current Load is now: OK on swift-be3 i-000001c9 output: OK - load average: 4.11, 2.38, 3.92 [08:07:09] PROBLEM Current Load is now: WARNING on mobile-testing i-00000271 output: WARNING - load average: 7.25, 9.40, 12.65 [08:07:11] RECOVERY Current Users is now: OK on precise-test i-00000231 output: USERS OK - 0 users currently logged in [08:07:11] RECOVERY Disk Space is now: OK on precise-test i-00000231 output: DISK OK [08:07:19] PROBLEM Current Load is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [08:07:19] RECOVERY Current Load is now: OK on labs-nfs1 i-0000005d output: OK - load average: 1.86, 1.98, 4.22 [08:07:19] RECOVERY Current Load is now: OK on deployment-apache22 i-0000026f output: OK - load average: 0.85, 1.32, 3.33 [08:07:19] PROBLEM Current Load is now: WARNING on bots-cb i-0000009e output: WARNING - load average: 3.32, 4.90, 15.37 [08:07:24] RECOVERY Current Load is now: OK on bots-2 i-0000009c output: OK - load average: 4.60, 3.84, 4.37 [08:07:39] PROBLEM Current Load is now: WARNING on reportcard2 i-000001ea output: WARNING - load average: 17.07, 16.23, 13.63 [08:09:32] PROBLEM SSH is now: CRITICAL on bots-sql2 i-000000af output: CRITICAL - Socket timeout after 10 seconds [08:11:41] PROBLEM Disk Space is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [08:15:30] RECOVERY Current Load is now: OK on precise-test i-00000231 output: OK - load average: 3.48, 3.81, 4.77 [08:15:30] RECOVERY Free ram is now: OK on precise-test i-00000231 output: OK: 86% free memory [08:15:30] RECOVERY Total Processes is now: OK on precise-test i-00000231 output: PROCS OK: 91 processes [08:15:35] RECOVERY dpkg-check is now: OK on precise-test i-00000231 output: All packages OK [08:21:53] PROBLEM Current Load is now: WARNING on deployment-imagescaler01 i-0000025a output: WARNING - load average: 6.35, 7.43, 6.69 [08:21:53] PROBLEM Current Load is now: WARNING on bots-apache1 i-000000b0 output: WARNING - load average: 6.82, 5.59, 5.54 [08:22:18] PROBLEM Current Users is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [08:22:18] PROBLEM Total Processes is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [08:22:23] PROBLEM Free ram is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [08:22:23] PROBLEM Current Load is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:23:15] PROBLEM HTTP is now: WARNING on deployment-web4 i-00000214 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.009 second response time [08:23:15] PROBLEM HTTP is now: WARNING on deployment-web i-00000217 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.009 second response time [08:23:15] PROBLEM HTTP is now: WARNING on deployment-web3 i-00000219 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.013 second response time [08:23:15] PROBLEM HTTP is now: WARNING on deployment-web5 i-00000213 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.010 second response time [08:23:15] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 6.05, 8.99, 10.73 [08:25:06] PROBLEM Current Load is now: WARNING on bots-2 i-0000009c output: WARNING - load average: 7.76, 9.57, 8.04 [08:25:06] RECOVERY Current Load is now: OK on deployment-imagescaler01 i-0000025a output: OK - load average: 0.55, 2.93, 4.91 [08:25:06] PROBLEM Current Load is now: CRITICAL on nagios 127.0.0.1 output: CRITICAL - load average: 7.22, 11.74, 13.95 [08:25:06] RECOVERY SSH is now: OK on bots-sql2 i-000000af output: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [08:25:06] RECOVERY dpkg-check is now: OK on bots-sql2 i-000000af output: All packages OK [08:26:07] PROBLEM Current Load is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:27:07] PROBLEM Current Load is now: WARNING on ganglia-test2 i-00000250 output: WARNING - load average: 7.80, 10.31, 14.97 [08:27:07] PROBLEM Current Load is now: WARNING on labs-nfs1 i-0000005d output: WARNING - load average: 6.54, 7.97, 6.64 [08:27:08] PROBLEM Total Processes is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:31:21] RECOVERY Current Load is now: OK on deployment-web3 i-00000219 output: OK - load average: 0.20, 1.86, 4.10 [08:31:21] RECOVERY Current Load is now: OK on incubator-bot1 i-00000251 output: OK - load average: 1.07, 3.16, 4.79 [08:31:41] RECOVERY Total Processes is now: OK on rds i-00000207 output: PROCS OK: 91 processes [08:34:35] RECOVERY Current Load is now: OK on swift-be4 i-000001ca output: OK - load average: 0.97, 2.23, 4.62 [08:34:37] PROBLEM HTTP is now: CRITICAL on deployment-web4 i-00000214 output: CRITICAL - Socket timeout after 10 seconds [08:34:37] PROBLEM HTTP is now: CRITICAL on deployment-web i-00000217 output: CRITICAL - Socket timeout after 10 seconds [08:34:47] PROBLEM HTTP is now: CRITICAL on deployment-web3 i-00000219 output: CRITICAL - Socket timeout after 10 seconds [08:34:47] PROBLEM HTTP is now: CRITICAL on deployment-web5 i-00000213 output: CRITICAL - Socket timeout after 10 seconds [08:34:47] PROBLEM Disk Space is now: CRITICAL on precise-test i-00000231 output: Connection refused or timed out [08:34:47] PROBLEM Current Users is now: CRITICAL on precise-test i-00000231 output: Connection refused or timed out [08:35:24] PROBLEM Disk Space is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:35:24] PROBLEM Current Users is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:35:24] PROBLEM Free ram is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:35:24] PROBLEM Total Processes is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:38:49] PROBLEM Current Load is now: WARNING on worker1 i-00000208 output: WARNING - load average: 0.73, 4.37, 5.44 [08:38:49] RECOVERY Total Processes is now: OK on mobile-testing i-00000271 output: PROCS OK: 150 processes [08:39:42] RECOVERY Current Load is now: OK on aggregator-test2 i-0000024e output: OK - load average: 2.67, 3.13, 4.61 [08:40:17] RECOVERY Disk Space is now: OK on worker1 i-00000208 output: DISK OK [08:40:17] RECOVERY Current Users is now: OK on worker1 i-00000208 output: USERS OK - 0 users currently logged in [08:40:17] RECOVERY Free ram is now: OK on worker1 i-00000208 output: OK: 94% free memory [08:40:17] RECOVERY Total Processes is now: OK on worker1 i-00000208 output: PROCS OK: 86 processes [08:41:44] PROBLEM Free ram is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:41:44] PROBLEM dpkg-check is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Socket timeout after 10 seconds. [08:43:03] RECOVERY Current Users is now: OK on upload-wizard i-0000021c output: USERS OK - 0 users currently logged in [08:43:03] RECOVERY Free ram is now: OK on upload-wizard i-0000021c output: OK: 85% free memory [08:43:03] RECOVERY Total Processes is now: OK on upload-wizard i-0000021c output: PROCS OK: 105 processes [08:43:08] RECOVERY Current Load is now: OK on bots-apache1 i-000000b0 output: OK - load average: 1.82, 3.87, 4.96 [08:44:50] RECOVERY Current Load is now: OK on worker1 i-00000208 output: OK - load average: 0.07, 1.57, 3.90 [08:44:50] PROBLEM Current Load is now: WARNING on upload-wizard i-0000021c output: WARNING - load average: 8.73, 10.42, 9.86 [08:45:57] RECOVERY Disk Space is now: OK on upload-wizard i-0000021c output: DISK OK [08:45:57] RECOVERY Current Load is now: OK on bots-cb i-0000009e output: OK - load average: 0.65, 1.93, 4.01 [08:46:15] PROBLEM Current Load is now: WARNING on incubator-bot1 i-00000251 output: WARNING - load average: 4.63, 5.71, 5.21 [08:46:55] RECOVERY Free ram is now: OK on incubator-bot1 i-00000251 output: OK: 41% free memory [08:46:55] RECOVERY dpkg-check is now: OK on incubator-bot1 i-00000251 output: All packages OK [09:20:10] RECOVERY Current Load is now: OK on deployment-nfs-memc i-000000d7 output: OK - load average: 3.95, 3.00, 4.71 [09:25:17] PROBLEM Current Load is now: WARNING on nagios 127.0.0.1 output: WARNING - load average: 0.83, 2.10, 3.45 [09:29:37] RECOVERY Current Load is now: OK on nagios 127.0.0.1 output: OK - load average: 0.66, 1.51, 2.88 [10:17:40] PROBLEM HTTP is now: WARNING on deployment-web4 i-00000214 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.017 second response time [10:17:40] PROBLEM HTTP is now: WARNING on deployment-web i-00000217 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.014 second response time [10:17:40] PROBLEM HTTP is now: WARNING on deployment-web3 i-00000219 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.025 second response time [10:17:40] PROBLEM HTTP is now: WARNING on deployment-web5 i-00000213 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.019 second response time [11:36:50] PROBLEM Puppet freshness is now: CRITICAL on localpuppet1 i-0000020b output: Puppet has not run in last 20 hours [12:03:35] PROBLEM HTTP is now: CRITICAL on deployment-web4 i-00000214 output: CRITICAL - Socket timeout after 10 seconds [12:03:35] PROBLEM HTTP is now: CRITICAL on deployment-web3 i-00000219 output: CRITICAL - Socket timeout after 10 seconds [12:03:35] PROBLEM HTTP is now: CRITICAL on deployment-web i-00000217 output: CRITICAL - Socket timeout after 10 seconds [12:03:35] PROBLEM HTTP is now: CRITICAL on deployment-web5 i-00000213 output: CRITICAL - Socket timeout after 10 seconds [12:08:05] PROBLEM HTTP is now: WARNING on deployment-web4 i-00000214 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.015 second response time [12:08:05] PROBLEM HTTP is now: WARNING on deployment-web3 i-00000219 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.009 second response time [12:08:05] PROBLEM HTTP is now: WARNING on deployment-web i-00000217 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.021 second response time [12:08:05] PROBLEM HTTP is now: WARNING on deployment-web5 i-00000213 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.011 second response time [13:40:58] RECOVERY Free ram is now: OK on bots-2 i-0000009c output: OK: 22% free memory [15:33:29] PROBLEM Puppet freshness is now: CRITICAL on nova-ldap1 i-000000df output: Puppet has not run in last 20 hours [17:25:22] ^demon: Erik said I should bug you to get ownership of a gerrit project? [17:25:52] <^demon> I knew signing into IRC on a sunday was a mistake ;-) [17:26:36] Ha! [17:26:44] Well, I didn't see you on all week [17:27:03] Maybe because of inattentiveness, but even so [17:28:18] <^demon> I was out Thurs/Fri for my birthday :) [17:29:50] <^demon> Anyway, wikimedia/orgcharts is owned by the 'wmf' group. I tried to add you to it, but I got explosions and stacktraces. [17:29:55] <^demon> Will have to bug Ryan tomorrow. [17:54:57] chad has 2 birthdays? [18:04:08] PROBLEM HTTP is now: CRITICAL on deployment-web4 i-00000214 output: CRITICAL - Socket timeout after 10 seconds [18:04:08] PROBLEM HTTP is now: CRITICAL on deployment-web i-00000217 output: CRITICAL - Socket timeout after 10 seconds [18:04:08] PROBLEM HTTP is now: CRITICAL on deployment-web5 i-00000213 output: CRITICAL - Socket timeout after 10 seconds [18:04:08] PROBLEM HTTP is now: CRITICAL on deployment-web3 i-00000219 output: CRITICAL - Socket timeout after 10 seconds [18:07:38] <^demon|away> No, I just took 2 days off. Actual birthday was thursday :) [18:08:51] PROBLEM HTTP is now: WARNING on deployment-web4 i-00000214 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.012 second response time [18:08:51] PROBLEM HTTP is now: WARNING on deployment-web i-00000217 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.011 second response time [18:08:51] PROBLEM HTTP is now: WARNING on deployment-web5 i-00000213 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.012 second response time [18:08:51] PROBLEM HTTP is now: WARNING on deployment-web3 i-00000219 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.015 second response time [18:26:43] Can someone point me to the hookconfig.py.erb file ^demon|away mentioned on wikitech? [18:27:20] <^demon|away> templates/gerrit/hookconfig.py.erb [18:29:57] <^demon|away> Ah, I forgot to mention the second part...also needs customizing in manifests/site.pp to map the log name to the channel to join (for ircecho config) [18:57:49] Damianz: the deployment-prep IRC feed is going into the same one as the production channels, is there a way to put it in separate channels? [19:16:27] should be [19:16:36] would need to edit the echo bot [19:18:45] what would be the best way to do it? put it in #meta.wikimedia.beta for example? [19:22:26] hmm [19:31:35] No idea where the bot is [20:12:42] it looks like it's the same feed as rc-pmtpa Damianz [20:13:12] Then it probably needs a bot setting up and mediawiki changing [21:37:38] PROBLEM Puppet freshness is now: CRITICAL on localpuppet1 i-0000020b output: Puppet has not run in last 20 hours