[02:56:49] RECOVERY Puppet freshness is now: OK on deployment-nfs-memc i-000000d7 output: puppet ran at Sun Jun 17 02:56:42 UTC 2012 [03:05:29] RECOVERY Puppet freshness is now: OK on test2 i-0000013c output: puppet ran at Sun Jun 17 03:05:25 UTC 2012 [03:05:49] RECOVERY Puppet freshness is now: OK on gerrit i-000000ff output: puppet ran at Sun Jun 17 03:05:43 UTC 2012 [03:07:59] RECOVERY Puppet freshness is now: OK on swift-be2 i-000001c8 output: puppet ran at Sun Jun 17 03:07:57 UTC 2012 [03:42:19] PROBLEM Free ram is now: WARNING on test-oneiric i-00000187 output: Warning: 14% free memory [03:44:49] PROBLEM Free ram is now: WARNING on orgcharts-dev i-0000018f output: Warning: 13% free memory [03:54:49] PROBLEM Free ram is now: WARNING on utils-abogott i-00000131 output: Warning: 16% free memory [03:57:19] PROBLEM Free ram is now: CRITICAL on test-oneiric i-00000187 output: Critical: 5% free memory [03:59:49] PROBLEM Free ram is now: CRITICAL on orgcharts-dev i-0000018f output: Critical: 4% free memory [04:02:09] PROBLEM Free ram is now: WARNING on nova-daas-1 i-000000e7 output: Warning: 15% free memory [04:02:19] RECOVERY Free ram is now: OK on test-oneiric i-00000187 output: OK: 97% free memory [04:04:49] RECOVERY Free ram is now: OK on orgcharts-dev i-0000018f output: OK: 94% free memory [04:14:49] PROBLEM Free ram is now: CRITICAL on utils-abogott i-00000131 output: Critical: 3% free memory [04:19:49] RECOVERY Free ram is now: OK on utils-abogott i-00000131 output: OK: 96% free memory [04:22:09] PROBLEM Free ram is now: CRITICAL on nova-daas-1 i-000000e7 output: Critical: 5% free memory [04:32:09] RECOVERY Free ram is now: OK on nova-daas-1 i-000000e7 output: OK: 94% free memory [04:37:09] PROBLEM Free ram is now: CRITICAL on test3 i-00000093 output: Critical: 3% free memory [04:42:09] RECOVERY Free ram is now: OK on test3 i-00000093 output: OK: 96% free memory [05:34:09] PROBLEM Free ram is now: CRITICAL on incubator-bot1 i-00000251 output: Critical: 5% free memory [06:29:55] PROBLEM Free ram is now: WARNING on incubator-bot2 i-00000252 output: Warning: 19% free memory [06:35:49] PROBLEM Disk Space is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:24] PROBLEM Free ram is now: CRITICAL on mwreview i-000002ae output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:24] PROBLEM dpkg-check is now: CRITICAL on mwreview i-000002ae output: CHECK_NRPE: Socket timeout after 10 seconds. [06:45:10] PROBLEM Current Load is now: CRITICAL on nagios 127.0.0.1 output: CRITICAL - load average: 7.97, 6.63, 4.00 [06:45:14] PROBLEM Free ram is now: CRITICAL on incubator-bot2 i-00000252 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:46:29] RECOVERY Free ram is now: OK on mwreview i-000002ae output: OK: 69% free memory [06:46:29] RECOVERY dpkg-check is now: OK on mwreview i-000002ae output: All packages OK [06:46:55] RECOVERY Disk Space is now: OK on deployment-transcoding i-00000105 output: DISK OK [06:50:36] PROBLEM Free ram is now: WARNING on incubator-bot2 i-00000252 output: Warning: 17% free memory [06:50:45] RECOVERY Disk Space is now: OK on worker1 i-00000208 output: DISK OK [06:51:34] PROBLEM Total Processes is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:51:41] PROBLEM Disk Space is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:54:54] PROBLEM Current Load is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:54:54] PROBLEM Free ram is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:55:14] PROBLEM Current Load is now: CRITICAL on bots-cb i-0000009e output: CRITICAL - load average: 5.96, 25.32, 30.25 [06:55:14] PROBLEM Current Load is now: WARNING on maps-tilemill1 i-00000294 output: WARNING - load average: 7.34, 7.97, 5.82 [06:56:41] PROBLEM dpkg-check is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:56:41] PROBLEM Current Users is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:58:07] RECOVERY Disk Space is now: OK on incubator-bot1 i-00000251 output: DISK OK [06:58:20] RECOVERY Total Processes is now: OK on incubator-bot1 i-00000251 output: PROCS OK: 137 processes [06:58:25] PROBLEM Current Load is now: WARNING on incubator-bot1 i-00000251 output: WARNING - load average: 6.95, 8.75, 6.98 [06:59:11] PROBLEM Current Users is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:59:11] PROBLEM dpkg-check is now: CRITICAL on pybal-precise i-00000289 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:59:11] PROBLEM Disk Space is now: CRITICAL on migration1 i-00000261 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:01:33] PROBLEM Free ram is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:01:34] PROBLEM dpkg-check is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:01:34] PROBLEM Total Processes is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:03:55] RECOVERY Current Users is now: OK on incubator-bot1 i-00000251 output: USERS OK - 0 users currently logged in [07:04:01] PROBLEM SSH is now: CRITICAL on bots-sql2 i-000000af output: CRITICAL - Socket timeout after 10 seconds [07:04:23] RECOVERY dpkg-check is now: OK on pybal-precise i-00000289 output: All packages OK [07:04:24] PROBLEM Disk Space is now: WARNING on deployment-transcoding i-00000105 output: DISK WARNING - free space: / 74 MB (5% inode=52%): [07:04:24] PROBLEM Current Load is now: WARNING on precise-test i-00000231 output: WARNING - load average: 5.96, 6.84, 5.63 [07:04:24] RECOVERY Free ram is now: OK on mobile-testing i-00000271 output: OK: 83% free memory [07:04:35] RECOVERY Total Processes is now: OK on mobile-testing i-00000271 output: PROCS OK: 150 processes [07:04:40] RECOVERY dpkg-check is now: OK on mobile-testing i-00000271 output: All packages OK [07:04:40] PROBLEM Current Load is now: WARNING on redis1 i-000002b6 output: WARNING - load average: 3.03, 5.79, 5.03 [07:04:40] PROBLEM Current Load is now: WARNING on mobile-testing i-00000271 output: WARNING - load average: 4.84, 9.81, 8.90 [07:05:13] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 5.29, 6.16, 5.80 [07:06:13] PROBLEM Current Load is now: WARNING on wikistats-01 i-00000042 output: WARNING - load average: 2.21, 10.98, 7.45 [07:08:24] PROBLEM Current Load is now: WARNING on bots-cb i-0000009e output: WARNING - load average: 0.61, 6.29, 16.66 [07:08:24] RECOVERY Current Load is now: OK on maps-tilemill1 i-00000294 output: OK - load average: 0.19, 1.77, 3.66 [07:08:24] RECOVERY dpkg-check is now: OK on incubator-bot1 i-00000251 output: All packages OK [07:08:24] RECOVERY SSH is now: OK on bots-sql2 i-000000af output: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [07:09:03] RECOVERY Current Users is now: OK on migration1 i-00000261 output: USERS OK - 0 users currently logged in [07:09:03] RECOVERY Disk Space is now: OK on migration1 i-00000261 output: DISK OK [07:09:13] RECOVERY Current Load is now: OK on incubator-bot1 i-00000251 output: OK - load average: 0.52, 2.82, 4.90 [07:09:13] RECOVERY Free ram is now: OK on migration1 i-00000261 output: OK: 90% free memory [07:09:13] RECOVERY Current Load is now: OK on precise-test i-00000231 output: OK - load average: 0.10, 2.73, 4.21 [07:09:33] RECOVERY Current Load is now: OK on redis1 i-000002b6 output: OK - load average: 0.04, 2.24, 3.71 [07:09:53] PROBLEM Current Load is now: WARNING on jenkins2 i-00000102 output: WARNING - load average: 0.39, 5.02, 6.50 [07:10:13] RECOVERY Current Load is now: OK on bots-sql2 i-000000af output: OK - load average: 2.95, 4.13, 5.00 [07:14:53] RECOVERY Current Load is now: OK on jenkins2 i-00000102 output: OK - load average: 0.01, 1.93, 4.75 [07:16:13] RECOVERY Current Load is now: OK on wikistats-01 i-00000042 output: OK - load average: 0.42, 1.65, 3.96 [07:19:03] PROBLEM Current Load is now: WARNING on nagios 127.0.0.1 output: WARNING - load average: 0.38, 0.84, 3.47 [07:19:33] RECOVERY Current Load is now: OK on mobile-testing i-00000271 output: OK - load average: 0.54, 1.12, 3.91 [07:24:03] RECOVERY Current Load is now: OK on nagios 127.0.0.1 output: OK - load average: 0.29, 0.50, 2.58 [07:28:23] RECOVERY Current Load is now: OK on bots-cb i-0000009e output: OK - load average: 0.21, 0.28, 4.72 [07:52:23] PROBLEM Free ram is now: WARNING on bots-3 i-000000e5 output: Warning: 16% free memory [08:17:23] PROBLEM Free ram is now: CRITICAL on bots-3 i-000000e5 output: Critical: 5% free memory [08:32:23] RECOVERY Free ram is now: OK on bots-3 i-000000e5 output: OK: 61% free memory [09:14:13] RECOVERY Free ram is now: OK on incubator-bot2 i-00000252 output: OK: 30% free memory [10:45:52] !log bots petrb: installing new io cache for wmbot [10:45:53] Logged the message, Master [10:51:33] :) [10:51:48] remind me who is admin of wm-bot, is it Thehelpfulone? [10:52:11] Thehelpfulone: I installed a bouncer cache so now things are complicated, I will insert it to docs [10:54:14] !ping [10:54:15] pong [11:23:24] petan: ok [12:10:07] Thehelpfulone: you know what is irc bouncer? [12:10:57] it means that wm-bot is connected to bouncer which is connected to IRC, so that you when you shut it down it's still online [12:11:06] it allows me to patch the bot without having to disconnect it [12:11:09] petan: I'm on a bouncer at the moment :P [12:11:12] ok [12:29:11] Thehelpfulone: how's the thing with -help [12:29:50] always when I see someone there asking for help I remember there is no bot and I quit helping before I start :D [14:13:23] PROBLEM Free ram is now: UNKNOWN on incubator-bot1 i-00000251 output: NRPE: Call to fork() failed [14:16:23] PROBLEM Current Users is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Error - Could not complete SSL handshake. [14:16:23] PROBLEM dpkg-check is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Error - Could not complete SSL handshake. [14:17:13] PROBLEM Current Load is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Error - Could not complete SSL handshake. [14:17:13] PROBLEM Disk Space is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Error - Could not complete SSL handshake. [14:17:13] PROBLEM Total Processes is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Error - Could not complete SSL handshake. [14:18:24] PROBLEM Free ram is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Error - Could not complete SSL handshake. [14:21:23] PROBLEM SSH is now: CRITICAL on incubator-bot1 i-00000251 output: Server answer: [14:22:13] RECOVERY Current Load is now: OK on incubator-bot1 i-00000251 output: OK - load average: 2.31, 1.27, 0.75 [14:22:13] RECOVERY Disk Space is now: OK on incubator-bot1 i-00000251 output: DISK OK [14:22:13] RECOVERY Total Processes is now: OK on incubator-bot1 i-00000251 output: PROCS OK: 124 processes [14:23:23] RECOVERY Free ram is now: OK on incubator-bot1 i-00000251 output: OK: 39% free memory [14:26:23] RECOVERY SSH is now: OK on incubator-bot1 i-00000251 output: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [14:26:23] RECOVERY Current Users is now: OK on incubator-bot1 i-00000251 output: USERS OK - 0 users currently logged in [14:26:23] RECOVERY dpkg-check is now: OK on incubator-bot1 i-00000251 output: All packages OK [17:09:16] !log bots petrb: patching bouncer [17:09:17] Logged the message, Master [17:16:37] !ping [17:16:38] pong [17:39:35] !ping [17:39:35] pong [19:41:36] New patchset: Hashar; "cronspam sprint - srv222 - use -ignore_readdir_race with find to avoid errors for missing files" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/11753 [19:41:55] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/11753 [19:42:06] New review: Hashar; "Cherry picked from production." [operations/puppet] (test); V: 0 C: 0; - https://gerrit.wikimedia.org/r/11753