[00:01:13] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [00:01:14] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [00:01:53] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.513184/1.95, alarm hl:np_load_avg=2.206055/2.0, alarm hl:mem_free=482.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.513184/2.3, alarm hl:np_load_long=2.121582/2.5, alarm hl:cpu=100.000000/98, alarm hl:mem_free=482.000000M/150M, alarm hl:available=1/0 [00:02:02] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:03:42] Load avg. on willow is WARNING: WARNING - load average: 20.24, 18.42, 17.34 [00:07:03] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [00:07:23] SMF on wolfsbane is CRITICAL: ERROR - maintenance: svc:/application/sge/execd:toolserver [00:11:43] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 400466 MB (7% inode=40%): [00:12:44] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.094726/1.00, alarm hl:np_load_long=0.653320/1.50, alarm hl:mem_free=15963.000000M/600M, alarm hl:available=1/0 [00:14:23] SMF on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [00:14:43] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [00:15:22] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [00:18:43] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in unknown state: medium-sol@wolfsbane in unknown state [00:23:23] SMF on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [00:23:44] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [00:32:04] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:37:55] [[Special:Log/block]] reblock 10 * Ltcmdrap * (changed block settings for [[02User:Brandon Sky10]] with an expiry time of infinite (account creation disabled): Crosswiki issues, we aren't going to get a constructive contributor) [00:45:14] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [00:45:14] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [00:51:44] Load avg. on willow is CRITICAL: CRITICAL - load average: 23.01, 20.99, 20.03 [00:52:13] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 52101.000000 [00:52:22] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 52092 [00:52:44] Load avg. on willow is WARNING: WARNING - load average: 20.76, 20.70, 19.98 [00:53:44] Load avg. on willow is CRITICAL: CRITICAL - load average: 20.55, 20.66, 20.02 [01:01:15] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [01:01:15] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:01:53] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=3.251465/1.95, alarm hl:np_load_avg=2.792969/2.0, alarm hl:mem_free=118.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=3.251465/2.3, alarm hl:np_load_long=2.611328/2.5, alarm hl:cpu=99.900000/98, alarm hl:mem_free=118.000000M/150M, alarm hl:available=1/0 [01:07:22] SMF on wolfsbane is CRITICAL: ERROR - maintenance: svc:/application/sge/execd:toolserver [01:12:43] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 399498 MB (7% inode=40%): [01:14:43] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.114258/1.10, alarm hl:np_load_long=0.760742/1.55, alarm hl:mem_free=18083.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.114258/1.00, alarm hl:np_load_long=0.760742/1.50, alarm hl:mem_free=18083.000000M/600M, alarm hl:available=1/0 [01:16:42] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [01:18:53] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in unknown state: medium-sol@wolfsbane in unknown state [01:19:43] Load avg. on willow is CRITICAL: CRITICAL - load average: 16.05, 18.82, 20.19 [01:21:43] Load avg. on willow is WARNING: WARNING - load average: 16.14, 18.17, 19.79 [01:24:43] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [01:45:23] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [01:46:13] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [01:52:23] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 49600 [01:53:12] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 49560.000000 [02:02:02] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.825684/1.95, alarm hl:np_load_avg=2.455566/2.0, alarm hl:mem_free=836.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.825684/2.3, alarm hl:np_load_long=2.357422/2.5, alarm hl:cpu=100.000000/98, alarm hl:mem_free=836.000000M/150M, alarm hl:available=1/0 [02:02:14] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [02:02:23] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:07:23] SMF on wolfsbane is CRITICAL: ERROR - maintenance: svc:/application/sge/execd:toolserver [02:12:43] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.077149/1.00, alarm hl:np_load_long=0.830078/1.50, alarm hl:mem_free=17454.000000M/600M, alarm hl:available=1/0 [02:13:43] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 399271 MB (7% inode=40%): [02:15:32] SMF on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [02:16:23] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [02:16:43] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [02:18:52] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in unknown state: medium-sol@wolfsbane in unknown state [02:20:52] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.080078/1.00, alarm hl:np_load_long=0.904297/1.50, alarm hl:mem_free=17736.000000M/600M, alarm hl:available=1/0 [02:22:43] Load avg. on willow is WARNING: WARNING - load average: 16.71, 17.95, 18.43 [02:24:44] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [02:43:42] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.286133/1.10, alarm hl:np_load_long=0.911133/1.55, alarm hl:mem_free=17714.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.286133/1.00, alarm hl:np_load_long=0.911133/1.50, alarm hl:mem_free=17714.000000M/600M, alarm hl:available=1/0 [02:44:44] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [02:46:14] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [02:51:52] 3(commented) [UTRS-98] Account creation is impossible <10https://jira.toolserver.org/browse/UTRS-98> (DeltaQuad) [02:52:22] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 39513 [02:53:14] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39283.000000 [03:02:03] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.367676/1.95, alarm hl:np_load_avg=2.333008/2.0, alarm hl:mem_free=759.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.367676/2.3, alarm hl:np_load_long=2.330566/2.5, alarm hl:cpu=99.900000/98, alarm hl:mem_free=759.000000M/150M, alarm hl:available=1/0 [03:02:23] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:02:23] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [03:07:23] SMF on wolfsbane is CRITICAL: ERROR - maintenance: svc:/application/sge/execd:toolserver [03:11:42] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.778320/1.10, alarm hl:np_load_long=0.990235/1.55, alarm hl:mem_free=17272.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.778320/1.00, alarm hl:np_load_long=0.990235/1.50, alarm hl:mem_free=17272.000000M/600M, alarm hl:available=1/0 [03:13:43] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 397632 MB (7% inode=40%): [03:16:23] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [03:17:43] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [03:18:53] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in unknown state: medium-sol@wolfsbane in unknown state [03:23:43] Load avg. on willow is WARNING: WARNING - load average: 17.05, 18.98, 19.11 [03:24:53] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [03:37:23] /sql on cassia is OK: DISK OK - free space: /sql 130767 MB (11% inode=99%): [03:40:23] /sql on cassia is WARNING: DISK WARNING - free space: /sql 129083 MB (10% inode=99%): [03:43:23] /sql on cassia is OK: DISK OK - free space: /sql 130423 MB (11% inode=99%): [03:45:25] SMF on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [03:46:24] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [03:47:13] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [03:52:23] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 26451 [03:53:22] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 26438.000000 [03:54:43] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.007812/1.00, alarm hl:np_load_long=0.824219/1.50, alarm hl:mem_free=17346.000000M/600M, alarm hl:available=1/0 [03:55:44] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [04:02:03] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.232910/1.95, alarm hl:np_load_avg=2.349121/2.0, alarm hl:mem_free=652.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.232910/2.3, alarm hl:np_load_long=2.342773/2.5, alarm hl:cpu=99.000000/98, alarm hl:mem_free=652.000000M/150M, alarm hl:available=1/0 [04:02:24] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:02:25] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [04:02:44] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.338867/1.10, alarm hl:np_load_long=0.930664/1.55, alarm hl:mem_free=17052.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.338867/1.00, alarm hl:np_load_long=0.930664/1.50, alarm hl:mem_free=17052.000000M/600M, alarm hl:available=1/0 [04:05:25] /sql on cassia is WARNING: DISK WARNING - free space: /sql 128051 MB (10% inode=99%): [04:07:24] SMF on wolfsbane is CRITICAL: ERROR - maintenance: svc:/application/sge/execd:toolserver [04:13:43] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 397621 MB (7% inode=40%): [04:17:43] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [04:18:53] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in unknown state: medium-sol@wolfsbane in unknown state [04:22:43] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.511719/1.10, alarm hl:np_load_long=1.299805/1.55, alarm hl:mem_free=15949.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.511719/1.00, alarm hl:np_load_long=1.299805/1.50, alarm hl:mem_free=15949.000000M/600M, alarm hl:available=1/0 [04:23:43] Load avg. on willow is WARNING: WARNING - load average: 16.71, 18.08, 18.73 [04:24:52] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [04:27:43] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:27:43] Environment IPMI on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:27:43] SMTP on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:27:43] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:27:43] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:27:52] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:28:02] Load avg. on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:28:12] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [04:28:12] Environment IPMI on hyacinth is OK: ok: temperature ok fan ok voltage ok chassis ok [04:28:32] SMTP on z-dat-s4-a is OK: SMTP OK - 0.003 sec. response time [04:28:33] Load avg. on hyacinth is OK: OK - load average: 1.14, 1.09, 2.04 [04:28:33] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [04:28:33] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [04:28:43] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [04:35:34] /sql on cassia is OK: DISK OK - free space: /sql 130484 MB (11% inode=99%): [04:37:44] [[Special:Log/newusers]] create 10 * Ilhamulub * (New user account) [04:38:32] /sql on cassia is WARNING: DISK WARNING - free space: /sql 129705 MB (10% inode=99%): [04:42:13] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [04:45:14] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.336914/1.10, alarm hl:np_load_long=1.503906/1.55, alarm hl:mem_free=14623.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.336914/1.00, alarm hl:np_load_long=1.503906/1.50, alarm hl:mem_free=14623.000000M/600M, alarm hl:available=1/0 [04:46:32] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [04:47:20] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [04:52:32] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 18040 [04:53:13] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [04:53:33] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 17968.000000 [05:02:13] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.102051/1.95, alarm hl:np_load_avg=2.205566/2.0, alarm hl:mem_free=775.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.102051/2.3, alarm hl:np_load_long=2.257812/2.5, alarm hl:cpu=99.200000/98, alarm hl:mem_free=775.000000M/150M, alarm hl:available=1/0 [05:02:32] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:02:32] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [05:07:33] SMF on wolfsbane is CRITICAL: ERROR - maintenance: svc:/application/sge/execd:toolserver [05:14:13] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 397574 MB (7% inode=40%): [05:18:31] /sql on cassia is OK: DISK OK - free space: /sql 131491 MB (11% inode=99%): [05:19:12] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in unknown state: medium-sol@wolfsbane in unknown state [05:21:12] Load avg. on willow is CRITICAL: CRITICAL - load average: 21.07, 21.32, 20.03 [05:21:31] /sql on cassia is WARNING: DISK WARNING - free space: /sql 129800 MB (10% inode=99%): [05:22:13] Load avg. on willow is WARNING: WARNING - load average: 18.57, 20.64, 19.88 [05:25:13] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [05:27:12] Load avg. on willow is CRITICAL: CRITICAL - load average: 21.84, 21.06, 20.20 [05:46:31] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [05:47:21] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [05:52:31] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 6250 [05:54:31] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 5464.000000 [05:55:21] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:56:12] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [05:59:32] s1 replag on thyme is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3524.000000 [05:59:33] MySQL slave on thyme is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3447 [06:02:21] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.276855/1.95, alarm hl:np_load_avg=2.309082/2.0, alarm hl:mem_free=370.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.276855/2.3, alarm hl:np_load_long=2.403809/2.5, alarm hl:cpu=100.000000/98, alarm hl:mem_free=370.000000M/150M, alarm hl:available=1/0 [06:02:32] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:02:32] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [06:05:31] s1 replag on thyme is OK: QUERY OK: SELECT ts_rc_age() returned 1352.000000 [06:05:31] MySQL slave on thyme is OK: Uptime: 2113331 Threads: 5 Questions: 861764302 Slow queries: 424501 Opens: 135417 Flush tables: 3 Open tables: 3407 Queries per second avg: 407.775 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1249 [06:08:31] SMF on wolfsbane is CRITICAL: ERROR - maintenance: svc:/application/sge/execd:toolserver [06:11:21] Sun Grid Engine execd on willow is UNKNOWN: NRPE: Unable to read output [06:12:21] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.281250/1.95, alarm hl:np_load_avg=2.374023/2.0, alarm hl:mem_free=157.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.281250/2.3, alarm hl:np_load_long=2.397949/2.5, alarm hl:cpu=98.700000/98, alarm hl:mem_free=157.000000M/150M, alarm hl:available=1/0 [06:13:12] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.779297/1.10, alarm hl:np_load_long=0.930664/1.55, alarm hl:mem_free=16742.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.779297/1.00, alarm hl:np_load_long=0.930664/1.50, alarm hl:mem_free=16742.000000M/600M, alarm hl:available=1/0 [06:15:12] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 397478 MB (7% inode=40%): [06:19:12] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in unknown state: medium-sol@wolfsbane in unknown state [06:21:12] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [06:26:12] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.063476/1.00, alarm hl:np_load_long=0.998047/1.50, alarm hl:mem_free=17777.000000M/600M, alarm hl:available=1/0 [06:26:12] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [06:27:12] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:46:12] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.104492/1.10, alarm hl:np_load_long=0.101074/1.55, alarm hl:mem_free=1182.000000M/500M, error: no complex attribute for threshold available: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.104492/1.00, alarm hl:np_load_long=0.101074/1.50, alarm hl:mem_free=1182.000000M/600M, error: no complex [06:46:32] SMF on wolfsbane is OK: OK - all services online [06:46:32] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [06:47:00] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [06:47:11] Sun Grid Engine execd on wolfsbane is OK: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [06:47:21] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [06:52:21] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:52:34] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:53:32] SMF on wolfsbane is CRITICAL: ERROR - maintenance: svc:/application/sge/execd:toolserver [06:54:13] Sun Grid Engine execd on wolfsbane is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [06:54:40] SMF on wolfsbane is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [07:02:34] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:02:34] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [07:02:41] / on wolfsbane is WARNING: DISK WARNING - free space: / 6134 MB (20% inode=93%): [07:04:42] / on wolfsbane is CRITICAL: Connection refused by host [07:06:13] Environment IPMI on wolfsbane is CRITICAL: Connection refused by host [07:06:13] Load avg. on willow is WARNING: WARNING - load average: 12.08, 15.32, 16.72 [07:06:21] Load avg. on wolfsbane is CRITICAL: Connection refused by host [07:07:13] /tmp on wolfsbane is CRITICAL: Connection refused by host [07:07:13] Cluster on wolfsbane is CRITICAL: Connection refused by host [07:10:12] toolserver.org HTTP on ortelius is WARNING: HTTP WARNING: HTTP/1.1 200 OK - 239 bytes in 0.830 second response time [07:11:12] toolserver.org HTTP on ortelius is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 3.548 second response time [07:11:31] Load avg. on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [07:11:40] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:12:11] Load avg. on willow is WARNING: WARNING - load average: 14.38, 13.82, 15.47 [07:12:20] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.574219/1.95, alarm hl:np_load_avg=1.684570/2.0, alarm hl:mem_free=96.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.574219/2.3, alarm hl:np_load_long=1.922851/2.5, alarm hl:cpu=94.800000/98, alarm hl:mem_free=96.000000M/150M, alarm hl:available=1/0 [07:12:21] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.123 second response time [07:13:12] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.025 second response time [07:14:12] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.010742/1.00, alarm hl:np_load_long=0.871094/1.50, alarm hl:mem_free=17038.000000M/600M, alarm hl:available=1/0 [07:15:12] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 397440 MB (7% inode=40%): [07:15:12] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [07:15:12] Load avg. on willow is OK: OK - load average: 12.84, 13.22, 14.92 [07:15:20] toolserver.org HTTP on wolfsbane is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 1.607 second response time [07:20:21] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.022 second response time [07:24:12] Sun Grid Engine execd on wolfsbane is CRITICAL: Connection refused by host [07:24:31] SMF on wolfsbane is CRITICAL: Connection refused by host [07:27:12] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [07:32:14] Load avg. on willow is WARNING: WARNING - load average: 21.67, 18.94, 15.68 [07:37:12] Load avg. on willow is OK: OK - load average: 10.07, 14.22, 14.57 [07:40:31] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:44:21] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.012 second response time [07:45:12] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.100586/1.10, alarm hl:np_load_long=0.880860/1.55, alarm hl:mem_free=17553.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.100586/1.00, alarm hl:np_load_long=0.880860/1.50, alarm hl:mem_free=17553.000000M/600M, alarm hl:available=1/0 [07:46:13] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [07:46:31] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [07:48:12] Load avg. on willow is WARNING: WARNING - load average: 16.42, 15.98, 15.14 [07:48:21] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [07:52:32] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:56:13] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [08:00:13] Cluster on wolfsbane is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [08:00:13] /tmp on wolfsbane is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [08:00:21] Environment IPMI on wolfsbane is OK: ok: temperature ok fan ok voltage ok chassis ok [08:00:21] Load avg. on wolfsbane is OK: OK - load average: 0.33, 0.42, 0.55 [08:00:31] SMF on wolfsbane is OK: OK - all services online [08:00:40] / on wolfsbane is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [08:01:11] /tmp on wolfsbane is OK: DISK OK - free space: /tmp 75 MB (40% inode=99%): [08:01:12] Cluster on wolfsbane is OK: CLUSTER OK ! [08:01:41] / on wolfsbane is OK: DISK OK - free space: / 11158 MB (37% inode=93%): [08:02:12] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [08:02:20] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.398 second response time [08:03:13] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in unknown state: medium-sol@wolfsbane in unknown state [08:03:13] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.714844/1.95, alarm hl:np_load_avg=1.938965/2.0, alarm hl:mem_free=319.000000M/350M, alarm hl:available=1/0 [08:03:32] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:03:32] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [08:04:14] Sun Grid Engine execd on wolfsbane is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [08:04:31] SMF on wolfsbane is CRITICAL: ERROR - maintenance: svc:/application/sge/execd:toolserver [08:06:21] toolserver.org HTTP on ortelius is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 9.491 second response time [08:11:41] SMF on wolfsbane is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [08:13:21] Load avg. on wolfsbane is CRITICAL: Connection refused by host [08:13:41] / on wolfsbane is CRITICAL: Connection refused by host [08:14:12] /tmp on wolfsbane is CRITICAL: Connection refused by host [08:14:12] Cluster on wolfsbane is CRITICAL: Connection refused by host [08:14:12] Environment IPMI on wolfsbane is CRITICAL: Connection refused by host [08:15:12] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 397268 MB (7% inode=40%): [08:15:21] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [08:16:20] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.027 second response time [08:23:11] /tmp on wolfsbane is OK: DISK OK - free space: /tmp 5742 MB (98% inode=99%): [08:23:11] Cluster on wolfsbane is OK: CLUSTER OK ! [08:23:12] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.049805/1.00, alarm hl:np_load_long=0.869140/1.50, alarm hl:mem_free=16931.000000M/600M, alarm hl:available=1/0 [08:23:12] Environment IPMI on wolfsbane is OK: ok: temperature ok fan ok voltage ok chassis ok [08:23:20] Load avg. on wolfsbane is OK: OK - load average: 1.41, 1.11, 0.77 [08:23:41] / on wolfsbane is OK: DISK OK - free space: / 11399 MB (38% inode=93%): [08:24:12] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [08:27:12] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [08:32:12] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:46:33] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [08:47:21] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.525391/1.95, alarm hl:np_load_avg=1.422363/2.0, alarm hl:mem_free=238.000000M/350M, alarm hl:available=1/0 [08:48:33] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [08:51:21] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [09:02:00] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [09:02:21] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.752930/1.95, alarm hl:np_load_avg=1.634766/2.0, alarm hl:mem_free=208.000000M/350M, alarm hl:available=1/0 [09:04:32] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [09:04:32] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:12:21] Load avg. on willow is WARNING: WARNING - load average: 15.11, 14.38, 13.09 [09:13:21] Load avg. on willow is OK: OK - load average: 14.23, 14.27, 13.13 [09:14:21] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.140625/1.10, alarm hl:np_load_long=0.802735/1.55, alarm hl:mem_free=17866.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.140625/1.00, alarm hl:np_load_long=0.802735/1.50, alarm hl:mem_free=17866.000000M/600M, alarm hl:available=1/0 [09:15:21] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 397162 MB (7% inode=40%): [09:15:21] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [09:22:21] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.136719/1.10, alarm hl:np_load_long=0.900390/1.55, alarm hl:mem_free=18316.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.136719/1.00, alarm hl:np_load_long=0.900390/1.50, alarm hl:mem_free=18316.000000M/600M, alarm hl:available=1/0 [09:24:21] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [09:27:21] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [09:28:20] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.276367/1.95, alarm hl:np_load_avg=1.529297/2.0, alarm hl:mem_free=214.000000M/350M, alarm hl:available=1/0 [09:44:21] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [09:47:21] Load avg. on willow is WARNING: WARNING - load average: 21.54, 16.57, 14.18 [09:47:32] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [09:48:21] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.434570/1.95, alarm hl:np_load_avg=2.087402/2.0, alarm hl:mem_free=552.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.434570/2.3, alarm hl:np_load_long=1.791016/2.5, alarm hl:cpu=96.800000/98, alarm hl:mem_free=552.000000M/150M, alarm hl:available=1/0 [09:48:33] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [09:50:22] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [09:51:21] Load avg. on willow is OK: OK - load average: 12.98, 14.93, 14.09 [09:52:02] wikipedia * Re: [Toolserver-l] Blocking of hacking attacks IP addresses? [10:03:21] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.324707/1.95, alarm hl:np_load_avg=1.400879/2.0, alarm hl:mem_free=207.000000M/350M, alarm hl:available=1/0 [10:04:32] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [10:04:32] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:07:32] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.052734/1.00, alarm hl:np_load_long=0.820312/1.50, alarm hl:mem_free=17520.000000M/600M, alarm hl:available=1/0 [10:08:32] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [10:15:21] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 397017 MB (7% inode=40%): [10:16:21] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [10:27:21] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [10:28:32] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.176270/1.95, alarm hl:np_load_avg=1.184570/2.0, alarm hl:mem_free=305.000000M/350M, alarm hl:available=1/0 [10:29:21] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [10:47:40] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [10:48:41] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [10:57:32] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.256836/1.10, alarm hl:np_load_long=0.239258/1.55, alarm hl:mem_free=488.000000M/500M, alarm hl:available=1/0: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.256836/1.00, alarm hl:np_load_long=0.239258/1.50, alarm hl:mem_free=488.000000M/600M, alarm hl:available=1/0 [10:58:32] Sun Grid Engine execd on wolfsbane is OK: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [11:04:34] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:04:35] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [11:05:33] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.254883/1.10, alarm hl:np_load_long=0.620117/1.55, alarm hl:mem_free=17900.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.254883/1.00, alarm hl:np_load_long=0.620117/1.50, alarm hl:mem_free=17900.000000M/600M, alarm hl:available=1/0 [11:06:32] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [11:09:32] Load avg. on willow is WARNING: WARNING - load average: 17.68, 18.06, 13.25 [11:13:22] Load avg. on willow is OK: OK - load average: 10.15, 14.41, 12.88 [11:15:21] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396959 MB (7% inode=40%): [11:27:22] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [11:27:33] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:28:21] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [11:44:34] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.529297/1.10, alarm hl:np_load_long=0.916015/1.55, alarm hl:mem_free=17876.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.529297/1.00, alarm hl:np_load_long=0.916015/1.50, alarm hl:mem_free=17876.000000M/600M, alarm hl:available=1/0 [11:44:35] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.979004/1.95, alarm hl:np_load_avg=1.051270/2.0, alarm hl:mem_free=118.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=0.979004/2.3, alarm hl:np_load_long=1.144043/2.5, alarm hl:cpu=76.000000/98, alarm hl:mem_free=118.000000M/150M, alarm hl:available=1/0 [11:48:41] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [11:48:41] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [11:49:32] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [11:55:42] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:56:32] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [11:57:32] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.101562/1.10, alarm hl:np_load_long=1.064453/1.55, alarm hl:mem_free=17870.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.101562/1.00, alarm hl:np_load_long=1.064453/1.50, alarm hl:mem_free=17870.000000M/600M, alarm hl:available=1/0 [12:04:41] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:05:32] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [12:07:42] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:07:51] SMTP on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:08:00] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:08:32] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [12:08:41] SMTP on z-dat-s6-a is OK: SMTP OK - 0.003 sec. response time [12:15:21] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396910 MB (7% inode=40%): [12:22:31] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [12:25:32] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.074707/1.95, alarm hl:np_load_avg=1.028809/2.0, alarm hl:mem_free=200.000000M/350M, alarm hl:available=1/0 [12:27:32] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [12:28:00] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:33:31] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.413574/1.10, alarm hl:np_load_long=0.534180/1.55, alarm hl:mem_free=236.000000M/500M, alarm hl:available=1/0: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.413574/1.00, alarm hl:np_load_long=0.534180/1.50, alarm hl:mem_free=236.000000M/600M, alarm hl:available=1/0 [12:34:32] Sun Grid Engine execd on wolfsbane is OK: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [12:34:32] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [12:48:32] toolserver.org HTTP on ortelius is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 1.133 second response time [12:48:51] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [12:48:51] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [12:49:32] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.005 second response time [12:54:12] [[Special:Log/newusers]] create 10 * Shujenchang * (New user account) [13:04:51] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:05:41] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [13:15:32] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396832 MB (7% inode=40%): [13:27:31] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [13:49:00] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [13:49:41] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [13:56:41] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.962891/1.95, alarm hl:np_load_avg=0.928711/2.0, alarm hl:mem_free=194.000000M/350M, alarm hl:available=1/0 [14:03:33] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.391113/1.10, alarm hl:np_load_long=0.291504/1.55, alarm hl:mem_free=283.000000M/500M, alarm hl:available=1/0: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.391113/1.00, alarm hl:np_load_long=0.291504/1.50, alarm hl:mem_free=283.000000M/600M, alarm hl:available=1/0 [14:04:42] Sun Grid Engine execd on wolfsbane is OK: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [14:05:42] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [14:05:52] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:14:41] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [14:16:31] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396749 MB (7% inode=40%): [14:27:41] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [14:29:41] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.944824/1.95, alarm hl:np_load_avg=0.895508/2.0, alarm hl:mem_free=296.000000M/350M, alarm hl:available=1/0 [14:35:54] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [14:38:54] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.628418/1.95, alarm hl:np_load_avg=0.789062/2.0, alarm hl:mem_free=310.000000M/350M, alarm hl:available=1/0 [14:49:10] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [14:49:52] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [14:52:52] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.241211/1.10, alarm hl:np_load_long=0.770508/1.55, alarm hl:mem_free=17564.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.241211/1.00, alarm hl:np_load_long=0.770508/1.50, alarm hl:mem_free=17564.000000M/600M, alarm hl:available=1/0 [14:53:51] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [15:04:01] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.101562/1.10, alarm hl:np_load_long=0.800781/1.55, alarm hl:mem_free=17544.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.101562/1.00, alarm hl:np_load_long=0.800781/1.50, alarm hl:mem_free=17544.000000M/600M, alarm hl:available=1/0 [15:06:02] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [15:06:03] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:07:02] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [15:08:02] Christopher David Howie * Re: [Toolserver-l] Toolserver having a bad day? [15:11:11] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.082520/1.95, alarm hl:np_load_avg=1.060547/2.0, alarm hl:mem_free=120.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.082520/2.3, alarm hl:np_load_long=1.049805/2.5, alarm hl:cpu=73.300000/98, alarm hl:mem_free=120.000000M/150M, alarm hl:available=1/0 [15:12:02] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [15:16:30] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396672 MB (7% inode=40%): [15:28:02] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [15:47:12] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:49:11] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [15:50:01] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [16:05:22] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.115723/1.95, alarm hl:np_load_avg=1.321289/2.0, alarm hl:mem_free=115.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.115723/2.3, alarm hl:np_load_long=1.234863/2.5, alarm hl:cpu=78.500000/98, alarm hl:mem_free=115.000000M/150M, alarm hl:available=1/0 [16:06:11] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [16:06:11] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:07:01] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [16:10:40] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:11:11] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.454590/1.95, alarm hl:np_load_avg=1.343262/2.0, alarm hl:mem_free=73.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.454590/2.3, alarm hl:np_load_long=1.271484/2.5, alarm hl:cpu=96.500000/98, alarm hl:mem_free=73.000000M/150M, alarm hl:available=1/0 [16:16:31] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396517 MB (7% inode=40%): [16:18:10] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [16:26:02] DaB. * Re: [Toolserver-l] Rules clarifications [16:28:11] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [16:31:06] DaB. * [Toolserver-announce] SGE-Maintenance at Thursday and Friday [16:32:51] Environment IPMI on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:32:51] /tmp on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:33:11] SMTP on willow is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:33:40] /tmp on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [16:34:21] /tmp on willow is OK: DISK OK - free space: / 37764 MB (36% inode=99%): [16:34:31] Environment IPMI on willow is OK: ok: temperature ok fan ok voltage ok chassis ok [16:36:40] Load avg. on willow is CRITICAL: CRITICAL - load average: 187.46, 78.43, 36.05 [16:37:01] SMTP on willow is OK: SMTP OK - 3.904 sec. response time [16:49:21] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [16:50:11] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [16:51:54] 3(assigned) [UTRS-100] Confirm close <10https://jira.toolserver.org/browse/UTRS-100> (Chris Howie) [16:54:12] (Can't contact the database server: User oneonez5_botwiki already has more than 'max_user_connections' active connections (localhost)) [16:54:18] botwiki is down [16:54:52] oops, wrong channel :P [16:55:53] 3(work started) [UTRS-100] Confirm close <10https://jira.toolserver.org/browse/UTRS-100> (Chris Howie) [16:55:54] 3(work started) [UTRS-97] Newlines are not escaped in JavaScript string values <10https://jira.toolserver.org/browse/UTRS-97> (Chris Howie) [16:55:54] 3(commented) [UTRS-100] Confirm close <10https://jira.toolserver.org/browse/UTRS-100> (Chris Howie) [17:02:40] SMF on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [17:02:40] Cluster on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:02:50] Environment IPMI on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:02:50] /tmp on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:03:41] / on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:03:41] SMF on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:04:12] / on willow is OK: DISK OK - free space: / 37741 MB (36% inode=99%): [17:04:12] Cluster on willow is OK: CLUSTER OK ! [17:04:21] /tmp on willow is OK: DISK OK - free space: / 37741 MB (36% inode=99%): [17:04:21] Environment IPMI on willow is OK: ok: temperature ok fan ok voltage ok chassis ok [17:06:11] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:06:11] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [17:07:51] 3(commented) [UTRS-65] Transfer code system to git <10https://jira.toolserver.org/browse/UTRS-65> (Chris Howie) [17:10:41] Load avg. on willow is WARNING: WARNING - load average: 9.23, 9.41, 19.47 [17:16:31] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396125 MB (7% inode=40%): [17:17:41] Load avg. on willow is OK: OK - load average: 5.79, 6.74, 14.45 [17:22:01] /sql on cassia is WARNING: DISK WARNING - free space: /sql 125477 MB (10% inode=99%): [17:28:12] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [17:45:21] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.773438/1.95, alarm hl:np_load_avg=0.862305/2.0, alarm hl:mem_free=127.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=0.773438/2.3, alarm hl:np_load_long=1.032226/2.5, alarm hl:cpu=63.400000/98, alarm hl:mem_free=127.000000M/150M, alarm hl:available=1/0 [17:49:22] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [17:49:22] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [17:55:21] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.760742/1.95, alarm hl:np_load_avg=0.784668/2.0, alarm hl:mem_free=218.000000M/350M, alarm hl:available=1/0 [17:58:11] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.897461/1.10, alarm hl:np_load_long=0.904297/1.55, alarm hl:mem_free=17253.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.897461/1.00, alarm hl:np_load_long=0.904297/1.50, alarm hl:mem_free=17253.000000M/600M, alarm hl:available=1/0 [17:59:10] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [18:04:11] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [18:05:51] s4 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 1942.000000 [18:06:11] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [18:06:11] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:16:31] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 395936 MB (7% inode=40%): [18:25:54] 3(commented) [UTRS-97] Newlines are not escaped in JavaScript string values <10https://jira.toolserver.org/browse/UTRS-97> (Chris Howie) [18:29:11] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [18:31:53] Hello all [18:42:20] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [18:45:21] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.236816/1.95, alarm hl:np_load_avg=1.244629/2.0, alarm hl:mem_free=127.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.236816/2.3, alarm hl:np_load_long=1.311524/2.5, alarm hl:cpu=88.200000/98, alarm hl:mem_free=127.000000M/150M, alarm hl:available=1/0 [18:47:21] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [18:49:31] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [18:52:11] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:53:11] Sun Grid Engine execd on wolfsbane is WARNING: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.262695/1.00, alarm hl:np_load_long=0.275879/1.50, alarm hl:mem_free=594.000000M/600M, alarm hl:available=1/0 [18:56:11] Sun Grid Engine execd on wolfsbane is OK: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [19:01:51] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [19:04:28] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [19:05:51] s4 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2225.000000 [19:06:20] APT on yarrow is CRITICAL: APT CRITICAL: 9 packages available for upgrade (9 critical updates). [19:06:21] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [19:09:30] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.927246/1.95, alarm hl:np_load_avg=1.010254/2.0, alarm hl:mem_free=156.000000M/350M, alarm hl:available=1/0 [19:10:21] APT on yarrow is OK: APT OK: 0 packages available for upgrade (0 critical updates). [19:16:41] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 395796 MB (7% inode=40%): [19:29:21] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [19:49:51] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [19:50:41] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [19:56:42] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.852539/1.95, alarm hl:np_load_avg=0.883301/2.0, alarm hl:mem_free=288.000000M/350M, alarm hl:available=1/0 [19:58:42] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [20:05:21] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [20:06:01] s4 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3459.000000 [20:06:32] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [20:13:02] DaBPunkt: the query killer isn't running [20:13:11] and possibly the memory slayer [20:14:59] *restarted query-killer* [20:15:13] looks like only some sub-parts were crashed [20:16:42] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 395647 MB (7% inode=40%): [20:17:35] memory slayer seems to work [20:17:55] on the web servers? [20:19:22] We had emailed users who ran bots on the web server [20:20:51] no process on the webservers is using more then 1GB of memory AFAIS. [20:23:08] slaporte is running `python revisions.py` on ortelius [20:28:06] Dispenser: I do not see the connection to the memory-slayer, but I killed this and several other "bot"-processes on the wbeservers now [20:29:31] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [20:31:02] s4 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1761.000000 [20:37:11] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:50:02] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [20:52:01] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [21:05:20] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [21:07:31] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [21:16:42] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 395571 MB (7% inode=40%): [21:29:31] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [21:31:42] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.950195/1.95, alarm hl:np_load_avg=1.304199/2.0, alarm hl:mem_free=497.000000M/350M, alarm hl:available=1/0 [21:32:42] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [21:38:01] /sql on cassia is OK: DISK OK - free space: /sql 130729 MB (11% inode=99%): [21:41:01] /sql on cassia is WARNING: DISK WARNING - free space: /sql 125825 MB (10% inode=99%): [21:50:01] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [21:51:41] nacht ts [21:51:53] * Firebolt waves [21:55:12] SSH on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:55:12] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:55:12] SMTP on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:55:41] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [21:56:01] SSH on hyacinth is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [21:56:01] SMTP on z-dat-s3-a is OK: SMTP OK - 0.003 sec. response time [22:02:41] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.916016/1.95, alarm hl:np_load_avg=1.488281/2.0, alarm hl:mem_free=332.000000M/350M, alarm hl:available=1/0 [22:05:21] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [22:05:41] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [22:07:31] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [22:08:41] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.085938/1.95, alarm hl:np_load_avg=1.266113/2.0, alarm hl:mem_free=181.000000M/350M, alarm hl:available=1/0 [22:16:51] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 395474 MB (7% inode=40%): [22:29:30] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [22:35:41] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.064453/1.95, alarm hl:np_load_avg=1.176758/2.0, alarm hl:mem_free=183.000000M/350M, alarm hl:available=1/0 [22:38:52] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [22:50:01] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [23:05:21] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [23:07:30] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [23:09:51] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.855469/1.95, alarm hl:np_load_avg=0.951172/2.0, alarm hl:mem_free=243.000000M/350M, alarm hl:available=1/0 [23:14:51] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [23:17:51] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 394302 MB (7% inode=40%): [23:29:31] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [23:31:56] 3(created) [ACCAPP-499] Account for Wikisource-specific IRC bot; Account Approval; New Account <10https://jira.toolserver.org/browse/ACCAPP-499> (inductiveload) [23:50:11] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk