[00:06:11] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [00:10:49] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [00:13:49] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [00:15:11] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 308508 MB (5% inode=34%): [00:21:00] MySQL slave on thyme is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1833 [00:21:10] s1 replag on thyme is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 1832.000000 [00:31:23] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [00:37:31] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:46:11] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 2627 MB (8% inode=93%): [00:55:10] s1 replag on thyme is OK: QUERY OK: SELECT ts_rc_age() returned 1799.000000 [00:56:00] MySQL slave on thyme is OK: Uptime: 3822757 Threads: 15 Questions: 1465824662 Slow queries: 606391 Opens: 236427 Flush tables: 3 Open tables: 3499 Queries per second avg: 383.446 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1722 [00:57:40] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [01:06:20] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [01:10:49] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:13:49] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [01:16:10] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 308455 MB (5% inode=34%): [01:27:30] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [01:29:02] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.083008/1.95, alarm hl:np_load_avg=1.258301/2.0, alarm hl:mem_free=286.000000M/350M, alarm hl:available=1/0 [01:32:00] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [01:32:20] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [01:46:10] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 2407 MB (8% inode=93%): [01:57:49] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [02:05:30] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.310547/1.10, alarm hl:np_load_long=0.729492/1.55, alarm hl:mem_free=13230.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.310547/1.00, alarm hl:np_load_long=0.729492/1.50, alarm hl:mem_free=13230.000000M/600M, alarm hl:available=1/0 [02:07:10] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [02:08:31] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [02:10:50] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:13:50] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [02:16:11] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 307663 MB (5% inode=34%): [02:22:42] 3(created) [MNT-1237] automatically restart tirex-master every 12 hours; Maintenance: ptolemy; Minor Minor work <10https://jira.toolserver.org/browse/MNT-1237> (Kai Krueger) [02:25:43] 3(commented) [ACCAPP-505] Cron jobs for Hazard-Bot <10https://jira.toolserver.org/browse/ACCAPP-505> (Hazard-SJ ) [02:32:21] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [02:32:43] 3(commented) [ACCAPP-505] Cron jobs for Hazard-Bot <10https://jira.toolserver.org/browse/ACCAPP-505> (Hazard-SJ ) [02:36:20] SMF on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [02:37:20] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [02:47:11] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 2212 MB (7% inode=93%): [02:57:50] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [03:02:20] Load avg. on willow is WARNING: WARNING - load average: 19.91, 16.40, 14.25 [03:03:01] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.040527/1.95, alarm hl:np_load_avg=1.991699/2.0, alarm hl:mem_free=1051.000000M/350M, alarm hl:available=1/0 [03:04:00] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [03:05:10] Load avg. on willow is OK: OK - load average: 13.70, 14.87, 14.02 [03:05:30] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.228516/1.10, alarm hl:np_load_long=0.931640/1.55, alarm hl:mem_free=13046.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.228516/1.00, alarm hl:np_load_long=0.931640/1.50, alarm hl:mem_free=13046.000000M/600M, alarm hl:available=1/0 [03:07:20] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [03:07:30] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [03:08:00] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.304688/1.95, alarm hl:np_load_avg=2.167969/2.0, alarm hl:mem_free=1101.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.304688/2.3, alarm hl:np_load_long=1.895020/2.5, alarm hl:cpu=97.600000/98, alarm hl:mem_free=1101.000000M/200M, alarm hl:available=1/0 [03:09:10] Load avg. on willow is WARNING: WARNING - load average: 13.36, 16.09, 14.89 [03:10:50] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:13:50] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [03:17:10] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 306414 MB (5% inode=33%): [03:37:20] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [03:43:41] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.040039/1.00, alarm hl:np_load_long=0.816406/1.50, alarm hl:mem_free=12905.000000M/600M, alarm hl:available=1/0 [03:46:40] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [03:47:21] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 2004 MB (6% inode=93%): [03:48:00] Sun Grid Engine execd on willow is CRITICAL: medium-sol@willow in error state: QERROR as result of job 2028034s failure: longrun-sol@willow in error state: QERROR as result of job 2028034s failure [03:57:29] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:57:49] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [04:07:20] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [04:07:21] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [04:10:50] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:13:50] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [04:17:19] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 306375 MB (5% inode=33%): [04:25:41] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.247070/1.10, alarm hl:np_load_long=0.951172/1.55, alarm hl:mem_free=12865.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.247070/1.00, alarm hl:np_load_long=0.951172/1.50, alarm hl:mem_free=12865.000000M/600M, alarm hl:available=1/0 [04:27:40] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [04:37:21] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [04:44:51] is the 1GB memory limit applied per process, or per user? are qsub'd tasks counted to the overall 1GB limit too? [04:45:40] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.316406/1.10, alarm hl:np_load_long=0.944336/1.55, alarm hl:mem_free=12710.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.316406/1.00, alarm hl:np_load_long=0.944336/1.50, alarm hl:mem_free=12710.000000M/600M, alarm hl:available=1/0 [04:48:00] Sun Grid Engine execd on willow is CRITICAL: medium-sol@willow in error state: QERROR as result of job 2028034s failure: longrun-sol@willow in error state: QERROR as result of job 2028034s failure [04:48:20] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 1841 MB (6% inode=93%): [04:49:21] Load avg. on willow is WARNING: WARNING - load average: 16.56, 13.98, 12.17 [04:50:00] liangent: Per user, see https://wiki.toolserver.org/view/Account_limits#Memory [04:50:20] Load avg. on willow is OK: OK - load average: 12.96, 13.12, 11.96 [04:51:27] Hazard-SJ: then are qsub counted? [04:55:09] SMTP on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:55:29] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:55:29] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:55:30] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:56:00] SMTP on z-dat-s4-a is OK: SMTP OK - 0.003 sec. response time [04:56:00] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [04:56:21] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [04:56:21] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [04:58:00] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [05:02:52] liangent: Think so. https://wiki.toolserver.org/view/Qsub#obligatory_resources [05:05:46] Hazard-SJ: and I have one 1GB on one host, then another 1GB on another host? [05:07:20] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [05:09:09] liangent: I'm far from being the best person for such confirmations. [05:10:49] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:13:50] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [05:17:21] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 306295 MB (5% inode=33%): [05:27:31] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:37:29] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [05:42:10] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [05:44:20] Load avg. on willow is WARNING: WARNING - load average: 18.39, 14.49, 12.74 [05:45:20] Load avg. on willow is OK: OK - load average: 13.00, 13.60, 12.54 [05:47:59] Sun Grid Engine execd on willow is CRITICAL: medium-sol@willow in error state: QERROR as result of job 2028034s failure: longrun-sol@willow in error state: QERROR as result of job 2028034s failure [05:49:20] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 1652 MB (5% inode=93%): [05:53:20] Load avg. on willow is WARNING: WARNING - load average: 15.91, 14.27, 13.15 [05:58:00] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [06:03:43] 3(created) [MAGNUS-320] Yearly statistics; Magnus' tools; Minor New Feature <10https://jira.toolserver.org/browse/MAGNUS-320> (Ole Palnatoke Andersen) [06:03:50] Sun Grid Engine execd on wolfsbane is WARNING: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.440918/1.00, alarm hl:np_load_long=0.419922/1.50, alarm hl:mem_free=599.000000M/600M, alarm hl:available=1/0 [06:05:40] /tmp on willow is WARNING: DISK WARNING - free space: / 21961 MB (20% inode=99%): [06:05:40] / on willow is WARNING: DISK WARNING - free space: / 21961 MB (20% inode=99%): [06:05:49] Sun Grid Engine execd on wolfsbane is OK: testqueue@wolfsbane disabled: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [06:07:30] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [06:07:39] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.082031/1.00, alarm hl:np_load_long=0.711914/1.50, alarm hl:mem_free=11585.000000M/600M, alarm hl:available=1/0 [06:08:40] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [06:10:59] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:13:50] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [06:18:20] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 306216 MB (5% inode=33%): [06:37:30] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [06:48:05] Sun Grid Engine execd on willow is CRITICAL: medium-sol@willow in error state: QERROR as result of job 2028034s failure: longrun-sol@willow in error state: QERROR as result of job 2028034s failure [06:49:20] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 1430 MB (4% inode=93%): [06:58:00] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [07:05:39] / on willow is WARNING: DISK WARNING - free space: / 21639 MB (20% inode=98%): [07:05:39] /tmp on willow is WARNING: DISK WARNING - free space: / 21639 MB (20% inode=98%): [07:07:29] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [07:10:59] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:13:50] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [07:19:20] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 306107 MB (5% inode=33%): [07:26:20] Load avg. on willow is WARNING: WARNING - load average: 17.15, 14.59, 12.47 [07:27:21] Load avg. on willow is OK: OK - load average: 15.00, 14.23, 12.47 [07:32:20] Load avg. on willow is WARNING: WARNING - load average: 17.14, 16.24, 13.72 [07:33:10] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [07:34:09] Sun Grid Engine execd on willow is CRITICAL: medium-sol@willow in error state: QERROR as result of job 2028034s failure: longrun-sol@willow in error state: QERROR as result of job 2028034s failure [07:37:29] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [07:49:20] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 1190 MB (3% inode=93%): [07:58:00] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [08:01:20] /tmp on wolfsbane is WARNING: DISK WARNING - free space: /tmp 343 MB (14% inode=97%): [08:01:50] Sun Grid Engine execd on wolfsbane is WARNING: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.310547/1.00, alarm hl:np_load_long=0.291992/1.50, alarm hl:mem_free=594.000000M/600M, alarm hl:available=1/0 [08:04:20] /tmp on wolfsbane is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [08:04:20] / on wolfsbane is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [08:05:20] /tmp on wolfsbane is CRITICAL: DISK CRITICAL - free space: /tmp 93 MB (4% inode=97%): [08:05:20] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 1098 MB (3% inode=93%): [08:05:40] / on willow is WARNING: DISK WARNING - free space: / 21402 MB (20% inode=98%): [08:05:40] /tmp on willow is WARNING: DISK WARNING - free space: / 21402 MB (20% inode=98%): [08:07:29] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [08:09:50] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in error state: QERROR as result of job 2028822s failure: medium-sol@wolfsbane in error state: QERROR as result of job 2028822s failure [08:11:00] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:13:50] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [08:19:20] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 305986 MB (5% inode=33%): [08:34:09] Sun Grid Engine execd on willow is CRITICAL: medium-sol@willow in error state: QERROR as result of job 2028034s failure: longrun-sol@willow in error state: QERROR as result of job 2028034s failure [08:37:29] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [08:55:42] 3(created) [MAGNUS-321] End at option; Magnus' tools: FIST; Minor New Feature <10https://jira.toolserver.org/browse/MAGNUS-321> (Traveler100) [08:56:50] Sun Grid Engine execd on wolfsbane is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [08:57:40] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:58:00] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [08:58:00] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [08:58:10] APT on yarrow is WARNING: APT WARNING: 20 packages available for upgrade (0 critical updates). [08:58:20] toolserver.org HTTP on wolfsbane is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 4.130 second response time [08:58:49] toolserver.org HTTP on ortelius is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 9.647 second response time [08:59:50] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in error state: QERROR as result of job 2028822s failure: medium-sol@wolfsbane in error state: QERROR as result of job 2028822s failure [09:02:41] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.450 second response time [09:03:20] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.005 second response time [09:03:20] /tmp on wolfsbane is CRITICAL: DISK CRITICAL - free space: /tmp 71 MB (3% inode=97%): [09:04:20] /tmp on wolfsbane is WARNING: DISK WARNING - free space: /tmp 542 MB (20% inode=97%): [09:05:20] /tmp on wolfsbane is OK: DISK OK - free space: /tmp 953 MB (31% inode=97%): [09:05:20] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 835 MB (2% inode=93%): [09:05:29] Environment IPMI on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:05:40] / on willow is WARNING: DISK WARNING - free space: / 21166 MB (20% inode=98%): [09:05:40] /tmp on willow is WARNING: DISK WARNING - free space: / 21166 MB (20% inode=98%): [09:07:09] Environment IPMI on adenia is OK: ok: temperature ok fan ok voltage ok chassis ok [09:07:31] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [09:11:00] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:12:20] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [09:14:00] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [09:19:20] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 304949 MB (5% inode=33%): [09:22:50] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in error state: QERROR as result of job 2028822s failure: medium-sol@wolfsbane in error state: QERROR as result of job 2028822s failure [09:34:09] Sun Grid Engine execd on willow is CRITICAL: medium-sol@willow in error state: QERROR as result of job 2028034s failure: longrun-sol@willow in error state: QERROR as result of job 2028034s failure [09:37:29] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [09:58:00] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [09:58:10] APT on yarrow is WARNING: APT WARNING: 20 packages available for upgrade (0 critical updates). [10:05:20] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 555 MB (1% inode=93%): [10:05:40] / on willow is WARNING: DISK WARNING - free space: / 20982 MB (20% inode=98%): [10:06:40] /tmp on willow is WARNING: DISK WARNING - free space: / 20978 MB (20% inode=98%): [10:07:29] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [10:07:39] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.115234/1.10, alarm hl:np_load_long=0.817383/1.55, alarm hl:mem_free=12760.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.115234/1.00, alarm hl:np_load_long=0.817383/1.50, alarm hl:mem_free=12760.000000M/600M, alarm hl:available=1/0 [10:09:40] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [10:11:00] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:14:00] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [10:19:20] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 304792 MB (5% inode=33%): [10:22:49] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in error state: QERROR as result of job 2028822s failure: medium-sol@wolfsbane in error state: QERROR as result of job 2028822s failure [10:34:09] Sun Grid Engine execd on willow is CRITICAL: medium-sol@willow in error state: QERROR as result of job 2028034s failure: longrun-sol@willow in error state: QERROR as result of job 2028034s failure [10:37:30] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [10:41:50] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 42953 MB (10% inode=99%): [10:46:50] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 67544 MB (16% inode=99%): [10:58:00] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [10:58:10] APT on yarrow is WARNING: APT WARNING: 20 packages available for upgrade (0 critical updates). [10:59:50] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.739258/1.10, alarm hl:np_load_long=0.973633/1.55, alarm hl:mem_free=12635.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.739258/1.00, alarm hl:np_load_long=0.973633/1.50, alarm hl:mem_free=12635.000000M/600M, alarm hl:available=1/0 [11:05:20] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 249 MB (0% inode=93%): [11:05:39] / on willow is WARNING: DISK WARNING - free space: / 20802 MB (19% inode=98%): [11:06:40] /tmp on willow is WARNING: DISK WARNING - free space: / 20799 MB (19% inode=98%): [11:07:29] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [11:09:50] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [11:11:00] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:14:00] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [11:15:50] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.332031/1.10, alarm hl:np_load_long=1.124024/1.55, alarm hl:mem_free=13808.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.332031/1.00, alarm hl:np_load_long=1.124024/1.50, alarm hl:mem_free=13808.000000M/600M, alarm hl:available=1/0 [11:16:49] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 41912 MB (10% inode=99%): [11:16:59] Sun Grid Engine execd on wolfsbane is OK: testqueue@wolfsbane disabled: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [11:17:10] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [11:18:49] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 45979 MB (11% inode=99%): [11:19:30] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 304615 MB (5% inode=33%): [11:21:49] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 42468 MB (10% inode=99%): [11:23:20] /tmp on wolfsbane is CRITICAL: DISK CRITICAL - free space: /tmp 176 MB (7% inode=97%): [11:24:20] /tmp on wolfsbane is WARNING: DISK WARNING - free space: /tmp 407 MB (16% inode=97%): [11:25:20] /tmp on wolfsbane is OK: DISK OK - free space: /tmp 788 MB (27% inode=97%): [11:37:30] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [11:39:49] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.095703/1.00, alarm hl:np_load_long=1.432617/1.50, alarm hl:mem_free=12056.000000M/600M, alarm hl:available=1/0 [11:52:03] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.437500/1.10, alarm hl:np_load_long=0.521973/1.55, alarm hl:mem_free=443.000000M/500M, alarm hl:available=1/0: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.437500/1.00, alarm hl:np_load_long=0.521973/1.50, alarm hl:mem_free=443.000000M/600M, alarm hl:available=1/0 [11:55:03] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in error state: QERROR as result of job 2029358s failure: medium-sol@wolfsbane in error state: QERROR as result of job 2029382s failure [11:57:51] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [11:58:10] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [11:58:11] APT on yarrow is WARNING: APT WARNING: 20 packages available for upgrade (0 critical updates). [12:00:52] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.169922/1.10, alarm hl:np_load_long=1.483399/1.55, alarm hl:mem_free=13324.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.169922/1.00, alarm hl:np_load_long=1.483399/1.50, alarm hl:mem_free=13324.000000M/600M, alarm hl:available=1/0 [12:05:35] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=93%): [12:05:45] / on willow is WARNING: DISK WARNING - free space: / 20568 MB (19% inode=98%): [12:06:51] /tmp on willow is WARNING: DISK WARNING - free space: / 20635 MB (19% inode=98%): [12:07:47] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [12:11:41] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:14:58] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [12:17:44] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [12:20:26] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 304470 MB (5% inode=33%): [12:23:43] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.329101/1.10, alarm hl:np_load_long=1.282226/1.55, alarm hl:mem_free=12803.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.329101/1.00, alarm hl:np_load_long=1.282226/1.50, alarm hl:mem_free=12803.000000M/600M, alarm hl:available=1/0 [12:26:49] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [12:35:43] / on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [12:36:22] / on willow is WARNING: DISK WARNING - free space: / 20582 MB (19% inode=98%): [12:38:20] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [12:40:17] / on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [12:50:05] Bad Gateway for phpmyadmin right now [12:55:58] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in error state: QERROR as result of job 2029251s failure: medium-sol@wolfsbane in error state: QERROR as result of job 2029382s failure [12:58:26] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [12:58:58] APT on yarrow is WARNING: APT WARNING: 20 packages available for upgrade (0 critical updates). [13:00:18] / on willow is WARNING: DISK WARNING - free space: / 20515 MB (19% inode=98%): [13:01:27] SMF on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [13:01:27] /tmp on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [13:02:16] /tmp on willow is WARNING: DISK WARNING - free space: / 20510 MB (19% inode=98%): [13:02:17] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [13:03:17] SMF on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [13:05:58] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=93%): [13:08:17] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [13:11:57] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:14:58] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [13:21:17] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 304357 MB (5% inode=33%): [13:23:07] Hello all [13:23:13] Nice Mother-day for all [13:23:17] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [13:43:18] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.511719/1.10, alarm hl:np_load_long=0.946289/1.55, alarm hl:mem_free=11000.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.511719/1.00, alarm hl:np_load_long=0.946289/1.50, alarm hl:mem_free=11000.000000M/600M, alarm hl:available=1/0 [13:47:17] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [13:47:57] Sun Grid Engine execd on willow is CRITICAL: medium-sol@willow in error state: QERROR as result of job 2030436s failure [13:48:18] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:48:28] / on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [13:48:28] /tmp on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [13:49:17] / on willow is WARNING: DISK WARNING - free space: / 20377 MB (19% inode=98%): [13:49:17] /tmp on willow is WARNING: DISK WARNING - free space: / 20377 MB (19% inode=98%): [13:55:57] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in error state: QERROR as result of job 2029353s failure: medium-sol@wolfsbane in error state: QERROR as result of job 2029382s failure [13:58:27] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [13:58:58] APT on yarrow is WARNING: APT WARNING: 20 packages available for upgrade (0 critical updates). [14:05:57] / on wolfsbane is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=93%): [14:08:18] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [14:11:58] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:13:17] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.999024/1.10, alarm hl:np_load_long=1.404297/1.55, alarm hl:mem_free=11425.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.999024/1.00, alarm hl:np_load_long=1.404297/1.50, alarm hl:mem_free=11425.000000M/600M, alarm hl:available=1/0 [14:14:58] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [14:15:18] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [14:21:17] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 304219 MB (5% inode=33%): [14:23:27] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [14:28:49] hello DaBPunkt thanks but I am no mother :p [14:29:17] but... http://www.youtube.com/watch?v=7_rBidCkJxo [14:35:58] Sun Grid Engine execd on wolfsbane is OK: testqueue@wolfsbane disabled: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [14:35:58] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [14:36:37] SMF on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [14:37:27] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [14:37:58] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [14:38:57] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in error state: QERROR as result of job 2030611s failure: medium-sol@wolfsbane in error state: QERROR as result of job 2030008s failure [14:44:58] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.280274/1.10, alarm hl:np_load_long=0.286621/1.55, alarm hl:mem_free=4263.000000M/500M, alarm hl:tmp_free=0M/200M, alarm hl:available=1/0: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.280274/1.00, alarm hl:np_load_long=0.286621/1.50, alarm hl:mem_free=4263.000000M/600M, alarm hl:tmp_free=0M [14:46:27] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.144531/1.10, alarm hl:np_load_long=0.894531/1.55, alarm hl:mem_free=12633.000000M/500M, alarm hl:tmp_free=14325M/200M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.144531/1.00, alarm hl:np_load_long=0.894531/1.50, alarm hl:mem_free=12633.000000M/600M, alarm hl:tmp_free [14:47:27] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [14:49:27] / on willow is WARNING: DISK WARNING - free space: / 20198 MB (19% inode=98%): [14:49:27] /tmp on willow is WARNING: DISK WARNING - free space: / 20198 MB (19% inode=98%): [14:53:57] / on wolfsbane is OK: DISK OK - free space: / 11357 MB (37% inode=93%): [14:54:57] Sun Grid Engine execd on wolfsbane is OK: testqueue@wolfsbane disabled: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [14:58:36] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [14:58:57] APT on yarrow is WARNING: APT WARNING: 20 packages available for upgrade (0 critical updates). [14:59:42] 3(created) [MNT-1238] Deleted audit-files on wolfsbane; Maintenance; Emergency work <10https://jira.toolserver.org/browse/MNT-1238> (DaB.) [15:00:58] APT on yarrow is OK: APT OK: 0 packages available for upgrade (0 critical updates). [15:09:17] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [15:11:57] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:15:04] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [15:15:06] /tmp on wolfsbane is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [15:15:58] /tmp on wolfsbane is CRITICAL: DISK CRITICAL - free space: /tmp 217 MB (9% inode=97%): [15:16:57] /tmp on wolfsbane is OK: DISK OK - free space: /tmp 887 MB (30% inode=97%): [15:16:57] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in error state: QERROR as result of job 2030745s failure: medium-sol@wolfsbane in error state: QERROR as result of job 2030745s failure [15:20:57] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [15:21:17] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 303958 MB (5% inode=33%): [15:22:06] Sun Grid Engine execd on wolfsbane is CRITICAL: short-sol@wolfsbane in error state: QERROR as result of job 2030745s failure: medium-sol@wolfsbane in error state: QERROR as result of job 2030745s failure [15:22:42] 3(commented) [TS-1083] noboardwiki_p / noboard_chapterswikimedia_p should be removed <10https://jira.toolserver.org/browse/TS-1083> (Jesse Plamondon-Willard) [15:23:07] toolserver.org HTTP on wolfsbane is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 8.698 second response time [15:23:17] toolserver.org HTTP on ortelius is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 7.814 second response time [15:27:57] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.005 second response time [15:28:20] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.007 second response time [15:29:46] 3(commented) [TS-1083] noboardwiki_p / noboard_chapterswikimedia_p should be removed <10https://jira.toolserver.org/browse/TS-1083> (DaB.) [15:29:57] SMF on wolfsbane is CRITICAL: ERROR - maintenance: svc:/application/sge/execd:toolserver [15:32:57] SMF on wolfsbane is OK: OK - all services online [15:34:43] 3(commented) [TS-1083] noboardwiki_p / noboard_chapterswikimedia_p should be removed <10https://jira.toolserver.org/browse/TS-1083> (Jesse Plamondon-Willard) [15:38:26] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [15:39:45] 3(resolved) [TS-1083] noboardwiki_p / noboard_chapterswikimedia_p should be removed <10https://jira.toolserver.org/browse/TS-1083> (DaB.) [15:45:27] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.069336/1.10, alarm hl:np_load_long=0.892578/1.55, alarm hl:mem_free=12904.000000M/500M, alarm hl:tmp_free=14305M/200M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.069336/1.00, alarm hl:np_load_long=0.892578/1.50, alarm hl:mem_free=12904.000000M/600M, alarm hl:tmp_free [15:47:27] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [15:49:27] /tmp on willow is WARNING: DISK WARNING - free space: / 20019 MB (19% inode=98%): [15:49:27] / on willow is WARNING: DISK WARNING - free space: / 20019 MB (19% inode=98%): [15:58:37] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [16:06:27] / on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [16:07:27] / on willow is WARNING: DISK WARNING - free space: / 19953 MB (19% inode=98%): [16:09:17] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [16:11:57] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:15:57] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [16:21:18] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 303812 MB (5% inode=33%): [16:38:27] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [16:39:56] when im want close the screen in toolserver or terminate a code what i am do? [16:42:37] Mahdiz, a running screen session or a finished one? [16:42:47] if you have a prompt, you can just write: exit [16:42:57] finished one [16:43:04] if you want to detach from screen, and leave it running in the background, do Ctrl+A+D [16:43:35] or close the screen or finished? [16:43:54] Mahdiz, write exit [16:45:28] thanks [16:45:39] but when the screen run a code [16:45:52] not working [16:46:51] you want to stop the running code? [16:46:58] or leave it runnig, but not to view it? [16:47:03] yes:D [16:47:14] yes to which one? :/ [16:47:22] Ctrl+C to stop it [16:49:37] /tmp on willow is WARNING: DISK WARNING - free space: / 19829 MB (18% inode=98%): [16:49:47] first [16:52:50] then press Ctrl+C [16:54:28] ok. thanks ;) [16:55:52] DaBPunkt: around ? [16:58:37] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [16:59:25] Toto_Azero|away: : no, dinner is now. in 1h if you like [17:00:44] ok [17:07:27] / on willow is WARNING: DISK WARNING - free space: / 19776 MB (18% inode=98%): [17:09:18] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [17:12:57] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:15:57] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [17:22:17] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 303595 MB (5% inode=33%): [17:33:35] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:38:14] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [17:39:05] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [17:50:05] /tmp on willow is WARNING: DISK WARNING - free space: / 19619 MB (18% inode=98%): [17:59:14] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [18:08:05] / on willow is WARNING: DISK WARNING - free space: / 19397 MB (18% inode=98%): [18:10:14] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [18:13:14] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:16:13] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [18:23:13] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 303438 MB (5% inode=33%): [18:25:06] re [18:32:34] DaBPunkt: here ? [18:32:38] yes [18:32:52] I had two questions actually [18:32:58] sure [18:33:09] 1) I'm getting more and more cron mail error [18:33:21] is this issue known ? [18:33:52] errors about cron itself or errors because a program/scrip of you failed? [18:34:22] err… I'd say strange errors xD I'll copy some there [18:34:31] please use a pastebin [18:34:37] ok [18:34:43] or open a ira-bugreport [18:36:25] http://pastebin.com/n786XMf1 [18:37:52] Toto_Azero|busy: yes, that's a known issue [18:38:03] ok [18:38:09] so 2nd question xD [18:38:32] It will get better when the boxes boxes will be added in a week or two [18:38:47] 2) the deadline is coming soon (two days) and I've sent a mail to Luckas Blade ten days ago… but he didn't answer yet :p may I be allowed to run my interwiki-bot after May 15th ? [18:38:58] since the MMP isn't set up yet [18:39:24] toolserver.org HTTP on ortelius is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 1.136 second response time [18:40:04] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [18:40:24] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.009 second response time [18:57:12] quieter [18:57:33] DaBPunkt: not the bot, 'just' the cookies [18:57:37] Toto_Azero|busy: ^ [18:58:13] but, of course, this is a bit of a problem for a global bot ;-) [18:58:27] yes… [18:59:14] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [18:59:29] I'm looking to use SGE, question: How do I scheduled a script to run while externallinks table is still "hot" (in memory) from another script [18:59:52] That's not possible [18:59:57] Wrapper script to make sure the jobs run sequentially? [19:00:11] jobs --> processes or whatever [19:00:16] I already do that with cron [19:00:31] SGE is ordered, but not linear, right? [19:00:42] Jobs go in order, but can start whenever resources are available? [19:00:53] Dispenser: you HOPE that the table is still in memory ;) [19:01:23] DaBPunkt: It cuts out 25 minutes [19:02:26] it is very depending what other users (or the mysql-slave) run parallell or after your query [19:03:12] but you can use a wrapper-scrip like Joan sugguested to make the 2 script run after each other [19:03:21] Joan: in general yes, but you can add dependecies to your jobs (e.g. job B has to wait until job A has finished) [19:04:10] Merlissimo: Hmm, all right. [19:05:01] qsub -N A commandA && qsub -N B -hold_jid A commondB [19:06:46] Merlissimo: AFAIS does the && handle the Sequanzialität already here ;) [19:07:26] doesn't qsub return before job A is done? [19:07:34] DaBPunkt: no, qsub only submits, but does not start the job [19:07:50] mm, yes you are right [19:07:55] Joan: another solution for running sequential but unordered is to use the user-slots resource [19:08:05] / on willow is WARNING: DISK WARNING - free space: / 18989 MB (18% inode=98%): [19:08:20] Dispenser: ^ [19:10:24] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [19:13:24] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [19:13:43] 3(created) [JARRY-36] Image existence checker program fails due to image in black list with not English characters; Jarry's Tools; Bug <10https://jira.toolserver.org/browse/JARRY-36> (Traveler100) [19:14:03] DaBPunkt: Do we still have backups of our /home directory? [19:14:05] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.153320/1.10, alarm hl:np_load_long=0.780273/1.55, alarm hl:mem_free=12527.000000M/500M, alarm hl:tmp_free=14016M/200M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.153320/1.00, alarm hl:np_load_long=0.780273/1.50, alarm hl:mem_free=12527.000000M/600M, alarm hl:tmp_free [19:14:24] Dispenser: yes. Back 1 week for normal [19:15:05] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [19:16:24] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [19:23:24] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 303273 MB (5% inode=33%): [19:40:05] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [19:44:05] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.480469/1.10, alarm hl:np_load_long=0.889648/1.55, alarm hl:mem_free=12562.000000M/500M, alarm hl:tmp_free=13976M/200M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.480469/1.00, alarm hl:np_load_long=0.889648/1.50, alarm hl:mem_free=12562.000000M/600M, alarm hl:tmp_free [19:44:47] 3(commented) [OSM-11] WIWOSM https <10https://jira.toolserver.org/browse/OSM-11> (Kai Krueger) [19:47:05] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [19:50:04] /tmp on willow is WARNING: DISK WARNING - free space: / 20304 MB (19% inode=98%): [19:51:05] / on willow is OK: DISK OK - free space: / 39092 MB (37% inode=99%): [19:51:05] /tmp on willow is OK: DISK OK - free space: / 39092 MB (37% inode=99%): [19:59:13] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [20:10:24] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [20:13:24] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [20:16:24] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [20:23:24] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 303131 MB (5% inode=33%): [20:25:49] 3(created) [ACCAPP-509] Account to run a bot for es.wikipedia and other Wikimedia projects, which would make use of Toolserver.; Account Approval; New Account <10https://jira.toolserver.org/browse/ACCAPP-509> [20:40:04] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [20:46:05] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.071289/1.00, alarm hl:np_load_long=0.810547/1.50, alarm hl:mem_free=12371.000000M/600M, alarm hl:tmp_free=13934M/100M, alarm hl:available=1/0 [20:48:04] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [20:59:13] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [21:10:24] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [21:13:24] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [21:16:24] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [21:23:24] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 302986 MB (5% inode=33%): [21:41:04] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [21:55:29] nacht ts [21:59:24] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [22:03:25] Sun Grid Engine execd on willow is CRITICAL: medium-sol@willow in error state: QERROR as result of job 2032296s failure: longrun-sol@willow in error state: QERROR as result of job 2032296s failure [22:10:25] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [22:13:24] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [22:16:24] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [22:19:04] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.077149/1.00, alarm hl:np_load_long=0.750977/1.50, alarm hl:mem_free=12388.000000M/600M, alarm hl:tmp_free=13855M/100M, alarm hl:available=1/0 [22:20:04] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [22:23:25] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 303421 MB (5% inode=33%): [22:25:05] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.860351/1.10, alarm hl:np_load_long=1.034180/1.55, alarm hl:mem_free=12844.000000M/500M, alarm hl:tmp_free=13860M/200M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.860351/1.00, alarm hl:np_load_long=1.034180/1.50, alarm hl:mem_free=12844.000000M/600M, alarm hl:tmp_free [22:41:05] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [22:59:24] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [23:03:24] Sun Grid Engine execd on willow is CRITICAL: medium-sol@willow in error state: QERROR as result of job 2032296s failure: longrun-sol@willow in error state: QERROR as result of job 2032296s failure [23:10:23] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [23:12:24] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [23:13:24] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [23:16:24] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d42 of mirror d40 is Needs and submirror d32 of mirror d30 is Needs and submirror d22 of mirror d20 is Needs and submirror d12 of mirror d10 is Needs [23:23:25] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 303363 MB (5% inode=33%): [23:41:14] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default