[00:03:05] Merlissimo: /bin/sh: qcronsub: not found [00:03:16] Merlissimo: getting e-mail bomb [00:03:30] first hit: 20 minutes ago [00:03:40] 01:40 in EU timezone [00:03:47] CET [00:03:49] whatever [00:03:55] hawthorn [00:04:02] cron [00:04:11] i think DaB. changed somethink because of Platonides. i don't know what [00:05:05] Krinkle: which qcronsub ? [00:05:13] all the ones I have [00:05:27] including some other projects outside my user account [00:05:37] some at *:02 some at :17 [00:05:37] etc. [00:05:43] spread around the hour as I was asked [00:06:20] no, could you run the command "which qcronsub"? [00:06:32] lol [00:06:54] Merlissimo: empty string [00:07:11] (which exists with status code 1) [00:07:24] Free Memory on damiana is WARNING: WARNING - 5.9% (494524 kB) free! [00:07:30] mmh, for me it is set correctly: [00:07:30] rmerl@hawthorn:~$ which qcronsub [00:07:30] /sge/GE/bin/sol-amd64/qcronsub [00:07:30] rmerl@hawthorn:~$ sh [00:07:31] $ which qcronsub [00:07:32] /sge/GE/bin/sol-amd64/qcronsub [00:09:09] indeed, now it works on willow/nightshade/hawthorn [00:09:22] cron is different though [00:09:31] try from crontab [00:09:41] (or cronie to be specific) [00:09:46] Krinkle: you used cron and not cronie? [00:09:54] Mail is sent from cron [00:09:55] I use cronie [00:10:07] Cron Daemon [00:10:12] i cannot reproduce your wrong path. and because i am not an admin i cannot check your config [00:10:27] I just got another mail report, same error [00:10:36] Cron qcronsub [00:10:58] from crontie at krinkle@submit.toolserver.org [00:11:02] cronie* [00:11:34] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [00:12:48] i only know that new jobs are still submitted to sge [00:13:03] I just got the same error as Krinkle is having [00:13:08] "/bin/sh: qcronsub: not found" [00:13:34] Free Memory on damiana is OK: OK - 7.1% (596152 kB) free. [00:22:34] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 60436 MB (9% inode=99%): [00:23:54] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [00:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [00:24:43] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [00:24:53] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [00:25:24] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [00:28:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [00:29:11] Krinkle / legoktm: open a jira ticket so that ts admin know about this problem [00:29:24] ok [00:29:25] will do [00:29:59] I was locked out of JIRA (like many other users in SSO) last week [00:30:04] Ah, I see it is fixed now [00:30:09] legoktm: go ahead :) [00:32:18] i only have sge manager priviliges which allow me to change some config on a running sge cluster, but i have no access to sge process itself or any os related stuff [00:32:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [00:33:24] https://jira.toolserver.org/browse/TS-1544 [00:42:13] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [00:54:33] what's happened? /bin/sh: qcronsub: not found [00:54:51] liangent: https://jira.toolserver.org/browse/TS-1544 [00:58:34] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [00:59:54] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [01:00:34] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1942 [01:00:54] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 1949.000000 [01:11:14] HTTP proxy on ha-proxy.esi is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:11:24] ethernet 0/1/17 on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:11:24] ethernet 0/1/18 [far2-n1-oe16-a-esams.mgmt] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:11:24] ethernet 0/1/4 [ts-array5 controller A] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:11:24] ethernet 0/1/23 [rosemary.mgmt] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:11:24] ethernet 0/1/13 [nightshade.mgmt] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:11:24] ethernet 0/1/14 [mayapple] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:11:24] ethernet 0/1/5 [ts-array5 controller B] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:11:34] ethernet 0/1/19 on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:11:34] ethernet 0/1/2 [daphne] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:11:34] ethernet 0/1/7 [damiana bge0] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:11:34] ethernet 0/1/3 [rosemary] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:11:34] ethernet 0/1/15 [ts-array3.mgmt] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:11:34] ethernet 0/1/10 [cassia] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:11:34] ethernet 0/1/24 [ortelius.mgmt] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:11:35] ethernet 0/1/16 [yarrow.mgmt] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:11:35] ethernet 0/1/1 on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:11:36] ethernet 0/1/6 [turnera bge0] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:11:36] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [01:11:37] ethernet 0/1/8 [turnera bge1] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:12:03] ethernet 0/1/11 [msw1-oe10-esams] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:12:03] ethernet 0/1/2 [nightshade] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:12:03] ethernet 0/1/7 [hemlock] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:12:03] ethernet 0/1/20 [hemlock.mgmt] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:12:04] ethernet 0/1/15 [far1-n1-oe16-a-esams.mgmt] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:12:04] ethernet 0/1/24 [daphne.mgmt] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:12:04] ethernet 0/1/10 [hawthorn] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:12:05] ethernet 0/1/8 [yarrow] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:12:05] ethernet 0/1/14 [cassini.mgmt] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:12:06] ethernet 0/1/23 [asw-oe10-esams:0/1/6 LACP secondary] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:12:06] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:12:07] ethernet 0/1/16 [far1-n1-oe16-b-esams.mgmt] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:12:07] ethernet 0/1/9 [ortelius] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [01:12:08] ethernet 0/1/3 [cassini] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [01:15:33] MySQL slave on rosemary is CRITICAL: (Service Check Timed Out) [01:15:54] s1 replag on rosemary is CRITICAL: (Service Check Timed Out) [01:16:04] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2121.000000 [01:16:14] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2120 [01:21:14] ethernet 0/1/5 [ts-array5 controller B] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/5:UP:1 UP: OK [01:21:14] ethernet 0/1/14 [mayapple] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/14:UP:1 UP: OK [01:21:14] ethernet 0/1/18 [far2-n1-oe16-a-esams.mgmt] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/18:UP:1 UP: OK [01:21:14] ethernet 0/1/13 [nightshade.mgmt] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/13:DOWN:1 UP: OK [01:21:14] ethernet 0/1/23 [rosemary.mgmt] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/23:DOWN:1 UP: OK [01:21:14] ethernet 0/1/17 on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/17:DOWN:1 UP: OK [01:21:14] ethernet 0/1/4 [ts-array5 controller A] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/4:UP:1 UP: OK [01:21:24] ethernet 0/1/19 on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/19:DOWN:1 UP: OK [01:21:24] ethernet 0/1/2 [daphne] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/2:UP:1 UP: OK [01:21:24] ethernet 0/1/7 [damiana bge0] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/7:UP:1 UP: OK [01:21:24] ethernet 0/1/16 [yarrow.mgmt] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/16:DOWN:1 UP: OK [01:21:24] ethernet 0/1/10 [cassia] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/10:UP:1 UP: OK [01:21:24] ethernet 0/1/15 [ts-array3.mgmt] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/15:UP:1 UP: OK [01:21:24] ethernet 0/1/24 [ortelius.mgmt] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/24:DOWN:1 UP: OK [01:21:25] ethernet 0/1/3 [rosemary] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/3:UP:1 UP: OK [01:21:25] ethernet 0/1/1 on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/1:DOWN:1 UP: OK [01:21:26] ethernet 0/1/6 [turnera bge0] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/6:UP:1 UP: OK [01:21:26] ethernet 0/1/9 [damiana bge1] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/9:UP:1 UP: OK [01:21:27] ethernet 0/1/8 [turnera bge1] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/8:UP:1 UP: OK [01:21:34] ethernet 0/1/18 [scs-oe10-esams] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/18:UP:1 UP: OK [01:21:34] ethernet 0/1/4 [wolfsbane] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/4:UP:1 UP: OK [01:21:34] ethernet 0/1/20 on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/20:DOWN:1 UP: OK [01:21:34] ethernet 0/1/17 [zedler.mgmt] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/17:DOWN:1 UP: OK [01:21:34] ethernet 0/1/11 [thyme] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/11:UP:1 UP: OK [01:21:34] ethernet 0/1/5 [asw-oe16:0/1/22 LACP primary] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/5:UP:1 UP: OK [01:21:34] ethernet 0/1/12 [fsw1-n1-oe16-esams.mgmt] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/12:UP:1 UP: OK [01:21:35] ethernet 0/1/21 [csw2-esams:2/0/47] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/21:UP:1 UP: OK [01:21:35] ethernet 0/1/6 [asw-oe16:0/1/23 LACP secondary] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/6:UP:1 UP: OK [01:21:37] ethernet 0/1/19 [ptolemy.mgmt] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/19:DOWN:1 UP: OK [01:21:54] ethernet 0/1/2 [nightshade] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/2:UP:1 UP: OK [01:21:54] ethernet 0/1/11 [msw1-oe10-esams] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/11:UP:1 UP: OK [01:21:54] ethernet 0/1/7 [hemlock] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/7:UP:1 UP: OK [01:21:54] ethernet 0/1/20 [hemlock.mgmt] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/20:DOWN:1 UP: OK [01:21:54] ethernet 0/1/8 [yarrow] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/8:UP:1 UP: OK [01:21:54] ethernet 0/1/10 [hawthorn] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/10:UP:1 UP: OK [01:21:54] ethernet 0/1/15 [far1-n1-oe16-a-esams.mgmt] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/15:UP:1 UP: OK [01:21:55] ethernet 0/1/14 [cassini.mgmt] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/14:UP:1 UP: OK [01:21:55] ethernet 0/1/24 [daphne.mgmt] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/24:UP:1 UP: OK [01:21:56] ethernet 0/1/23 [asw-oe10-esams:0/1/6 LACP secondary] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/23:UP:1 UP: OK [01:21:56] ethernet 0/1/9 [ortelius] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/9:UP:1 UP: OK [01:21:57] ethernet 0/1/21 [ptolemy] on asw-oe10-esams.mgmt is OK: GigabitEthernet0/1/21:UP:1 UP: OK [01:22:34] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 60221 MB (9% inode=99%): [01:23:54] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [01:24:23] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [01:24:43] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [01:24:54] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [01:25:24] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:25:55] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [01:28:34] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [01:32:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [01:42:13] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [01:58:34] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [01:59:54] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [02:11:34] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [02:22:34] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 60079 MB (9% inode=99%): [02:23:53] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [02:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [02:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [02:24:54] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [02:25:23] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:25:53] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [02:28:34] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [02:32:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [02:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [02:55:34] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 70574 MB (7% inode=98%): [02:55:34] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 70574 MB (7% inode=98%): [02:56:24] /sql on z-dat-s7-a is WARNING: DISK WARNING - free space: /sql 36056 MB (8% inode=99%): [02:58:34] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [02:59:54] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [03:11:34] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [03:22:34] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 59939 MB (9% inode=99%): [03:23:54] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [03:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [03:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [03:24:54] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [03:25:24] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [03:28:34] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [03:32:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [03:42:13] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [03:58:34] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [03:59:54] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [04:11:34] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [04:22:34] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 59859 MB (9% inode=99%): [04:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [04:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [04:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [04:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [04:25:34] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:25:53] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [04:28:34] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [04:32:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [04:37:09] @replag [04:37:09] Matthew_: s1-rr-a: 1m 55s [-0.01 s/s]; s1-user: 1m 55s [-0.01 s/s]; s2-user: 53s [+0.00 s/s]; s3-rr-a: 1m 6s [-0.00 s/s]; s3-user: 1m 6s [-0.00 s/s]; s5-rr-a: 10s [-0.00 s/s]; s5-user: 10s [-0.00 s/s] [04:37:44] /sql on ptolemy is OK: DISK OK - free space: /sql 178490 MB (29% inode=99%): [04:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [04:54:54] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [05:00:04] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [05:11:43] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [05:14:54] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [05:24:03] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [05:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [05:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [05:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [05:25:33] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:25:53] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [05:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [05:32:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [05:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [05:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [06:00:04] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [06:10:34] Free Memory on damiana is WARNING: WARNING - 5.1% (423824 kB) free! [06:11:33] Free Memory on damiana is CRITICAL: CRITICAL - 3.5% (290160 kB) free! [06:11:45] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [06:14:34] Free Memory on damiana is OK: OK - 10.4% (874324 kB) free. [06:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [06:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [06:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [06:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [06:25:33] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:25:53] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [06:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [06:32:23] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [06:42:13] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [06:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [07:00:04] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [07:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [07:23:34] Free Memory on damiana is WARNING: WARNING - 6.7% (563700 kB) free! [07:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [07:24:23] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [07:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [07:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [07:25:34] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [07:28:34] Free Memory on damiana is CRITICAL: CRITICAL - 4.8% (402120 kB) free! [07:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [07:30:34] Free Memory on damiana is OK: OK - 7.6% (640320 kB) free. [07:32:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [07:42:13] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [07:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [08:00:04] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [08:04:34] Free Memory on damiana is CRITICAL: CRITICAL - 4.9% (413772 kB) free! [08:05:33] Free Memory on damiana is WARNING: WARNING - 5.1% (430728 kB) free! [08:11:43] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [08:15:16] nosy, Merlissimo: ping [08:15:23] qcronsub disappeared on hawthorn [08:18:14] I was going to report that [08:18:24] DaB was removing sge62 last night [08:18:40] something seems to have gone wrong [08:19:05] specially given that if hawthorn had been using sge62, it wouldn't have worked [08:20:39] I filed a bug [08:24:03] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [08:24:22] cronied probably needs restarting to get the new path [08:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [08:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [08:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [08:25:33] Free Memory on damiana is OK: OK - 7.1% (593488 kB) free. [08:25:33] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:25:45] this is strange [08:25:53] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [08:25:54] $PATH for hawthorn is /sge/GE/bin/sol-amd64 [08:26:01] *contains [08:26:12] which is where qcronsub lives [08:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [08:32:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [08:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [08:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [09:00:04] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [09:11:43] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [09:18:05] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 102748 MB (10% inode=99%): [09:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [09:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [09:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [09:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [09:25:33] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:25:53] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [09:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [09:32:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [09:34:14] toolserver.org HTTP on wolfsbane is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 5.551 second response time [09:39:14] Sun Grid Engine execd on wolfsbane is CRITICAL: NRPE: Unable to read output [09:40:03] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [09:42:13] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [09:44:04] Sun Grid Engine execd on wolfsbane is CRITICAL: Connection refused by host [09:44:34] Free Memory on damiana is WARNING: WARNING - 5.1% (428136 kB) free! [09:46:34] Free Memory on damiana is OK: OK - 8.6% (721420 kB) free. [09:49:07] Bryan: https://jira.toolserver.org/browse/TS-1544 , i am not an admin, so i cannot help [09:54:33] Load avg. on wolfsbane is CRITICAL: Connection refused by host [09:54:44] Environment IPMI on wolfsbane is CRITICAL: Connection refused by host [09:56:13] /tmp on wolfsbane is CRITICAL: Connection refused by host [09:57:39] Free Memory on damiana is CRITICAL: CRITICAL - 4.7% (394244 kB) free! [09:58:03] / on wolfsbane is CRITICAL: Connection refused by host [09:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [10:05:04] Sun Grid Engine execd on wolfsbane is CRITICAL: Connection refused by host [10:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [10:12:03] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.057 second response time [10:24:03] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [10:24:23] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [10:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [10:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [10:25:34] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [10:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [10:31:33] Free Memory on damiana is WARNING: WARNING - 6.2% (515612 kB) free! [10:32:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [10:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [10:45:34] Free Memory on damiana is OK: OK - 7.3% (608496 kB) free. [10:47:13] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:54:34] Load avg. on wolfsbane is CRITICAL: Connection refused by host [10:54:44] Environment IPMI on wolfsbane is CRITICAL: Connection refused by host [10:56:14] /tmp on wolfsbane is CRITICAL: Connection refused by host [10:58:04] / on wolfsbane is CRITICAL: Connection refused by host [10:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [11:05:04] Sun Grid Engine execd on wolfsbane is CRITICAL: Connection refused by host [11:07:04] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.021 second response time [11:07:06] DaB. * [Toolserver-announce] /sge62 is gone [11:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [11:14:13] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:17:04] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.376 second response time [11:22:36] any ts-wiki-admin? [11:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [11:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [11:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [11:25:03] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [11:25:33] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [11:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [11:32:34] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [11:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [11:43:14] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:54:34] Load avg. on wolfsbane is CRITICAL: Connection refused by host [11:54:44] Environment IPMI on wolfsbane is CRITICAL: Connection refused by host [11:56:14] /tmp on wolfsbane is CRITICAL: Connection refused by host [11:58:04] / on wolfsbane is CRITICAL: Connection refused by host [11:58:43] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [12:02:33] hello all [12:05:04] SSH on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:05:04] Sun Grid Engine execd on wolfsbane is CRITICAL: Connection refused by host [12:05:04] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.010 second response time [12:05:54] SSH on wolfsbane is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [12:08:24] DaBPunkt: hello. saw my comment on that jira ticket? [12:09:18] liangent: yes, di you see my reply? [12:11:35] DaBPunkt: just now ... but no mail? [12:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [12:11:46] is it another issue that mail of jira is broken? [12:11:55] yes [12:12:29] already known? [12:12:40] yes [12:14:13] toolserver.org HTTP on wolfsbane is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 6.276 second response time [12:14:34] Free Memory on damiana is CRITICAL: CRITICAL - 3.9% (326076 kB) free! [12:14:35] wolfsbane looks unhappy [12:15:24] DaBPunkt: mine is: /opt/local/bin:/opt/ts/gnu/bin:/opt/ts/bin:/opt/ts/mysql/5.1/bin:/opt/ts/perl/5.10/bin:/opt/ts/python/2.6/bin:/opt/ts/php/5.3/bin:/opt/ts/ruby/1.9/bin:/opt/ts/mono/2.0/bin:/opt/ts/tcl/8.5/bin:/usr/ccs/bin:/sge/GE/bin/sol-amd64:/usr/bin:/usr/sbin:/usr/sfw/bin:/usr/postgres/8.3/bin:/opt/jobserver/bin [12:15:54] exactly the same [12:16:32] and it yet fails? [12:16:34] Free Memory on damiana is WARNING: WARNING - 5.4% (451732 kB) free! [12:17:42] DaBPunkt: yes [12:18:35] strange [12:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [12:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [12:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [12:25:03] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [12:25:46] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [12:26:33] Free Memory on damiana is OK: OK - 7.2% (607248 kB) free. [12:26:53] Environment IPMI on wolfsbane is OK: ok: temperature ok fan ok voltage ok chassis ok [12:27:04] / on wolfsbane is OK: DISK OK - free space: / 16565 MB (55% inode=93%): [12:27:14] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.012 second response time [12:27:14] /tmp on wolfsbane is OK: DISK OK - free space: / 16537 MB (55% inode=93%): [12:27:23] liangent: try now [12:27:34] Load avg. on wolfsbane is OK: OK - load average: 2.11, 0.89, 0.34 [12:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [12:31:13] DaBPunkt: what can I do to try ... I think I should just wait and see if things are done on wiki now... [12:33:04] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [12:33:23] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [12:41:13] DaBPunkt, psw recovery still not working [12:41:20] can you restore for me, plz? [12:41:25] *restore it [12:41:49] for jira or the wiki? [12:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [12:42:39] wiki, DaBPunkt [12:43:33] Gnumarcoo: I can try. Which name? [12:43:38] Gnumarcoo [12:44:46] DaBPunkt, ^ [12:44:51] DaBPunkt: the issue is now resolved. what's its cause? just for curiousity [12:45:15] liangent: the only thing I did was restarting cron [12:45:19] cronie [12:45:31] maybe it cached the path wrong [12:45:38] I will restart it everywhere [12:46:19] ok thanks [12:54:25] Gnumarcoo: sorry, we have to find the problem first. I can not manually change our password at the moment [12:55:34] Free Memory on damiana is WARNING: WARNING - 6.8% (568896 kB) free! [12:56:43] DaBPunkt, your recovery system has a bug so I cannot do it by myself. I'll call you back later... [12:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [13:01:33] Free Memory on damiana is OK: OK - 7.3% (608300 kB) free. [13:02:54] DaBPunkt, i've found my old psw, so there's no prob [13:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [13:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [13:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [13:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [13:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [13:25:44] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [13:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [13:33:04] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [13:33:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [13:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [13:58:43] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [14:04:34] Free Memory on damiana is WARNING: WARNING - 5.7% (479324 kB) free! [14:10:33] Free Memory on damiana is WARNING: WARNING - 6.2% (516436 kB) free! [14:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [14:15:44] Hi. Toolserver-Java guys here (except me)? [14:17:13] toolserver.org/lalm/navigate/way/16765068 returns an exception, but works locally. Differences exist in the used libraries and probably the database server (schema is the same), but I'm not sure how I could solve that. [14:23:43] jongleur: that's a casting-problem – I doubt it is related to the database-server [14:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [14:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [14:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [14:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [14:25:44] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [14:25:55] DaBPunkt: it may be a problem with the jar files, but the only difference I can see between local and remote stuff is postgresql-9.1-901.jdbc4.jar (toolserver) vs. postgresql-9.1-902.jdbc4.jar (locally), and there I'm not able to change that, because that's provided by toolservers tomcat. [14:26:59] I can update the jar, but I doubt that such a little version-step is importnat. Have you an url? [14:27:09] wait a moment [14:27:34] Free Memory on damiana is CRITICAL: CRITICAL - 4.4% (367700 kB) free! [14:28:43] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [14:29:34] Free Memory on damiana is WARNING: WARNING - 5.1% (425364 kB) free! [14:30:33] Free Memory on damiana is CRITICAL: CRITICAL - 4.5% (378504 kB) free! [14:30:38] DaBPunkt: http://jdbc.postgresql.org/download/postgresql-9.1-902.jdbc4.jar [14:33:04] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [14:33:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [14:34:47] jongleur: didn't help. Did you see http://hibernate-spatial.1140993.n2.nabble.com/PGobject-cannot-be-cast-to-PGgeometry-td1141096.html ? [14:35:38] DaBPunkt: no, thanks... [14:36:05] should I create a ticket to get the postgis-jar? [14:39:01] yes please [14:39:15] k, thanks [14:41:48] DaBPunkt, did wolfsbane change its ip recently? [14:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [14:51:33] Platonides: rebootetd wolfbane today, but that shouldn't change its ip – why? [14:54:28] 91.198.174.210 is mayapple now, but I had an entry on known_hosts for that ip linked to wolfsbane [14:54:51] the webserver configuration also complain about not binding to some ips [14:54:59] I opened https://jira.toolserver.org/browse/TS-1548 with them [14:55:44] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 71353 MB (7% inode=98%): [14:55:44] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 71353 MB (7% inode=98%): [14:55:50] I'm also getting a different Warning: the ECDSA host key for 'wolfsbane.toolserver.org' differs from the key for the IP address '91.198.174.203' [14:56:10] it looks like it moved ips [14:56:34] /sql on z-dat-s7-a is WARNING: DISK WARNING - free space: /sql 35499 MB (8% inode=99%): [14:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [15:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [15:14:21] Platonides: I assigned 91.198.174.210 to mayapple arround 2. August. The IP was free in the DNS-setup at that time. Maybe river once assigned it to wolfsbane (too) but nowaday it is not assigned there. [15:15:21] by the way, where did mayapple come from? [15:15:52] Platonides: an old dell-server the wmf planed to throw away [15:15:53] it suddenly appeared without notice, and if it wasn't bought... [15:16:23] taken for the toolsever! :) [15:16:35] is it supposed to be like a login server? [15:16:42] I conviced Mark to give it to us so I can install debian on it and test apache [15:17:05] (we got a second one but it broke a power-up) [15:17:12] a → at [15:18:32] Platonides: at the moment it is a test-server. Its future is unclear (it has no warranty so if something breaks the chance is high that we have to throw it away) [15:18:59] I guess we still have the pieces of that second server that broke :P [15:19:18] yes, it waste space in our rack ;) [15:19:44] mayapple is not configured as submit host, and its /mnt/user-store needs a remount [15:19:55] I'm not sure if it's worth opening you a task for that [15:19:56] yes to both [15:20:35] you mean, open two jira tickets... ? [15:20:50] no, I mean that I know both things :) [15:21:01] ah [15:22:17] I thought that would be news, given that it's not fixed ;) [15:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [15:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [15:24:36] I neglected mayapple in the last time because I was busy with other ts-stuff [15:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [15:25:02] well, it's kind of a "bonus server" [15:25:03] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [15:25:15] I think there was something else broken... [15:25:26] server switching [15:25:37] you can't jump from mayapple to other hosts [15:25:44] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [15:28:43] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [15:29:26] I logged them in a generic mayapple task: https://jira.toolserver.org/browse/TS-1549 [15:33:04] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [15:33:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [15:34:22] * DaBPunkt fills in now a general task for making dinner – cu :-) [15:35:09] ENOCOOK: Resource unavailable [15:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [15:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [16:07:44] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1946 [16:08:13] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 1946.000000 [16:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [16:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [16:24:14] s1 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1787.000000 [16:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [16:24:44] MySQL slave on rosemary is OK: Uptime: 15728840 Threads: 14 Questions: 6974580029 Slow queries: 2323843 Opens: 342278 Flush tables: 6 Open tables: 4142 Queries per second avg: 443.426 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1786 [16:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [16:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [16:25:44] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:25:53] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [16:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [16:33:14] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [16:33:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [16:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [16:43:33] Free Memory on damiana is WARNING: WARNING - 6.0% (504972 kB) free! [16:49:44] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2042 [16:49:54] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:50:13] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2049.000000 [16:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [17:02:33] Free Memory on damiana is OK: OK - 7.1% (598736 kB) free. [17:11:43] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [17:17:07] Dr. Trigon * Re: [Toolserver-l] When to execute cron-tasks [17:22:05] Dr. Trigon * Re: [Toolserver-l] /sge62 is gone [17:24:03] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [17:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [17:24:34] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [17:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [17:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [17:25:43] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [17:28:06] Toto Azéro * Re: [Toolserver-l] /sge62 is gone [17:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [17:33:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [17:34:03] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [17:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [17:49:44] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1859 [17:50:13] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 1834.000000 [17:51:43] MySQL slave on rosemary is OK: Uptime: 15734061 Threads: 11 Questions: 6977090755 Slow queries: 2325937 Opens: 342735 Flush tables: 6 Open tables: 4143 Queries per second avg: 443.438 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1798 [17:52:14] s1 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1799.000000 [17:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [18:03:44] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2114 [18:04:34] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2128.000000 [18:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [18:22:34] s1 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1781.000000 [18:22:34] Free Memory on damiana is CRITICAL: CRITICAL - 4.9% (409712 kB) free! [18:22:44] MySQL slave on rosemary is OK: Uptime: 15735921 Threads: 11 Questions: 6977648277 Slow queries: 2326621 Opens: 342797 Flush tables: 6 Open tables: 4143 Queries per second avg: 443.421 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1764 [18:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [18:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [18:24:34] Free Memory on damiana is WARNING: WARNING - 5.2% (431824 kB) free! [18:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [18:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [18:25:44] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [18:27:34] Free Memory on damiana is CRITICAL: CRITICAL - 4.5% (377528 kB) free! [18:28:43] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [18:33:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [18:34:03] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [18:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [18:43:53] @replag [18:43:53] Merlissimo: s1-rr-a: 18m 55s [+0.02 s/s]; s1-user: 18m 55s [+0.02 s/s]; s3-rr-a: 1m 14s [+0.00 s/s]; s3-user: 1m 14s [+0.00 s/s] [18:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [19:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [19:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [19:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [19:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [19:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [19:25:44] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [19:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [19:33:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [19:34:03] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [19:38:34] Free Memory on damiana is WARNING: WARNING - 6.1% (514484 kB) free! [19:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [19:51:33] Free Memory on damiana is OK: OK - 7.2% (602864 kB) free. [19:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [20:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [20:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [20:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [20:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [20:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [20:25:44] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [20:28:43] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [20:33:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [20:34:03] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [20:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [20:46:50] @replag [20:46:51] scfc_de`: s1-rr-a: 22m 27s [+0.03 s/s]; s1-user: 22m 27s [+0.03 s/s]; s3-rr-a: 11s [-0.01 s/s]; s3-user: 11s [-0.01 s/s] [20:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [21:05:33] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 1870.000000 [21:05:44] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1867 [21:08:34] Free Memory on damiana is WARNING: WARNING - 6.7% (559032 kB) free! [21:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [21:18:03] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 102812 MB (10% inode=99%): [21:21:33] Free Memory on damiana is OK: OK - 7.1% (597700 kB) free. [21:23:43] MySQL slave on rosemary is OK: Uptime: 15746780 Threads: 8 Questions: 6982238186 Slow queries: 2331181 Opens: 344055 Flush tables: 6 Open tables: 4136 Queries per second avg: 443.407 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1773 [21:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [21:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [21:24:34] s1 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1755.000000 [21:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [21:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [21:25:44] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [21:28:43] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [21:33:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [21:33:34] Free Memory on damiana is WARNING: WARNING - 6.7% (560424 kB) free! [21:34:04] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [21:34:33] Free Memory on damiana is OK: OK - 7.3% (609596 kB) free. [21:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [21:54:34] Free Memory on damiana is WARNING: WARNING - 6.0% (504480 kB) free! [21:57:08] @replag [21:57:08] Merlissimo: s1-rr-a: 18m 15s [-0.06 s/s]; s1-user: 18m 15s [-0.06 s/s]; s3-rr-a: 24s [+0.00 s/s]; s3-user: 24s [+0.00 s/s]; s4-rr-a: 7m 6s [+0.00 s/s]; s4-user: 7m 6s [+0.00 s/s] [21:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [22:02:02] Merlissimo, can you make mayapple a submit host? [22:03:00] why? [22:04:51] i don't know what the plan with this host is. but submit is the last option i would guess [22:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [22:13:34] Free Memory on damiana is OK: OK - 7.2% (603148 kB) free. [22:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [22:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [22:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [22:25:03] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [22:25:44] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [22:28:43] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [22:33:25] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [22:34:03] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [22:37:33] Free Memory on damiana is WARNING: WARNING - 5.5% (459152 kB) free! [22:38:17] Merlissimo, I mean "make qsub work there" [22:38:34] not to replace clematis/hawthorn pair [22:40:26] Platonides: the question is: why should sb. use qsub there? no cron , other input possibilities like webscripts, not declared as login server. But that's DaB's decision. [22:41:17] IMHO you should be able to submit jobs from anywhere [22:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [22:42:21] seems to be for testing right now [22:42:24] I conviced Mark to give it to us so I can install debian on it and test apache [22:42:49] although it'd probably work quite well as login if things worked [22:48:14] DaBPunkt told me that it is not reliable. So perhaps it should not be promated a login server. if it's a sge execution server only it can go without any problem any time. people are always crying if a login server fails. [22:48:45] i'll ask DaB. about his plans. [22:52:23] nacht ts [22:55:04] Platonides: can you tell me any use case how you would like to submit jobs? [22:58:17] * Merlissimo has run a program on myapple which shows a file in the defualt editor which is joe and now he does not know how to exit [22:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [22:59:21] a found [23:11:44] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [23:13:02] Merlissimo, Ctrl-C quits you from joe... [23:13:30] crtl-k-z worked for me [23:13:38] currently, the most appealing feature of mayapple is its 99.5%idle time [23:13:41] but i never used joe before [23:14:17] Platonides: other cpus are idle most of the times, too [23:14:19] Ctrl-K, H shows you the help [23:14:33] which is quite helpful [23:14:40] it is hinted at the top roght [23:16:26] night [23:16:50] either vi or nano. anything in between is an unnessessary mix [23:17:28] I like joe, but if choosing between vi and nano for a default editor, nano [23:17:40] it shows you the hint to exit in the screen [23:18:06] trying to exit vi when you don't know the incantation can be exasperating [23:18:15] :q! [23:18:40] no ctrl ;-) [23:18:44] which is completely obvious :P [23:18:58] Ctrl-C closes most programs [23:19:34] Free Memory on damiana is OK: OK - 7.2% (600776 kB) free. [23:20:14] ctrl-c nromally does not close a program, it simply kills the process. [23:21:21] lets you exit from it [23:21:42] vi traps you and doesn't want to leave the screen [23:22:06] editor wars :P [23:24:04] SRaid on nightshade is CRITICAL: NRPE: Unable to read output [23:24:24] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [23:24:44] MySQL on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [23:25:04] MySQL slave on z-dat-s4-a is CRITICAL: Cant connect to MySQL server on z-dat-s4-a (146) [23:25:43] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:25:54] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [23:28:44] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [23:33:24] DiskSuite on damiana is CRITICAL: CRITICAL - submirror d52 of mirror d50 is Needs and submirror d32 of mirror d30 is Needs and submirror d12 of mirror d10 is Needs and submirror d22 of mirror d20 is Needs [23:34:04] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat [23:42:14] CAM on hemlock is WARNING: WARNING - Storage ts-array5 (2 warnings): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.A:S3: 42:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_BATTERY_NEAR_EXPIRATION.description:S17:Tray.85.Battery.B:S3: 42: [23:58:44] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge62/bin/sol-amd64/qstat