[00:27:54] 3(resolved) [UTRS-63] Test interface <10https://jira.toolserver.org/browse/UTRS-63> (Hersfold) [00:33:12] Load avg. on willow is WARNING: WARNING - load average: 12.98, 15.96, 14.93 [00:34:12] Load avg. on willow is OK: OK - load average: 11.49, 14.99, 14.65 [00:41:52] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [00:42:52] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [00:42:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [00:50:32] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 33540 MB (3% inode=99%): [00:55:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [01:03:12] Load avg. on willow is WARNING: WARNING - load average: 13.54, 15.41, 14.45 [01:04:13] Load avg. on willow is OK: OK - load average: 13.38, 15.00, 14.37 [01:13:12] Load avg. on willow is WARNING: WARNING - load average: 17.68, 16.33, 15.11 [01:25:53] 3(work started) [UTRS-64] System/Tool notice <10https://jira.toolserver.org/browse/UTRS-64> (Hersfold) [01:39:03] MZMcBride * Re: [Toolserver-l] enwiki_p corrupt? [01:41:53] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:42:54] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:42:54] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [01:52:32] Sun Grid Engine execd on ortelius is WARNING: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.067383/1.00, alarm hl:np_load_long=0.812500/1.50, alarm hl:mem_free=16320.000000M/300M, alarm hl:available=1/0 [01:53:33] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [01:55:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [02:03:12] Load avg. on willow is WARNING: WARNING - load average: 14.27, 15.48, 14.37 [02:04:12] Load avg. on willow is OK: OK - load average: 12.97, 14.92, 14.25 [02:13:42] Sun Grid Engine execd on ortelius is WARNING: short@ortelius exceedes load threshold: alarm hl:np_load_short=1.352539/1.10, alarm hl:np_load_long=0.924805/1.55, alarm hl:mem_free=15711.000000M/300M, alarm hl:available=1/0: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.352539/1.00, alarm hl:np_load_long=0.924805/1.50, alarm hl:mem_free=15711.000000M/300M, alarm hl:available=1/0 [02:14:43] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [02:33:12] Load avg. on willow is WARNING: WARNING - load average: 13.93, 15.61, 14.93 [02:33:22] Sun Grid Engine execd on wolfsbane is WARNING: short@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.285156/1.10, alarm hl:np_load_long=0.276855/1.55, alarm hl:mem_free=130.000000M/300M, alarm hl:available=1/0: all.q@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.285156/1.00, alarm hl:np_load_long=0.276855/1.50, alarm hl:mem_free=130.000000M/300M, alarm hl:available=1/0 [02:34:22] Sun Grid Engine execd on wolfsbane is OK: short@wolfsbane OK: all.q@wolfsbane OK [02:41:53] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:42:54] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [02:42:54] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:44:43] Sun Grid Engine execd on ortelius is WARNING: short@ortelius exceedes load threshold: alarm hl:np_load_short=1.405274/1.10, alarm hl:np_load_long=0.923828/1.55, alarm hl:mem_free=15464.000000M/300M, alarm hl:available=1/0: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.405274/1.00, alarm hl:np_load_long=0.923828/1.50, alarm hl:mem_free=15464.000000M/300M, alarm hl:available=1/0 [02:53:44] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [02:55:53] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [02:59:12] Load avg. on willow is OK: OK - load average: 11.61, 13.79, 14.88 [03:00:43] Sun Grid Engine execd on ortelius is WARNING: short@ortelius exceedes load threshold: alarm hl:np_load_short=1.217774/1.10, alarm hl:np_load_long=1.020508/1.55, alarm hl:mem_free=15663.000000M/300M, alarm hl:available=1/0: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.217774/1.00, alarm hl:np_load_long=1.020508/1.50, alarm hl:mem_free=15663.000000M/300M, alarm hl:available=1/0 [03:26:26] [[Special:Log/newusers]] create 10 * Nyseoinc * (New user account) [03:27:14] [[User:Nyseoinc]] !N 10https://wiki.toolserver.org/w/index.php?oldid=6819&rcid=8991 * Nyseoinc * (+1984) (Created page with "During Manhattan, isn't an easy task to be effective an organization correctly. If you can't contain a web-site to your online business, that you're in reality neglecting prospec...") [03:39:24] Sun Grid Engine execd on wolfsbane is WARNING: short@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.184082/1.10, alarm hl:np_load_long=0.263184/1.55, alarm hl:mem_free=247.000000M/300M, alarm hl:available=1/0: all.q@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.184082/1.00, alarm hl:np_load_long=0.263184/1.50, alarm hl:mem_free=247.000000M/300M, alarm hl:available=1/0 [03:42:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:43:03] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:43:03] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [03:44:53] 3(created) [DBQ-177] Items with OTRS permission confirmed on enwiki_p; Database Queries; Minor Task <10https://jira.toolserver.org/browse/DBQ-177> (madman) [03:48:23] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 36705 MB (9% inode=99%): [03:50:22] Sun Grid Engine execd on wolfsbane is OK: short@wolfsbane OK: all.q@wolfsbane OK [03:53:22] /sql on z-dat-s4-a is CRITICAL: DISK CRITICAL - free space: /sql 21960 MB (5% inode=99%): [03:55:22] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 47940 MB (11% inode=99%): [03:56:03] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [04:02:12] Load avg. on willow is WARNING: WARNING - load average: 18.91, 17.46, 14.89 [04:02:53] Sun Grid Engine execd on ortelius is WARNING: short@ortelius exceedes load threshold: alarm hl:np_load_short=1.304688/1.10, alarm hl:np_load_long=0.957031/1.55, alarm hl:mem_free=16142.000000M/300M, alarm hl:available=1/0: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.304688/1.00, alarm hl:np_load_long=0.957031/1.50, alarm hl:mem_free=16142.000000M/300M, alarm hl:available=1/0 [04:04:52] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [04:08:11] Load avg. on willow is OK: OK - load average: 12.52, 14.56, 14.39 [04:12:11] Load avg. on willow is WARNING: WARNING - load average: 15.82, 15.51, 14.81 [04:13:51] Sun Grid Engine execd on ortelius is WARNING: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.099609/1.00, alarm hl:np_load_long=0.990235/1.50, alarm hl:mem_free=15531.000000M/300M, alarm hl:available=1/0 [04:15:52] 3(work started) [DBQ-177] Items with OTRS permission confirmed on enwiki_p <10https://jira.toolserver.org/browse/DBQ-177> (madman) [04:42:07] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:43:07] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:43:08] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [04:43:47] Sun Grid Engine execd on wolfsbane is WARNING: short@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.391601/1.10, alarm hl:np_load_long=0.262695/1.55, alarm hl:mem_free=232.000000M/300M, alarm hl:available=1/0: all.q@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.391601/1.00, alarm hl:np_load_long=0.262695/1.50, alarm hl:mem_free=232.000000M/300M, alarm hl:available=1/0 [04:45:48] Sun Grid Engine execd on wolfsbane is OK: short@wolfsbane OK: all.q@wolfsbane OK [04:56:07] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [05:02:17] Load avg. on willow is WARNING: WARNING - load average: 15.01, 15.00, 13.97 [05:03:17] Load avg. on willow is OK: OK - load average: 12.79, 14.38, 13.82 [05:04:53] 3(resolved) [DBQ-177] Items with OTRS permission confirmed on enwiki_p <10https://jira.toolserver.org/browse/DBQ-177> (madman) [05:22:16] Load avg. on willow is WARNING: WARNING - load average: 13.53, 15.02, 14.39 [05:32:49] Sun Grid Engine execd on wolfsbane is WARNING: short@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.083008/1.10, alarm hl:np_load_long=0.147949/1.55, alarm hl:mem_free=218.000000M/300M, alarm hl:available=1/0: all.q@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.083008/1.00, alarm hl:np_load_long=0.147949/1.50, alarm hl:mem_free=218.000000M/300M, alarm hl:available=1/0 [05:33:49] Sun Grid Engine execd on wolfsbane is OK: short@wolfsbane OK: all.q@wolfsbane OK [05:38:49] Sun Grid Engine execd on wolfsbane is WARNING: short@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.102051/1.10, alarm hl:np_load_long=0.125488/1.55, alarm hl:mem_free=217.000000M/300M, alarm hl:available=1/0: all.q@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.102051/1.00, alarm hl:np_load_long=0.125488/1.50, alarm hl:mem_free=217.000000M/300M, alarm hl:available=1/0 [05:42:08] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:43:07] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [05:43:08] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:49:27] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 38403 MB (9% inode=99%): [05:51:27] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 66048 MB (16% inode=99%): [05:56:08] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [06:00:26] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 37786 MB (9% inode=99%): [06:01:17] Load avg. on willow is WARNING: WARNING - load average: 21.55, 18.05, 16.61 [06:19:17] Load avg. on willow is OK: OK - load average: 11.65, 13.85, 14.97 [06:21:17] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 581818 MB (10% inode=50%): [06:24:17] Load avg. on willow is WARNING: WARNING - load average: 16.96, 14.88, 15.00 [06:34:49] [[~quentinv57/sulinfo]] !N 10https://wiki.toolserver.org/w/index.php?oldid=6820&rcid=8992 * Jayvdb * (+382) (start) [06:35:35] [[~luxo/contributions/contributions.php]] ! 10https://wiki.toolserver.org/w/index.php?diff=6821&oldid=6430&rcid=8993 * Jayvdb * (+29) (Use [[~quentinv57/sulinfo]]) [06:36:00] [[~luxo/contributions/contributions.php]] !M 10https://wiki.toolserver.org/w/index.php?diff=6822&oldid=6821&rcid=8994 * Jayvdb * (-29) (Undo revision 6821 by [[Special:Contributions/Jayvdb|Jayvdb]] ([[User talk:Jayvdb|talk]]) : wrong one) [06:36:39] [[~vvv/sulutil.php]] ! 10https://wiki.toolserver.org/w/index.php?diff=6823&oldid=3044&rcid=8995 * Jayvdb * (+0) (use 'Example' as example user) [06:37:09] [[~vvv/sulutil.php]] ! 10https://wiki.toolserver.org/w/index.php?diff=6824&oldid=6823&rcid=8996 * Jayvdb * (+65) (+This tool currently redirects to [[~quentinv57/sulinfo]]) [06:37:29] [[Category:Tools by quentinv57]] !N 10https://wiki.toolserver.org/w/index.php?oldid=6825&rcid=8997 * Jayvdb * (+30) (Created page with "[[Category: Tools by authors]]") [06:42:07] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:43:09] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [06:43:18] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:50:17] Load avg. on willow is CRITICAL: CRITICAL - load average: 31.97, 19.37, 17.52 [06:51:16] Load avg. on willow is WARNING: WARNING - load average: 26.04, 20.08, 17.90 [06:56:17] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [07:06:17] Load avg. on willow is CRITICAL: CRITICAL - load average: 32.63, 23.02, 19.95 [07:21:17] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 579915 MB (10% inode=50%): [07:26:17] Load avg. on willow is WARNING: WARNING - load average: 15.94, 15.21, 16.54 [07:31:17] Load avg. on willow is CRITICAL: CRITICAL - load average: 31.80, 21.43, 18.64 [07:32:17] Load avg. on willow is WARNING: WARNING - load average: 28.43, 22.31, 19.13 [07:35:28] Load avg. on willow is CRITICAL: CRITICAL - load average: 30.55, 24.79, 20.66 [07:42:08] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:44:07] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:44:08] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [07:55:27] Load avg. on willow is WARNING: WARNING - load average: 22.54, 20.20, 19.59 [07:57:08] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [08:02:27] Load avg. on willow is CRITICAL: CRITICAL - load average: 23.41, 21.11, 20.03 [08:17:27] Load avg. on willow is WARNING: WARNING - load average: 12.23, 18.51, 19.94 [08:21:27] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 578905 MB (10% inode=50%): [08:31:36] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:31:58] SMTP on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:31:58] /sql on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:31:58] /tmp on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:32:17] SMF on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:32:17] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:32:27] / on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:32:27] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [08:32:28] SMF on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:32:28] /sql on z-dat-s7-a is OK: DISK OK - free space: /sql 117765 MB (29% inode=99%): [08:32:28] /tmp on z-dat-s7-a is OK: DISK OK - free space: /tmp 3061 MB (99% inode=99%): [08:32:48] SMTP on z-dat-s3-a is OK: SMTP OK - 0.022 sec. response time [08:32:48] SMF on z-dat-s6-a is OK: OK - all services online [08:32:48] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [08:32:57] / on z-dat-s7-a is OK: DISK OK - free space: / 11673 MB (38% inode=87%): [08:32:58] SMF on z-dat-s3-a is OK: OK - all services online [08:42:07] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:43:57] Sun Grid Engine execd on ortelius is WARNING: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.080078/1.00, alarm hl:np_load_long=0.714844/1.50, alarm hl:mem_free=16362.000000M/300M, alarm hl:available=1/0 [08:44:07] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [08:44:07] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:44:57] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [08:57:17] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [09:17:27] Load avg. on willow is WARNING: WARNING - load average: 15.33, 15.80, 15.95 [09:21:27] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 568811 MB (10% inode=49%): [09:22:57] SMTP on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:23:16] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:23:27] /sql on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:23:28] / on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:23:28] Load avg. on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:23:28] Load avg. on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:23:28] SMF on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:23:37] /tmp on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:23:37] / on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:23:37] /sql on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:23:37] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:23:37] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:23:37] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:23:38] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:23:38] Load avg. on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:23:48] SMF on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:23:48] SMF on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:23:48] SMTP on z-dat-s4-a is OK: SMTP OK - 2.916 sec. response time [09:23:48] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [09:23:57] /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 207850 MB (21% inode=99%): [09:23:57] / on z-dat-s7-a is OK: DISK OK - free space: / 11673 MB (38% inode=87%): [09:23:58] Load avg. on z-dat-s7-a is OK: OK - load average: 1.69, 2.64, 3.36 [09:24:07] Load avg. on z-dat-s3-a is OK: OK - load average: 1.74, 2.65, 3.36 [09:24:07] SMF on z-dat-s3-a is OK: OK - all services online [09:24:08] MySQL on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [09:24:08] /tmp on z-dat-s3-a is OK: DISK OK - free space: /tmp 3108 MB (99% inode=99%): [09:24:08] / on z-dat-s3-a is OK: DISK OK - free space: / 11673 MB (38% inode=87%): [09:24:08] /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 207847 MB (21% inode=99%): [09:24:08] Load avg. on z-dat-s4-a is OK: OK - load average: 2.07, 2.70, 3.37 [09:24:17] MySQL on z-dat-s3-a is OK: Uptime: 777238 Threads: 18 Questions: 1012420908 Slow queries: 41434 Opens: 6585905 Flush tables: 1 Open tables: 16384 Queries per second avg: 1302.588 [09:24:17] SMF on z-dat-s7-a is OK: OK - all services online [09:24:17] SMF on z-dat-s4-a is OK: OK - all services online [09:24:27] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:24:27] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:24:27] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:24:27] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:42:27] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:44:28] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:44:28] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [09:57:28] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [10:17:28] Load avg. on willow is WARNING: WARNING - load average: 15.61, 17.02, 17.18 [10:21:27] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 554414 MB (10% inode=49%): [10:23:27] Sun Grid Engine execd on ortelius is WARNING: short@ortelius exceedes load threshold: alarm hl:np_load_short=1.382812/1.10, alarm hl:np_load_long=0.774414/1.55, alarm hl:mem_free=16462.000000M/300M, alarm hl:available=1/0: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.382812/1.00, alarm hl:np_load_long=0.774414/1.50, alarm hl:mem_free=16462.000000M/300M, alarm hl:available=1/0 [10:26:27] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [10:33:27] Sun Grid Engine execd on ortelius is WARNING: short@ortelius exceedes load threshold: alarm hl:np_load_short=1.399414/1.10, alarm hl:np_load_long=0.920898/1.55, alarm hl:mem_free=16432.000000M/300M, alarm hl:available=1/0: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.399414/1.00, alarm hl:np_load_long=0.920898/1.50, alarm hl:mem_free=16432.000000M/300M, alarm hl:available=1/0 [10:39:36] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 36940 MB (9% inode=99%): [10:42:28] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:43:36] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 50080 MB (12% inode=99%): [10:44:29] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [10:44:29] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:57:28] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [11:01:58] 3(created) [MNT-1217] Hard disk failure in SAN; Maintenance; Minor Minor work <10https://jira.toolserver.org/browse/MNT-1217> (Marlen Caemmerer) [11:17:29] Load avg. on willow is WARNING: WARNING - load average: 16.69, 17.50, 17.34 [11:17:37] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 37260 MB (9% inode=99%): [11:20:36] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 64071 MB (15% inode=99%): [11:21:28] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 533974 MB (10% inode=48%): [11:30:28] Load avg. on willow is CRITICAL: CRITICAL - load average: 30.89, 19.96, 18.09 [11:31:28] Load avg. on willow is WARNING: WARNING - load average: 22.54, 19.61, 18.09 [11:38:05] [[User:Dab/Debian-Packages]] ! 10https://wiki.toolserver.org/w/index.php?diff=6826&oldid=6809&rcid=8998 * Junkie.dolphin * (+72) (/* Python */ adding common scientific packages for Python ) [11:38:26] [[User:Dab/Debian-Packages]] ! 10https://wiki.toolserver.org/w/index.php?diff=6827&oldid=6826&rcid=8999 * Junkie.dolphin * (+5) (/* General Tools */ Adding ZSH) [11:38:37] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 37982 MB (9% inode=99%): [11:42:29] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:44:29] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [11:44:29] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:44:37] /sql on z-dat-s4-a is CRITICAL: DISK CRITICAL - free space: /sql 22550 MB (5% inode=99%): [11:45:36] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 48289 MB (11% inode=99%): [11:58:28] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [11:59:27] s4 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 1935.000000 [12:00:28] Load avg. on willow is CRITICAL: CRITICAL - load average: 33.84, 19.33, 17.15 [12:01:28] Load avg. on willow is WARNING: WARNING - load average: 24.35, 19.30, 17.29 [12:12:00] 3(commented) [MNT-1217] Hard disk failure in SAN <10https://jira.toolserver.org/browse/MNT-1217> (Marlen Caemmerer) [12:21:29] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 488092 MB (9% inode=45%): [12:23:37] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 36875 MB (9% inode=99%): [12:25:12] Hello all [12:26:09] DaBPunkt: Hi! [12:28:36] /sql on z-dat-s4-a is CRITICAL: DISK CRITICAL - free space: /sql 23326 MB (5% inode=99%): [12:30:36] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 44941 MB (11% inode=99%): [12:35:28] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3643.000000 [12:42:28] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:45:28] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [12:45:28] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:50:47] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 33479 MB (3% inode=99%): [12:58:29] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [13:01:29] Load avg. on willow is WARNING: WARNING - load average: 20.26, 18.46, 17.34 [13:04:28] /tmp on willow is WARNING: DISK WARNING - free space: /tmp 105 MB (20% inode=99%): [13:22:28] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 453946 MB (8% inode=44%): [13:35:29] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 6453.000000 [13:42:29] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:45:28] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:45:28] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [13:46:21] @replag all [13:46:21] nosy: s1-rr-a: 1s [-]; s1-rr-a-c: 1h 56m 27s [+0.42 s/s]; s1-user: 34s [+0.00 s/s]; s2-user: 9s [-0.00 s/s]; s2-user-c: 6s [+0.00 s/s]; s3-rr-a: 6s [-0.01 s/s]; s3-user: 6s [-0.01 s/s]; s4-rr-a: 6s [+0.00 s/s] [13:46:22] nosy: s4-user: 6s [-0.02 s/s]; s5-rr-a: 3s [+0.00 s/s]; s5-user: 3s [+0.00 s/s]; s5-user-c: 6s [+0.00 s/s]; s6-rr-a: 1s [-0.00 s/s]; s6-user: 1s [-0.00 s/s]; s7-rr-a: 1s [-0.00 s/s]; s7-user: 1s [-0.00 s/s] [13:49:37] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 39608 MB (9% inode=99%): [13:52:36] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 61868 MB (15% inode=99%): [13:56:53] 3(created) [MNT-1218] Stopped trainwreck on rosemary; Maintenance; Minor work <10https://jira.toolserver.org/browse/MNT-1218> (Marlen Caemmerer) [13:59:28] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [14:00:28] SMF on rosemary is CRITICAL: ERROR - maintenance: svc:/network/trainwreck:default [14:02:29] Load avg. on willow is WARNING: WARNING - load average: 14.70, 15.48, 15.84 [14:05:26] /tmp on willow is WARNING: DISK WARNING - free space: /tmp 63 MB (12% inode=99%): [14:17:26] /tmp on willow is CRITICAL: DISK CRITICAL - free space: /tmp 55 MB (10% inode=99%): [14:19:26] Sun Grid Engine execd on ortelius is WARNING: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.046875/1.00, alarm hl:np_load_long=0.811523/1.50, alarm hl:mem_free=15818.000000M/300M, alarm hl:available=1/0 [14:19:26] /tmp on willow is WARNING: DISK WARNING - free space: /tmp 77 MB (15% inode=99%): [14:20:25] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [14:21:10] @replag all [14:21:10] nosy: s1-rr-a: 0s [-0.00 s/s]; s1-rr-a-c: 2h 30m 0s [+0.96 s/s]; s1-user: 0s [-0.02 s/s]; s2-user: 2s [-0.00 s/s]; s2-user-c: 1s [-0.00 s/s]; s3-rr-a: 1m 10s [+0.03 s/s]; s3-user: 1m 10s [+0.03 s/s]; s4-rr-a: 1s [-0.00 s/s] [14:21:11] nosy: s4-user: 1s [-0.00 s/s]; s5-rr-a: 3s [-]; s5-user: 3s [-]; s5-user-c: 1s [-0.00 s/s]; s6-rr-a: 1m 9s [+0.03 s/s]; s6-user: 1m 9s [+0.03 s/s]; s7-rr-a: 5s [+0.00 s/s]; s7-user: 5s [+0.00 s/s] [14:21:16] @replag all [14:21:17] nosy: s1-rr-a: 0s [-]; s1-rr-a-c: 2h 29m 56s [-0.62 s/s]; s1-user: 0s [-]; s2-user: 0s [-0.31 s/s]; s2-user-c: 0s [-0.15 s/s]; s3-rr-a: 1m 16s [+0.93 s/s]; s3-user: 1m 16s [+0.93 s/s]; s4-rr-a: 0s [-0.15 s/s] [14:21:18] nosy: s4-user: 0s [-0.15 s/s]; s5-rr-a: 3s [-]; s5-user: 3s [-]; s5-user-c: 0s [-0.15 s/s]; s6-rr-a: 1m 2s [-1.08 s/s]; s6-user: 1m 2s [-1.08 s/s]; s7-rr-a: 0s [-0.77 s/s]; s7-user: 0s [-0.77 s/s] [14:21:36] SMF on rosemary is OK: OK - all services online [14:22:25] Load avg. on willow is OK: OK - load average: 13.55, 14.27, 14.95 [14:22:35] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 412563 MB (7% inode=41%): [14:23:58] 3(resolved) [MNT-1218] Stopped trainwreck on rosemary <10https://jira.toolserver.org/browse/MNT-1218> (Marlen Caemmerer) [14:32:26] Load avg. on willow is WARNING: WARNING - load average: 14.88, 15.04, 14.88 [14:33:36] Load avg. on willow is OK: OK - load average: 13.16, 14.50, 14.70 [14:35:56] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 5550.000000 [14:42:54] s4 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3171.000000 [14:42:54] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:45:55] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:45:56] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [14:46:54] s4 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1784.000000 [14:46:56] 3(closed) [OSM-8] Removed PostgreSQL log files <10https://jira.toolserver.org/browse/OSM-8> (Marlen Caemmerer) [14:48:54] 3(commented) [OSM-5] hstore operator ? no longer uses indexes <10https://jira.toolserver.org/browse/OSM-5> (Marlen Caemmerer) [14:48:55] Sun Grid Engine execd on ortelius is WARNING: short@ortelius exceedes load threshold: alarm hl:np_load_short=2.994140/1.10, alarm hl:np_load_long=1.230469/1.55, alarm hl:mem_free=15685.000000M/300M, alarm hl:available=1/0: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=2.994140/1.00, alarm hl:np_load_long=1.230469/1.50, alarm hl:mem_free=15685.000000M/300M, alarm hl:available=1/0 [14:50:54] 3(closed) [OSM-7] Add extra indexes to make tile rendering faster <10https://jira.toolserver.org/browse/OSM-7> (Marlen Caemmerer) [14:52:55] 3(commented) [OSM-9] Periodic "Connection refused" on tirex mod_tile socket <10https://jira.toolserver.org/browse/OSM-9> (Marlen Caemmerer) [14:54:56] 3(commented) [OSM-3] Ptolemy Postgres crashed twice <10https://jira.toolserver.org/browse/OSM-3> (Marlen Caemmerer) [14:56:56] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [14:59:56] /tmp on willow is CRITICAL: DISK CRITICAL - free space: /tmp 55 MB (10% inode=99%): [14:59:56] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [15:07:55] Sun Grid Engine execd on ortelius is WARNING: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.056641/1.00, alarm hl:np_load_long=1.043945/1.50, alarm hl:mem_free=15740.000000M/300M, alarm hl:available=1/0 [15:18:55] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 38971 MB (9% inode=99%): [15:20:54] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 65103 MB (16% inode=99%): [15:22:55] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 404579 MB (7% inode=41%): [15:42:55] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:44:03] DaB. * [Toolserver-announce] Maintenance at Wednesday [15:45:54] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:46:55] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [15:52:55] Load avg. on willow is WARNING: WARNING - load average: 23.19, 17.88, 15.41 [15:59:55] /tmp on willow is CRITICAL: DISK CRITICAL - free space: /tmp 17 MB (3% inode=99%): [15:59:56] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [16:12:54] Sun Grid Engine execd on ortelius is WARNING: short@ortelius exceedes load threshold: alarm hl:np_load_short=1.235351/1.10, alarm hl:np_load_long=0.850586/1.55, alarm hl:mem_free=15624.000000M/300M, alarm hl:available=1/0: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.235351/1.00, alarm hl:np_load_long=0.850586/1.50, alarm hl:mem_free=15624.000000M/300M, alarm hl:available=1/0 [16:22:54] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 395524 MB (7% inode=40%): [16:23:54] Load avg. on willow is OK: OK - load average: 10.62, 12.70, 14.76 [16:29:54] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 37198 MB (9% inode=99%): [16:31:55] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 67210 MB (16% inode=99%): [16:32:55] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [16:42:55] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:45:55] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:47:04] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [17:00:03] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [17:00:55] /tmp on willow is CRITICAL: DISK CRITICAL - free space: /tmp 0 MB (0% inode=99%): [17:01:25] HTTPS isn't responding anymore [17:03:11] Seems to firewall/router issue as I can access it locally [17:03:19] to be* [17:05:10] Ok, working again [17:13:33] Merlissimo: are you around ? [17:18:06] or is there anybody else who can help me about SGE ? [17:19:04] doh ! never mind, I found what I was looking for… ^^ [17:19:49] Toto_Azero: re (but no so much time) [17:20:42] Merlissimo: I was just asking myself if the new qsub choose by itself the queue [17:20:58] eg. I want to run a job in the short queue [17:22:06] so I has nothing else to do than list memory usage I need and a delay for the job, is that right ? [17:22:27] if you have a short runtime of few hours sge will choose the short queue if there is free cpu power and memory available [17:22:54] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 393104 MB (7% inode=40%): [17:22:58] you also have to add runtime limit (-l h_rt) [17:24:01] ok all right :) I believe if it's only a few minutes long, the short queue will mostly be the chosen one ? [17:25:41] yes, sge will choose a queue that gives your short job many cpu shares [17:26:41] ok great… :) and thanks a much for your work [17:30:42] [[Job scheduling]] ! 10https://wiki.toolserver.org/w/index.php?diff=6828&oldid=6815&rcid=9000 * Totoazero * (+0) (/* optional resources */ mistake ("ob" -> "of")) [17:35:55] Sun Grid Engine execd on ortelius is WARNING: short@ortelius exceedes load threshold: alarm hl:np_load_short=3.699219/1.10, alarm hl:np_load_long=1.448242/1.55, alarm hl:mem_free=14705.000000M/300M, alarm hl:available=1/0: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=3.699219/1.00, alarm hl:np_load_long=1.448242/1.50, alarm hl:mem_free=14705.000000M/300M, alarm hl:available=1/0 [17:38:54] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [17:42:54] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:43:54] Sun Grid Engine execd on ortelius is WARNING: short@ortelius exceedes load threshold: alarm hl:np_load_short=1.169922/1.10, alarm hl:np_load_long=1.254883/1.55, alarm hl:mem_free=14734.000000M/300M, alarm hl:available=1/0: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.169922/1.00, alarm hl:np_load_long=1.254883/1.50, alarm hl:mem_free=14734.000000M/300M, alarm hl:available=1/0 [17:46:05] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:47:04] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [17:52:58] 3(commented) [OSM-5] hstore operator ? no longer uses indexes <10https://jira.toolserver.org/browse/OSM-5> (Kai Krueger) [17:54:53] 3(commented) [OSM-3] Ptolemy Postgres crashed twice <10https://jira.toolserver.org/browse/OSM-3> (Kai Krueger) [18:00:04] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [18:01:59] /tmp on willow is CRITICAL: DISK CRITICAL - free space: /tmp 8 MB (1% inode=99%): [18:10:05] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 38232 MB (9% inode=99%): [18:16:04] /sql on z-dat-s4-a is CRITICAL: DISK CRITICAL - free space: /sql 22825 MB (5% inode=99%): [18:17:04] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 53417 MB (13% inode=99%): [18:22:57] 3(commented) [OSM-3] Ptolemy Postgres crashed twice <10https://jira.toolserver.org/browse/OSM-3> (Marlen Caemmerer) [18:23:53] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 390776 MB (7% inode=40%): [18:39:28] @replag all [18:39:28] nosy: s1-rr-a: 1s [+0.00 s/s]; s1-rr-a-c: 1s [-0.58 s/s]; s1-user: 3s [+0.00 s/s]; s2-user: 5s [+0.00 s/s]; s2-user-c: 4s [+0.00 s/s]; s3-rr-a: 27s [-0.00 s/s]; s3-user: 27s [-0.00 s/s]; s4-rr-a: 1s [+0.00 s/s] [18:39:29] nosy: s4-user: 1s [+0.00 s/s]; s5-rr-a: 4s [+0.00 s/s]; s5-user: 4s [+0.00 s/s]; s5-user-c: 4s [+0.00 s/s]; s6-rr-a: 1s [-0.00 s/s]; s6-user: 1s [-0.00 s/s]; s7-rr-a: 3s [+0.00 s/s]; s7-user: 3s [+0.00 s/s] [18:42:54] Sun Grid Engine execd on ortelius is WARNING: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.009766/1.00, alarm hl:np_load_long=0.820312/1.50, alarm hl:mem_free=15731.000000M/300M, alarm hl:available=1/0 [18:43:04] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:43:52] 3(commented) [OSM-5] hstore operator ? no longer uses indexes <10https://jira.toolserver.org/browse/OSM-5> (Marlen Caemmerer) [18:44:00] 3(commented) [OSM-5] hstore operator ? no longer uses indexes <10https://jira.toolserver.org/browse/OSM-5> (Marlen Caemmerer) [18:44:54] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [18:46:05] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:47:04] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [19:00:04] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [19:02:54] /tmp on willow is CRITICAL: DISK CRITICAL - free space: /tmp 0 MB (0% inode=99%): [19:23:54] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 388418 MB (7% inode=40%): [19:43:14] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [19:44:54] Sun Grid Engine execd on ortelius is WARNING: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.082031/1.00, alarm hl:np_load_long=0.860352/1.50, alarm hl:mem_free=15951.000000M/300M, alarm hl:available=1/0 [19:45:54] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [19:46:15] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [19:47:14] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [19:52:31] I will reboot willow now, because /tmp is finaly full [19:53:19] now → in 15 minutes [19:53:46] oh gosh :-/ [19:53:54] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:54:08] i won't be able to save everything in 15 mins :-/ [19:54:18] DaBPunkt: can you wait a bit more? [19:54:19] Danny_B|backup: how much time to you need? [19:54:49] max 30, but 15 seems not enought for my needs [19:55:07] ok, I will wait for you. Ping me, when you are ready [19:55:18] thanks [19:55:23] i appreciate it [19:55:54] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [20:00:14] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [20:02:55] /tmp on willow is CRITICAL: DISK CRITICAL - free space: /tmp 0 MB (0% inode=99%): [20:11:37] DaBPunkt: in two mins - i'm detaching the irssi now, so i won't be able to tell you exactly [20:11:53] 3(created) [MNT-1219] Reboot willow because /tmp is full; Maintenance; Emergency work <10https://jira.toolserver.org/browse/MNT-1219> (DaB.) [20:11:53] Danny_B|backup: ok, [20:15:10] DaBPunkt: cronie has some issues on willow on restart so please let me know when the machine is up, thank you [20:15:33] Danny_B|webchat: will send you a ping [20:15:57] thanks [20:19:14] Sun Grid Engine execd on willow is CRITICAL: Connection refused by host [20:20:37] Danny_B|webchat: four other users used irssi ;-) [20:21:03] obviously [20:21:19] my one from some odd reasons does not save configuration :-/ [20:21:54] /tmp on willow is OK: DISK OK - free space: / 32491 MB (29% inode=99%): [20:22:29] Danny_B|webchat: ok, we are back [20:23:54] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 411031 MB (7% inode=41%): [20:24:07] why couldn't we just simply delete some stuff on tmp? [20:24:15] Sun Grid Engine execd on willow is OK: all.q@willow OK: longrun@willow OK [20:24:22] Danny_B|webchat: because there was nothing there [20:24:31] ah [20:24:44] I think it is a memory-leak in solaris [20:24:49] and i wonder why cronie @reboot still does not work [20:24:53] 3(updated) [MNT-1219] Reboot willow because /tmp is full <10https://jira.toolserver.org/browse/MNT-1219> (DaB.) [20:25:00] looking forward to be back on linux where cron worked [20:25:14] yes, I am too [20:27:03] :o) [20:31:47] and they have a sane(r) implementation of malloc [20:36:12] hello [20:37:05] DaBPunkt: good you rebooted willow. deleted a big file in tmp today that was owned by sgeadmin and merl said it was ok to do this but the space was not freed [20:37:49] ah ok [20:38:44] DaBPunkt: that was the log of the sqlslots load sensor [20:39:11] is disabled the log. it run because of mapnik [20:41:15] is it possible cronie starts before the nfs mount of /home is there? [20:42:03] found a ticket with @reboot jobs not working because of this [20:42:04] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:42:15] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:42:15] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:42:44] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [20:43:03] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:43:04] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:43:14] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [20:43:28] nosy: I have this feeling too, yes. But some user-processes were started [20:43:51] hehe :) [20:44:13] this is cronie? [20:44:14] Jan 19 20:35:17 willow /opt/ts/sbin/crond[524]: [ID 529850 cron.info] (CRON) INFO (@reboot jobs will be run at computer's startup.) [20:45:20] I think so [20:45:36] (enough plus of debian: only 1(!) cron ;)) [20:46:06] really strange...i found another log entry [20:46:08] Jan 19 20:44:03 willow /opt/ts/sbin/crond[431]: [ID 529850 cron.info] (CRON) INFO (@reboot jobs will be run at computer's startup.) [20:46:35] yes this works on debian as 64bit architecture and licensing is straight - oh and updates :D [20:47:15] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [20:47:15] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [20:47:19] Jan 19 20:43:54 willow bnx: [ID 517869 kern.info] NOTICE: Broadcom NetXtreme II Gigabit Ethernet Driver v6.0.3 [20:47:53] really...i cannot get the sense here: [20:47:55] root@willow:~# svcs|grep cron [20:47:55] online 20:21:37 svc:/system/cron:default [20:47:55] online 20:21:37 svc:/system/cronie:default [20:48:16] oh man...i can [20:48:21] look at the date... [20:50:18] there is the network some seconds before: Mar 9 20:21:30 willow bnx: [ID 219732 kern.info] NOTICE: bnx0: (v6.0.3) BCM5708 device with F/W Ver305000c is initialized (1 Fixed) [21:17:54] Sun Grid Engine execd on ortelius is WARNING: short@ortelius exceedes load threshold: alarm hl:np_load_short=1.109375/1.10, alarm hl:np_load_long=0.965820/1.55, alarm hl:mem_free=15779.000000M/300M, alarm hl:available=1/0: all.q@ortelius exceedes load threshold: alarm hl:np_load_short=1.109375/1.00, alarm hl:np_load_long=0.965820/1.50, alarm hl:mem_free=15779.000000M/300M, alarm hl:available=1/0 [21:18:55] Sun Grid Engine execd on ortelius is OK: short@ortelius OK: all.q@ortelius OK [21:23:04] apmon: still awake? [21:23:54] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 416044 MB (7% inode=42%): [21:32:56] 3(commented) [OSM-5] hstore operator ? no longer uses indexes <10https://jira.toolserver.org/browse/OSM-5> (Marlen Caemmerer) [21:43:14] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 37383 MB (9% inode=99%): [21:43:14] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [21:46:14] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 58499 MB (14% inode=99%): [21:48:14] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [21:48:14] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [21:54:15] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 35146 MB (8% inode=99%): [22:08:57] 3(commented) [OSM-5] hstore operator ? no longer uses indexes <10https://jira.toolserver.org/browse/OSM-5> (Kai Krueger) [22:23:54] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 411979 MB (7% inode=41%): [22:43:14] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [22:48:15] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [22:48:15] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [23:24:04] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 407108 MB (7% inode=41%): [23:43:25] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [23:48:24] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [23:48:24] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default