[00:01:55] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1295748.000000 [00:02:35] Load avg. on willow is WARNING: WARNING - load average: 25.70, 21.25, 16.92 [00:02:35] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=3.371582/1.75, alarm hl:np_load_avg=2.666992/2.0, alarm hl:mem_free=130.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=3.371582/2.1, alarm hl:np_load_long=2.112305/2.5, alarm hl:cpu=94.100000/99, alarm hl:mem_free=130.000000M/200M, alarm hl:available=1/0 [00:03:45] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 474767 MB (8% inode=44%): [00:23:35] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [00:26:55] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2096634 [00:29:35] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [00:29:55] 3(commented) [DBQ-183] SQL Query Issue <10https://jira.toolserver.org/browse/DBQ-183> (Hoo man) [00:32:37] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.289062/1.75, alarm hl:np_load_avg=2.206543/2.0, alarm hl:mem_free=137.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.289062/2.1, alarm hl:np_load_long=2.204590/2.5, alarm hl:cpu=94.600000/99, alarm hl:mem_free=137.000000M/200M, alarm hl:available=1/0 [00:48:36] Load avg. on willow is CRITICAL: CRITICAL - load average: 34.39, 22.75, 19.24 [00:53:37] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.607422/1.10, alarm hl:np_load_long=0.763672/1.55, alarm hl:mem_free=19821.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.607422/1.00, alarm hl:np_load_long=0.763672/1.50, alarm hl:mem_free=19821.000000M/350M, alarm hl:available=1/0 [00:55:35] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [00:55:35] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [00:55:45] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [00:56:35] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:01:55] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1299352.000000 [01:04:44] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 476043 MB (8% inode=44%): [01:24:36] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [01:26:55] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2065212 [01:33:36] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.224609/1.75, alarm hl:np_load_avg=2.714844/2.0, alarm hl:mem_free=169.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.224609/2.1, alarm hl:np_load_long=3.031738/2.5, alarm hl:cpu=96.000000/99, alarm hl:mem_free=169.000000M/200M, alarm hl:available=1/0 [01:49:36] Load avg. on willow is CRITICAL: CRITICAL - load average: 21.76, 21.05, 21.59 [01:55:35] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [01:55:44] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [01:56:36] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:57:35] Load avg. on willow is WARNING: WARNING - load average: 15.37, 17.81, 19.94 [02:00:35] Load avg. on willow is CRITICAL: CRITICAL - load average: 29.53, 20.95, 20.67 [02:01:56] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1302951.000000 [02:03:36] Load avg. on willow is WARNING: WARNING - load average: 16.51, 18.70, 19.82 [02:04:44] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 476012 MB (8% inode=44%): [02:04:48] @replag [02:04:50] Dispenser: s2-user: 1m 22s [+0.00 s/s]; s3-rr-a: 1m 43s [+0.00 s/s]; s3-user: 1m 43s [+0.00 s/s] [02:13:34] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.054688/1.00, alarm hl:np_load_long=0.747070/1.50, alarm hl:mem_free=19729.000000M/350M, alarm hl:available=1/0 [02:14:36] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [02:23:37] Load avg. on willow is CRITICAL: CRITICAL - load average: 15.67, 19.04, 20.16 [02:24:37] Load avg. on willow is WARNING: WARNING - load average: 15.81, 18.48, 19.89 [02:24:37] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [02:25:35] Load avg. on willow is CRITICAL: CRITICAL - load average: 21.37, 19.38, 20.10 [02:26:56] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2032990 [02:34:36] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.340332/1.75, alarm hl:np_load_avg=2.520020/2.0, alarm hl:mem_free=278.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.340332/2.1, alarm hl:np_load_long=2.513184/2.5, alarm hl:cpu=95.100000/99, alarm hl:mem_free=278.000000M/200M, alarm hl:available=1/0 [02:45:36] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.389649/1.10, alarm hl:np_load_long=0.779297/1.55, alarm hl:mem_free=19994.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.389649/1.00, alarm hl:np_load_long=0.779297/1.50, alarm hl:mem_free=19994.000000M/350M, alarm hl:available=1/0 [02:46:36] Load avg. on willow is CRITICAL: CRITICAL - load average: 20.70, 21.34, 21.34 [02:46:36] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [02:55:44] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [02:55:55] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [02:56:44] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:02:55] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1306611.000000 [03:04:44] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 475566 MB (8% inode=44%): [03:10:44] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.168945/1.10, alarm hl:np_load_long=0.811523/1.55, alarm hl:mem_free=20239.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.168945/1.00, alarm hl:np_load_long=0.811523/1.50, alarm hl:mem_free=20239.000000M/350M, alarm hl:available=1/0 [03:14:46] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [03:24:45] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [03:26:53] 3(commented) [UTRS-98] Account creation is impossible <10https://jira.toolserver.org/browse/UTRS-98> (Ben Kurtovic) [03:27:04] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2008128 [03:34:47] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=3.055664/1.75, alarm hl:np_load_avg=3.156250/2.0, alarm hl:mem_free=313.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=3.055664/2.1, alarm hl:np_load_long=3.110840/2.5, alarm hl:cpu=99.300000/99, alarm hl:mem_free=313.000000M/200M, alarm hl:available=1/0 [03:46:48] Load avg. on willow is CRITICAL: CRITICAL - load average: 19.34, 22.05, 23.07 [03:55:46] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [03:55:56] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [03:56:49] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:03:04] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1310221.000000 [04:03:47] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.223633/1.10, alarm hl:np_load_long=0.835938/1.55, alarm hl:mem_free=19259.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.223633/1.00, alarm hl:np_load_long=0.835938/1.50, alarm hl:mem_free=19259.000000M/350M, alarm hl:available=1/0 [04:04:47] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 474986 MB (8% inode=44%): [04:14:44] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [04:16:14] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:17:03] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.838379/1.75, alarm hl:np_load_avg=2.997070/2.0, alarm hl:mem_free=64.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.838379/2.1, alarm hl:np_load_long=2.914551/2.5, alarm hl:cpu=99.900000/99, alarm hl:mem_free=64.000000M/200M, alarm hl:available=1/0 [04:24:47] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [04:27:05] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1996754 [04:31:52] 3(created) [UTRS-100] Confirm close; UTRS; Minor Improvement <10https://jira.toolserver.org/browse/UTRS-100> (Ryan (Rjd0060)) [04:44:52] 3(commented) [UTRS-100] Confirm close <10https://jira.toolserver.org/browse/UTRS-100> (Ryan (Rjd0060)) [04:46:53] Load avg. on willow is CRITICAL: CRITICAL - load average: 22.21, 22.91, 23.36 [04:55:51] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [04:56:31] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [04:56:50] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:03:20] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1313832.000000 [05:04:53] 3(commented) [UTRS-100] Confirm close <10https://jira.toolserver.org/browse/UTRS-100> (Ben Kurtovic) [05:04:54] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 474270 MB (8% inode=44%): [05:16:21] Load avg. on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [05:16:22] SMF on willow is UNKNOWN: NRPE: Call to fork() failed [05:16:22] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:17:11] Load avg. on willow is CRITICAL: CRITICAL - load average: 17.98, 21.41, 22.09 [05:17:21] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [05:17:22] /tmp on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [05:17:22] / on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [05:18:23] Cluster on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:18:23] SMF on willow is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [05:18:24] Environment IPMI on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:18:25] /tmp on willow is OK: DISK OK - free space: / 30872 MB (29% inode=99%): [05:18:31] / on willow is OK: DISK OK - free space: / 30872 MB (29% inode=99%): [05:19:10] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=5.914062/1.75, alarm hl:np_load_avg=3.456543/2.0, alarm hl:mem_free=34.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=5.914062/2.1, alarm hl:np_load_long=3.017578/2.5, alarm hl:cpu=91.200000/99, alarm hl:mem_free=34.000000M/200M, alarm hl:available=1/0 [05:19:51] Cluster on willow is OK: CLUSTER OK ! [05:19:51] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.344726/1.10, alarm hl:np_load_long=0.876953/1.55, alarm hl:mem_free=18775.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.344726/1.00, alarm hl:np_load_long=0.876953/1.50, alarm hl:mem_free=18775.000000M/350M, alarm hl:available=1/0 [05:21:09] Environment IPMI on willow is OK: ok: temperature ok fan ok voltage ok chassis ok [05:23:51] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [05:24:51] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [05:27:22] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1990646 [05:31:51] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.073242/1.00, alarm hl:np_load_long=0.893555/1.50, alarm hl:mem_free=19798.000000M/350M, alarm hl:available=1/0 [05:46:00] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [05:56:31] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [05:57:01] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:03:21] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1317434.000000 [06:05:50] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 474086 MB (8% inode=44%): [06:14:20] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:17:50] Load avg. on willow is CRITICAL: CRITICAL - load average: 23.14, 24.10, 24.15 [06:20:01] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.212402/1.75, alarm hl:np_load_avg=2.766113/2.0, alarm hl:mem_free=256.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.212402/2.1, alarm hl:np_load_long=2.928223/2.5, alarm hl:cpu=97.900000/99, alarm hl:mem_free=256.000000M/200M, alarm hl:available=1/0 [06:24:51] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [06:28:20] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1968940 [06:28:55] 3(assigned) [ACC-245] password reset should link to http://en.wikipedia.org/wiki/Special:PasswordReset <10https://jira.toolserver.org/browse/ACC-245> (Manish Goregaokar) [06:29:01] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [06:33:57] 3(commented) [ACC-245] password reset should link to http://en.wikipedia.org/wiki/Special:PasswordReset <10https://jira.toolserver.org/browse/ACC-245> (Manish Goregaokar) [06:34:01] 3(commented) [ACC-245] password reset should link to http://en.wikipedia.org/wiki/Special:PasswordReset <10https://jira.toolserver.org/browse/ACC-245> (Manish Goregaokar) [06:38:54] 3(updated) [ACC-245] password reset should link to http://en.wikipedia.org/wiki/Special:PasswordReset <10https://jira.toolserver.org/browse/ACC-245> (Manish Goregaokar) [06:44:20] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:46:01] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [06:52:52] 3(work started) [ACC-245] password reset should link to http://en.wikipedia.org/wiki/Special:PasswordReset <10https://jira.toolserver.org/browse/ACC-245> (Manish Goregaokar) [06:54:55] 3(resolved) [ACC-245] password reset should link to http://en.wikipedia.org/wiki/Special:PasswordReset <10https://jira.toolserver.org/browse/ACC-245> (Manish Goregaokar) [06:56:52] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [06:57:01] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:04:20] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1321093.000000 [07:05:52] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 474031 MB (8% inode=44%): [07:18:50] Load avg. on willow is CRITICAL: CRITICAL - load average: 18.41, 21.27, 22.10 [07:20:59] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=3.051270/1.75, alarm hl:np_load_avg=2.804688/2.0, alarm hl:mem_free=296.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=3.051270/2.1, alarm hl:np_load_long=2.802246/2.5, alarm hl:cpu=95.800000/99, alarm hl:mem_free=296.000000M/200M, alarm hl:available=1/0 [07:25:50] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [07:29:19] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1950089 [07:47:00] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [07:57:50] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [07:58:00] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:05:20] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1324755.000000 [08:05:51] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 473059 MB (8% inode=44%): [08:18:50] Load avg. on willow is CRITICAL: CRITICAL - load average: 23.80, 24.53, 25.04 [08:21:00] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.585938/1.75, alarm hl:np_load_avg=2.922852/2.0, alarm hl:mem_free=142.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.585938/2.1, alarm hl:np_load_long=3.070312/2.5, alarm hl:cpu=96.400000/99, alarm hl:mem_free=142.000000M/200M, alarm hl:available=1/0 [08:25:50] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [08:29:20] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1931045 [08:34:55] 3(created) [UTRS-101] Periodic cleanup of open request awaiting user; UTRS; New Feature <10https://jira.toolserver.org/browse/UTRS-101> (Martijn Hoekstra) [08:44:51] Load avg. on willow is WARNING: WARNING - load average: 17.34, 18.41, 19.89 [08:47:00] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [08:49:55] 3(commented) [UTRS-99] dupe merging <10https://jira.toolserver.org/browse/UTRS-99> (Martijn Hoekstra) [08:53:55] 3(commented) [UTRS-98] Account creation is impossible <10https://jira.toolserver.org/browse/UTRS-98> (Martijn Hoekstra) [08:54:20] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:57:51] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [08:58:00] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:58:00] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [09:03:51] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [09:05:51] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 471461 MB (8% inode=44%): [09:06:32] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1328420.000000 [09:14:51] Load avg. on willow is CRITICAL: CRITICAL - load average: 22.07, 21.34, 20.24 [09:16:51] Load avg. on willow is WARNING: WARNING - load average: 18.82, 20.33, 19.99 [09:17:51] Load avg. on willow is CRITICAL: CRITICAL - load average: 21.04, 20.56, 20.09 [09:21:00] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.538574/1.75, alarm hl:np_load_avg=2.561035/2.0, alarm hl:mem_free=132.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.538574/2.1, alarm hl:np_load_long=2.518066/2.5, alarm hl:cpu=96.800000/99, alarm hl:mem_free=132.000000M/200M, alarm hl:available=1/0 [09:26:00] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [09:29:31] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1895540 [09:47:09] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [09:54:00] Load avg. on willow is WARNING: WARNING - load average: 17.97, 19.46, 19.66 [09:55:50] Load avg. on willow is CRITICAL: CRITICAL - load average: 29.21, 22.88, 20.88 [09:58:00] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [09:58:01] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:58:01] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [09:59:50] Load avg. on willow is WARNING: WARNING - load average: 15.95, 19.29, 19.85 [10:00:51] Load avg. on willow is CRITICAL: CRITICAL - load average: 23.41, 20.86, 20.37 [10:05:51] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 468393 MB (8% inode=44%): [10:06:30] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1332021.000000 [10:20:49] Load avg. on willow is WARNING: WARNING - load average: 15.67, 16.31, 17.64 [10:21:11] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.935059/1.75, alarm hl:np_load_avg=2.032715/2.0, alarm hl:mem_free=654.000000M/350M, alarm hl:available=1/0 [10:26:00] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [10:30:30] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1867461 [10:47:11] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [10:47:11] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [10:49:20] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:49:56] hi [10:50:12] I have a problem with etherpad (the wikimedia one). Can you help? [10:52:11] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.628418/1.75, alarm hl:np_load_avg=1.740234/2.0, alarm hl:mem_free=287.000000M/350M, alarm hl:available=1/0 [10:54:00] Load avg. on willow is OK: OK - load average: 12.83, 13.64, 14.89 [10:54:31] Aktron: #wikimedia-tech would be more appropriate, afaik the toolserver has nothing to do with etherpad [10:54:39] Snowolf: ok, thanks [10:58:00] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [10:58:11] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:58:11] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [10:59:10] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [11:01:00] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.045899/1.00, alarm hl:np_load_long=0.634765/1.50, alarm hl:mem_free=18911.000000M/350M, alarm hl:available=1/0 [11:04:01] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [11:06:01] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 468238 MB (8% inode=44%): [11:07:00] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [11:07:30] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1335685.000000 [11:14:01] Load avg. on willow is WARNING: WARNING - load average: 16.02, 15.71, 15.17 [11:17:01] Load avg. on willow is OK: OK - load average: 13.75, 14.84, 14.91 [11:21:00] Load avg. on willow is WARNING: WARNING - load average: 17.27, 15.89, 15.31 [11:21:10] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.119629/1.75, alarm hl:np_load_avg=1.985840/2.0, alarm hl:mem_free=504.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.119629/2.1, alarm hl:np_load_long=1.914551/2.5, alarm hl:cpu=93.600000/99, alarm hl:mem_free=504.000000M/200M, alarm hl:available=1/0 [11:26:10] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [11:30:31] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1831405 [11:47:20] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [11:55:21] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:55:41] SMF on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:55:51] SMTP on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:55:52] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:55:52] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:55:52] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:55:52] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:56:21] SMF on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:56:51] MySQL on z-dat-s6-a is CRITICAL: (Service Check Timed Out) [11:57:01] MySQL slave on z-dat-s6-a is CRITICAL: (Service Check Timed Out) [11:57:01] MySQL slave on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [11:57:01] MySQL on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [11:57:29] SMF on hyacinth is OK: OK - all services online [11:57:30] MySQL on z-dat-s6-a is OK: Uptime: 751016 Threads: 10 Questions: 188141442 Slow queries: 45447 Opens: 1835352 Flush tables: 2 Open tables: 1870 Queries per second avg: 250.515 [11:57:41] SMTP on z-dat-s4-a is OK: SMTP OK - 0.002 sec. response time [11:57:42] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [11:57:42] MySQL slave on z-dat-s6-a is OK: Uptime: 751025 Threads: 11 Questions: 188143439 Slow queries: 45447 Opens: 1835352 Flush tables: 2 Open tables: 1870 Queries per second avg: 250.515 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 251 [11:57:42] MySQL on z-dat-s3-a is OK: Uptime: 3810446 Threads: 16 Questions: 4314212480 Slow queries: 222443 Opens: 31327025 Flush tables: 1 Open tables: 16384 Queries per second avg: 1132.206 [11:57:42] MySQL slave on z-dat-s3-a is OK: Uptime: 3810446 Threads: 16 Questions: 4314212480 Slow queries: 222443 Opens: 31327025 Flush tables: 1 Open tables: 16384 Queries per second avg: 1132.206 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 263 [11:57:42] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [11:57:42] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [11:57:43] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [11:57:51] SMF on z-dat-s4-a is OK: OK - all services online [11:57:51] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [11:58:01] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [11:58:20] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [11:59:09] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:05:20] s5 replag on cassia is CRITICAL: (Service Check Timed Out) [12:06:00] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 468051 MB (8% inode=44%): [12:07:40] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1339295.000000 [12:20:31] MySQL on cassia is CRITICAL: User nagios has exceeded the max_user_connections resource (current value: 15) [12:20:31] MySQL slave on cassia is CRITICAL: User nagios has exceeded the max_user_connections resource (current value: 15) [12:20:41] s4 replag on cassia is CRITICAL: QUERY CRITICAL: User nagios has exceeded the max_user_connections resource (current value: 15) [12:21:00] Load avg. on willow is WARNING: WARNING - load average: 17.51, 17.02, 16.85 [12:21:10] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.278320/1.75, alarm hl:np_load_avg=2.141602/2.0, alarm hl:mem_free=384.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.278320/2.1, alarm hl:np_load_long=2.110840/2.5, alarm hl:cpu=95.500000/99, alarm hl:mem_free=384.000000M/200M, alarm hl:available=1/0 [12:26:09] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [12:30:31] MySQL on cassia is OK: Uptime: 2481729 Threads: 28 Questions: 2794500323 Slow queries: 364526 Opens: 4686873 Flush tables: 2 Open tables: 13366 Queries per second avg: 1126.29 [12:30:31] MySQL slave on cassia is OK: Uptime: 2481729 Threads: 29 Questions: 2794500354 Slow queries: 364526 Opens: 4686873 Flush tables: 2 Open tables: 13366 Queries per second avg: 1126.29 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [12:30:31] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1790893 [12:30:41] s4 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 2.000000 [12:47:20] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [12:49:30] MySQL on cassia is CRITICAL: User nagios has exceeded the max_user_connections resource (current value: 15) [12:49:31] MySQL slave on cassia is CRITICAL: User nagios has exceeded the max_user_connections resource (current value: 15) [12:49:41] s4 replag on cassia is CRITICAL: QUERY CRITICAL: User nagios has exceeded the max_user_connections resource (current value: 15) [12:54:51] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:55:20] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:56:30] MySQL slave on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [12:56:30] MySQL on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [12:56:42] MySQL slave on z-dat-s3-a is OK: Uptime: 3813981 Threads: 11 Questions: 4317607577 Slow queries: 222555 Opens: 31338152 Flush tables: 1 Open tables: 16384 Queries per second avg: 1132.47 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 231 [12:56:43] MySQL on z-dat-s3-a is OK: Uptime: 3813981 Threads: 11 Questions: 4317607580 Slow queries: 222555 Opens: 31338152 Flush tables: 1 Open tables: 16384 Queries per second avg: 1132.47 [12:56:43] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [12:56:51] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [12:58:20] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [12:58:30] MySQL on cassia is OK: Uptime: 2483409 Threads: 38 Questions: 2795927835 Slow queries: 364766 Opens: 4695361 Flush tables: 2 Open tables: 13328 Queries per second avg: 1125.842 [12:58:31] MySQL slave on cassia is OK: Uptime: 2483410 Threads: 38 Questions: 2795927926 Slow queries: 364766 Opens: 4695361 Flush tables: 2 Open tables: 13328 Queries per second avg: 1125.842 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [12:58:42] s4 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 3.000000 [12:59:01] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [12:59:10] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:01:30] MySQL slave on cassia is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3585 [13:02:10] s5 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3416.000000 [13:06:59] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 467920 MB (8% inode=44%): [13:07:10] s5 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1352.000000 [13:07:41] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1342895.000000 [13:19:57] 3(closed) [ACC-245] password reset should link to http://en.wikipedia.org/wiki/Special:PasswordReset <10https://jira.toolserver.org/browse/ACC-245> (Simon Walker) [13:21:01] Load avg. on willow is WARNING: WARNING - load average: 20.50, 18.15, 17.43 [13:21:10] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.553711/1.75, alarm hl:np_load_avg=2.268066/2.0, alarm hl:mem_free=511.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.553711/2.1, alarm hl:np_load_long=2.178223/2.5, alarm hl:cpu=97.000000/99, alarm hl:mem_free=511.000000M/200M, alarm hl:available=1/0 [13:26:09] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [13:31:30] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1770855 [13:47:20] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [13:54:09] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [13:58:09] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.092285/1.75, alarm hl:np_load_avg=2.017578/2.0, alarm hl:mem_free=337.000000M/350M, alarm hl:available=1/0 [13:59:01] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [13:59:10] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:59:20] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [14:07:03] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 467773 MB (8% inode=44%): [14:07:41] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1346496.000000 [14:22:01] Load avg. on willow is WARNING: WARNING - load average: 16.62, 17.81, 17.27 [14:26:09] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [14:31:30] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1723252 [14:48:20] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [14:58:20] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.798340/1.75, alarm hl:np_load_avg=2.103516/2.0, alarm hl:mem_free=363.000000M/350M, alarm hl:available=1/0 [14:59:10] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [14:59:20] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:59:20] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [15:03:56] 3(created) [ACCAPP-495] Wikipedia mapping tool in collaboration with Oxford University; Account Approval; New Account <10https://jira.toolserver.org/browse/ACCAPP-495> (Gavin Baily) [15:07:14] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 467640 MB (8% inode=44%): [15:07:41] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1350097.000000 [15:22:10] Load avg. on willow is WARNING: WARNING - load average: 22.22, 19.58, 18.19 [15:26:09] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [15:31:40] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1694286 [15:48:20] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [15:58:20] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.099609/1.75, alarm hl:np_load_avg=2.008789/2.0, alarm hl:mem_free=311.000000M/350M, alarm hl:available=1/0 [15:59:20] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [15:59:20] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:59:20] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [16:07:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1353701.000000 [16:08:11] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 467483 MB (8% inode=44%): [16:12:52] 3(created) [TS-1355] Please reactivate my account (diebuche); Toolserver; Task <10https://jira.toolserver.org/browse/TS-1355> (Leo k) [16:22:11] Load avg. on willow is WARNING: WARNING - load average: 14.31, 15.65, 16.27 [16:23:11] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.377930/1.10, alarm hl:np_load_long=0.757812/1.55, alarm hl:mem_free=17830.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.377930/1.00, alarm hl:np_load_long=0.757812/1.50, alarm hl:mem_free=17830.000000M/350M, alarm hl:available=1/0 [16:23:20] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [16:24:11] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [16:26:11] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [16:26:20] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.435547/1.75, alarm hl:np_load_avg=2.157227/2.0, alarm hl:mem_free=245.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.435547/2.3, alarm hl:np_load_long=2.091309/2.5, alarm hl:cpu=98.000000/98, alarm hl:mem_free=245.000000M/200M, alarm hl:available=1/0 [16:31:51] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1660229 [16:36:10] Load avg. on willow is CRITICAL: CRITICAL - load average: 33.85, 22.62, 18.87 [16:49:30] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [16:58:11] Load avg. on willow is WARNING: WARNING - load average: 15.84, 17.46, 19.95 [16:59:22] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [16:59:22] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:59:22] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [17:07:52] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1357307.000000 [17:08:12] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 466366 MB (8% inode=44%): [17:26:20] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [17:26:20] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.251953/1.75, alarm hl:np_load_avg=2.060547/2.0, alarm hl:mem_free=163.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.251953/2.3, alarm hl:np_load_long=2.091797/2.5, alarm hl:cpu=95.800000/98, alarm hl:mem_free=163.000000M/200M, alarm hl:available=1/0 [17:31:51] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1626897 [17:34:20] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [17:38:22] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.868164/1.75, alarm hl:np_load_avg=1.899414/2.0, alarm hl:mem_free=167.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.868164/2.3, alarm hl:np_load_long=1.995117/2.5, alarm hl:cpu=88.500000/98, alarm hl:mem_free=167.000000M/200M, alarm hl:available=1/0 [17:50:20] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [17:50:21] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [17:58:11] Load avg. on willow is WARNING: WARNING - load average: 15.77, 15.19, 15.29 [17:59:30] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [18:00:21] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:00:21] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [18:08:51] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1360967.000000 [18:09:12] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 465910 MB (8% inode=44%): [18:12:21] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.101074/1.75, alarm hl:np_load_avg=2.064453/2.0, alarm hl:mem_free=153.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.101074/2.3, alarm hl:np_load_long=2.010254/2.5, alarm hl:cpu=93.400000/98, alarm hl:mem_free=153.000000M/200M, alarm hl:available=1/0 [18:14:21] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:22:20] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [18:26:11] Load avg. on willow is OK: OK - load average: 12.88, 13.85, 14.90 [18:26:21] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [18:32:50] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1580885 [18:33:21] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.567383/1.75, alarm hl:np_load_avg=1.672851/2.0, alarm hl:mem_free=334.000000M/350M, alarm hl:available=1/0 [18:37:20] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [18:46:12] Load avg. on willow is WARNING: WARNING - load average: 15.61, 14.75, 14.36 [18:50:10] Load avg. on willow is OK: OK - load average: 13.97, 14.86, 14.55 [18:50:20] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [18:58:10] Load avg. on willow is WARNING: WARNING - load average: 15.00, 15.07, 14.76 [18:59:30] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [19:00:21] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [19:00:21] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [19:04:12] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [19:09:51] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1364628.000000 [19:10:11] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 465717 MB (8% inode=44%): [19:15:20] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.175293/1.75, alarm hl:np_load_avg=2.200195/2.0, alarm hl:mem_free=966.000000M/350M, alarm hl:available=1/0 [19:16:54] 3(updated) [DBQ-133] some data for the selected English articles <10https://jira.toolserver.org/browse/DBQ-133> (Nathan Marshall) [19:19:40] hello all [19:20:21] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [19:21:11] Load avg. on willow is WARNING: WARNING - load average: 13.47, 15.29, 15.86 [19:25:21] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.997070/1.75, alarm hl:np_load_avg=1.960449/2.0, alarm hl:mem_free=332.000000M/350M, alarm hl:available=1/0 [19:26:20] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [19:26:56] 3(created) [MNT-1230] Add rev_sha1 to the public view; Maintenance; Minor work <10https://jira.toolserver.org/browse/MNT-1230> (DaB.) [19:26:57] 3(work started) [MNT-1230] Add rev_sha1 to the public view <10https://jira.toolserver.org/browse/MNT-1230> (DaB.) [19:27:21] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [19:32:50] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1553636 [19:33:11] Load avg. on willow is OK: OK - load average: 12.64, 13.99, 15.00 [19:46:40] s4 replag on rosemary is CRITICAL: (Service Check Timed Out) [19:48:30] s4 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 628.000000 [19:49:20] Load avg. on willow is WARNING: WARNING - load average: 16.33, 15.51, 14.98 [19:50:21] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [19:51:23] Load avg. on willow is OK: OK - load average: 14.14, 14.98, 14.84 [19:51:23] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.760742/1.75, alarm hl:np_load_avg=1.871582/2.0, alarm hl:mem_free=572.000000M/350M, alarm hl:available=1/0 [19:57:22] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [19:57:53] 3(commented) [MNT-1227] Re-Import of enwiki <10https://jira.toolserver.org/browse/MNT-1227> (DaB.) [19:59:30] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [19:59:53] 3(commented) [MNT-1227] Re-Import of enwiki <10https://jira.toolserver.org/browse/MNT-1227> (DaB.) [20:00:21] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [20:00:30] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [20:01:53] 3(created) [TS-1356] Public key update needed; Toolserver; Critical Task <10https://jira.toolserver.org/browse/TS-1356> (Ahmed Medhat Mohamed) [20:02:22] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.916504/1.75, alarm hl:np_load_avg=1.787109/2.0, alarm hl:mem_free=133.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.916504/2.3, alarm hl:np_load_long=1.798828/2.5, alarm hl:cpu=94.700000/98, alarm hl:mem_free=133.000000M/200M, alarm hl:available=1/0 [20:03:56] 3(resolved) [MNT-1230] Add rev_sha1 to the public view <10https://jira.toolserver.org/browse/MNT-1230> (DaB.) [20:09:22] Load avg. on willow is WARNING: WARNING - load average: 15.66, 14.83, 14.56 [20:10:20] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 476386 MB (8% inode=44%): [20:10:40] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1368272.000000 [20:17:31] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [20:22:30] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.872559/1.75, alarm hl:np_load_avg=1.908203/2.0, alarm hl:mem_free=265.000000M/350M, alarm hl:available=1/0 [20:24:32] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [20:25:21] Load avg. on willow is OK: OK - load average: 13.00, 14.51, 14.93 [20:26:22] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [20:33:00] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1548013 [20:41:53] 3(assigned) [TS-1354] Change SSH public key on my toolserver account <10https://jira.toolserver.org/browse/TS-1354> (Vladimir Zapolskiy) [20:41:55] 3(commented) [TS-1354] Change SSH public key on my toolserver account <10https://jira.toolserver.org/browse/TS-1354> (Vladimir Zapolskiy) [20:43:54] 3(commented) [DBQ-181] Need to access deletion logs of articles by a specific user <10https://jira.toolserver.org/browse/DBQ-181> (Nathan Marshall) [20:43:56] 3(resolved) [TS-1354] Change SSH public key on my toolserver account <10https://jira.toolserver.org/browse/TS-1354> (Vladimir Zapolskiy) [20:47:31] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.675293/1.95, alarm hl:np_load_avg=1.683594/2.0, alarm hl:mem_free=211.000000M/350M, alarm hl:available=1/0 [20:50:31] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [20:53:30] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [20:56:30] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.685059/1.95, alarm hl:np_load_avg=1.642578/2.0, alarm hl:mem_free=160.000000M/350M, alarm hl:available=1/0 [20:59:40] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [21:00:30] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [21:00:30] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [21:00:36] why is willow so slow in last days? [21:04:21] Load avg. on willow is WARNING: WARNING - load average: 17.07, 15.07, 14.16 [21:08:13] /tmp on wolfsbane is WARNING: DISK WARNING - free space: /tmp 726 MB (19% inode=99%): [21:10:30] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 476191 MB (8% inode=44%): [21:10:41] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1371874.000000 [21:11:12] /tmp on wolfsbane is OK: DISK OK - free space: /tmp 1050 MB (26% inode=99%): [21:26:29] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [21:33:59] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1523627 [21:50:38] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [21:56:37] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.998535/1.95, alarm hl:np_load_avg=2.078125/2.0, alarm hl:mem_free=187.000000M/350M, alarm hl:available=1/0 [22:00:08] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [22:00:38] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [22:00:38] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [22:02:37] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [22:04:29] Load avg. on willow is WARNING: WARNING - load average: 14.39, 15.21, 16.39 [22:08:38] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.883789/1.95, alarm hl:np_load_avg=1.845703/2.0, alarm hl:mem_free=181.000000M/350M, alarm hl:available=1/0 [22:10:40] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 475134 MB (8% inode=44%): [22:11:38] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1375536.000000 [22:15:52] 3(commented) [DBQ-183] SQL Query Issue <10https://jira.toolserver.org/browse/DBQ-183> (Hoo man) [22:21:38] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [22:23:27] Load avg. on willow is OK: OK - load average: 14.29, 14.38, 14.99 [22:26:28] Load avg. on willow is WARNING: WARNING - load average: 16.84, 15.64, 15.38 [22:26:38] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.079590/1.95, alarm hl:np_load_avg=1.954590/2.0, alarm hl:mem_free=371.000000M/350M, alarm hl:available=1/0 [22:26:38] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [22:34:07] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1486491 [22:42:30] nacht ts [22:50:48] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [23:00:45] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [23:00:47] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [23:00:47] APT on yarrow is CRITICAL: APT CRITICAL: 6 packages available for upgrade (6 critical updates). [23:03:47] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.818359/1.95, alarm hl:np_load_avg=1.860840/2.0, alarm hl:mem_free=187.000000M/350M, alarm hl:available=1/0 [23:04:20] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:08:57] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [23:10:37] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 474779 MB (8% inode=44%): [23:11:47] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1379143.000000 [23:26:37] Load avg. on willow is WARNING: WARNING - load average: 17.25, 16.34, 15.91 [23:26:47] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [23:30:40] bonne nuit [23:35:09] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1449197 [23:46:49] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [23:50:49] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [23:57:49] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.929688/1.95, alarm hl:np_load_avg=1.875488/2.0, alarm hl:mem_free=294.000000M/350M, alarm hl:available=1/0 [23:59:48] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK