[00:00:33] APT on nightshade is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [00:03:03] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 90896.000000 [00:08:34] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/network/rpc-100235_1/rpc_ticotsord:default svc:/application/jira:default [00:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [00:18:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [00:25:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36011.000000 [00:28:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [00:44:34] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 36312 [00:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 91884 [00:51:04] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 274199 MB (5% inode=30%): [01:00:33] APT on nightshade is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [01:03:13] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 91941.000000 [01:08:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/network/rpc-100235_1/rpc_ticotsord:default svc:/application/jira:default [01:17:44] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [01:18:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:25:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36735.000000 [01:28:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:45:33] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 36832 [01:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 92464 [01:51:03] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 271937 MB (5% inode=30%): [02:00:33] APT on nightshade is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [02:03:13] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 92474.000000 [02:08:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/network/rpc-100235_1/rpc_ticotsord:default svc:/application/jira:default [02:11:34] /sql on thyme is WARNING: DISK WARNING - free space: /sql 178727 MB (18% inode=99%): [02:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [02:18:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:25:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36763.000000 [02:28:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:45:34] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 36975 [02:50:14] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 92765 [02:51:03] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 269697 MB (5% inode=30%): [03:00:33] APT on nightshade is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [03:03:13] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 92712.000000 [03:09:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/network/rpc-100235_1/rpc_ticotsord:default svc:/application/jira:default [03:13:34] Load avg. on willow is WARNING: WARNING - load average: 15.94, 16.18, 13.79 [03:16:33] Load avg. on willow is OK: OK - load average: 14.16, 14.82, 13.64 [03:17:42] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [03:18:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:25:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 37450.000000 [03:26:33] Load avg. on willow is WARNING: WARNING - load average: 14.19, 15.11, 14.46 [03:29:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:46:34] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 37530 [03:48:24] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106846 MB (10% inode=98%): [03:48:34] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106846 MB (10% inode=98%): [03:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 93459 [03:51:03] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 269592 MB (5% inode=30%): [03:52:53] /sql on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:53:23] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106806 MB (10% inode=98%): [03:59:33] APT on yarrow is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [04:00:34] APT on nightshade is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [04:03:13] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 93623.000000 [04:06:33] Load avg. on willow is WARNING: WARNING - load average: 19.42, 17.66, 14.60 [04:07:03] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.222656/1.95, alarm hl:tmp_free=48486M/100M, alarm hl:np_load_avg=2.177246/2.0, alarm hl:mem_free=727.000000M/350M, alarm hl:available=1/0 [04:09:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/network/rpc-100235_1/rpc_ticotsord:default svc:/application/jira:default [04:11:03] /sql on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:11:53] /sql on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:12:23] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 105685 MB (10% inode=98%): [04:12:33] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 105685 MB (10% inode=98%): [04:15:03] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [04:17:33] Load avg. on willow is OK: OK - load average: 13.65, 14.90, 14.89 [04:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [04:18:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:25:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 37413.000000 [04:29:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:38:34] /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 108522 MB (11% inode=98%): [04:39:24] /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 107193 MB (11% inode=98%): [04:42:23] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.281250/1.10, alarm hl:np_load_long=1.083984/1.55, alarm hl:mem_free=17314.000000M/500M, alarm hl:tmp_free=13431M/200M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.281250/1.00, alarm hl:np_load_long=1.083984/1.50, alarm hl:mem_free=17314.000000M/600M, alarm hl:tmp_free [04:46:24] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [04:46:34] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 37479 [04:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 94511 [04:51:03] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 269442 MB (5% inode=30%): [04:56:23] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106574 MB (10% inode=98%): [04:56:33] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106556 MB (10% inode=98%): [04:57:24] /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 107151 MB (11% inode=98%): [04:57:34] /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 107129 MB (11% inode=98%): [04:59:34] APT on yarrow is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [05:00:33] APT on nightshade is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [05:03:14] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 94733.000000 [05:06:03] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.913086/1.95, alarm hl:tmp_free=48262M/100M, alarm hl:np_load_avg=2.036133/2.0, alarm hl:mem_free=773.000000M/350M, alarm hl:available=1/0 [05:06:33] Load avg. on willow is WARNING: WARNING - load average: 17.51, 16.71, 14.67 [05:06:59] @replag [05:06:59] Froggis: s1-rr-a: 10h 29m 1s [+0.13 s/s]; s1-user: 1d 2h 20m 14s [+0.20 s/s]; s3-rr-a: 11m 19s [+0.01 s/s]; s3-user: 11m 19s [+0.01 s/s] [05:07:18] @replag [05:07:19] Froggis: s1-rr-a: 10h 29m 7s [+0.31 s/s]; s1-user: 1d 2h 20m 23s [+0.47 s/s] [05:07:42] @replag [05:07:42] Froggis: s1-rr-a: 10h 29m 10s [+0.13 s/s]; s1-user: 1d 2h 20m 31s [+0.34 s/s]; s2-user: 31s [+0.00 s/s]; s3-rr-a: 29s [-15.20 s/s]; s3-user: 29s [-15.23 s/s] [05:08:07] @replag [05:08:08] Froggis: s1-rr-a: 10h 29m 13s [+0.12 s/s]; s1-user: 1d 2h 20m 38s [+0.27 s/s]; s3-rr-a: 55s [+1.01 s/s]; s3-user: 55s [+1.00 s/s] [05:08:15] @replag [05:08:15] Froggis: s1-rr-a: 10h 29m 10s [-0.39 s/s]; s1-user: 1d 2h 20m 41s [+0.39 s/s]; s2-user: 11s [-0.60 s/s]; s3-rr-a: 1m 3s [+1.05 s/s]; s3-user: 1m 3s [+1.06 s/s] [05:08:24] @replag [05:08:24] Froggis: s1-rr-a: 10h 29m 12s [+0.22 s/s]; s1-user: 1d 2h 20m 46s [+0.55 s/s]; s2-user: 20s [+0.99 s/s]; s3-rr-a: 1m 12s [+0.99 s/s]; s3-user: 1m 12s [+0.99 s/s]; s6-rr-a: 12s [-0.00 s/s]; s6-user: 12s [-0.00 s/s]; s7-rr-a: 15s [-0.00 s/s] [05:08:25] Hi Froggis. [05:08:26] Froggis: s7-user: 15s [-0.00 s/s] [05:08:28] hi [05:08:43] You've discovered the @replag command? [05:08:56] yes :D [05:09:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/network/rpc-100235_1/rpc_ticotsord:default svc:/application/jira:default [05:09:36] i have browswd wikipedia articles beginnin from windows operating systems ending to edit staticks and from there to here to test @replag :) [05:12:25] http://toolserver.org/~bryan/stats/replag/ what is this thing [05:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [05:18:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:19:03] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [05:23:43] /sql on cassia is CRITICAL: DISK CRITICAL - free space: /sql 61012 MB (5% inode=98%): [05:25:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 37996.000000 [05:29:03] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.879395/1.95, alarm hl:tmp_free=48214M/100M, alarm hl:np_load_avg=2.043945/2.0, alarm hl:mem_free=1124.000000M/350M, alarm hl:available=1/0 [05:29:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:30:34] Load avg. on willow is CRITICAL: CRITICAL - load average: 47.01, 23.21, 18.66 [05:39:33] Load avg. on willow is WARNING: WARNING - load average: 13.36, 19.77, 19.72 [05:46:33] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 38092 [05:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 95764 [05:50:33] Load avg. on willow is OK: OK - load average: 9.79, 11.16, 14.93 [05:51:03] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 269262 MB (5% inode=30%): [06:00:33] APT on yarrow is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [06:01:33] APT on nightshade is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [06:03:23] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 95902.000000 [06:07:23] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 41600 MB (10% inode=99%): [06:09:34] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/network/rpc-100235_1/rpc_ticotsord:default svc:/application/jira:default [06:17:33] Load avg. on willow is WARNING: WARNING - load average: 14.75, 15.98, 15.43 [06:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [06:18:03] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.888672/1.95, alarm hl:tmp_free=48060M/100M, alarm hl:np_load_avg=2.004883/2.0, alarm hl:mem_free=1025.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.888672/2.3, alarm hl:np_load_long=1.931152/2.5, alarm hl:cpu=98.900000/98, alarm hl:mem_free=1025.000000M/200M, [06:18:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:19:23] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106651 MB (10% inode=98%): [06:19:33] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106625 MB (10% inode=98%): [06:20:03] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [06:20:33] Load avg. on willow is OK: OK - load average: 12.21, 14.53, 15.00 [06:25:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 37993.000000 [06:29:04] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:33:23] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 58748 MB (14% inode=99%): [06:36:34] Load avg. on willow is WARNING: WARNING - load average: 15.58, 16.09, 15.72 [06:40:04] /sql on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:40:33] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106566 MB (10% inode=98%): [06:46:33] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 38145 [06:47:24] /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 107002 MB (11% inode=98%): [06:47:33] /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 106935 MB (11% inode=98%): [06:50:12] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 96859 [06:51:03] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 268901 MB (5% inode=30%): [07:00:34] APT on yarrow is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [07:01:34] APT on nightshade is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [07:03:34] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 96964.000000 [07:09:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/network/rpc-100235_1/rpc_ticotsord:default svc:/application/jira:default [07:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [07:18:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:25:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 38410.000000 [07:29:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:46:33] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 38353 [07:46:33] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.223144/1.10, alarm hl:np_load_long=0.241699/1.55, alarm hl:mem_free=149.000000M/500M, alarm hl:tmp_free=13248M/200M, alarm hl:available=1/0: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.223144/1.00, alarm hl:np_load_long=0.241699/1.50, alarm hl:mem_free=149.000000M/600M, alarm hl:tmp_free= [07:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 97460 [07:50:33] Sun Grid Engine execd on wolfsbane is OK: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [07:51:03] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 268485 MB (5% inode=30%): [07:53:04] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 332298 MB (6% inode=35%): [07:55:03] /aux0 on hemlock is OK: DISK OK - free space: /aux0 609028 MB (11% inode=49%): [08:00:34] APT on yarrow is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [08:01:33] APT on nightshade is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [08:04:23] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 36743 MB (9% inode=99%): [08:04:33] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 97563.000000 [08:09:34] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [08:10:34] Load avg. on willow is WARNING: WARNING - load average: 18.40, 16.81, 15.11 [08:14:23] /sql on z-dat-s4-a is CRITICAL: DISK CRITICAL - free space: /sql 24430 MB (5% inode=99%): [08:15:03] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=3.468750/1.95, alarm hl:tmp_free=47636M/100M, alarm hl:np_load_avg=2.497070/2.0, alarm hl:mem_free=788.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=3.468750/2.3, alarm hl:np_load_long=2.084473/2.5, alarm hl:cpu=99.700000/98, alarm hl:mem_free=788.000000M/200M, al [08:15:34] Load avg. on willow is CRITICAL: CRITICAL - load average: 30.48, 21.58, 17.38 [08:16:33] Load avg. on willow is WARNING: WARNING - load average: 18.54, 19.86, 17.04 [08:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [08:18:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:22:03] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [08:23:23] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 67938 MB (16% inode=99%): [08:25:54] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 38395.000000 [08:27:13] s4 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2027.000000 [08:29:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:46:34] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 38490 [08:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 98077 [09:00:43] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:01:13] SMTP on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:01:24] SMTP on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:01:24] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:01:24] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:01:24] SMTP on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:01:24] SMF on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:01:24] SMF on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:01:33] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:01:34] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:01:34] APT on nightshade is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [09:01:34] SMF on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:01:34] SMF on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:01:34] SSH on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:01:34] APT on yarrow is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [09:01:43] SMF on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:01:43] / on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:01:54] /sql on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:01:54] / on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:02:03] /tmp on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:02:03] Load avg. on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:02:03] /sql on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:02:03] /tmp on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:02:04] Load avg. on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:02:43] MySQL slave on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [09:02:53] /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 107944 MB (11% inode=98%): [09:02:53] /tmp on z-dat-s6-a is OK: DISK OK - free space: /tmp 1643 MB (99% inode=99%): [09:02:53] / on z-dat-s6-a is OK: DISK OK - free space: / 8164 MB (27% inode=85%): [09:02:53] Load avg. on z-dat-s6-a is OK: OK - load average: 0.14, 0.64, 1.82 [09:02:53] MySQL slave on z-dat-s3-a is OK: Uptime: 1651729 Threads: 27 Questions: 1862011704 Slow queries: 160149 Opens: 19265526 Flush tables: 1 Open tables: 16384 Queries per second avg: 1127.310 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 563 [09:02:54] SMF on z-dat-s3-a is OK: OK - all services online [09:02:54] SMF on z-dat-s4-a is OK: OK - all services online [09:03:03] SMTP on z-dat-s3-a is OK: SMTP OK - 0.003 sec. response time [09:03:03] SMF on z-dat-s6-a is OK: OK - all services online [09:03:03] SMF on z-dat-s7-a is OK: OK - all services online [09:03:13] MySQL on z-dat-s7-a is CRITICAL: (Service Check Timed Out) [09:03:13] s4 replag on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [09:03:14] SMTP on z-dat-s7-a is OK: SMTP OK - 0.005 sec. response time [09:03:14] SMTP on z-dat-s6-a is OK: SMTP OK - 0.003 sec. response time [09:03:14] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:03:14] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:03:14] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [09:03:14] SMF on hyacinth is OK: OK - all services online [09:03:15] / on z-dat-s4-a is OK: DISK OK - free space: / 8164 MB (27% inode=85%): [09:03:23] MySQL on z-dat-s7-a is OK: Uptime: 3560877 Threads: 9 Questions: 1071775695 Slow queries: 156745 Opens: 8454669 Flush tables: 1 Open tables: 6642 Queries per second avg: 300.986 [09:03:23] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:03:24] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:03:24] s4 replag on z-dat-s4-a is OK: QUERY OK: SELECT ts_rc_age() returned 441.000000 [09:03:24] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 56382 MB (13% inode=99%): [09:03:24] SSH on hyacinth is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:03:34] /tmp on z-dat-s4-a is OK: DISK OK - free space: /tmp 1595 MB (99% inode=99%): [09:03:34] Load avg. on z-dat-s4-a is OK: OK - load average: 1.94, 1.02, 1.90 [09:05:34] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 98271.000000 [09:08:13] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3639.000000 [09:09:34] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [09:11:33] Load avg. on willow is WARNING: WARNING - load average: 14.46, 15.10, 13.78 [09:12:34] Load avg. on willow is OK: OK - load average: 13.80, 14.69, 13.71 [09:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [09:18:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:25:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 38427.000000 [09:29:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:46:34] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 38570 [09:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 99287 [10:01:34] APT on nightshade is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [10:01:34] APT on yarrow is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [10:05:33] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 99588.000000 [10:08:13] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 5184.000000 [10:10:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [10:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [10:18:30] hello all [10:18:54] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:26:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 38802.000000 [10:29:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:32:23] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106858 MB (10% inode=98%): [10:32:33] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106856 MB (10% inode=98%): [10:43:23] /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 107570 MB (11% inode=98%): [10:43:33] /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 107791 MB (11% inode=98%): [10:47:33] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 38866 [10:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 100631 [11:02:34] APT on nightshade is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [11:02:34] APT on yarrow is CRITICAL: APT CRITICAL: 2 packages available for upgrade (2 critical updates). [11:05:34] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 100776.000000 [11:08:13] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 5948.000000 [11:09:33] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106451 MB (10% inode=98%): [11:10:23] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106451 MB (10% inode=98%): [11:10:34] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [11:10:34] APT on yarrow is OK: APT OK: 0 packages available for upgrade (0 critical updates). [11:11:34] APT on nightshade is OK: APT OK: 0 packages available for upgrade (0 critical updates). [11:17:03] /sql on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:17:44] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [11:17:54] /sql on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:18:53] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106299 MB (10% inode=98%): [11:18:53] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106299 MB (10% inode=98%): [11:18:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:26:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 38942.000000 [11:29:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:39:36] DabPunkt, is there any estimate at all of how long the s1 situation will continue? [11:40:16] russblau: betweeen a week or two [11:40:29] OK, thanks [11:47:33] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 38912 [11:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 101534 [12:05:33] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 101847.000000 [12:06:33] Load avg. on willow is WARNING: WARNING - load average: 15.61, 15.49, 12.92 [12:08:13] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7325.000000 [12:10:33] Load avg. on willow is CRITICAL: CRITICAL - load average: 31.02, 19.71, 15.05 [12:11:34] Load avg. on willow is WARNING: WARNING - load average: 22.11, 19.11, 15.12 [12:11:34] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [12:11:53] /sql on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:12:03] /sql on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:12:03] Sun Grid Engine execd on willow is WARNING: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.477051/2.3, alarm hl:np_load_long=1.896973/2.5, alarm hl:cpu=97.700000/98, alarm hl:mem_free=755.000000M/200M, alarm hl:tmp_free=46650M/100M, alarm hl:available=1/0: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.477051/1.95, alarm hl:tmp_free=46650M/100M, alarm hl:np_load_avg=2.358887/2.0, alarm h [12:13:24] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 105956 MB (10% inode=98%): [12:13:33] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 105955 MB (10% inode=98%): [12:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [12:18:03] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: longrun-sol@willow OK: medium-sol@willow OK [12:19:52] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:26:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39071.000000 [12:29:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:36:04] /sql on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:36:53] /sql on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:37:44] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106472 MB (10% inode=98%): [12:37:44] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106472 MB (10% inode=98%): [12:46:33] Load avg. on willow is WARNING: WARNING - load average: 18.53, 17.19, 15.70 [12:46:53] /sql on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:47:04] /sql on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:47:33] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 39204 [12:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 102518 [12:52:13] SMTP on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:52:23] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:52:24] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:52:24] SMF on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:52:24] SMF on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:52:33] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:52:33] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:52:34] SMTP on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:52:34] SMF on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:52:34] SMF on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:52:34] SSH on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:52:43] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:52:43] SMF on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:52:44] / on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:52:44] / on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:52:53] /sql on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:52:53] / on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:52:53] / on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:53:03] /tmp on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:53:03] Load avg. on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:53:03] /tmp on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:53:03] /tmp on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:53:03] Load avg. on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:53:04] /sql on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:53:04] /tmp on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:53:04] Load avg. on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:53:04] Load avg. on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:53:13] Environment IPMI on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:53:13] SMTP on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:53:23] Load avg. on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:53:23] SMTP on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:53:23] SMTP on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:53:33] Load avg. on willow is OK: OK - load average: 12.29, 14.46, 14.99 [12:53:43] MySQL on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [12:54:04] MySQL slave on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [12:54:04] s4 replag on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [12:54:13] MySQL on z-dat-s6-a is CRITICAL: (Service Check Timed Out) [12:54:13] MySQL slave on z-dat-s6-a is CRITICAL: (Service Check Timed Out) [12:54:13] MySQL slave on z-dat-s7-a is CRITICAL: (Service Check Timed Out) [12:54:33] MySQL on z-dat-s7-a is CRITICAL: (Service Check Timed Out) [12:54:33] MySQL slave on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [12:54:43] MySQL on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [12:56:04] NTP on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:57:53] Load avg. on hyacinth is OK: OK - load average: 0.01, 0.31, 1.24 [12:58:13] SMTP on z-dat-s6-a is OK: SMTP OK - 5.565 sec. response time [12:58:13] SMTP on z-dat-s7-a is OK: SMTP OK - 5.594 sec. response time [12:58:13] SMF on z-dat-s4-a is OK: OK - all services online [12:58:13] SMF on z-dat-s6-a is OK: OK - all services online [12:58:13] SMF on z-dat-s3-a is OK: OK - all services online [12:58:13] SMF on hyacinth is OK: OK - all services online [12:58:13] / on z-dat-s4-a is OK: DISK OK - free space: / 8163 MB (27% inode=85%): [12:58:14] / on z-dat-s3-a is OK: DISK OK - free space: / 8163 MB (27% inode=85%): [12:58:23] SMTP on hyacinth is OK: SMTP OK - 0.002 sec. response time [12:58:23] SMF on z-dat-s7-a is OK: OK - all services online [12:58:23] MySQL on z-dat-s4-a is OK: Uptime: 3596612 Threads: 2 Questions: 218485222 Slow queries: 39380 Opens: 66329 Flush tables: 1 Open tables: 1066 Queries per second avg: 60.747 [12:58:23] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106234 MB (10% inode=98%): [12:58:23] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 51176 MB (12% inode=99%): [12:58:24] / on z-dat-s6-a is OK: DISK OK - free space: / 8163 MB (27% inode=85%): [12:58:24] MySQL on z-dat-s6-a is OK: Uptime: 1617675 Threads: 7 Questions: 359353860 Slow queries: 145773 Opens: 3354535 Flush tables: 1 Open tables: 3073 Queries per second avg: 222.142 [12:58:24] MySQL slave on z-dat-s6-a is OK: Uptime: 1617675 Threads: 8 Questions: 359353866 Slow queries: 145773 Opens: 3354535 Flush tables: 1 Open tables: 3073 Queries per second avg: 222.142 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 769 [12:58:24] MySQL slave on z-dat-s7-a is OK: Uptime: 3574981 Threads: 4 Questions: 1077933402 Slow queries: 157334 Opens: 8478376 Flush tables: 1 Open tables: 6640 Queries per second avg: 301.521 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1036 [12:58:25] / on z-dat-s7-a is OK: DISK OK - free space: / 8163 MB (27% inode=85%): [12:58:25] MySQL slave on z-dat-s3-a is OK: Uptime: 1665865 Threads: 14 Questions: 1875074808 Slow queries: 161137 Opens: 19332262 Flush tables: 1 Open tables: 16384 Queries per second avg: 1125.586 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1001 [12:58:26] MySQL on z-dat-s7-a is OK: Uptime: 3574983 Threads: 4 Questions: 1077933673 Slow queries: 157334 Opens: 8478376 Flush tables: 1 Open tables: 6640 Queries per second avg: 301.521 [12:58:53] MySQL slave on z-dat-s4-a is OK: Uptime: 3596639 Threads: 6 Questions: 218489627 Slow queries: 39382 Opens: 66331 Flush tables: 1 Open tables: 1066 Queries per second avg: 60.748 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 619 [12:58:53] s4 replag on z-dat-s4-a is OK: QUERY OK: SELECT ts_rc_age() returned 619.000000 [12:58:54] Environment IPMI on hyacinth is OK: ok: temperature ok fan ok voltage ok chassis ok [12:58:54] NTP on hyacinth is OK: NTP OK: Offset -0.005035 secs [12:59:02] SMTP on z-dat-s4-a is OK: SMTP OK - 0.031 sec. response time [12:59:03] SMTP on z-dat-s3-a is OK: SMTP OK - 0.130 sec. response time [12:59:13] qstat -f [12:59:14] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [12:59:14] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [12:59:24] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [12:59:24] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [13:05:53] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 102658.000000 [13:06:34] Load avg. on willow is WARNING: WARNING - load average: 11.18, 14.93, 15.57 [13:08:14] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7261.000000 [13:11:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [13:11:54] /sql on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:12:03] /sql on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:14:23] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106382 MB (10% inode=98%): [13:14:33] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106381 MB (10% inode=98%): [13:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [13:19:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:26:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39225.000000 [13:29:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:32:33] are there any performance/load monitoring tools on TS? [13:41:25] MaxSem: there is nagios (tsnag) [13:41:29] tsnag: [13:41:57] puppet is in use too, and ganglia I think, though I don't know the address of the ganglia instance [13:42:22] MaxSem: Or do you mean individually for your tools? [13:43:02] Krinkle, no, I need to see DB/appserver load [13:45:11] MaxSem: munin.toolserver.org [13:45:32] whee, thanks [13:47:34] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 39188 [13:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 103229 [14:05:53] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 103350.000000 [14:08:13] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 4082.000000 [14:11:13] s4 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3367.000000 [14:12:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [14:12:33] /sql on thyme is WARNING: DISK WARNING - free space: /sql 177346 MB (18% inode=99%): [14:15:13] s4 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1467.000000 [14:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [14:20:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:26:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39354.000000 [14:29:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:47:33] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 39482 [14:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 103751 [15:05:23] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106480 MB (10% inode=98%): [15:05:33] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106469 MB (10% inode=98%): [15:06:04] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 103882.000000 [15:12:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [15:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [15:19:23] /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 107446 MB (11% inode=98%): [15:19:33] /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 107446 MB (11% inode=98%): [15:20:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:26:53] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39897.000000 [15:29:03] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:48:33] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 40302 [15:50:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 104333 [16:06:03] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 104554.000000 [16:08:23] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 104022 MB (10% inode=98%): [16:08:34] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 104002 MB (10% inode=98%): [16:13:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [16:17:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [16:20:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:27:20] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 40612.000000 [16:27:20] /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 107122 MB (11% inode=98%): [16:27:40] /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 107043 MB (11% inode=98%): [16:29:10] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:34:20] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 105740 MB (10% inode=98%): [16:34:39] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 105740 MB (10% inode=98%): [16:48:40] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 41049 [16:50:19] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 105068 [17:06:59] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 105186.000000 [17:13:39] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [17:21:49] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [17:21:59] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:23:49] /sql on cassia is CRITICAL: DISK CRITICAL - free space: /sql 64407 MB (5% inode=99%): [17:27:20] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 41959.000000 [17:29:09] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:40:59] Load avg. on adenia is WARNING: WARNING - load average: 17.84, 16.14, 13.28 [17:42:59] Load avg. on adenia is OK: OK - load average: 11.53, 14.43, 13.00 [17:48:39] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 42292 [17:50:19] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 105616 [17:55:09] Sun Grid Engine execd on willow is WARNING: medium-sol@willow in orphaned state [18:00:20] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106916 MB (10% inode=98%): [18:00:40] /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106916 MB (10% inode=98%): [18:06:59] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 105844.000000 [18:13:39] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [18:21:49] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [18:21:59] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:27:20] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43080.000000 [18:29:09] SMF on damiana is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:29:20] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [18:29:20] /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 107305 MB (11% inode=98%): [18:29:40] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [18:36:19] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106321 MB (10% inode=98%): [18:38:19] /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 107245 MB (11% inode=98%): [18:48:39] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 43457 [18:50:20] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 106285 [18:55:09] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [18:58:49] Getting 2-3 of these every minute: Unable to run job: SGE maintance. [18:58:50] "Output from "cron" command" [18:58:52] is it possible to surpress those during mainTenance [18:59:39] Or something I could change on my end? [18:59:47] Krinkle: disable those cron jobs? [19:00:15] Krinkle: yes, but in a few minutes you'll get a "commoand not found" error that cannot be hidden [19:00:18] well I don't want to have to edit crontab files on various servers for 3 different project every time someone touches buttons [19:00:41] Merlissimo: Suppose I do this from submit.toolserver.org [19:00:56] perhaps those can be disabled completely during SGE maintenance? [19:01:14] ask DaBPunkt [19:01:15] (assuming disabling cron on willow is not an option since other things should just continue to run during SGE maint) [19:01:18] it would probably make sense to stop cron on submit.toolserver [19:01:34] yeah, other crons are excused. [19:02:16] any suggestions for a regex for converting {{{UNmembership|1976-12-01}}} to 1976-12-01 ? [19:02:34] I want to run it in AWB [19:02:43] disabled crontab and cronie on submit [19:03:05] for people which uses willow's cron: Selbst Schuld ;P [19:03:15] DaBPunkt: fair nuff :) [19:03:49] DaBPunkt: so to migrate to submit.ts.o, can I just paste the same line that I have on willow (and remove it on willow) [19:03:51] e.g. stuff like `cronsub -l -s dbbot_wm $HOME/bots/dbbot-wm-start.sh` [19:03:55] * * * .. [19:04:11] or do I need something special on submit [19:04:39] no. But you have to use cronie on submit [19:05:03] * Krinkle reads [[tswiki:submit]] [19:05:09] DaBPunkt: syntax compatible? [19:05:16] should be [19:05:21] k [19:06:26] ToAruShiroiNeko: http://txt2re.com/ [19:06:59] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 106467.000000 [19:07:01] DaBPunkt: Hm.. ssh submit brings be to clematis? [19:07:06] yes [19:07:07] Krinkle: and you can convert to "qcronsub" if you are already working on it [19:07:40] ok [19:07:57] Merlissimo: What would I use instead of cronsub -l -s dbbot_wm $HOME/bots/dbbot-wm-start.sh ? [19:08:20] It should check every minute if a task by that name is running and if not, start it in SGE, and put output in a single nice file [19:08:59] Krinkle: what is your expected maximum runtime? [19:09:37] irc bot, so "indefinite". basically until either I restart for deploying update or if ts reboots [19:09:57] history tells me ~ 5 weeks average [19:10:10] low activity php process. mostly idling. [19:10:20] 1 query to sql.ts.org per day. [19:11:15] qcronsub -b -N dbbot_wm -l h_rt=INFINITY -l virtual_free=50M $HOME/bots/dbbot-wm-start.sh [19:11:27] (if you need 50M memory) [19:11:41] https://wiki.toolserver.org/view/Job_scheduling [19:13:01] Merlissimo: thx [19:13:15] I'll measure the memory in a moment and set it accordingly with a margin [19:14:39] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [19:16:39] /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 107012 MB (11% inode=98%): [19:17:20] /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106808 MB (10% inode=98%): [19:18:23] Merlissimo: cronie or cronietab? docs say cronietab,
 sample mentions cronie
[19:18:39] 	 cronie exists but that may be the executable instead of the tab
[19:19:32] 	 "cron" command changes the crontab; "cronie" command changes the cronietab
[19:20:20] 	 "cronie -e" shoud open die cronietab in your preferred editor
[19:20:50] 	 thx
[19:21:06] 	 on willow I always use `crontab -e`
[19:21:09] 	 is that wrong?
[19:21:49] 	 SMF on willow is CRITICAL: ERROR - maintenance:  svc:/network/puppetmasterd:default  
[19:22:00] 	 SMF on turnera is CRITICAL: ERROR -  offline:  svc:/system/cluster/scsymon-srv:default  
[19:22:17] 	 [05.07.2012 21:04:39]  no. But you have to use cronie on submit
[19:22:55] 	 got the cronie scheduled on submit now. thx.
[19:23:39] 	 /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106801 MB (10% inode=98%):  
[19:23:42] 	 I'll gradually start moving more schedulers there, for now I'll see how this runs with dbbot-wm after SGE maintenance is over and make sure it works for me.
[19:23:51] 	 Thx for the help DaBPunkt Merlissimo 
[19:24:00] 	 np
[19:24:26] 	 btw, I noticed wikibot is no longer reporting here. known?bug?feature?
[19:24:33] * Krinkle  made an edit a few minutes ago
[19:24:50] 	 If one of the database is already corrupt then why are we replicating the hash updates?
[19:26:39] 	 Load avg. on willow is WARNING: WARNING - load average: 18.88, 15.68, 10.81  
[19:26:49] 	 Load avg. on ortelius is CRITICAL: CRITICAL - load average: 96.15, 68.65, 34.13  
[19:27:19] 	 s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43714.000000  
[19:27:50] 	 Load avg. on ortelius is UNKNOWN: CHECK_NRPE: Error receiving data from daemon.  
[19:28:39] 	 Load avg. on willow is CRITICAL: CRITICAL - load average: 271.80, 81.52, 34.39  
[19:29:09] 	 SMF on damiana is CRITICAL: ERROR -  offline:  svc:/system/cluster/scsymon-srv:default  
[19:29:40] 	 Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output  
[19:33:59] 	 Load avg. on wolfsbane is WARNING: WARNING - load average: 1.83, 15.76, 12.30  
[19:35:00] 	 Load avg. on wolfsbane is OK: OK - load average: 1.36, 13.12, 11.58  
[19:37:20] 	 Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output  
[19:49:39] 	 MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 43907  
[19:50:19] 	 MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 106639  
[19:55:09] 	 Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output  
[19:56:49] 	 Load avg. on ortelius is CRITICAL: CRITICAL - load average: 16.04, 28.36, 35.16  
[20:02:38] 	 Load avg. on willow is WARNING: WARNING - load average: 4.30, 7.71, 19.70  
[20:06:59] 	 s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 106792.000000  
[20:08:39] 	 Load avg. on willow is OK: OK - load average: 3.42, 4.59, 14.29  
[20:14:39] 	 SMF on web.amaranth is CRITICAL: ERROR - maintenance:  svc:/application/jira:default  
[20:21:49] 	 SMF on willow is CRITICAL: ERROR - maintenance:  svc:/network/puppetmasterd:default  
[20:21:59] 	 SMF on turnera is CRITICAL: ERROR -  offline:  svc:/system/cluster/scsymon-srv:default  
[20:26:49] 	 Load avg. on ortelius is WARNING: WARNING - load average: 2.33, 6.91, 19.12  
[20:27:19] 	 /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 107220 MB (11% inode=98%):  
[20:28:20] 	 s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 44673.000000  
[20:29:09] 	 SMF on damiana is CRITICAL: ERROR -  offline:  svc:/system/cluster/scsymon-srv:default  
[20:29:39] 	 Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output  
[20:31:49] 	 Load avg. on ortelius is OK: OK - load average: 2.09, 3.97, 14.38  
[20:34:19] 	 /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106701 MB (10% inode=98%):  
[20:37:20] 	 Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output  
[20:49:40] 	 MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 45149  
[20:51:20] 	 MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 107284  
[20:55:09] 	 Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output  
[20:55:30] 	 my mailbox is filled up with SGE error mails
[21:01:50] 	 Load avg. on ortelius is WARNING: WARNING - load average: 25.46, 23.15, 14.82  
[21:02:49] 	 Load avg. on ortelius is CRITICAL: CRITICAL - load average: 33.32, 26.00, 16.38  
[21:03:41] 	 liangent: sge maintance, but cronie@submit ist disabled
[21:04:57] 	 @replag
[21:04:58] 	 Earwig: s1-rr-a: 12h 36m 22s [+0.23 s/s]; s1-user: 1d 5h 49m 52s [+0.13 s/s]
[21:07:08] 	 s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 107432.000000  
[21:15:39] 	 SMF on web.amaranth is CRITICAL: ERROR - maintenance:  svc:/application/jira:default  
[21:19:50] 	 Load avg. on ortelius is WARNING: WARNING - load average: 3.37, 15.20, 19.48  
[21:21:49] 	 SMF on willow is CRITICAL: ERROR - maintenance:  svc:/network/puppetmasterd:default  
[21:21:59] 	 SMF on turnera is CRITICAL: ERROR -  offline:  svc:/system/cluster/scsymon-srv:default  
[21:24:49] 	 Load avg. on ortelius is OK: OK - load average: 1.53, 6.78, 14.53  
[21:28:19] 	 s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 45412.000000  
[21:29:09] 	 SMF on damiana is CRITICAL: ERROR -  offline:  svc:/system/cluster/scsymon-srv:default  
[21:30:40] 	 Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output  
[21:37:42] 	 Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output  
[21:41:31] 	 Merlissimo: are you sure about cronie@submit, i get e-mails from time to time
[21:42:32] 	 [21:02:43]  disabled crontab and cronie on submit
[21:49:39] 	 MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 45375  
[21:51:19] 	 MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 108260  
[21:55:10] 	 Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output  
[22:07:49] 	 s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 108352.000000  
[22:15:40] 	 SMF on web.amaranth is CRITICAL: ERROR - maintenance:  svc:/application/jira:default  
[22:21:49] 	 SMF on willow is CRITICAL: ERROR - maintenance:  svc:/network/puppetmasterd:default  
[22:22:00] 	 SMF on turnera is CRITICAL: ERROR -  offline:  svc:/system/cluster/scsymon-srv:default  
[22:22:47] 	 !replag
[22:25:06] 	 @replag
[22:25:06] 	 Earwig: s1-rr-a: 12h 39m 48s [+0.04 s/s]; s1-user: 1d 6h 8m 48s [+0.24 s/s]; s3-rr-a: 16s [-0.00 s/s]; s3-user: 16s [-0.00 s/s]; s6-rr-a: 11s [-0.00 s/s]; s6-user: 11s [-0.00 s/s]
[22:27:20] 	 /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 106993 MB (11% inode=98%):  
[22:27:40] 	 /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 107182 MB (11% inode=98%):  
[22:28:19] 	 s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 45618.000000  
[22:29:10] 	 SMF on damiana is CRITICAL: ERROR -  offline:  svc:/system/cluster/scsymon-srv:default  
[22:30:40] 	 Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output  
[22:34:20] 	 /sql on z-dat-s3-a is WARNING: DISK WARNING - free space: /sql 106545 MB (10% inode=98%):  
[22:34:39] 	 /sql on z-dat-s6-a is WARNING: DISK WARNING - free space: /sql 106542 MB (10% inode=98%):  
[22:37:19] 	 Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output  
[22:44:49] 	 Load avg. on ortelius is CRITICAL: CRITICAL - load average: 31.18, 26.39, 14.02  
[22:50:40] 	 MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 45764  
[22:51:20] 	 MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 108824  
[22:55:09] 	 Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output  
[22:58:49] 	 Load avg. on ortelius is WARNING: WARNING - load average: 8.46, 21.21, 20.00  
[23:01:31] 	 Does anyone know the IPv6 addresses of the Polish toolserver?
[23:03:20] 	 because 2001:41d0::/32, the webhost that hosts it, is globally blocked now
[23:03:48] 	 and to avoid the effects of that block you probably will have to give the subnet to give an exemption to
[23:03:49] 	 Load avg. on ortelius is OK: OK - load average: 2.37, 9.17, 14.94  
[23:07:49] 	 s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 108941.000000  
[23:10:14] 	 Jasper_Deng: may I ask why 79228162514264337593543950336 adresses were globaly blocked?
[23:10:26] 	 DaBPunkt: webhost
[23:10:31] 	 has two open proxies on it
[23:11:18] 	 Jasper_Deng: 2001:41d0::/32 is a range, not a single host
[23:11:26] 	 yeah
[23:11:35] 	 it's no different than other webhost blocks
[23:12:41] 	 it is normal to block the hole provider, because a host has a open-proxy?
[23:12:54] 	 yeah
[23:13:02] 	 b/c:
[23:13:24] 	 1.Some people actually choose to edit from their webhost addresses, so the whole webhost may be treated as an open proxy
[23:13:42] 	 2.Would you rather host a proxy@home or @a 100% reliable host?
[23:14:14] 	 3.If you edit from a webhost you're probably not up to good
[23:14:41] 	 Jasper_Deng: can you please define the word "webhost"?
[23:14:44] 	 (i.e. no collateral damage)
[23:14:59] 	 it seems we understand differend things under it
[23:15:31] 	 webhost=a public service that hosts servers for others, normally webservers but sometimes dedicated servers as well
[23:15:40] 	 SMF on web.amaranth is CRITICAL: ERROR - maintenance:  svc:/application/jira:default  
[23:16:38] 	 Jasper_Deng: so if I rent a server in the amazon cloud and run a open-proxy on it, you will block ALL ip-adresses of amazon?
[23:16:48] 	 DaBPunkt: yeah
[23:16:52] 	 because other people also do
[23:16:59] 	 that's stupid
[23:17:09] 	 if anyone has legitimate business editing from a webhost
[23:17:15] 	 (i.e., a bot)
[23:17:17] 	 they can be given exemption
[23:17:36] 	 the /vast majority/ of users can use their home connection
[23:17:59] 	 this has been common practice for years now
[23:18:18] 	 luckyly not in dewp
[23:18:45] 	 unfortunately for dewp, then, these large global blocks apply there too.
[23:19:25] 	 is there a list of these global blocks somewhere?
[23:19:49] 	 on any wiki, Special:GlobalBlockList
[23:20:03] 	 (though you will have a hard time finding IPv6 blocks)
[23:20:10] 	 (so I can request a exception for me and my hoster…)
[23:20:17] 	 you host a bot?
[23:20:33] 	 then go to http://meta.wikimedia.org/wiki/Steward_requests/Global_permissions
[23:20:33] 	 no
[23:20:45] 	 and request global IP block exemption
[23:21:12] 	 though I don't see a legitimate reason to use a webhost IP except to run a bot
[23:21:37] 	 Jasper_Deng: using a VPN-system?
[23:21:49] 	 SMF on willow is CRITICAL: ERROR - maintenance:  svc:/network/puppetmasterd:default  
[23:21:59] 	 DaBPunkt: that's also considered to be an OP
[23:21:59] 	 SMF on turnera is CRITICAL: ERROR -  offline:  svc:/system/cluster/scsymon-srv:default  
[23:22:06] 	 (open proxy)
[23:22:27] 	 but it is neither open nor a proxy
[23:22:39] 	 vandals love VPNs
[23:23:02] 	 I doubt that a vandal will use my vpn-system
[23:23:16] 	 he'd want his own
[23:23:32] 	 he?
[23:23:37] 	 they'd*
[23:23:44] 	 they?
[23:23:47] 	 the vandals
[23:23:51] 	 would use their own VPN
[23:24:14] 	 and it's too much work to give an exemption to a very small part of the address space
[23:24:28] 	 so if you want to use your VPN to edit WP you should just
[23:24:36] 	 request the global IP block exemption flag
[23:25:18] 	 The idea behind a wiki is it to make it easy to edit – so you should block a FEW a possible, not as big as possible and add exception then
[23:26:01] 	 who uses webhosts to edit?
[23:26:18] 	 Probably less than 1% of users absolutely /have/ to use a webhost to edit
[23:26:33] 	 They can use their ISP addresses fine
[23:28:20] 	 s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 45724.000000  
[23:29:09] 	 SMF on damiana is CRITICAL: ERROR -  offline:  svc:/system/cluster/scsymon-srv:default  
[23:29:18] 	 I remember the time enwp blocked the german telecom, because a vandale used the adress-range (~ 16 Million adresses). I have the feeling now that we have ipv6 that the blocking-people went crazy totaly…
[23:29:55] 	 DaBPunkt: no block can affect that many addresses in IPv4, only in IPv6
[23:30:06] 	 because IPv4 blocks are limited
[23:30:12] 	 to 65536 max.
[23:30:29] 	 Jasper_Deng: yes, nowadays
[23:30:31] 	 if your local community really wants to an administrator can disable this block on that wiki only
[23:30:39] 	 Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output  
[23:30:59] 	 DaBPunkt: 16 million addresses requires a minimum of 256 blocks
[23:31:04] 	 in IPv4
[23:32:00] 	 Jasper_Deng: it needs just 1 block. The range-limit were added later
[23:32:09] 	 when was this?
[23:32:15] 	 ~2005
[23:32:52] 	 yeah, that was 3 years before I created an account
[23:34:00] 	 you probably will see more IPv6 webhost blocks this year than other years
[23:34:39] 	 as long as you not block the sixxs-range…
[23:34:53] 	 No vandal has started abusing it yet
[23:35:05] 	 it'll probably get blocked eventually, though. Ig2g
[23:37:20] 	 Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output  
[23:50:39] 	 MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 45704  
[23:51:20] 	 MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 109359  
[23:55:09] 	 Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output  
[23:59:19] 	 /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 107428 MB (11% inode=98%):  
[23:59:40] 	 /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 107714 MB (11% inode=98%):