[00:50:23] This is the fourth worst replag in the past year. toolserver.org/~bryan/stats/replag/s1-yearly.png [00:57:18] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 86689.000000 [00:57:18] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 86689.000000 [01:00:28] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 349052 [01:02:28] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 355018 [01:04:18] SSH on nightshade.mgmt is CRITICAL: Server answer: [01:05:49] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 38401 MB (3% inode=99%): [01:06:28] Sun Grid Engine execd on willow is WARNING: all.q@willow exceedes load threshold: alarm hl:np_load_short=0.805664/1.75, alarm hl:np_load_avg=0.693359/2.00, alarm hl:mem_free=280.000000M/300M [01:07:38] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:08:39] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:12:28] Sun Grid Engine execd on willow is OK: all.q@willow OK: longrun@willow OK [01:57:18] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 90289.000000 [01:57:18] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 90290.000000 [02:00:28] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 327537 [02:02:39] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 344796 [02:04:18] SSH on nightshade.mgmt is CRITICAL: Server answer: [02:04:27] replag on s1 is now over 90,000 seconds and over 24 hours. :( [02:05:54] :( [02:05:59] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 38655 MB (3% inode=99%): [02:06:11] @replag [02:06:16] matthewrbowker: s1-pri: 1d 1h 13m 49s [+1.00 s/s]; s1-sec: 1d 1h 13m 49s [+1.00 s/s]; s2-pri: 57s [+0.00 s/s]; s3-rr: error; s3-user: error; s4-user: 1m 11s [+0.00 s/s]; s5-rr: 11s [+0.00 s/s]; s5-user: 11s [+0.00 s/s] [02:06:17] matthewrbowker: s6-rr: error; s6-user: error; s7-rr: error; s7-user: error [02:07:39] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:08:39] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:09:01] JeffQuassel: replag is broken [02:09:42] JeffQuassel: check your email about the s1 issues [02:09:54] thx [02:13:28] Sun Grid Engine execd on willow is WARNING: all.q@willow exceedes load threshold: alarm hl:np_load_short=0.420410/1.75, alarm hl:np_load_avg=0.452149/2.00, alarm hl:mem_free=282.000000M/300M [02:14:28] Sun Grid Engine execd on willow is OK: all.q@willow OK: longrun@willow OK [02:29:39] SMTP on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:29:50] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:29:50] SMTP on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:29:59] SSH on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:08] SMTP on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:30:19] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [02:30:28] SMTP on hyacinth is OK: SMTP OK - 0.002 sec. response time [02:30:39] SMTP on z-dat-s7-a is OK: SMTP OK - 0.065 sec. response time [02:30:49] SSH on hyacinth is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [02:30:58] SMTP on z-dat-s6-a is OK: SMTP OK - 0.007 sec. response time [02:57:18] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 93890.000000 [02:57:18] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 93889.000000 [03:00:29] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 304890 [03:02:39] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 322824 [03:04:19] SSH on nightshade.mgmt is CRITICAL: Server answer: [03:06:58] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 39452 MB (4% inode=99%): [03:07:39] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:09:38] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:12:39] SMTP on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:12:39] SMTP on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:12:49] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:12:58] SSH on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:13:19] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [03:13:28] SMTP on z-dat-s3-a is OK: SMTP OK - 0.004 sec. response time [03:13:28] SMTP on hyacinth is OK: SMTP OK - 0.003 sec. response time [03:13:49] SSH on hyacinth is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [03:57:18] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 97490.000000 [03:57:18] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 97489.000000 [04:01:28] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 296847 [04:02:39] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 307222 [04:05:19] SSH on nightshade.mgmt is CRITICAL: Server answer: [04:06:58] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 39321 MB (4% inode=99%): [04:07:39] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:09:39] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:18:27] Sun Grid Engine execd on willow is WARNING: all.q@willow exceedes load threshold: alarm hl:np_load_short=0.396484/1.75, alarm hl:np_load_avg=0.429688/2.00, alarm hl:mem_free=273.000000M/300M [04:19:28] Sun Grid Engine execd on willow is OK: all.q@willow OK: longrun@willow OK [04:57:19] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 101091.000000 [04:57:19] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 101090.000000 [05:01:39] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 287984 [05:03:39] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 298802 [05:05:20] SSH on nightshade.mgmt is CRITICAL: Server answer: [05:06:58] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 40135 MB (4% inode=99%): [05:07:39] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:09:39] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:09:41] [[Special:Log/newusers]] create 10 * Rosakaye * (New user account) [05:16:13] [[User talk:Rosakaye]] !N 10https://wiki.toolserver.org/w/index.php?oldid=6577&rcid=8658 * Rosakaye * (+69) (Created page with "[[http://www.arnoldeng.com/html/company-profile.html|aerostructures]]") [05:17:39] Sun Grid Engine execd on willow is WARNING: all.q@willow exceedes load threshold: alarm hl:np_load_short=0.565430/1.75, alarm hl:np_load_avg=0.588379/2.00, alarm hl:mem_free=253.000000M/300M [05:17:59] [[User talk:Rosakaye]] ! 10https://wiki.toolserver.org/w/index.php?diff=6578&oldid=6577&rcid=8659 * Rosakaye * (+0) () [05:19:15] [[User talk:Rosakaye]] ! 10https://wiki.toolserver.org/w/index.php?diff=6579&oldid=6578&rcid=8660 * Rosakaye * (+2831) (/* Aerostructures - What Do Aerostructures Do? */ new section) [05:19:39] Sun Grid Engine execd on willow is OK: all.q@willow OK: longrun@willow OK [05:20:55] [[User talk:Rosakaye]] ! 10https://wiki.toolserver.org/w/index.php?diff=6580&oldid=6579&rcid=8661 * Rosakaye * (+2) () [05:22:33] [[User talk:Rosakaye]] ! 10https://wiki.toolserver.org/w/index.php?diff=6581&oldid=6580&rcid=8662 * Rosakaye * (+2833) (/* Aerostructures - What Do Aerostructures Do? */ new section) [05:39:50] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:40:09] SMTP on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:40:19] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:40:39] SMTP on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:40:39] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [05:40:49] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:40:59] SMTP on z-dat-s6-a is OK: SMTP OK - 0.002 sec. response time [05:41:09] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [05:41:20] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [05:41:28] SMTP on hyacinth is OK: SMTP OK - 0.047 sec. response time [05:57:19] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 104690.000000 [05:57:19] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 104690.000000 [06:01:39] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 277072 [06:03:39] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 287845 [06:05:20] SSH on nightshade.mgmt is CRITICAL: Server answer: [06:06:58] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 40933 MB (4% inode=99%): [06:07:39] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:09:39] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:19:49] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:20:40] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [06:57:20] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 108290.000000 [06:57:20] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 108291.000000 [07:01:39] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 256330 [07:03:39] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 271495 [07:05:30] SSH on nightshade.mgmt is CRITICAL: Server answer: [07:06:58] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 40825 MB (4% inode=99%): [07:07:39] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:10:39] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:58:18] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 111952.000000 [07:58:18] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 111952.000000 [08:01:39] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 242735 [08:03:39] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 252770 [08:06:28] SSH on nightshade.mgmt is CRITICAL: Server answer: [08:06:58] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 41648 MB (4% inode=99%): [08:08:40] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:10:39] /aux0 on daphne is CRITICAL: DISK CRITICAL - free space: /aux0 32520 MB (3% inode=99%): [08:11:40] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:58:18] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 115552.000000 [08:58:19] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 115553.000000 [09:02:38] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 227763 [09:03:39] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 237655 [09:06:28] SSH on nightshade.mgmt is CRITICAL: Server answer: [09:06:59] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 42473 MB (4% inode=99%): [09:08:48] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:11:49] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:58:19] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 119152.000000 [09:58:19] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 119153.000000 [10:02:38] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 217928 [10:04:40] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 224176 [10:06:28] SSH on nightshade.mgmt is CRITICAL: Server answer: [10:07:59] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 43283 MB (4% inode=99%): [10:08:49] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:12:49] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:58:18] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 122751.000000 [10:58:28] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 122757.000000 [11:03:38] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 202063 [11:04:24] @replag [11:04:25] closedmouth: s1-pri: 1d 10h 12m 3s [+1.00 s/s]; s1-sec: 1d 10h 12m 3s [+1.00 s/s]; s3-rr: 13m 54s [+0.02 s/s]; s3-user: 13m 54s [+0.02 s/s]; s6-rr: 8m 35s [+0.01 s/s]; s6-user: 8m 35s [+0.01 s/s]; s7-rr: 5m 54s [+0.01 s/s]; s7-user: 5m 54s [+0.01 s/s] [11:05:38] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 213658 [11:06:38] SSH on nightshade.mgmt is CRITICAL: Server answer: [11:08:08] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 43142 MB (4% inode=99%): [11:09:48] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:13:49] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:55:28] SMTP on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:56:20] SMTP on z-dat-s4-a is OK: SMTP OK - 0.022 sec. response time [11:58:19] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 126351.000000 [11:58:29] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 126360.000000 [12:03:49] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 175501 [12:06:05] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 197750 [12:08:28] SSH on nightshade.mgmt is CRITICAL: Server answer: [12:08:28] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 43965 MB (4% inode=99%): [12:10:13] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:14:04] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:31:20] 3(resolved) [TS-1265] primary and secondary s1 database slaves aren't replication anymore <10https://jira.toolserver.org/browse/TS-1265> (merl) [12:50:20] 3(commented) [TS-1265] primary and secondary s1 database slaves aren't replication anymore <10https://jira.toolserver.org/browse/TS-1265> (Marlen Caemmerer) [12:58:24] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 129954.000000 [12:59:03] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 129995.000000 [13:04:04] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 157079 [13:06:04] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 176949 [13:07:04] SSH on nightshade.mgmt is CRITICAL: Server answer: [13:07:42] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:08:13] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 44790 MB (4% inode=99%): [13:08:13] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [13:10:13] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:15:03] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:37:41] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:38:13] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:38:25] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:38:25] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:38:33] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [13:39:03] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [13:39:12] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [13:39:12] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [13:58:25] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 133558.000000 [14:00:03] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 133654.000000 [14:05:03] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 140447 [14:07:02] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 161754 [14:08:03] SSH on nightshade.mgmt is CRITICAL: Server answer: [14:08:22] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 45593 MB (4% inode=99%): [14:10:34] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:15:12] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:58:34] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 137163.000000 [15:01:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 127935.000000 [15:03:14] Sun Grid Engine execd on willow is WARNING: all.q@willow exceedes load threshold: alarm hl:np_load_short=0.384277/1.75, alarm hl:np_load_avg=0.421387/2.00, alarm hl:mem_free=292.000000M/300M [15:05:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 127178 [15:05:14] Sun Grid Engine execd on willow is OK: all.q@willow OK: longrun@willow OK [15:07:14] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 147920 [15:08:13] SSH on nightshade.mgmt is CRITICAL: Server answer: [15:08:22] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 45460 MB (4% inode=99%): [15:10:34] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:15:13] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:34:21] 3(resolved) [TS-1266] Quota for la2 <10https://jira.toolserver.org/browse/TS-1266> (DaB.) [15:39:22] 3(assigned) [TS-1264] please extend account <10https://jira.toolserver.org/browse/TS-1264> (DaB.) [15:41:21] 3(resolved) [TS-1264] please extend account <10https://jira.toolserver.org/browse/TS-1264> (DaB.) [15:58:35] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 137746.000000 [16:01:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 119568.000000 [16:05:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 119512 [16:07:32] zzz =_= [16:08:14] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 136660 [16:09:14] SSH on nightshade.mgmt is CRITICAL: Server answer: [16:09:22] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 46238 MB (4% inode=99%): [16:11:33] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:15:13] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:54:35] / on willow is WARNING: DISK WARNING - free space: / 23319 MB (20% inode=99%): [16:58:34] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 130938.000000 [17:01:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 105064.000000 [17:05:14] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 104495 [17:08:12] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 130610 [17:09:22] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 46115 MB (4% inode=99%): [17:10:14] SSH on nightshade.mgmt is CRITICAL: Server answer: [17:11:34] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:16:12] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:55:33] / on willow is WARNING: DISK WARNING - free space: / 23266 MB (20% inode=99%): [17:59:33] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 128019.000000 [18:01:03] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 95322.000000 [18:02:34] Sun Grid Engine execd on nightshade is WARNING: longrun@nightshade exceedes load threshold: alarm hl:np_load_short=1.530274/1.50, alarm hl:np_load_long=0.708496/1.75, alarm hl:mem_free=1755.000000M/250M [18:03:33] Sun Grid Engine execd on nightshade is OK: all.q@nightshade OK: longrun@nightshade OK [18:05:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 94617 [18:08:13] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 127294 [18:09:22] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 46927 MB (4% inode=99%): [18:10:13] SSH on nightshade.mgmt is CRITICAL: Server answer: [18:12:33] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:16:13] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:38:13] After a high of over 130,000 seconds, the replag on s1 finally started going down at about 14:20 UTC (about 4 hours 18 minutes ago). [18:55:34] / on willow is WARNING: DISK WARNING - free space: / 23210 MB (20% inode=99%): [18:59:33] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 118052.000000 [19:01:03] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 90014.000000 [19:05:14] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 89872 [19:08:14] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 115662 [19:09:22] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 46681 MB (4% inode=99%): [19:10:13] SSH on nightshade.mgmt is CRITICAL: Server answer: [19:12:42] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [19:16:22] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [19:53:43] MySQL on z-dat-s7-a is CRITICAL: Access denied for user nagios@damiana-bge0.esi.toolserver.org (using password: YES) [19:53:44] MySQL slave on z-dat-s7-a is CRITICAL: Access denied for user nagios@damiana-bge0.esi.toolserver.org (using password: YES) [19:55:43] / on willow is WARNING: DISK WARNING - free space: / 23155 MB (20% inode=99%): [20:00:33] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 107063.000000 [20:01:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 82578.000000 [20:06:14] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 82269 [20:08:13] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 105856 [20:09:23] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 47437 MB (4% inode=99%): [20:10:12] SSH on nightshade.mgmt is CRITICAL: Server answer: [20:11:13] /aux0 on daphne is CRITICAL: DISK CRITICAL - free space: /aux0 30929 MB (3% inode=99%): [20:12:43] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [20:16:23] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [20:53:44] MySQL on z-dat-s7-a is CRITICAL: Access denied for user nagios@damiana-bge0.esi.toolserver.org (using password: YES) [20:53:44] MySQL slave on z-dat-s7-a is CRITICAL: Access denied for user nagios@damiana-bge0.esi.toolserver.org (using password: YES) [20:55:42] / on willow is WARNING: DISK WARNING - free space: / 23103 MB (20% inode=99%): [20:57:52] @replag [20:57:52] Joan: s1-pri: 21h 15m 34s [-2.46 s/s]; s1-sec: 1d 4h 27s [-2.01 s/s]; s3-rr: 19s [-0.00 s/s]; s3-user: 19s [-0.00 s/s] [20:58:23] Sun Grid Engine execd on willow is WARNING: all.q@willow exceedes load threshold: alarm hl:np_load_short=0.395996/1.75, alarm hl:np_load_avg=0.364746/2.00, alarm hl:mem_free=230.000000M/300M: longrun@willow exceedes load threshold: alarm hl:np_load_short=0.395996/1.50, alarm hl:np_load_long=0.353516/1.75, alarm hl:mem_free=230.000000M/250M [21:00:22] Sun Grid Engine execd on willow is OK: all.q@willow OK: longrun@willow OK [21:01:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 76012.000000 [21:01:32] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 100582.000000 [21:06:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 75522 [21:08:13] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 99860 [21:10:13] SSH on nightshade.mgmt is CRITICAL: Server answer: [21:10:22] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 47260 MB (4% inode=99%): [21:12:43] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [21:17:22] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [21:30:34] [[Category:Tools]] ! 10https://wiki.toolserver.org/w/index.php?diff=6582&oldid=6368&rcid=8663 * 70.40.185.99 * (+19) (Minds) [21:31:00] [[Category:Tools]] ! 10https://wiki.toolserver.org/w/index.php?diff=6583&oldid=6582&rcid=8664 * 70.40.185.99 * (-19) (O) [21:32:46] [[Query service]] ! 10https://wiki.toolserver.org/w/index.php?diff=6584&oldid=5890&rcid=8665 * 70.40.185.99 * (+35) (Gas) [21:54:43] MySQL slave on z-dat-s7-a is CRITICAL: Access denied for user nagios@damiana-bge0.esi.toolserver.org (using password: YES) [21:54:43] MySQL on z-dat-s7-a is CRITICAL: Access denied for user nagios@damiana-bge0.esi.toolserver.org (using password: YES) [21:55:52] / on willow is WARNING: DISK WARNING - free space: / 23054 MB (20% inode=99%): [22:01:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 66559.000000 [22:01:32] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 93780.000000 [22:06:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 65468 [22:08:12] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 93522 [22:10:13] SSH on nightshade.mgmt is CRITICAL: Server answer: [22:10:23] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 47025 MB (4% inode=99%): [22:12:53] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [22:16:23] Sun Grid Engine execd on willow is WARNING: all.q@willow exceedes load threshold: alarm hl:np_load_short=0.381836/1.75, alarm hl:np_load_avg=0.418457/2.00, alarm hl:mem_free=236.000000M/300M: longrun@willow exceedes load threshold: alarm hl:np_load_short=0.381836/1.50, alarm hl:np_load_long=0.448730/1.75, alarm hl:mem_free=236.000000M/250M [22:17:23] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [22:18:23] Sun Grid Engine execd on willow is OK: all.q@willow OK: longrun@willow OK [22:54:43] MySQL on z-dat-s7-a is CRITICAL: Access denied for user nagios@damiana-bge0.esi.toolserver.org (using password: YES) [22:54:43] MySQL slave on z-dat-s7-a is CRITICAL: Access denied for user nagios@damiana-bge0.esi.toolserver.org (using password: YES) [22:55:54] / on willow is WARNING: DISK WARNING - free space: / 23004 MB (20% inode=99%): [23:01:03] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 54417.000000 [23:01:33] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 86441.000000 [23:06:23] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 53925 [23:09:12] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 86296 [23:10:24] SSH on nightshade.mgmt is CRITICAL: Server answer: [23:10:24] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 47788 MB (4% inode=99%): [23:12:54] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [23:17:24] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [23:54:53] MySQL slave on z-dat-s7-a is CRITICAL: Access denied for user nagios@damiana-bge0.esi.toolserver.org (using password: YES) [23:54:53] MySQL on z-dat-s7-a is CRITICAL: Access denied for user nagios@damiana-bge0.esi.toolserver.org (using password: YES) [23:55:54] / on willow is WARNING: DISK WARNING - free space: / 22957 MB (20% inode=99%):