[00:17:13] hmmm... that's probably not good... [00:23:43] :D [00:32:21] ethernet 0/1/18 on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/18:UP: 1 int NOK : CRITICAL [00:32:32] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [00:33:32] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [00:33:41] ethernet 0/1/24 on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/24:UP: 1 int NOK : CRITICAL [00:37:41] SSH on nightshade.mgmt is CRITICAL: Server answer: [00:40:04] Heh. I did not expect that scripts would be copied to a new directory when scheduled. Whoopsy. [00:40:11] Makes sense in retrospect. [00:53:52] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51548 MB (5% inode=99%): [00:59:17] DaBPunkt: Your results in pastebin looked fine as far as character set; these do not: https://toolserver.org/~madman/dbq-results/dbq-174.txt [00:59:40] What character set should I be using? In MySQL the table's using latin1 so I'm setting iso-8859-1 (same). [01:07:41] AMadman: the output looks fine for me [01:07:55] AMadman: the data in the tables is utf8 [01:08:55] Yeah. Problem was in my .htaccess. [01:09:23] I figured out it was utf-8 and was using AddCharset to set the character set, forgetting we were using ZWS. [01:09:33] Though looking at the headers, it's still not serving a character set. [01:10:31] Now it is, after clearing cache. All is well in the world. [01:11:06] Using the toolserver is not like riding a bicycle, apparently. Have to get resituated. :p [01:18:30] nacht ts [01:32:31] ethernet 0/1/18 on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/18:UP: 1 int NOK : CRITICAL [01:33:33] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:33:33] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:33:41] ethernet 0/1/24 on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/24:UP: 1 int NOK : CRITICAL [01:38:42] SSH on nightshade.mgmt is CRITICAL: Server answer: [01:52:05] Huh. Warning: mysql_connect(): Lost connection to MySQL server at 'reading initial communication packet', system error: 0 in /home/madman/dbq/dbq-174.php on line 29 [01:52:55] Ah. Just a blip; never mind. [01:53:50] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 52320 MB (5% inode=99%): [02:07:32] a blip that's happening annoying often these days [02:17:55] Yeah, I noticed it a few times. Guess I'll build a few retries and a reasonable retry interval into my DBQ scripts. [02:18:21] 3(updated) [DBQ-174] Emijrp/List of Wikipedians by number of edits <10https://jira.toolserver.org/browse/DBQ-174> (madman) [02:18:22] 3(commented) [DBQ-174] Emijrp/List of Wikipedians by number of edits <10https://jira.toolserver.org/browse/DBQ-174> (madman) [02:20:51] That was a fun DBQ to take. I learned how to format reports for other projects. :D https://toolserver.org/~madman/dbq-results/dbq-174.txt [02:23:21] 3(updated) [DBQ-174] Emijrp/List of Wikipedians by number of edits <10https://jira.toolserver.org/browse/DBQ-174> (madman) [02:32:31] ethernet 0/1/18 on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/18:UP: 1 int NOK : CRITICAL [02:33:41] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:33:41] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:33:41] ethernet 0/1/24 on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/24:UP: 1 int NOK : CRITICAL [02:38:42] SSH on nightshade.mgmt is CRITICAL: Server answer: [02:47:42] SMTP on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:47:42] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:47:42] SMTP on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:47:42] SMTP on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:47:51] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:47:51] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:47:51] /sql on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:47:51] /tmp on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:48:01] /tmp on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:48:01] SMTP on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:48:01] /tmp on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:48:11] /tmp on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:48:21] SMTP on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:48:21] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:48:41] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:49:21] MySQL slave on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [02:49:21] MySQL slave on z-dat-s7-a is CRITICAL: (Service Check Timed Out) [02:49:21] MySQL on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [02:49:31] MySQL on z-dat-s6-a is CRITICAL: (Service Check Timed Out) [02:49:41] MySQL on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [02:49:42] MySQL slave on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [02:49:42] MySQL on z-dat-s7-a is CRITICAL: (Service Check Timed Out) [02:49:51] /tmp on z-dat-s6-a is OK: DISK OK - free space: /tmp 3498 MB (99% inode=99%): [02:49:51] SMTP on z-dat-s6-a is OK: SMTP OK - 0.937 sec. response time [02:49:51] /tmp on z-dat-s4-a is OK: DISK OK - free space: /tmp 3528 MB (99% inode=99%): [02:49:51] MySQL on z-dat-s6-a is OK: Uptime: 5025940 Threads: 13 Questions: 951557423 Slow queries: 334503 Opens: 9691469 Flush tables: 2 Open tables: 2887 Queries per second avg: 189.329 [02:49:52] /tmp on z-dat-s3-a is OK: DISK OK - free space: /tmp 3529 MB (99% inode=99%): [02:49:52] MySQL on z-dat-s4-a is OK: Uptime: 4943160 Threads: 9 Questions: 219892382 Slow queries: 53958 Opens: 39085 Flush tables: 1 Open tables: 449 Queries per second avg: 44.484 [02:49:52] MySQL slave on z-dat-s4-a is OK: Uptime: 4943160 Threads: 9 Questions: 219892382 Slow queries: 53958 Opens: 39085 Flush tables: 1 Open tables: 449 Queries per second avg: 44.484 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 230 [02:49:52] MySQL on z-dat-s3-a is OK: Uptime: 3759024 Threads: 19 Questions: 4163364439 Slow queries: 323083 Opens: 52775182 Flush tables: 2 Open tables: 16384 Queries per second avg: 1107.565 [02:49:52] MySQL slave on z-dat-s3-a is OK: Uptime: 3759024 Threads: 19 Questions: 4163364439 Slow queries: 323083 Opens: 52775182 Flush tables: 2 Open tables: 16384 Queries per second avg: 1107.565 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 283 [02:49:53] MySQL on z-dat-s7-a is OK: Uptime: 2531459 Threads: 5 Questions: 592474547 Slow queries: 101687 Opens: 7125026 Flush tables: 1 Open tables: 3681 Queries per second avg: 234.44 [02:50:01] MySQL slave on z-dat-s6-a is CRITICAL: (Service Check Timed Out) [02:50:01] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [02:50:01] MySQL slave on z-dat-s7-a is OK: Uptime: 2531465 Threads: 4 Questions: 592475906 Slow queries: 101687 Opens: 7125026 Flush tables: 1 Open tables: 3681 Queries per second avg: 234.44 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 239 [02:50:11] MySQL slave on z-dat-s6-a is OK: Uptime: 5025953 Threads: 5 Questions: 951558978 Slow queries: 334505 Opens: 9691479 Flush tables: 2 Open tables: 2887 Queries per second avg: 189.329 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 229 [02:50:11] SMTP on z-dat-s7-a is OK: SMTP OK - 0.059 sec. response time [02:50:21] /sql on z-dat-s7-a is OK: DISK OK - free space: /sql 125336 MB (31% inode=99%): [02:50:21] /tmp on z-dat-s7-a is OK: DISK OK - free space: /tmp 3554 MB (99% inode=99%): [02:50:31] SMTP on hyacinth is OK: SMTP OK - 0.003 sec. response time [02:50:31] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [02:50:31] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [02:50:31] SMTP on z-dat-s3-a is OK: SMTP OK - 0.003 sec. response time [02:50:31] SMTP on z-dat-s4-a is OK: SMTP OK - 0.003 sec. response time [02:50:41] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [02:50:42] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [02:53:51] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 52178 MB (5% inode=99%): [03:32:30] ethernet 0/1/18 on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/18:UP: 1 int NOK : CRITICAL [03:33:41] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:33:42] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:33:42] ethernet 0/1/24 on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/24:UP: 1 int NOK : CRITICAL [03:38:41] SSH on nightshade.mgmt is CRITICAL: Server answer: [03:40:10] Sun Grid Engine execd on willow is WARNING: all.q@willow exceedes load threshold: alarm hl:np_load_short=0.282715/1.75, alarm hl:np_load_avg=0.338379/2.00, alarm hl:mem_free=254.000000M/300M [03:45:10] Sun Grid Engine execd on willow is OK: all.q@willow OK: longrun@willow OK [03:53:11] Sun Grid Engine execd on willow is WARNING: all.q@willow exceedes load threshold: alarm hl:np_load_short=0.379395/1.75, alarm hl:np_load_avg=0.399902/2.00, alarm hl:mem_free=257.000000M/300M [03:53:50] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 52046 MB (5% inode=99%): [04:20:20] 3(created) [PATHOSCHILD-6] Add regex tool to TemplateScript; Pathoschild's tools; New Feature <10https://jira.toolserver.org/browse/PATHOSCHILD-6> (Jesse Plamondon-Willard) [04:32:31] ethernet 0/1/18 on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/18:UP: 1 int NOK : CRITICAL [04:33:42] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:33:42] ethernet 0/1/24 on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/24:UP: 1 int NOK : CRITICAL [04:33:42] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:38:41] SSH on nightshade.mgmt is CRITICAL: Server answer: [04:53:51] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51915 MB (5% inode=99%): [05:27:43] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:27:43] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:27:52] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:27:52] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:28:02] SMTP on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:28:02] /tmp on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:28:12] /tmp on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:28:22] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:28:42] SMTP on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:28:43] SMTP on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:28:52] /tmp on z-dat-s4-a is OK: DISK OK - free space: /tmp 3507 MB (99% inode=99%): [05:28:53] /tmp on z-dat-s6-a is OK: DISK OK - free space: /tmp 3505 MB (99% inode=99%): [05:28:53] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [05:28:53] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [05:28:53] SMTP on z-dat-s6-a is OK: SMTP OK - 0.002 sec. response time [05:28:53] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [05:29:31] SMTP on hyacinth is OK: SMTP OK - 0.003 sec. response time [05:29:31] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [05:29:31] SMTP on z-dat-s4-a is OK: SMTP OK - 0.003 sec. response time [05:29:31] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [05:32:32] ethernet 0/1/18 on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/18:UP: 1 int NOK : CRITICAL [05:33:43] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:33:43] ethernet 0/1/24 on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/24:UP: 1 int NOK : CRITICAL [05:33:43] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:35:21] 3(commented) [DBQ-174] Emijrp/List of Wikipedians by number of edits <10https://jira.toolserver.org/browse/DBQ-174> (Rahuldeshmukh101) [05:37:20] 3(assigned) [DBQ-174] Emijrp/List of Wikipedians by number of edits <10https://jira.toolserver.org/browse/DBQ-174> (Rahuldeshmukh101) [05:38:43] SSH on nightshade.mgmt is CRITICAL: Server answer: [05:53:52] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51795 MB (5% inode=99%): [06:32:31] ethernet 0/1/18 on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/18:UP: 1 int NOK : CRITICAL [06:34:43] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:34:43] ethernet 0/1/24 on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/24:UP: 1 int NOK : CRITICAL [06:34:43] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:39:41] SSH on nightshade.mgmt is CRITICAL: Server answer: [06:53:52] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51720 MB (5% inode=99%): [07:32:31] ethernet 0/1/18 on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/18:UP: 1 int NOK : CRITICAL [07:34:43] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:34:44] ethernet 0/1/24 on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/24:UP: 1 int NOK : CRITICAL [07:34:51] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:39:42] SSH on nightshade.mgmt is CRITICAL: Server answer: [07:54:01] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51555 MB (5% inode=99%): [08:09:52] /aux0 on daphne is CRITICAL: DISK CRITICAL - free space: /aux0 35596 MB (3% inode=99%): [08:32:32] ethernet 0/1/18 on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/18:UP: 1 int NOK : CRITICAL [08:34:44] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:34:45] ethernet 0/1/24 on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/24:UP: 1 int NOK : CRITICAL [08:35:51] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:39:42] SSH on nightshade.mgmt is CRITICAL: Server answer: [08:54:01] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 52357 MB (5% inode=99%): [09:19:21] Sun Grid Engine execd on willow is WARNING: all.q@willow exceedes load threshold: alarm hl:np_load_short=0.616699/1.75, alarm hl:np_load_avg=0.591309/2.00, alarm hl:mem_free=245.000000M/300M: longrun@willow exceedes load threshold: alarm hl:np_load_short=0.616699/1.50, alarm hl:np_load_long=0.570312/1.75, alarm hl:mem_free=245.000000M/250M [09:20:21] Sun Grid Engine execd on willow is OK: all.q@willow OK: longrun@willow OK [09:32:32] ethernet 0/1/18 on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/18:UP: 1 int NOK : CRITICAL [09:34:42] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:35:43] ethernet 0/1/24 on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/24:UP: 1 int NOK : CRITICAL [09:36:51] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:39:42] SSH on nightshade.mgmt is CRITICAL: Server answer: [09:54:01] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 52226 MB (5% inode=99%): [10:33:32] ethernet 0/1/18 on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/18:UP: 1 int NOK : CRITICAL [10:34:52] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:35:53] ethernet 0/1/24 on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/24:UP: 1 int NOK : CRITICAL [10:37:02] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:39:51] SSH on nightshade.mgmt is CRITICAL: Server answer: [10:54:02] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 52082 MB (5% inode=99%): [11:33:47] ethernet 0/1/18 on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/18:UP: 1 int NOK : CRITICAL [11:35:48] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:37:36] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:40:37] SSH on nightshade.mgmt is CRITICAL: Server answer: [11:54:37] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51939 MB (5% inode=99%): [11:57:19] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51931 MB (5% inode=99%): [11:57:27] SSH on nightshade.mgmt is CRITICAL: Server answer: [11:57:27] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:30:14] hi [12:56:27] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:57:16] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51778 MB (5% inode=99%): [12:57:27] SSH on nightshade.mgmt is CRITICAL: Server answer: [12:57:27] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:56:37] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:58:16] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51628 MB (5% inode=99%): [13:58:26] SSH on nightshade.mgmt is CRITICAL: Server answer: [13:58:26] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:02:24] [[Special:Log/newusers]] create 10 * 3385586 * (New user account) [14:57:36] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:58:16] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51465 MB (5% inode=99%): [14:58:26] SSH on nightshade.mgmt is CRITICAL: Server answer: [14:58:26] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:57:37] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:58:15] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 52232 MB (5% inode=99%): [15:58:26] SSH on nightshade.mgmt is CRITICAL: Server answer: [15:58:26] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:58:15] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 52046 MB (5% inode=99%): [16:58:25] SSH on nightshade.mgmt is CRITICAL: Server answer: [16:58:26] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:58:36] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:58:15] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51878 MB (5% inode=99%): [17:58:26] SSH on nightshade.mgmt is CRITICAL: Server answer: [17:58:26] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:58:36] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:58:15] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51705 MB (5% inode=99%): [18:58:25] SSH on nightshade.mgmt is CRITICAL: Server answer: [18:58:26] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:58:37] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [19:09:24] strace on the toolserver (nightshade) throws "ERROR: unable to open /dev/log", why? [19:09:46] It would come pin pretty handy to debug an issue I'm having (in particular strace -eopen [19:09:48] ) [19:10:17] Sun Grid Engine execd on willow is WARNING: all.q@willow exceedes load threshold: alarm hl:np_load_short=0.421875/1.75, alarm hl:np_load_avg=0.414551/2.00, alarm hl:mem_free=258.000000M/300M [19:10:22] ok, I mean I can see : [19:10:25] crw-r----- 1 root sys 21, 5 Jan 3 2011 /devices/pseudo/log@0:log [19:10:30] but why? [19:10:37] sun thing? [19:10:58] alternatives to quickly find out which file open fails in a program? [19:11:12] (burried deep down in a library without debug symbols) [19:11:17] Sun Grid Engine execd on willow is OK: all.q@willow OK: longrun@willow OK [19:12:03] will be back in 30mins. You better come up wiuth an answer until then, channel! ;-) [19:37:05] mh [19:47:14] truss is the answer [19:58:16] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51531 MB (5% inode=99%): [19:58:26] SSH on nightshade.mgmt is CRITICAL: Server answer: [19:58:26] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [19:58:47] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [20:10:06] /aux0 on daphne is CRITICAL: DISK CRITICAL - free space: /aux0 34716 MB (3% inode=99%): [20:13:18] SMTP on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:13:27] SMTP on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:13:36] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:13:36] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:13:57] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:14:17] SMTP on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:14:17] SMTP on z-dat-s6-a is OK: SMTP OK - 0.461 sec. response time [20:14:26] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [20:14:26] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:14:48] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:15:07] SMTP on hyacinth is OK: SMTP OK - 0.080 sec. response time [20:15:07] SMTP on z-dat-s7-a is OK: SMTP OK - 0.002 sec. response time [20:27:07] MySQL slave on z-dat-s3-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1891 [20:37:07] MySQL slave on z-dat-s3-a is OK: Uptime: 3823053 Threads: 20 Questions: 4219482394 Slow queries: 325858 Opens: 53363565 Flush tables: 2 Open tables: 16377 Queries per second avg: 1103.694 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1730 [20:41:27] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:41:36] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:42:17] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [20:42:17] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:47:27] SSH on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:47:27] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:47:36] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:47:57] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:48:07] SMTP on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:48:17] /sql on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:48:17] /tmp on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:48:18] SMTP on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:48:26] SSH on hyacinth is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:48:27] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:48:47] /sql on z-dat-s7-a is OK: DISK OK - free space: /sql 126044 MB (31% inode=99%): [20:48:47] /tmp on z-dat-s7-a is OK: DISK OK - free space: /tmp 3599 MB (99% inode=99%): [20:48:47] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:48:57] SMTP on z-dat-s3-a is OK: SMTP OK - 0.020 sec. response time [20:49:06] SMTP on z-dat-s7-a is OK: SMTP OK - 0.004 sec. response time [20:49:33] any OSM map guys in here? [20:58:17] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51357 MB (5% inode=99%): [20:58:26] SSH on nightshade.mgmt is CRITICAL: Server answer: [20:58:27] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [20:59:46] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [21:12:36] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:13:26] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [21:35:45] why does slayerd kill processes that mmap a large file?! [21:35:52] this does not use physical memory [21:58:28] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51189 MB (5% inode=99%): [21:58:28] SSH on nightshade.mgmt is CRITICAL: Server answer: [21:58:38] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [21:59:48] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [22:09:12] zzz =_= [22:22:08] MySQL slave on z-dat-s3-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1923 [22:54:28] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:55:18] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [22:58:39] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51956 MB (5% inode=99%): [22:58:40] SSH on nightshade.mgmt is CRITICAL: Server answer: [22:58:40] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [22:59:49] SMF on damiana.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [23:22:20] MySQL slave on z-dat-s3-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3103 [23:29:19] s4 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 1917.000000 [23:38:21] MySQL slave on z-dat-s3-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3627 [23:59:20] s4 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1790.000000 [23:59:39] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 51804 MB (5% inode=99%): [23:59:39] SSH on nightshade.mgmt is CRITICAL: Server answer: [23:59:39] SMF on turnera.esi is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default