[01:38:23] 2014/03/08 01:29 CRIT cassia s4 replag (Service Check Timed Out) [01:39:23] 2014/03/08 01:38 OK cassia s4 replag QUERY OK: 'SELECT ts_rc_age()' returned 0.000000 [02:35:26] 2014/03/08 02:29 WARN hawthorn NTP NTP WARNING: Server has the LI_ALARM bit set, Offset -0.130588 secs [02:46:26] 2014/03/08 02:46 CRIT hawthorn NTP NTP CRITICAL: Server not synchronized, Offset unknown [02:47:26] 2014/03/08 02:47 WARN hawthorn NTP NTP WARNING: Server has the LI_ALARM bit set, Offset -0.002217 secs [02:51:35] Hm.. something up with svn.toolserver.org? I can't commit anything [02:51:59] At least.. it won't take any of my passwords [02:52:16] I'm assuming it's the LDAP password [02:55:05] It should be IIRC. [03:04:27] 2014/03/08 03:04 OK hawthorn NTP NTP OK: Offset 0.000318 secs [03:12:28] 2014/03/08 03:11 OK ortelius / DISK OK - free space: / 7415 MB (24% inode=91%): [03:49:30] 2014/03/08 03:42 WARN yarrow Load avg. WARNING - load average: 24.60, 18.95, 10.92 [03:53:31] 2014/03/08 03:52 CRIT yarrow Load avg. CRITICAL - load average: 30.35, 24.65, 15.03 [07:25:43] 2014/03/08 07:24 WARN yarrow Load avg. WARNING - load average: 1.11, 1.17, 19.89 [07:30:43] 2014/03/08 07:29 OK yarrow Load avg. OK - load average: 1.13, 1.14, 14.70 [08:13:45] 2014/03/08 08:07 WARN z-dat-s5-b /sql DISK WARNING - free space: /sql 119882 MB (10% inode=99%): [12:18:05] 2014/03/08 12:17 OK z-dat-s5-b /sql DISK OK - free space: /sql 137068 MB (12% inode=99%): [13:22:07] 2014/03/08 13:15 WARN z-dat-s5-b /sql DISK WARNING - free space: /sql 117834 MB (10% inode=99%): [15:33:14] 2014/03/08 15:32 OK z-dat-s5-b /sql DISK OK - free space: /sql 136752 MB (12% inode=99%): [16:55:19] 2014/03/08 16:48 WARN z-dat-s5-b /sql DISK WARNING - free space: /sql 120955 MB (10% inode=99%): [17:26:20] 2014/03/08 17:19 WARN ortelius / DISK WARNING - free space: / 6274 MB (20% inode=91%): [19:03:26] 2014/03/08 19:02 OK z-dat-s5-b /sql DISK OK - free space: /sql 137021 MB (12% inode=99%): [20:00:46] hello [20:00:50] maintenance tonight [20:00:57] going to update the solaris machines [20:01:21] its ortelius, wolfsbane, willow, damiana, clematis and hawthorn [20:03:00] amette: id like to do damiana first [20:03:09] whats your first box? [20:03:44] I'd prefer something Linuxy for warm-up [20:03:55] oh no :) [20:04:00] lets do linux another day [20:04:02] ;) [20:04:15] ok, then I don't have any preference [20:04:34] then your hosts are clematis, hawthorn and willow [20:04:39] roger that [20:07:17] started the update processes [20:11:07] * amette too [20:15:30] 2014/03/08 20:08 WARN wolfsbane / DISK WARNING - free space: / 6075 MB (20% inode=92%): [20:15:30] 2014/03/08 20:08 WARN wolfsbane /tmp DISK WARNING - free space: / 6013 MB (20% inode=92%): [20:20:30] 2014/03/08 20:13 ?? hawthorn Load avg. CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [20:21:30] 2014/03/08 20:14 ?? hawthorn / CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [20:21:30] 2014/03/08 20:14 ?? hawthorn /tmp CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [20:21:30] 2014/03/08 20:14 ?? hawthorn Environment IPMI CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [20:21:30] 2014/03/08 20:14 CRIT hawthorn SSH CRITICAL - Socket timeout after 10 seconds [20:25:30] 2014/03/08 20:25 OK ptolemy /sql DISK OK - free space: /sql 173926 MB (21% inode=99%): [20:27:30] 2014/03/08 20:26 OK hawthorn Load avg. OK - load average: 0.02, 0.03, 0.05 [20:27:30] 2014/03/08 20:27 OK hawthorn SSH SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:28:30] 2014/03/08 20:27 OK hawthorn / DISK OK - free space: / 18567 MB (61% inode=93%): [20:28:31] 2014/03/08 20:27 OK hawthorn /tmp DISK OK - free space: /tmp 84 MB (75% inode=95%): [20:28:31] 2014/03/08 20:27 OK hawthorn Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [20:38:58] i will reboot ortelius now [20:40:30] 2014/03/08 20:39 ?? ortelius / CHECK_NRPE: Error receiving data from daemon. [20:40:30] 2014/03/08 20:39 ?? ortelius Sun Grid Engine execd CHECK_NRPE: Error receiving data from daemon. [20:41:30] 2014/03/08 20:40 CRIT ortelius / Connection refused or timed out [20:41:30] 2014/03/08 20:40 CRIT ortelius /tmp Connection refused or timed out [20:41:30] 2014/03/08 20:40 CRIT ortelius DiskSuite Timeout while attempting connection [20:41:30] 2014/03/08 20:40 CRIT ortelius Environment IPMI Timeout while attempting connection [20:41:30] 2014/03/08 20:40 CRIT ortelius Load avg. Timeout while attempting connection [20:41:30] 2014/03/08 20:40 CRIT ortelius NTP CRITICAL - Socket timeout after 10 seconds [20:41:31] 2014/03/08 20:40 CRIT ortelius PING PING CRITICAL - Packet loss = 100% [20:41:31] 2014/03/08 20:40 CRIT ortelius SMTP CRITICAL - Socket timeout after 10 seconds [20:41:32] 2014/03/08 20:40 CRIT ortelius SSH No route to host [20:41:32] 2014/03/08 20:40 CRIT ortelius Sun Grid Engine execd Connection refused or timed out [20:41:33] 2014/03/08 20:40 CRIT ortelius toolserver.org HTTP CRITICAL - Socket timeout after 10 seconds [20:43:30] 2014/03/08 20:42 OK ortelius /tmp DISK OK - free space: /tmp 27932 MB (100% inode=99%): [20:43:30] 2014/03/08 20:43 OK ortelius DiskSuite OK - No disk failures detected [20:43:30] 2014/03/08 20:43 OK ortelius Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [20:43:30] 2014/03/08 20:43 OK ortelius Load avg. OK - load average: 2.12, 0.86, 0.32 [20:43:30] 2014/03/08 20:43 OK ortelius NTP NTP OK: Offset 0.061706 secs [20:44:30] 2014/03/08 20:43 WARN ortelius / DISK WARNING - free space: / 4959 MB (16% inode=91%): [20:44:30] 2014/03/08 20:43 OK ortelius PING PING OK - Packet loss = 0%, RTA = 0.12 ms [20:44:30] 2014/03/08 20:43 OK ortelius SSH SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:44:30] 2014/03/08 20:43 WARN ortelius Sun Grid Engine execd NRPE: Unable to read output [20:45:01] ortelius is done [20:49:46] going to reboot wolfsbane [20:51:30] 2014/03/08 20:44 CRIT willow SMTP CRITICAL - Socket timeout after 10 seconds [20:52:30] 2014/03/08 20:45 CRIT clematis SMTP CRITICAL - Socket timeout after 10 seconds [20:52:30] 2014/03/08 20:51 CRIT wolfsbane / Connection refused by host [20:52:30] 2014/03/08 20:51 CRIT wolfsbane /tmp Timeout while attempting connection [20:52:30] 2014/03/08 20:51 CRIT wolfsbane Sun Grid Engine execd Connection refused by host [20:53:30] 2014/03/08 20:53 CRIT wolfsbane Environment IPMI Connection refused or timed out [20:53:30] 2014/03/08 20:53 CRIT wolfsbane Load avg. Connection refused or timed out [20:53:30] 2014/03/08 20:52 CRIT wolfsbane SMTP CRITICAL - Socket timeout after 10 seconds [20:53:30] 2014/03/08 20:52 CRIT wolfsbane SSH No route to host [20:53:30] 2014/03/08 20:52 CRIT wolfsbane toolserver.org HTTP No route to host [20:54:30] 2014/03/08 20:53 CRIT wolfsbane NTP NTP CRITICAL: No response from NTP server [20:54:30] 2014/03/08 20:53 CRIT wolfsbane PING PING CRITICAL - Packet loss = 100% [20:55:30] 2014/03/08 20:54 OK wolfsbane / DISK OK - free space: / 6320 MB (21% inode=93%): [20:55:31] 2014/03/08 20:54 OK wolfsbane /tmp DISK OK - free space: / 6323 MB (21% inode=93%): [20:55:31] 2014/03/08 20:55 OK wolfsbane Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [20:55:31] 2014/03/08 20:55 OK wolfsbane Load avg. OK - load average: 1.41, 0.71, 0.28 [20:55:31] 2014/03/08 20:55 OK wolfsbane NTP NTP OK: Offset 0.001169 secs [20:55:32] 2014/03/08 20:54 OK wolfsbane PING PING OK - Packet loss = 0%, RTA = 0.13 ms [20:55:32] 2014/03/08 20:54 ?? wolfsbane Sun Grid Engine execd Cannot execute /sge/GE/bin/sol-amd64/qstat [20:56:30] 2014/03/08 20:55 OK wolfsbane SSH SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:56:31] 2014/03/08 20:55 WARN wolfsbane Sun Grid Engine execd NRPE: Unable to read output [20:56:50] wolfsbane is done [20:57:30] 2014/03/08 20:56 WARN clematis SMTP recv() failed [20:57:30] 2014/03/08 20:57 CRIT clematis Sun Grid Engine execd Connection refused by host [20:57:30] 2014/03/08 20:51 CRIT damiana SMTP CRITICAL - Socket timeout after 10 seconds [20:58:05] clematis rebooting [20:58:12] and willow is done and needs a reboot, too [20:58:30] 2014/03/08 20:58 CRIT clematis / Connection refused or timed out [20:58:30] 2014/03/08 20:57 CRIT clematis SMTP CRITICAL - Socket timeout after 10 seconds [20:58:34] ALL : willow will be rebooted in the next ten minutes - please save your work !!! [20:59:30] 2014/03/08 20:59 CRIT clematis /tmp Connection refused or timed out [20:59:30] 2014/03/08 20:58 CRIT clematis Environment IPMI Connection refused or timed out [20:59:30] 2014/03/08 20:58 CRIT clematis Load avg. Connection refused or timed out [20:59:30] 2014/03/08 20:58 CRIT clematis NTP CRITICAL - Socket timeout after 10 seconds [20:59:30] 2014/03/08 20:58 CRIT clematis PING CRITICAL - Host Unreachable (clematis) [20:59:30] 2014/03/08 20:59 CRIT clematis SSH No route to host [21:00:30] 2014/03/08 21:00 OK clematis / DISK OK - free space: / 18814 MB (62% inode=93%): [21:00:30] 2014/03/08 21:00 OK clematis /tmp DISK OK - free space: /tmp 6471 MB (100% inode=99%): [21:00:30] 2014/03/08 20:59 OK clematis PING PING OK - Packet loss = 0%, RTA = 1.92 ms [21:00:30] 2014/03/08 21:00 WARN clematis Sun Grid Engine execd option -w not recognized [21:01:30] 2014/03/08 21:00 OK clematis Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [21:01:30] 2014/03/08 21:00 OK clematis Load avg. OK - load average: 1.28, 0.61, 0.24 [21:01:30] 2014/03/08 21:00 OK clematis NTP NTP OK: Offset -0.000761 secs [21:02:23] Danny_B multichill : you guys are logged in to willow and available here in the channel - so personal warning for you: willow will be rebooted soon! :) [21:04:53] ah wait a bit [21:05:04] alright, let me know when it's ok for you [21:06:16] nightrshade is going to be rebooted too? [21:07:31] 2014/03/08 21:06 OK clematis SMTP SMTP OK - 0.003 sec. response time [21:07:56] Danny_B: not planned today [21:10:14] oki, shoot [21:11:04] thanks Danny_B [21:11:43] thanks for wait [21:12:19] np :) [21:13:30] 2014/03/08 21:13 OK willow SMTP SMTP OK - 0.003 sec. response time [21:16:42] * amette waits a bit more with the willow-reboot.... [21:17:30] 2014/03/08 21:10 CRIT hawthorn SMTP CRITICAL - Socket timeout after 10 seconds [21:19:18] * amette reboots now [21:21:30] 2014/03/08 21:20 CRIT willow Sun Grid Engine execd Connection refused by host [21:22:30] 2014/03/08 21:21 CRIT willow / Connection refused or timed out [21:22:30] 2014/03/08 21:21 CRIT willow /tmp Connection refused or timed out [21:22:30] 2014/03/08 21:21 CRIT willow Environment IPMI Connection refused or timed out [21:22:30] 2014/03/08 21:21 CRIT willow Load avg. Connection refused or timed out [21:22:30] 2014/03/08 21:22 CRIT willow NTP CRITICAL - Socket timeout after 10 seconds [21:22:30] 2014/03/08 21:22 CRIT willow PING CRITICAL - Host Unreachable (willow) [21:22:30] 2014/03/08 21:21 CRIT willow SMTP No route to host [21:22:31] 2014/03/08 21:21 CRIT willow SSH No route to host [21:23:30] 2014/03/08 21:22 OK willow /tmp DISK OK - free space: / 35929 MB (34% inode=99%): [21:23:30] 2014/03/08 21:22 OK willow Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [21:23:30] 2014/03/08 21:22 OK willow Load avg. OK - load average: 1.58, 0.55, 0.20 [21:23:30] 2014/03/08 21:23 OK willow NTP NTP OK: Offset 0.000499 secs [21:23:30] 2014/03/08 21:23 OK willow PING PING OK - Packet loss = 0%, RTA = 0.23 ms [21:23:30] 2014/03/08 21:23 OK willow SMTP SMTP OK - 0.133 sec. response time [21:23:30] 2014/03/08 21:23 OK willow SSH SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [21:24:30] 2014/03/08 21:23 OK willow / DISK OK - free space: / 35926 MB (34% inode=99%): [21:24:30] 2014/03/08 21:23 WARN willow Sun Grid Engine execd NRPE: Unable to read output [21:24:46] no more reboots planned? [21:24:52] not for willow, no [21:25:06] great, thanks [21:25:11] Danny_B: damiana and hawthorn are still in the pipe [21:25:31] Danny_B: reboot of damiana will cause a small complete downtime for ts [21:25:37] because it does the ldap [21:25:51] btw [21:25:57] going to reboot damiana [21:27:31] 2014/03/08 21:21 WARN z-dat-s5-b /sql DISK WARNING - free space: /sql 121865 MB (10% inode=99%): [21:28:31] 2014/03/08 21:28 CRIT damiana / Connection refused or timed out [21:28:31] 2014/03/08 21:27 CRIT damiana DiskSuite Timeout while attempting connection [21:28:31] 2014/03/08 21:27 CRIT damiana Environment IPMI Timeout while attempting connection [21:28:31] 2014/03/08 21:27 CRIT damiana Free Memory Timeout while attempting connection [21:28:31] 2014/03/08 21:27 CRIT damiana Load avg. Timeout while attempting connection [21:28:31] 2014/03/08 21:27 CRIT damiana NTP CRITICAL - Socket timeout after 10 seconds [21:28:31] 2014/03/08 21:28 CRIT damiana PING CRITICAL - Host Unreachable (damiana) [21:28:32] 2014/03/08 21:28 CRIT damiana SSH CRITICAL - Socket timeout after 10 seconds [21:28:33] 2014/03/08 21:28 CRIT ha-dns-auth Authoritative DNS CRITICAL - Plugin timed out while executing system call [21:28:33] 2014/03/08 21:28 CRIT ha-dns-auth PING CRITICAL - Host Unreachable (ha-dns-auth) [21:28:33] 2014/03/08 21:27 CRIT ha-nfs.esi NFS CRITICAL - Socket timeout after 10 seconds [21:28:34] 2014/03/08 21:27 CRIT ha-nfs.esi PING CRITICAL - Host Unreachable (ha-nfs.esi) [21:28:34] 2014/03/08 21:27 CRIT ha-proxy.esi HTTP proxy CRITICAL - Socket timeout after 10 seconds [21:28:35] 2014/03/08 21:28 CRIT ha-proxy.esi PING CRITICAL - Host Unreachable (ha-proxy.esi) [21:29:31] 2014/03/08 21:28 CRIT damiana /tmp Connection refused or timed out [21:29:32] 2014/03/08 21:28 CRIT damiana ts-array5 Connection refused or timed out [21:29:32] 2014/03/08 21:28 CRIT ha-dns-recursor.esi DNS recursor CRITICAL - Plugin timed out while executing system call [21:29:32] 2014/03/08 21:28 CRIT ha-dns-recursor.esi PING CRITICAL - Host Unreachable (ha-dns-recursor.esi) [21:29:32] 2014/03/08 21:28 CRIT ha-ldap.esi LDAP Could not bind to the LDAP server [21:29:32] 2014/03/08 21:28 CRIT ha-ldap.esi PING CRITICAL - Host Unreachable (ha-ldap.esi) [21:30:31] 2014/03/08 21:30 OK damiana / DISK OK - free space: / 25106 MB (34% inode=95%): [21:30:32] 2014/03/08 21:30 OK damiana /tmp DISK OK - free space: /tmp 14493 MB (99% inode=99%): [21:30:32] 2014/03/08 21:29 OK damiana Free Memory OK - 91.8% (7692648 kB) free. [21:30:32] 2014/03/08 21:30 OK damiana PING PING OK - Packet loss = 0%, RTA = 0.20 ms [21:30:32] 2014/03/08 21:30 OK damiana SSH SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [21:30:32] 2014/03/08 21:30 OK damiana ts-array5 2/2 paths are active [21:30:33] 2014/03/08 21:29 OK ha-nfs.esi PING PING OK - Packet loss = 0%, RTA = 0.27 ms [21:31:31] 2014/03/08 21:30 OK damiana Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [21:33:31] 2014/03/08 21:33 CRIT ha-dns-recursor.esi CRITICAL - Host Unreachable (ha-dns-recursor.esi) [21:33:31] 2014/03/08 21:32 CRIT ha-ldap.esi CRITICAL - Host Unreachable (ha-ldap.esi) [21:33:31] 2014/03/08 21:32 CRIT ha-sql.esi CRITICAL - Host Unreachable (ha-sql.esi) [21:33:31] 2014/03/08 21:27 CRIT ptolemy SMTP CRITICAL - Socket timeout after 10 seconds [21:33:31] 2014/03/08 21:27 CRIT z-dat-s5-b SMTP CRITICAL - Socket timeout after 10 seconds [21:34:32] 2014/03/08 21:27 CRIT adenia SMTP CRITICAL - Socket timeout after 10 seconds [21:34:32] 2014/03/08 21:27 CRIT clematis SMTP CRITICAL - Socket timeout after 10 seconds [21:34:32] 2014/03/08 21:27 CRIT daphne SMTP CRITICAL - Socket timeout after 10 seconds [21:34:32] 2014/03/08 21:27 CRIT hemlock /home CHECK_NRPE: Socket timeout after 30 seconds. [21:34:32] 2014/03/08 21:27 CRIT hyacinth SMTP CRITICAL - Socket timeout after 10 seconds [21:34:32] 2014/03/08 21:28 CRIT nightshade SMTP CRITICAL - Socket timeout after 10 seconds [21:34:33] 2014/03/08 21:27 ?? nightshade Sun Grid Engine execd Execution timeout exceeded [21:34:33] 2014/03/08 21:27 CRIT nightshade aliasd CRITICAL - Socket timeout after 10 seconds [21:34:34] 2014/03/08 21:27 CRIT ortelius SMTP CRITICAL - Socket timeout after 10 seconds [21:34:34] 2014/03/08 21:27 CRIT ortelius toolserver.org HTTP CRITICAL - Socket timeout after 10 seconds [21:34:34] 2014/03/08 21:27 CRIT rosemary SMTP CRITICAL - Socket timeout after 10 seconds [21:34:35] 2014/03/08 21:27 CRIT sage SMTP CRITICAL - Socket timeout after 10 seconds [21:34:35] 2014/03/08 21:27 CRIT thyme SMTP CRITICAL - Socket timeout after 10 seconds [21:34:36] 2014/03/08 21:27 CRIT willow SMTP CRITICAL - Socket timeout after 10 seconds [21:35:31] 2014/03/08 21:28 CRIT hemlock SMTP CRITICAL - Socket timeout after 10 seconds [21:35:32] 2014/03/08 21:34 OK ortelius SMTP SMTP OK - 0.001 sec. response time [21:35:32] 2014/03/08 21:35 OK z-dat-s5-b /sql DISK OK - free space: /sql 136398 MB (12% inode=99%): [21:36:31] 2014/03/08 21:27 CRIT damiana NTP NTP CRITICAL: No response from NTP server [21:36:32] 2014/03/08 21:29 WARN nightshade Load avg. WARNING - load average: 25.72, 18.71, 9.24 [21:36:32] 2014/03/08 21:29 ?? nightshade.mgmt SSH Usage: [21:36:32] 2014/03/08 21:35 WARN ortelius toolserver.org HTTP HTTP WARNING: HTTP/1.1 404 Not found - 161 bytes in 0.014 second response time [21:36:32] 2014/03/08 21:29 ?? yarrow.mgmt SSH Usage: [21:37:32] 2014/03/08 21:36 WARN wolfsbane toolserver.org HTTP HTTP WARNING: HTTP/1.1 404 Not found - 161 bytes in 0.011 second response time [21:37:32] 2014/03/08 21:36 CRIT yarrow.mgmt SSH CRITICAL - Socket timeout after 10 seconds [21:38:31] 2014/03/08 21:38 OK ha-dns-recursor.esi PING OK - Packet loss = 0%, RTA = 0.31 ms [21:38:31] 2014/03/08 21:38 OK ha-sql.esi PING OK - Packet loss = 0%, RTA = 0.27 ms [21:38:31] 2014/03/08 21:37 OK clematis SMTP SMTP OK - 0.005 sec. response time [21:38:31] 2014/03/08 21:37 OK damiana NTP NTP OK: Offset -0.063823 secs [21:38:31] 2014/03/08 21:37 OK daphne SMTP SMTP OK - 5.018 sec. response time [21:39:37] 2014/03/08 21:38 OK adenia SMTP SMTP OK - 0.006 sec. response time [21:39:37] 2014/03/08 21:38 OK ha-dns-recursor.esi PING PING OK - Packet loss = 0%, RTA = 0.17 ms [21:40:37] 2014/03/08 21:39 OK hemlock /home DISK OK - free space: /home 16269 MB (32% inode=91%): [21:40:37] 2014/03/08 21:39 WARN nightshade Load avg. WARNING - load average: 12.90, 20.94, 12.68 [21:40:37] 2014/03/08 21:39 OK ortelius toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.010 second response time [21:40:37] 2014/03/08 21:39 OK wolfsbane toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.011 second response time [21:42:38] 2014/03/08 21:41 OK nightshade Load avg. OK - load average: 1.76, 14.02, 11.14 [21:48:22] Has everything just expired? [21:50:11] Reedy: ldap went down with the upgrade [21:50:27] aha [21:50:32] which kills basically everything [21:50:35] fixing right now [21:50:49] my root account works fine ;) [21:51:03] you wanna fix it? ;) [21:52:42] still can't log to willow [21:52:53] * Reedy kicks Danny_B|webgate [21:53:06] ? [21:54:37] 2014/03/08 21:54 OK nightshade aliasd TCP OK - 0.009 second response time on port 984 [500 Not found.] [21:54:37] 2014/03/08 21:48 CRIT yarrow Sun Grid Engine execd CRITICAL: execd not communicating [21:55:18] Danny_B|webgate: ldap on damiana died with the upgrade [21:55:37] 2014/03/08 21:55 OK yarrow Sun Grid Engine execd Host and Queues Ok [21:55:48] aha [22:00:39] 2014/03/08 22:00 OK damiana SMTP SMTP OK - 0.004 sec. response time [22:09:45] still fighting with the ldap [22:10:01] good luck [22:50:41] 2014/03/08 22:43 CRIT nightshade aliasd CRITICAL - Socket timeout after 10 seconds [23:02:47] nosy: i can't log in at all [23:03:19] gifti: yes problem with ldap after upgrade [23:03:22] please hang on [23:03:25] ok [23:03:41] 2014/03/08 23:03 OK ha-ldap.esi PING OK - Packet loss = 0%, RTA = 0.20 ms [23:03:41] 2014/03/08 23:02 OK nightshade aliasd TCP OK - 1.008 second response time on port 984 [500 Not found.] [23:04:41] 2014/03/08 23:03 OK ha-ldap.esi PING PING OK - Packet loss = 0%, RTA = 0.25 ms [23:05:41] 2014/03/08 23:05 CRIT damiana / Connection refused or timed out [23:05:41] 2014/03/08 23:05 CRIT damiana /tmp Connection refused or timed out [23:05:42] 2014/03/08 23:05 CRIT damiana SMTP CRITICAL - Socket timeout after 10 seconds [23:05:42] 2014/03/08 23:05 CRIT damiana SSH CRITICAL - Socket timeout after 10 seconds [23:05:42] 2014/03/08 23:05 CRIT damiana ts-array5 Connection refused or timed out [23:05:42] 2014/03/08 23:05 CRIT ha-dns-recursor.esi PING CRITICAL - Host Unreachable (ha-dns-recursor.esi) [23:05:42] 2014/03/08 23:05 CRIT ha-sql.esi PING CRITICAL - Host Unreachable (ha-sql.esi) [23:06:41] 2014/03/08 23:05 CRIT damiana DiskSuite Connection refused or timed out [23:06:42] 2014/03/08 23:05 CRIT damiana Environment IPMI Connection refused or timed out [23:06:42] 2014/03/08 23:05 CRIT damiana Free Memory Connection refused or timed out [23:06:42] 2014/03/08 23:05 CRIT damiana Load avg. Connection refused or timed out [23:06:42] 2014/03/08 23:05 CRIT damiana NTP CRITICAL - Socket timeout after 10 seconds [23:06:42] 2014/03/08 23:06 CRIT damiana PING CRITICAL - Host Unreachable (damiana) [23:06:42] 2014/03/08 23:06 CRIT ha-dns-auth Authoritative DNS CRITICAL - Plugin timed out while executing system call [23:06:43] 2014/03/08 23:06 CRIT ha-dns-auth PING CRITICAL - Host Unreachable (ha-dns-auth) [23:06:43] 2014/03/08 23:06 CRIT ha-dns-recursor.esi DNS recursor CRITICAL - Plugin timed out while executing system call [23:06:44] 2014/03/08 23:05 CRIT ha-nfs.esi NFS No route to host [23:06:44] 2014/03/08 23:05 CRIT ha-nfs.esi PING CRITICAL - Host Unreachable (ha-nfs.esi) [23:06:45] 2014/03/08 23:05 CRIT ha-proxy.esi HTTP proxy No route to host [23:06:45] 2014/03/08 23:06 CRIT ha-proxy.esi PING CRITICAL - Host Unreachable (ha-proxy.esi) [23:07:41] 2014/03/08 23:07 OK damiana /tmp DISK OK - free space: /tmp 14709 MB (99% inode=99%): [23:07:42] 2014/03/08 23:07 OK damiana Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [23:07:42] 2014/03/08 23:07 OK damiana PING PING OK - Packet loss = 0%, RTA = 0.25 ms [23:07:42] 2014/03/08 23:07 OK damiana ts-array5 2/2 paths are active [23:07:42] 2014/03/08 23:06 CRIT ha-ldap.esi PING CRITICAL - Host Unreachable (ha-ldap.esi) [23:08:42] 2014/03/08 23:07 OK damiana Free Memory OK - 88.1% (7386632 kB) free. [23:08:42] 2014/03/08 23:07 OK damiana Load avg. OK - load average: 1.57, 0.70, 0.27 [23:08:42] 2014/03/08 23:08 ?? ha-dns-auth Authoritative DNS check_dns: Invalid hostname/address - ha-dns-auth [23:08:42] 2014/03/08 23:07 OK ha-nfs.esi PING PING OK - Packet loss = 0%, RTA = 0.38 ms [23:09:42] 2014/03/08 23:08 CRIT ha-proxy.esi CRITICAL - Host Unreachable (ha-proxy.esi) [23:09:42] 2014/03/08 21:28 CRIT ha-ldap.esi LDAP Could not bind to the LDAP server [23:10:42] 2014/03/08 23:10 CRIT ha-dns-auth CRITICAL - Host Unreachable (ha-dns-auth) [23:10:42] 2014/03/08 23:10 CRIT ha-dns-recursor.esi CRITICAL - Host Unreachable (ha-dns-recursor.esi) [23:10:42] 2014/03/08 23:10 CRIT ha-ldap.esi CRITICAL - Host Unreachable (ha-ldap.esi) [23:10:42] 2014/03/08 23:10 CRIT ha-sql.esi CRITICAL - Host Unreachable (ha-sql.esi) [23:11:42] 2014/03/08 23:04 CRIT hemlock /home CHECK_NRPE: Socket timeout after 30 seconds. [23:11:42] 2014/03/08 23:04 CRIT hyacinth SMTP CRITICAL - Socket timeout after 10 seconds [23:11:42] 2014/03/08 23:05 CRIT nightshade SMTP CRITICAL - Socket timeout after 10 seconds [23:11:42] 2014/03/08 23:05 CRIT nightshade aliasd Connection refused [23:11:42] 2014/03/08 23:04 CRIT ortelius toolserver.org HTTP CRITICAL - Socket timeout after 10 seconds [23:11:43] 2014/03/08 23:05 CRIT ptolemy SMTP CRITICAL - Socket timeout after 10 seconds [23:11:43] 2014/03/08 23:04 CRIT rosemary SMTP CRITICAL - Socket timeout after 10 seconds [23:11:44] 2014/03/08 23:05 CRIT sage SMTP CRITICAL - Socket timeout after 10 seconds [23:11:44] 2014/03/08 23:05 CRIT willow SMTP CRITICAL - Socket timeout after 10 seconds [23:11:45] 2014/03/08 23:04 CRIT wolfsbane toolserver.org HTTP CRITICAL - Socket timeout after 10 seconds [23:11:45] 2014/03/08 23:04 CRIT yarrow SMTP CRITICAL - Socket timeout after 10 seconds [23:11:46] 2014/03/08 23:04 CRIT z-dat-s1-b SMTP CRITICAL - Socket timeout after 10 seconds [23:11:46] 2014/03/08 23:04 CRIT z-dat-s2-b SMTP CRITICAL - Socket timeout after 10 seconds [23:11:47] 2014/03/08 23:05 CRIT z-dat-s5-b SMTP CRITICAL - Socket timeout after 10 seconds [23:12:42] 2014/03/08 23:05 CRIT adenia SMTP CRITICAL - Socket timeout after 10 seconds [23:12:42] 2014/03/08 23:05 CRIT hemlock SMTP CRITICAL - Socket timeout after 10 seconds [23:12:42] 2014/03/08 23:05 ?? nightshade Sun Grid Engine execd Execution timeout exceeded [23:12:42] 2014/03/08 23:05 CRIT ortelius SMTP CRITICAL - Socket timeout after 10 seconds [23:12:42] 2014/03/08 23:05 CRIT thyme SMTP CRITICAL - Socket timeout after 10 seconds [23:12:43] 2014/03/08 23:05 CRIT wolfsbane SMTP CRITICAL - Socket timeout after 10 seconds [23:13:42] 2014/03/08 23:05 CRIT damiana NTP NTP CRITICAL: No response from NTP server [23:13:42] 2014/03/08 23:05 CRIT damiana SMTP CRITICAL - Socket timeout after 10 seconds [23:13:42] 2014/03/08 23:13 CRIT ha-dns-auth Authoritative DNS CRITICAL - Plugin timed out while executing system call [23:13:42] 2014/03/08 23:05 CRIT ha-nfs.esi NFS Connection refused [23:14:42] 2014/03/08 23:13 OK ha-nfs.esi NFS TCP OK - 0.000 second response time on port 2049 [23:14:42] 2014/03/08 23:13 WARN ortelius toolserver.org HTTP HTTP WARNING: HTTP/1.1 404 Not found - 161 bytes in 0.008 second response time [23:14:42] 2014/03/08 23:14 OK willow SMTP SMTP OK - 1.641 sec. response time [23:14:42] 2014/03/08 23:13 WARN wolfsbane toolserver.org HTTP HTTP WARNING: HTTP/1.1 404 Not found - 161 bytes in 0.008 second response time [23:15:43] 2014/03/08 23:15 OK ha-dns-auth PING OK - Packet loss = 0%, RTA = 0.27 ms [23:15:43] 2014/03/08 23:15 OK ha-dns-recursor.esi PING OK - Packet loss = 0%, RTA = 0.18 ms [23:15:43] 2014/03/08 23:15 OK ha-ldap.esi PING OK - Packet loss = 0%, RTA = 0.23 ms [23:15:43] 2014/03/08 23:15 OK ha-proxy.esi PING OK - Packet loss = 0%, RTA = 0.13 ms [23:15:43] 2014/03/08 23:15 OK ha-sql.esi PING OK - Packet loss = 0%, RTA = 0.33 ms [23:15:44] 2014/03/08 23:15 OK adenia SMTP SMTP OK - 0.003 sec. response time [23:15:44] 2014/03/08 23:15 OK damiana SMTP SMTP OK - 0.004 sec. response time [23:15:45] 2014/03/08 23:15 OK ha-dns-auth Authoritative DNS DNS OK: 0.058 seconds response time. 1.www.toolserver.org returns 91.198.174.203 [23:15:45] 2014/03/08 23:15 OK ha-dns-auth PING PING OK - Packet loss = 0%, RTA = 0.23 ms [23:15:46] 2014/03/08 23:15 OK ha-dns-recursor.esi PING PING OK - Packet loss = 0%, RTA = 0.18 ms [23:15:46] 2014/03/08 23:15 OK ha-proxy.esi PING PING OK - Packet loss = 0%, RTA = 0.19 ms [23:15:47] 2014/03/08 23:15 OK ha-sql.esi PING PING OK - Packet loss = 0%, RTA = 0.15 ms [23:15:47] 2014/03/08 23:15 OK hemlock SMTP SMTP OK - 0.002 sec. response time [23:15:48] 2014/03/08 23:14 OK hyacinth SMTP SMTP OK - 5.696 sec. response time [23:16:42] 2014/03/08 23:15 OK damiana NTP NTP OK: Offset -0.00098 secs [23:16:42] 2014/03/08 23:15 OK ha-ldap.esi PING PING OK - Packet loss = 0%, RTA = 0.22 ms [23:16:42] 2014/03/08 23:15 OK ha-proxy.esi HTTP proxy HTTP OK: HTTP/1.0 302 Moved Temporarily - 1247 bytes in 0.306 second response time [23:16:42] 2014/03/08 23:15 OK hemlock /home DISK OK - free space: /home 16268 MB (32% inode=91%): [23:16:42] 2014/03/08 23:16 OK nightshade aliasd TCP OK - 1.012 second response time on port 984 [500 Not found.] [23:16:43] 2014/03/08 23:15 OK ortelius toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.013 second response time [23:16:43] 2014/03/08 23:15 OK wolfsbane toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.007 second response time [23:16:44] 2014/03/08 23:15 OK z-dat-s1-b SMTP SMTP OK - 0.003 sec. response time [23:26:42] 2014/03/08 23:25 CRIT ha-ldap.esi PING CRITICAL - Host Unreachable (ha-ldap.esi) [23:26:56] amette: oh? [23:27:33] multichill: very sorry, reboot happened already :( [23:28:32] I tend to not leave unsaved work open on Toolserver/Toollabs because it might crash at any moment so no problem for me [23:28:48] ok, smart move! glad to hear that! [23:30:40] Weird moment to do you maintenance btw [23:31:04] why do you think so? [23:31:42] 2014/03/08 23:31 CRIT ha-ldap.esi CRITICAL - Host Unreachable (ha-ldap.esi) [23:32:42] 2014/03/08 23:25 CRIT nightshade aliasd CRITICAL - Socket timeout after 10 seconds [23:32:57] Saturday night. Most people will be out, not a lot of people to fallback to if something goes wrong [23:34:04] hmmm, yeah, we thought that people will be out and therefore we don't interrupt too many... choose your poison ;) [23:34:53] I meant paid for people or suppliers, volunteers tend to be online when most people are not working [23:37:31] not many paid for people or suppliers that we could/would fall back to for this kind of maintenance, I think... but will take this into consideration the next time [23:38:40] My guess is least activity in the European morning. [23:39:03] On a weekday [23:43:42] 2014/03/08 23:43 OK ha-ldap.esi PING OK - Packet loss = 0%, RTA = 0.15 ms [23:43:43] 2014/03/08 23:43 OK nightshade aliasd TCP OK - 1.008 second response time on port 984 [500 Not found.] [23:44:43] 2014/03/08 23:43 OK ha-ldap.esi PING PING OK - Packet loss = 0%, RTA = 0.16 ms [23:49:59] hello all, is the loss of the tools at http://toolserver.org/~para temporary, permanent or pending movement to labs ? [23:50:42] 2014/03/08 23:49 CRIT ha-ldap.esi PING CRITICAL - Host Unreachable (ha-ldap.esi) [23:50:53] amette nosy : maybe wanna set the topic to reflect the current situation? [23:51:30] NotASpy: it is temporary - but of course we hope that they will be moved soon ;) [23:51:42] Danny_B|webgate: good idea, don't have the rights on this channel though [23:52:05] amette: brilliant. Thanks. :-) [23:52:42] 2014/03/08 23:51 OK ha-ldap.esi PING PING OK - Packet loss = 0%, RTA = 0.21 ms [23:55:42] 2014/03/08 23:55 CRIT damiana / Connection refused or timed out [23:55:42] 2014/03/08 23:55 CRIT damiana /tmp Connection refused or timed out [23:55:42] 2014/03/08 23:55 CRIT damiana DiskSuite Connection refused or timed out [23:55:42] 2014/03/08 23:55 CRIT damiana Environment IPMI Connection refused or timed out [23:55:42] 2014/03/08 23:54 CRIT damiana Load avg. Timeout while attempting connection [23:55:42] 2014/03/08 23:55 CRIT damiana PING PING CRITICAL - Packet loss = 100% [23:55:42] 2014/03/08 23:55 CRIT damiana SMTP CRITICAL - Socket timeout after 10 seconds [23:55:43] 2014/03/08 23:55 CRIT damiana SSH CRITICAL - Socket timeout after 10 seconds [23:55:43] 2014/03/08 23:55 CRIT damiana ts-array5 Connection refused or timed out [23:55:44] 2014/03/08 23:55 CRIT ha-dns-auth Authoritative DNS CRITICAL - Plugin timed out while executing system call [23:55:44] 2014/03/08 23:55 CRIT ha-dns-auth PING CRITICAL - Host Unreachable (ha-dns-auth) [23:55:45] 2014/03/08 23:55 CRIT ha-dns-recursor.esi DNS recursor CRITICAL - Plugin timed out while executing system call [23:56:43] 2014/03/08 23:55 CRIT damiana Free Memory Connection refused or timed out [23:56:43] 2014/03/08 23:55 CRIT damiana NTP CRITICAL - Socket timeout after 10 seconds [23:56:43] 2014/03/08 23:55 CRIT ha-ldap.esi PING PING CRITICAL - Packet loss = 100% [23:56:43] 2014/03/08 23:55 CRIT ha-nfs.esi NFS No route to host [23:56:43] 2014/03/08 23:55 CRIT ha-nfs.esi PING CRITICAL - Host Unreachable (ha-nfs.esi) [23:56:44] 2014/03/08 23:55 CRIT ha-proxy.esi HTTP proxy No route to host [23:57:43] 2014/03/08 23:57 OK damiana / DISK OK - free space: / 27140 MB (37% inode=95%): [23:57:43] 2014/03/08 23:57 OK damiana /tmp DISK OK - free space: /tmp 6179 MB (99% inode=99%): [23:57:43] 2014/03/08 23:57 OK damiana Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [23:57:43] 2014/03/08 23:57 OK damiana PING PING OK - Packet loss = 0%, RTA = 0.19 ms [23:57:43] 2014/03/08 23:57 OK damiana SSH SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [23:57:44] 2014/03/08 23:57 OK damiana ts-array5 2/2 paths are active [23:58:32] amette: I think everybody can change the topic here. [23:58:42] NotASpy / amette : Didn't that account expire some time ago? [23:58:44] 2014/03/08 23:58 CRIT ha-ldap.esi CRITICAL - Host Unreachable (ha-ldap.esi) [23:58:44] 2014/03/08 23:57 CRIT ha-proxy.esi CRITICAL - Host Unreachable (ha-proxy.esi) [23:58:44] 2014/03/08 23:57 OK damiana Free Memory OK - 86.2% (7222808 kB) free. [23:58:44] 2014/03/08 23:57 OK damiana Load avg. OK - load average: 1.52, 0.72, 0.28 [23:58:44] 2014/03/08 23:57 OK ha-nfs.esi PING PING OK - Packet loss = 0%, RTA = 0.12 ms [23:59:35] multichill / NotASpy : good point... might be... I was just assuming that it's due to the LDAP failure [23:59:42] thanks multichill :) [23:59:51] multichill: was using it earlier this evening (unless someone has changed a link somewhere - I'll check)