[00:00:16] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [00:00:16] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [00:00:16] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [00:00:16] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:00:16] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [00:00:36] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [00:01:06] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.014 second response time [00:02:16] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [00:05:25] Load avg. on yarrow is WARNING: WARNING - load average: 5.76, 12.80, 19.16 [00:06:06] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:06:26] Sun Grid Engine execd on ortelius is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:06:26] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:06:46] APT on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:06:55] Load avg. on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:07:27] * Magog_the_Ogre kicks the toolserver [00:07:30] work darn it [00:07:54] be nice to toolserver or it wont serve you anymore [00:08:45] s4 replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2206.000000 [00:09:06] well that wouldn't be much of a change from the state it's in right now [00:09:26] Load avg. on yarrow is WARNING: WARNING - load average: 18.80, 17.14, 19.49 [00:11:26] s4 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1172.000000 [00:12:35] s4 replag on z-dat-s5-b is OK: QUERY OK: SELECT ts_rc_age() returned 1340.000000 [00:12:39] What is more depressing is that people running jobs out of cron are better served than SGE users at the moment :-). [00:13:01] i wonder what's wrong recently... in last few weeks it has continuous issues with nfs [00:27:55] Why is PHPMyAdmin keep 504/502ing [00:32:16] Load avg. on nightshade is WARNING: WARNING - load average: 4.59, 12.79, 19.08 [00:32:26] nfs is broken thats all :D [00:32:32] the rest then too... [00:33:45] Load avg. on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:35:09] assuming it will take much more time i will go sleep and wishing the same to nosy and DaBPunkt ... ;-) [00:35:28] yes that wont be a mistake [00:35:31] "Broken" = Can't mount? Or is the file system broken? [00:35:34] thanks [00:35:40] fs seems fine [00:36:18] can there be any program / script running over nfs to have an influence on it? [00:36:20] while cluster tries to enable the nfs mount even is working [00:36:25] this is my logs: [00:36:27] =0, sent=0, dropped=0, active_time=305 secs [00:36:27] May 11 00:31:59 damiana ntpd[2796]: [ID 702911 daemon.info] peers refreshed [00:36:27] May 11 00:32:03 damiana nfs: [ID 609386 kern.warning] WARNING: lockd: cannot contact statd (error 4), continuing [00:36:27] May 11 00:32:03 damiana rpcmod: [ID 650864 kern.warning] WARNING: svc_tli_kcreate: xprt_register failed [00:36:27] May 11 00:32:03 damiana SC[,SUNW.nfs:3.2,nfs,nfs-home,nfs_postnet_stop]: [ID 837223 daemon.error] NFS daemon /usr/lib/nfs/lockd died. Will [00:36:28] restart in 100 milliseconds. [00:36:28] May 11 00:32:03 damiana SC[,SUNW.nfs:3.2,nfs,nfs-home,nfs_postnet_stop]: [ID 530938 daemon.notice] Starting NFS daemon /usr/lib/nfs/lockd. [00:36:29] May 11 00:32:03 damiana SC[,SUNW.nfs:3.2,nfs,nfs-home,nfs_postnet_stop]: [ID 906922 daemon.notice] Started NFS daemon /usr/lib/nfs/lockd. [00:36:29] May 11 00:32:03 damiana SC[,SUNW.nfs:3.2,nfs,nfs-home,nfs_postnet_stop]: [ID 530938 daemon.notice] Starting NFS daemon /usr/lib/nfs/mountd [00:36:30] . [00:36:30] May 11 00:32:03 damiana SC[,SUNW.nfs:3.2,nfs,nfs-home,nfs_postnet_stop]: [ID 906922 daemon.notice] Started NFS daemon /usr/lib/nfs/mountd. [00:36:31] May 11 00:32:03 damiana SC[,SUNW.nfs:3.2,nfs,nfs-home,nfs_postnet_stop]: [ID 530938 daemon.notice] Starting NFS daemon /usr/lib/nfs/nfsd. [00:36:31] May 11 00:32:03 damiana nfssrv: [ID 760318 kern.notice] NOTICE: nfs_server: server was previously quiesced; existing NFSv4 state will be re-used [00:36:32] May 11 00:32:03 damiana SC[,SUNW.nfs:3.2,nfs,nfs-home,nfs_postnet_stop]: [ID 906922 daemon.notice] Started NFS daemon /usr/lib/nfs/nfsd. [00:37:43] at the end of the day we'll find out that the issue is in twisted wire in the patch cable... :-/ [00:38:16] MySQL slave on z-dat-s7-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3003 [00:38:16] MySQL slave on daphne is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1860 [00:38:16] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:38:16] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 118001 [00:38:36] aliasd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:38:45] s4 replag on daphne is OK: QUERY OK: SELECT ts_rc_age() returned 1639.000000 [00:38:55] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:38:55] /home on hemlock is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:39:09] Wasn't another NFS mounted on rosemany? Can't this be done for /home? Or are the broken volumes for the cluster hosts (LDAP & Co.)? [00:39:15] Sun Grid Engine execd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:39:16] MySQL slave on daphne is OK: Uptime: 1918360 Threads: 49 Questions: 226294425 Slow queries: 415622 Opens: 80649 Flush tables: 1 Open tables: 1916 Queries per second avg: 117.962 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1482 [00:39:16] aliasd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:39:36] /var on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:39:36] Sun Grid Engine execd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:39:46] Environment IPMI on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:40:06] /var/tmp on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:40:06] Environment IPMI on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:40:06] / on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:40:06] Sensors on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:40:15] /tmp on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:40:16] SRaid on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:40:16] /var on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:40:16] / on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:40:25] /var/tmp on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:40:25] Sensors on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:40:25] /tmp on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:40:45] /tmp on nightshade is OK: DISK OK - free space: /tmp 3489 MB (78% inode=99%): [00:40:45] / on nightshade is OK: DISK OK - free space: / 1593 MB (89% inode=94%): [00:40:46] Environment IPMI on nightshade is OK: ok: temperature ok fan ok voltage ok chassis ok [00:40:46] /var on nightshade is OK: DISK OK - free space: /var 9850 MB (73% inode=48%): [00:40:46] / on yarrow is OK: DISK OK - free space: / 1582 MB (88% inode=94%): [00:40:47] /var/tmp on yarrow is OK: DISK OK - free space: /var/tmp 827 MB (97% inode=99%): [00:40:47] SRaid on yarrow is OK: OK md0 status=[UU]. [00:40:47] Sensors on nightshade is OK: sensor ok [00:40:47] aliasd on nightshade is OK: TCP OK - 0.096 second response time on port 984 [500 Not found.] [00:40:48] /home on hemlock is OK: DISK OK - free space: /home 13525 MB (26% inode=81%): [00:40:55] /var/tmp on nightshade is OK: DISK OK - free space: /var/tmp 872 MB (98% inode=99%): [00:40:56] /tmp on yarrow is OK: DISK OK - free space: /tmp 4085 MB (96% inode=99%): [00:40:56] Sensors on yarrow is OK: sensor ok [00:41:06] /var on yarrow is OK: DISK OK - free space: /var 11652 MB (87% inode=96%): [00:41:06] Sun Grid Engine execd on yarrow is OK: Host and Queues Ok [00:41:06] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.409 second response time [00:41:06] aliasd on yarrow is OK: TCP OK - 0.004 second response time on port 984 [500 Not found.] [00:41:16] Environment IPMI on yarrow is OK: ok: temperature ok fan ok voltage ok chassis ok [00:41:16] MySQL slave on z-dat-s5-b is OK: Uptime: 145832 Threads: 5 Questions: 235145342 Slow queries: 1443 Opens: 44074 Flush tables: 1 Open tables: 256 Queries per second avg: 1612.439 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [00:41:45] Sun Grid Engine execd on nightshade is OK: Host and Queues Ok [00:41:46] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.004 second response time [00:42:15] MySQL slave on z-dat-s7-a is OK: Uptime: 452088 Threads: 8 Questions: 482523797 Slow queries: 6770 Opens: 876138 Flush tables: 1 Open tables: 3368 Queries per second avg: 1067.322 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1674 [00:43:36] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 513645.000000 [00:43:36] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [00:43:36] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2394702.000000 [00:43:36] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 690591.000000 [00:43:37] scfc_de: the home-volume is directly connected to both head nodes and not to others [00:43:45] NTP on ptolemy is CRITICAL: NTP CRITICAL: Offset 11.505836 secs [00:43:46] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3714.000000 [00:43:46] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:43:55] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1505605.000000 [00:43:55] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:43:56] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:43:56] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:43:56] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:44:00] worst workaround could be to simple not use the clustering [00:44:05] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [00:44:05] / on thyme is UNKNOWN: NRPE: Unable to read output [00:44:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1344010.000000 [00:44:06] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:44:06] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:44:06] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:44:14] which may cause interruptions while the system is booting [00:44:15] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:44:15] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:44:15] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [00:44:16] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3706 [00:44:16] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:44:16] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:44:17] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:44:25] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [00:44:26] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [00:44:26] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 357518.000000 [00:44:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 277026 MB (5% inode=64%): [00:44:26] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [00:44:35] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [00:44:36] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [00:44:36] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [00:44:36] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:44:37] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:44:46] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:44:46] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [00:44:46] RAID on thyme is UNKNOWN: NRPE: Unable to read output [00:44:46] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:44:46] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 117642.000000 [00:44:46] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 21106 [00:44:47] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [00:44:47] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [00:44:48] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1228703.000000 [00:44:48] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:44:56] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:44:56] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:44:56] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:45:04] WTF?? [00:45:06] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:45:06] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [00:45:16] SSH on mayapple is CRITICAL: Server answer: [00:45:16] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [00:45:16] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [00:45:28] indeed [00:46:15] "Booting" = couple of minutes every few days? [00:46:56] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:46:56] FC 0/14 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:06] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:06] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:06] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:06] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:16] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:16] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:36] ethernet 0/1/8 [turnera bge1] on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/8:DOWN: 1 int NOK : CRITICAL [00:47:36] FC 0/11 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:36] FC 0/21 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:45] ethernet 0/1/6 [turnera bge0] on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/6:DOWN: 1 int NOK : CRITICAL [00:47:45] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:45] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:46] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:46] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:46] FC 0/10 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:46] FC 0/2 [hemlock] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:47] FC 0/9 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:47] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:48] FC 0/20 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:48] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:49] FC 0/22 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:49] eth0 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:50] FC 0/12 [thyme] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:55] FC 0/13 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:47:56] FC 0/23 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:48:36] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [00:51:15] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 58359 MB (9% inode=99%): [00:51:25] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 117350.000000 [00:51:35] NTP on cassia is CRITICAL: NTP CRITICAL: Offset 10.05665 secs [00:52:16] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:52:26] Sensors on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:52:26] /tmp on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:52:36] Sun Grid Engine execd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:52:36] /var on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:52:36] aliasd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:52:45] Environment IPMI on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:52:55] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:52:55] /home on hemlock is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:53:05] / on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:53:05] Environment IPMI on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:53:06] /var/tmp on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:53:06] Sensors on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:53:15] /tmp on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:53:16] Sun Grid Engine execd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:53:16] SRaid on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:53:16] / on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:53:16] aliasd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:53:16] /var on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:53:26] /var/tmp on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:02:16] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [01:07:26] /home on hemlock is OK: DISK OK - free space: /home 13515 MB (26% inode=81%): [01:07:55] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [01:08:15] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [01:08:45] ok workaround works [01:08:45] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3597.000000 [01:08:54] seems to be back [01:09:16] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3568 [01:09:16] Load avg. on nightshade is WARNING: WARNING - load average: 0.08, 6.16, 19.17 [01:13:16] Load avg. on nightshade is OK: OK - load average: 0.12, 2.80, 14.81 [01:16:13] how did it happen that site_stats.ss_good_articles is different from what I see on wiki’s Special:Statistics? [01:16:56] the difference is instable and is around 88 now [01:17:13] (waiting for ruwiki’s millionth article) [01:17:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [01:18:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [01:20:11] replag? script not running? [01:20:35] replag is certainly zero, as I’m watching recentchanges in parallel [01:20:38] which script? [01:20:44] anyways ts seems to work somehow - having some sleep [01:21:26] can the replicated data be inconsistent? [01:22:49] it might be [01:22:49] i dont know the script thus not how it works [01:23:07] anyway its 3:30 my time, children will wake me up in 3 hours - ill go get some sleep [01:23:12] see you soon... [01:23:36] Good night. [01:26:04] DaB. * Re: [Toolserver-l] Maintenance: Solaris Updates [01:27:35] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [01:28:37] (okay, I rewrote my watching script to use API) [07:41:56] Free Memory on damiana is WARNING: WARNING - 5.2% (433516 kB) free! [07:43:36] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 538242.000000 [07:43:46] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:43:55] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1530249.000000 [07:43:55] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:43:56] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:43:56] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:43:56] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:44:05] / on thyme is UNKNOWN: NRPE: Unable to read output [07:44:06] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [07:44:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1368372.000000 [07:44:06] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:44:06] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:44:06] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:44:16] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:44:16] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:44:17] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [07:44:17] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:44:17] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:44:17] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:44:26] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [07:44:26] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [07:44:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 276194 MB (5% inode=64%): [07:44:26] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [07:44:36] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [07:44:36] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [07:44:37] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [07:44:37] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:44:37] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [07:44:37] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 715143.000000 [07:44:37] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:44:46] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:44:46] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [07:44:46] RAID on thyme is UNKNOWN: NRPE: Unable to read output [07:44:47] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:44:47] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 128330.000000 [07:44:56] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:44:56] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:44:56] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:45:06] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [07:45:06] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:45:16] SSH on mayapple is CRITICAL: Server answer: [07:45:16] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [07:45:16] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [07:45:36] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2419861.000000 [07:45:46] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [07:45:46] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1253964.000000 [07:45:47] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:48:36] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [07:50:46] NTP on ptolemy is CRITICAL: NTP CRITICAL: Offset 11.505836 secs [07:51:15] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 57488 MB (9% inode=99%): [08:05:37] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 25204.000000 [08:07:56] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [08:08:16] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [08:17:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [08:18:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [08:23:06] Free Memory on damiana is CRITICAL: CRITICAL - 3.9% (323948 kB) free! [08:27:36] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [08:29:56] Free Memory on damiana is WARNING: WARNING - 5.6% (467208 kB) free! [08:30:55] Free Memory on damiana is CRITICAL: CRITICAL - 4.5% (374524 kB) free! [08:33:56] Free Memory on damiana is WARNING: WARNING - 5.3% (446500 kB) free! [08:43:36] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 541842.000000 [08:43:46] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:43:56] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1533850.000000 [08:43:56] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:43:56] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:43:56] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:43:56] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:44:06] / on thyme is UNKNOWN: NRPE: Unable to read output [08:44:06] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [08:44:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1371972.000000 [08:44:06] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:44:06] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:44:07] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:44:16] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:44:16] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:44:16] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [08:44:16] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:44:16] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:44:17] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:44:26] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [08:44:26] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [08:44:26] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [08:44:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 276068 MB (5% inode=64%): [08:44:36] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [08:44:37] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [08:44:37] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [08:44:37] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:44:37] NTP on cassia is OK: NTP OK: Offset -0.006517 secs [08:44:37] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [08:44:38] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 718743.000000 [08:44:38] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:44:46] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:44:46] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [08:44:46] RAID on thyme is UNKNOWN: NRPE: Unable to read output [08:44:55] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:44:55] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:44:55] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:45:09] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [08:45:09] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:45:15] SSH on mayapple is CRITICAL: Server answer: [08:45:16] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [08:45:16] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [08:45:36] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2423461.000000 [08:45:46] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:45:46] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 129966.000000 [08:45:46] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [08:45:46] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1257564.000000 [08:45:46] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:48:36] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [08:50:46] NTP on ptolemy is CRITICAL: NTP CRITICAL: Offset 11.505836 secs [08:51:16] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 57331 MB (9% inode=99%): [08:57:56] Free Memory on damiana is CRITICAL: CRITICAL - 4.7% (392460 kB) free! [09:05:36] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 28804.000000 [09:07:56] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [09:08:16] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [09:13:16] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [09:17:55] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [09:18:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [09:18:56] Free Memory on damiana is WARNING: WARNING - 5.4% (451132 kB) free! [09:19:15] @replag [09:19:30] hmm [09:20:56] Free Memory on damiana is CRITICAL: CRITICAL - 4.4% (368748 kB) free! [09:27:36] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [09:29:16] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 130990 [09:29:55] Free Memory on damiana is WARNING: WARNING - 5.7% (480220 kB) free! [09:31:16] MySQL slave on z-dat-s5-b is OK: Uptime: 177632 Threads: 5 Questions: 323834681 Slow queries: 1608 Opens: 79835 Flush tables: 1 Open tables: 256 Queries per second avg: 1823.64 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [09:38:16] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 131115 [09:43:36] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 545442.000000 [09:43:46] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:43:55] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1537450.000000 [09:43:55] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:43:55] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:43:55] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:43:56] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:44:05] / on thyme is UNKNOWN: NRPE: Unable to read output [09:44:05] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [09:44:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1375572.000000 [09:44:06] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:44:06] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:44:07] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:44:15] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:44:16] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [09:44:16] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:44:16] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:44:16] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:44:17] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:44:25] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [09:44:26] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [09:44:26] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [09:44:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 275926 MB (5% inode=64%): [09:44:36] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [09:44:46] RAID on thyme is UNKNOWN: NRPE: Unable to read output [09:44:46] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [09:44:59] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:44:59] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:44:59] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:45:06] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [09:45:06] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:45:16] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [09:45:16] SSH on mayapple is CRITICAL: Server answer: [09:45:16] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [09:45:36] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [09:45:36] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [09:45:36] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:45:36] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [09:45:36] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2427061.000000 [09:45:37] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 722403.000000 [09:45:37] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:45:45] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:45:45] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 131133.000000 [09:45:46] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [09:45:46] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1261163.000000 [09:45:46] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:47:45] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [09:48:36] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [09:50:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [09:50:46] HTTP proxy on ha-proxy.esi is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:52:45] HTTP proxy on ha-proxy.esi is OK: HTTP OK: HTTP/1.0 302 Moved Temporarily - 1114 bytes in 7.916 second response time [09:54:56] Free Memory on damiana is CRITICAL: CRITICAL - 3.0% (253836 kB) free! [10:03:46] NFS server ha-nfs.esi not responding still trying [10:12:42] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 32822.000000 [10:12:42] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 131786.000000 [10:12:49] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:13:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [10:13:30] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [10:16:50] NTP on ptolemy is CRITICAL: NTP CRITICAL: Offset 11.505836 secs [10:17:36] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 57202 MB (9% inode=99%): [10:17:49] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [10:17:50] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [10:18:40] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 391442.000000 [10:18:50] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [10:21:19] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:21:20] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:21:50] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [10:21:50] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [10:22:05] sigh [10:34:30] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [10:43:50] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 549056.000000 [10:43:50] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:44:00] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1541054.000000 [10:44:10] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:44:10] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:44:10] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [10:44:10] / on thyme is UNKNOWN: NRPE: Unable to read output [10:44:10] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1379180.000000 [10:44:11] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:44:11] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:44:12] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:44:12] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:44:19] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:44:19] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:44:20] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:44:20] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:44:21] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:44:21] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [10:44:29] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [10:44:29] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [10:44:39] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 275798 MB (5% inode=64%): [10:44:40] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [10:44:40] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [10:44:49] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [10:44:50] RAID on thyme is UNKNOWN: NRPE: Unable to read output [10:45:00] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:45:00] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:45:10] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:45:10] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:45:10] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:45:20] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [10:45:30] SSH on mayapple is CRITICAL: Server answer: [10:45:30] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [10:45:30] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [10:45:40] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [10:45:40] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [10:45:50] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 726011.000000 [10:45:50] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [10:45:50] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:45:50] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:45:50] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:45:50] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 134093.000000 [10:45:59] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1264771.000000 [10:45:59] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:46:00] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [10:46:09] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2430692.000000 [10:47:59] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [10:49:10] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [10:50:26] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [11:11:50] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36372.000000 [11:12:10] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [11:12:20] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [11:12:39] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36422.000000 [11:12:40] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 133159.000000 [11:12:50] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:13:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [11:17:49] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [11:17:49] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [11:21:50] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [11:21:50] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [11:25:40] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 395463.000000 [11:34:30] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [11:43:50] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 552656.000000 [11:43:50] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:00] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1544654.000000 [11:44:10] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:10] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:10] / on thyme is UNKNOWN: NRPE: Unable to read output [11:44:10] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [11:44:10] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1382780.000000 [11:44:11] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:11] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:44:12] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:44:12] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:44:19] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:19] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:44:19] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:44:20] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:44:20] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [11:44:20] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:29] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [11:44:29] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [11:44:39] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 275514 MB (5% inode=64%): [11:44:40] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [11:44:40] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [11:44:50] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [11:44:50] RAID on thyme is UNKNOWN: NRPE: Unable to read output [11:45:00] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:00] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:45:10] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:45:10] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:45:11] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:20] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [11:45:30] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [11:45:30] SSH on mayapple is CRITICAL: Server answer: [11:45:30] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [11:45:40] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [11:45:40] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [11:45:50] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 729611.000000 [11:45:50] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [11:45:50] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:45:50] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:45:50] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:45:51] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 137693.000000 [11:45:59] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1268372.000000 [11:46:00] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:46:00] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [11:46:09] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2434291.000000 [11:47:59] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [11:49:10] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [11:50:20] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [12:07:50] / on willow is WARNING: DISK WARNING - free space: / 17080 MB (16% inode=98%): [12:08:00] /tmp on willow is WARNING: DISK WARNING - free space: / 17080 MB (16% inode=98%): [12:11:50] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39972.000000 [12:12:13] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [12:12:20] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [12:12:39] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 40022.000000 [12:12:39] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 134655.000000 [12:12:49] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:13:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [12:15:00] /tmp on willow is CRITICAL: DISK CRITICAL - free space: / 11503 MB (10% inode=98%): [12:15:50] / on willow is CRITICAL: DISK CRITICAL - free space: / 10643 MB (10% inode=97%): [12:17:49] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [12:17:50] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [12:21:50] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [12:21:50] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [12:25:40] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 399062.000000 [12:34:30] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [12:36:50] / on willow is OK: DISK OK - free space: / 24850 MB (23% inode=99%): [12:37:00] /tmp on willow is OK: DISK OK - free space: / 24850 MB (23% inode=99%): [12:43:50] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 556256.000000 [12:43:50] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:44:00] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1548255.000000 [12:44:09] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:44:09] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:44:09] / on thyme is UNKNOWN: NRPE: Unable to read output [12:44:10] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [12:44:10] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1386380.000000 [12:44:10] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:44:10] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:44:11] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:44:11] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:44:19] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:44:19] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:44:20] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:44:20] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:44:21] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:44:21] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [12:44:30] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [12:44:30] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [12:44:40] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 296470 MB (5% inode=64%): [12:44:40] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [12:44:40] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [12:44:50] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [12:44:50] RAID on thyme is UNKNOWN: NRPE: Unable to read output [12:45:00] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:45:00] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:45:10] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:45:10] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:45:10] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:45:20] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [12:45:30] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [12:45:30] SSH on mayapple is CRITICAL: Server answer: [12:45:30] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [12:45:40] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [12:45:40] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [12:45:50] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 733211.000000 [12:45:50] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [12:45:50] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:45:50] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:45:50] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:45:51] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 141293.000000 [12:45:59] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1271972.000000 [12:46:00] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:46:00] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [12:46:09] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2437891.000000 [12:47:39] SMTP on wolfsbane is OK: SMTP OK - 0.005 sec. response time [12:47:49] / on willow is WARNING: DISK WARNING - free space: / 17491 MB (16% inode=98%): [12:48:00] /tmp on willow is WARNING: DISK WARNING - free space: / 17491 MB (16% inode=98%): [12:48:00] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [12:49:10] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [12:50:20] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [13:06:50] / on willow is OK: DISK OK - free space: / 26242 MB (25% inode=99%): [13:06:59] /tmp on willow is OK: DISK OK - free space: / 26242 MB (25% inode=99%): [13:11:50] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43572.000000 [13:12:10] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [13:12:19] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [13:13:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [13:17:49] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [13:17:49] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [13:20:40] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 44101.000000 [13:20:40] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 137176.000000 [13:21:50] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [13:21:50] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [13:25:40] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 402662.000000 [13:34:30] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [13:43:50] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 559856.000000 [13:43:50] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:43:59] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1551855.000000 [13:44:09] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:44:09] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:44:09] / on thyme is UNKNOWN: NRPE: Unable to read output [13:44:09] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [13:44:10] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1389980.000000 [13:44:10] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:44:10] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:44:11] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:44:11] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:44:19] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:44:20] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:44:20] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:44:20] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:44:21] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [13:44:21] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:44:29] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [13:44:30] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [13:44:40] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 296131 MB (5% inode=64%): [13:44:40] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [13:44:50] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [13:44:50] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [13:44:50] RAID on thyme is UNKNOWN: NRPE: Unable to read output [13:45:00] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:45:00] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:45:10] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:45:10] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:45:10] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:45:20] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [13:45:30] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [13:45:30] SSH on mayapple is CRITICAL: Server answer: [13:45:30] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [13:45:40] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [13:45:40] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [13:45:49] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 736811.000000 [13:45:49] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [13:45:49] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:45:50] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:45:50] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:45:50] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 144893.000000 [13:45:59] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1275571.000000 [13:45:59] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:45:59] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [13:46:09] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2441491.000000 [13:47:59] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [13:49:10] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [13:50:20] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [14:11:50] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 47172.000000 [14:12:10] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [14:12:19] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [14:13:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [14:17:49] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [14:17:50] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [14:20:40] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 47702.000000 [14:20:40] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 138309.000000 [14:21:50] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [14:21:50] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [14:24:49] NTP on ptolemy is CRITICAL: NTP CRITICAL: Offset 11.505836 secs [14:25:30] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 56160 MB (9% inode=99%): [14:25:40] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 406262.000000 [14:34:30] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [14:43:50] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 563456.000000 [14:43:50] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:43:59] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1555455.000000 [14:44:09] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:44:10] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:44:10] / on thyme is UNKNOWN: NRPE: Unable to read output [14:44:10] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [14:44:10] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1393580.000000 [14:44:11] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:44:11] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:44:11] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:44:11] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:44:20] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:44:20] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:44:20] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:44:20] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:44:20] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [14:44:21] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:44:30] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [14:44:30] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [14:44:40] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 295614 MB (5% inode=64%): [14:44:40] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [14:44:50] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [14:44:50] RAID on thyme is UNKNOWN: NRPE: Unable to read output [14:45:00] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:45:00] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:45:10] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:45:10] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:45:10] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:45:19] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [14:45:29] SSH on mayapple is CRITICAL: Server answer: [14:45:29] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [14:45:29] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [14:45:39] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [14:45:40] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [14:45:40] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [14:45:49] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 740411.000000 [14:45:49] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [14:45:49] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:45:50] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:45:50] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:45:50] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 148493.000000 [14:46:00] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1279172.000000 [14:46:00] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:46:00] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [14:46:10] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2445092.000000 [14:48:00] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [14:49:09] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [14:50:20] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [15:00:03] Hmmm. tslogbot stopped at 0:45Z. Any ETA on when the Linux queues will be enabled again? [15:11:50] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 50772.000000 [15:12:09] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [15:12:19] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [15:13:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [15:17:50] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [15:17:50] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [15:20:39] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 51301.000000 [15:20:39] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 138371.000000 [15:21:50] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [15:21:50] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [15:25:40] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 409863.000000 [15:34:30] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [15:43:50] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 567056.000000 [15:43:50] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:43:59] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1559054.000000 [15:44:09] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:44:10] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:44:10] / on thyme is UNKNOWN: NRPE: Unable to read output [15:44:10] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [15:44:10] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1397180.000000 [15:44:10] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:44:11] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:44:11] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:44:11] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:44:20] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:44:20] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:44:20] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:44:20] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:44:20] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [15:44:20] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:44:30] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [15:44:30] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [15:44:40] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 295221 MB (5% inode=64%): [15:44:40] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [15:44:50] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [15:44:50] RAID on thyme is UNKNOWN: NRPE: Unable to read output [15:45:00] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:45:00] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:45:10] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:45:10] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:45:10] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:45:20] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [15:45:29] SSH on mayapple is CRITICAL: Server answer: [15:45:29] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [15:45:30] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [15:45:40] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [15:45:40] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [15:45:40] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [15:45:49] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 744011.000000 [15:45:50] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [15:45:50] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:45:50] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:45:51] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:45:51] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 152093.000000 [15:45:59] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1282772.000000 [15:46:00] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:46:00] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [15:46:10] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2448691.000000 [15:48:00] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [15:49:09] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [15:50:20] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [16:11:44] This channel is also logged by wm-bot at http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-toolserver/20130511.txt, so the answer seems to be: No estimate or anything since tonight. *sigh* [16:11:49] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 54372.000000 [16:12:10] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [16:12:20] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [16:13:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [16:18:24] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [16:18:24] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [16:25:40] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 413463.000000 [16:32:53] sge broken on nightshade and yarrow ? other host are overloaded and jobs never start [16:34:40] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [16:37:20] "qstat -u \* | wc -l": 657 :-) [16:38:57] nightshade load average: 0.10, 0.07, 0.06 [16:43:50] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 570656.000000 [16:43:50] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:44:09] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1562660.000000 [16:44:11] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [16:44:11] / on thyme is UNKNOWN: NRPE: Unable to read output [16:44:11] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1400780.000000 [16:44:11] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:44:11] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:44:11] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:44:19] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:44:20] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:44:20] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:44:20] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:44:20] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:44:20] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:44:21] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:44:21] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [16:44:21] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:44:40] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [16:44:40] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [16:44:40] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 294883 MB (5% inode=64%): [16:44:40] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [16:44:50] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc [16:44:50] RAID on thyme is UNKNOWN: NRPE: Unable to read output [16:45:00] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:45:10] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:45:10] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:45:10] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:45:20] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [16:45:20] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:45:40] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [16:45:40] SSH on mayapple is CRITICAL: Server answer: [16:45:40] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [16:45:40] /sql on z-dat-s1-b is WARNING: DISK WARNING - free space: /sql 81973 MB (8% inode=99%): [16:45:41] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [16:45:41] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [16:45:41] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [16:45:49] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 747611.000000 [16:45:49] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [16:45:50] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:45:50] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:45:51] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:45:51] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 155695.000000 [16:45:59] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [16:46:09] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:46:10] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1286383.000000 [16:46:10] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2452294.000000 [16:48:10] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [16:49:10] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [16:50:20] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [16:56:41] MySQL slave on daphne is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2171 [17:02:06] s4 replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2201.000000 [18:19:03] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [18:19:03] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [18:20:40] MySQL slave on daphne is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 7208 [18:21:10] s4 replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7202.000000 [18:21:49] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [18:21:50] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [18:34:40] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [18:43:50] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 577856.000000 [18:43:50] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:44:10] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1569860.000000 [18:44:10] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [18:44:10] / on thyme is UNKNOWN: NRPE: Unable to read output [18:44:10] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1407980.000000 [18:44:10] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:44:10] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:44:11] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:44:24] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:44:24] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:44:24] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:44:25] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:44:25] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:44:25] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:44:25] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:44:49] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [18:44:49] RAID on thyme is UNKNOWN: NRPE: Unable to read output [18:44:59] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:45:10] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:45:10] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:45:10] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:45:20] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [18:45:20] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:45:20] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [18:45:20] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:45:40] SSH on mayapple is CRITICAL: Server answer: [18:45:40] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [18:45:40] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [18:45:40] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [18:45:40] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [18:45:40] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 294122 MB (5% inode=64%): [18:45:41] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [18:45:42] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [18:45:42] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [18:45:50] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [18:45:50] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 754811.000000 [18:45:50] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [18:45:50] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:45:50] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:45:50] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:45:51] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 162895.000000 [18:45:59] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [18:46:10] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:46:10] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1293583.000000 [18:46:10] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2459494.000000 [18:48:10] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [18:49:10] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [18:50:19] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [19:03:20] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2866 [19:03:50] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2877.000000 [19:10:40] MySQL slave on daphne is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3585 [19:11:10] s4 replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3419.000000 [19:11:49] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 65172.000000 [19:12:10] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [19:12:20] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [19:13:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [19:17:50] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [19:17:50] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [19:21:50] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [19:21:50] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [19:24:10] s4 replag on daphne is OK: QUERY OK: SELECT ts_rc_age() returned 1800.000000 [19:24:39] MySQL slave on daphne is OK: Uptime: 1985882 Threads: 17 Questions: 229491745 Slow queries: 431015 Opens: 81393 Flush tables: 1 Open tables: 1923 Queries per second avg: 115.561 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1727 [19:34:40] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [19:43:50] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 581456.000000 [19:43:50] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:09] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1573461.000000 [19:44:10] / on thyme is UNKNOWN: NRPE: Unable to read output [19:44:10] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [19:44:11] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1411580.000000 [19:44:11] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:44:11] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:44:11] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:44:20] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:44:20] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:44:20] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:44:20] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:20] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:20] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:21] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:50] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [19:44:50] RAID on thyme is UNKNOWN: NRPE: Unable to read output [19:44:59] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:11] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:45:11] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:45:11] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:45:20] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [19:45:21] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:21] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [19:45:21] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:40] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [19:45:40] SSH on mayapple is CRITICAL: Server answer: [19:45:40] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [19:45:40] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 293796 MB (5% inode=64%): [19:45:40] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [19:45:41] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [19:45:41] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [19:45:41] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [19:45:42] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [19:45:49] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 758411.000000 [19:45:50] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [19:45:50] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [19:45:50] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:51] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:45:51] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:45:51] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 166495.000000 [19:46:00] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [19:46:10] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:46:10] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1297184.000000 [19:46:10] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2463093.000000 [19:48:10] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [19:49:10] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [19:50:20] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [19:54:49] NTP on ptolemy is CRITICAL: NTP CRITICAL: Offset 11.505836 secs [19:55:40] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 54652 MB (8% inode=99%): [20:03:19] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2931 [20:03:50] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2939.000000 [20:11:49] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 68773.000000 [20:12:10] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [20:12:19] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [20:13:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [20:17:49] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [20:17:50] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [20:21:50] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [20:21:50] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [20:34:10] Free Memory on damiana is WARNING: WARNING - 6.7% (562664 kB) free! [20:34:40] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [20:35:10] Free Memory on damiana is OK: OK - 7.2% (605220 kB) free. [20:42:10] Free Memory on damiana is WARNING: WARNING - 6.3% (528012 kB) free! [20:43:50] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 585056.000000 [20:43:50] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:44:10] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1577061.000000 [20:44:10] / on thyme is UNKNOWN: NRPE: Unable to read output [20:44:10] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [20:44:10] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1415180.000000 [20:44:10] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:44:10] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:44:19] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:44:19] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:44:19] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:44:20] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:44:20] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:44:20] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:44:20] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:44:50] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [20:44:50] RAID on thyme is UNKNOWN: NRPE: Unable to read output [20:45:00] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:45:10] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:45:10] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:45:10] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:45:10] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:45:20] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [20:45:20] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:45:20] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [20:45:20] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:45:40] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [20:45:40] SSH on mayapple is CRITICAL: Server answer: [20:45:40] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [20:45:40] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 293459 MB (5% inode=64%): [20:45:40] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [20:45:40] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [20:45:41] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [20:45:41] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [20:45:50] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 762011.000000 [20:45:50] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [20:45:50] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [20:45:50] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:45:50] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:45:50] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:45:51] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 170095.000000 [20:45:59] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [20:46:09] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1300784.000000 [20:46:10] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2466693.000000 [20:46:10] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:46:50] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [20:48:09] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [20:49:10] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [20:50:20] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [20:54:49] NTP on ptolemy is CRITICAL: NTP CRITICAL: Offset 11.505836 secs [20:55:40] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 54359 MB (8% inode=99%): [21:00:03] Cyberpower678: Pipe it over SSH [21:00:29] https://wiki.toolserver.org/view/Database_access#External_access [21:02:09] Free Memory on damiana is CRITICAL: CRITICAL - 1.9% (163032 kB) free! [21:03:19] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2161 [21:03:49] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2156.000000 [21:11:50] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 72373.000000 [21:12:10] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [21:12:19] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [21:13:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [21:32:01] is there some major issue with TS or something? [21:38:56] Yes, and the roots are AWOL. [22:35:49] Hello all. What's going on with my cluster? [22:37:57] DaBPunkt: it's not working :'( [22:38:18] I see that [22:38:53] My ssh session just froze :-( [22:42:00] Looks like damiana is down. I can not reach it by ssh and the console just returns empty lines. I will hard-reboot it [22:47:45] ok, I have a console again. Problem now is that I have no idea how nosy configued the nfs-service yesterday [22:50:46] DaBPunkt, there's a request at https://jira.toolserver.org/browse/ACCAPP-644 that a user asked me about - do you know if they need to submit more data? [22:50:58] not now [22:51:23] oh didn't see scrollback [22:51:35] greenrosetta, looks like there are bigger issues on the toolserver atm (like it being down!) [22:54:58] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 78557.000000 [22:54:58] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [22:54:58] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [22:54:58] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 147011.000000 [22:54:58] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 293356 MB (5% inode=64%): [22:55:05] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [22:55:05] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [22:55:05] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [22:55:05] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:55:05] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [22:55:05] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:55:15] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:55:15] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:55:15] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:55:22] Ooooh, good sign! [22:55:25] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:55:25] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:55:25] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:55:50] I'm getting 404s for my pages, tho [22:59:48] what happened? My account have been blocked even if I did the renewal procedure and it appear (in the ssh login message) that it would have expired in october 2013... [23:00:05] MySQL slave on z-dat-s3-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3227 [23:00:05] SSH on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:00:15] s4 replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3427.000000 [23:00:23] fale_: Some TS problems, it appears. [23:00:25] Sun Grid Engine execd on nightshade is UNKNOWN: Execution timeout exceeded [23:00:35] Sun Grid Engine execd on yarrow is UNKNOWN: Error with qhost: error: commlib error: got select error (Connection refused) [23:00:45] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 0.03, 12.24, 39.36 [23:00:45] MySQL slave on z-dat-s7-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3141 [23:00:45] MySQL slave on daphne is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3067 [23:00:55] MySQL slave on z-dat-s6-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2253 [23:00:55] Load avg. on yarrow is CRITICAL: CRITICAL - load average: 3.06, 8.17, 20.96 [23:00:55] MySQL slave on z-dat-s4-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1901 [23:00:55] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 437177.000000 [23:01:45] Matthew_: I see... I preoccupied since: crontab fails to chang directories, my ssh account is locked and my ~public_html returns 404 [23:01:55] MySQL slave on z-dat-s6-a is OK: Uptime: 533463 Threads: 4 Questions: 513986634 Slow queries: 17828 Opens: 928558 Flush tables: 1 Open tables: 3157 Queries per second avg: 963.490 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1329 [23:01:55] Load avg. on yarrow is WARNING: WARNING - load average: 3.10, 7.25, 19.84 [23:01:55] MySQL slave on z-dat-s4-a is OK: Uptime: 703470 Threads: 1 Questions: 25735856 Slow queries: 8 Opens: 77 Flush tables: 1 Open tables: 70 Queries per second avg: 36.584 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 968 [23:02:20] fale: Yeah, I understand. I'm experiencing the same problems, it appears DaB. is working on it actively. [23:02:56] Matthew_: thanks a lot :) I did not found anything the ml (maybe I'm blind) and I thought it was only my problem ;) [23:03:05] MySQL slave on z-dat-s3-a is OK: Uptime: 533359 Threads: 8 Questions: 732664369 Slow queries: 15395 Opens: 5302175 Flush tables: 1 Open tables: 16384 Queries per second avg: 1373.679 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1220 [23:03:15] s4 replag on daphne is OK: QUERY OK: SELECT ts_rc_age() returned 1307.000000 [23:03:45] MySQL slave on z-dat-s7-a is OK: Uptime: 532577 Threads: 4 Questions: 493178031 Slow queries: 7713 Opens: 945842 Flush tables: 1 Open tables: 3470 Queries per second avg: 926.22 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1302 [23:03:45] MySQL slave on daphne is OK: Uptime: 1999029 Threads: 5 Questions: 230264530 Slow queries: 433413 Opens: 81469 Flush tables: 1 Open tables: 1923 Queries per second avg: 115.188 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 892 [23:04:39] fale: Yeah, I didn't get anything until a minute ago. [23:05:55] SSH on nightshade is OK: SSH OK - OpenSSH_5.5p1 Debian-6+squeeze3 (protocol 2.0) [23:07:55] Load avg. on yarrow is OK: OK - load average: 3.01, 4.36, 14.46 [23:11:44] Load avg. on nightshade is WARNING: WARNING - load average: 1.02, 2.18, 19.75 [23:11:45] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3539 [23:12:14] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3462.000000 [23:14:55] Sun Grid Engine execd on ortelius is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:15:44] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [23:15:46] Hi! Seems that the https://wiki.toolserver.org/view/GeoHack doesn't work at the moment [23:16:45] Load avg. on nightshade is OK: OK - load average: 1.03, 1.45, 14.57 [23:17:06] rhinux: Yes, DaB. is working on Toolserver. Everything's broken... [23:17:10] I mean queries like http://toolserver.org/%7Egeohack/en/52_30_N_13_23_E_type:city(3431700)?title=Berlin give back 404 err page [23:17:14] Sun Grid Engine execd on nightshade is OK: Host and Queues Ok [23:17:34] Sun Grid Engine execd on yarrow is OK: Host and Queues Ok [23:18:07] rhinux: Yep... that's what they're trying to fix [23:18:40] DaB? I only have seen that status.* says ok, so I wondered [23:19:10] Yeah, he was just in here, and said he was looking at it. status.toolserver.org is /never/ updated... [23:20:11] We should probably replace it with a redirect to Nagios. [23:20:23] Yeah [23:20:27] Or a bot to auto-update. [23:20:43] what happend? someone spilled Mate over the servers? ;) [23:20:54] Well, Nagios is pretty close to a bot :-). [23:21:19] Yeah, but it's not always human-friendly. It's programmer-friendly :) [23:21:45] MySQL slave on rosemary is OK: Uptime: 516395 Threads: 4 Questions: 318288726 Slow queries: 233513 Opens: 29798 Flush tables: 1 Open tables: 3414 Queries per second avg: 616.366 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1783 [23:22:14] s1 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1684.000000 [23:22:27] Probably somewhere in Nagios you can define a "dashboard for dummies", but I don't know. [23:23:34] @scfc_de D4D: "dashboard for dummies" nice term :) [23:40:58] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:41:28] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [23:49:58] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:54:25] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 596483.000000 [23:54:25] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [23:54:25] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [23:54:25] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 773320.000000 [23:54:25] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 82121.000000 [23:54:25] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 181396.000000 [23:54:25] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [23:54:25] RAID on thyme is UNKNOWN: NRPE: Unable to read output [23:54:25] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [23:54:25] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [23:54:25] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [23:54:25] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [23:54:40] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:54:52] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:54:52] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [23:54:52] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:54:52] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:54:52] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:54:52] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:54:52] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [23:54:52] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:54:52] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:54:52] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:54:52] SSH on mayapple is CRITICAL: Server answer: [23:54:52] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [23:54:52] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [23:54:52] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [23:55:18] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 293356 MB (5% inode=64%): [23:55:18] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [23:55:18] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [23:55:18] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [23:55:18] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [23:55:18] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [23:55:19] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:55:19] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [23:55:20] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2478046.000000 [23:55:28] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:55:28] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:55:28] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:55:28] / on thyme is UNKNOWN: NRPE: Unable to read output [23:55:28] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [23:55:28] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1426661.000000 [23:55:29] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:55:38] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:55:38] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [23:55:38] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [23:55:47] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:55:47] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:55:57] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:56:07] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds.