[00:02:19] phe: fix it [00:04:05] Betacommand: That's helpful. [00:04:28] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 248907 MB (4% inode=67%): [00:04:43] Susan: having a process use 14GB of RAM means the program is broken [00:04:50] No shit. [00:04:58] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [00:05:19] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [00:05:20] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [00:05:38] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [00:05:38] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [00:06:08] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 117886 MB (19% inode=99%): [00:06:47] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [00:06:49] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [00:09:52] phe: should your program notified by singal or kill on high memory usage? [00:10:07] Betacommand, I fixed it by setting ulimit [00:10:16] that's not the trouble [00:10:41] phe: what caused your program to go crazy and eat that much ram? [00:10:47] what the purpose of virtual_free=64M if it allows 14 GB ? [00:11:18] aliasd on mayapple is CRITICAL: Connection refused [00:11:32] Betacommand, corrupted input file and application not well written, I can little to that except I reported the trouble upstream [00:12:27] phe: it is for the scheduler for choosing the right host and there is also a script that can kill jobs if memory on host is low. but linux is allocation to much unused memory, so it is only active on solaris atm [00:13:29] Merlissimo, ok, ty, I see activating it on linux will break many bots ;( anyway I worked around that [00:13:42] but you can use h_vmem at sge submit instead of ulimit in your script [00:14:19] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:14:39] e.g. my java uses 700mb on solaris, but 1,3 GB on linux (altough max heap memory is set as java args) [00:15:57] phe: e.g. qcronsub -l arch=lx -l h_rt=INFINITY -l virtual_free=64M -l h_vmem=70M -notify [00:16:53] hmm sge doc says h_vmem by default is 2Gb [00:17:50] no, it is unlimited. it is not possible to start java or mono if set [00:19:25] it should be set at a high value, e.g. 4 Gb, but with unlimited nighshade was nearly dead during 20 minutes [00:22:01] i was not able to start java when set to 16 GB [00:23:18] (a simple hello world programm) [00:32:19] nacht ts [00:35:18] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [00:47:48] MySQL slave on z-dat-s2-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2145 [00:52:27] Free Memory on turnera is CRITICAL: CRITICAL - 1.7% (145136 kB) free! [00:58:19] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:04:28] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 249812 MB (4% inode=67%): [01:04:57] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [01:05:38] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [01:05:39] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [01:06:08] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 117520 MB (19% inode=99%): [01:06:18] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [01:06:19] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [01:06:49] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [01:06:49] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [01:09:57] gar [01:11:18] aliasd on mayapple is CRITICAL: Connection refused [01:15:18] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:33:48] MySQL slave on z-dat-s2-b is OK: Uptime: 15752 Threads: 8 Questions: 20135695 Slow queries: 249 Opens: 177337 Flush tables: 1 Open tables: 257 Queries per second avg: 1278.294 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1638 [01:35:18] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [01:53:28] Free Memory on turnera is CRITICAL: CRITICAL - 2.0% (169260 kB) free! [01:58:18] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:04:58] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [02:05:28] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 249540 MB (4% inode=67%): [02:05:38] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [02:06:08] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 115512 MB (19% inode=99%): [02:06:23] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [02:06:23] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [02:06:39] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [02:06:48] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [02:06:49] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [02:11:18] aliasd on mayapple is CRITICAL: Connection refused [02:15:18] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:35:19] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [02:53:28] Free Memory on turnera is CRITICAL: CRITICAL - 2.0% (163880 kB) free! [02:58:18] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:04:57] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [03:05:39] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [03:06:08] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 117067 MB (19% inode=99%): [03:06:18] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [03:06:18] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [03:06:27] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 248579 MB (4% inode=67%): [03:06:38] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [03:06:48] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [03:06:49] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [03:11:19] aliasd on mayapple is CRITICAL: Connection refused [03:15:18] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:35:19] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [03:53:28] Free Memory on turnera is CRITICAL: CRITICAL - 2.0% (170692 kB) free! [03:58:18] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:04:59] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [04:05:38] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [04:06:08] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 116901 MB (19% inode=99%): [04:06:18] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [04:06:18] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [04:06:27] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 247803 MB (4% inode=67%): [04:06:48] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [04:06:49] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [04:07:38] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [04:11:18] aliasd on mayapple is CRITICAL: Connection refused [04:15:18] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:35:18] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [04:53:39] Free Memory on turnera is CRITICAL: CRITICAL - 2.1% (174192 kB) free! [04:58:18] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:07:04] ummmmm [05:08:09] anyone else having issues? [05:36:31] With what? [05:37:27] webserver [05:37:42] Yeah, seems broken. [05:37:44] https://toolserver.org/~legoktm/enwiki_removals.txt gives me a 404 [05:37:46] https://toolserver.org/~mzmcbride/watcher/ isn't loading for me. [05:37:53] Did you e-mail ts-admins? [05:37:56] Or the mailing list? [05:38:00] Or file a ticket in JIRA? [05:38:02] No… :/ [05:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:46:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [05:46:32] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [05:46:32] aliasd on mayapple is CRITICAL: Connection refused [05:46:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicati [05:47:42] Free Memory on turnera is OK: OK - 91.9% (7702988 kB) free. [05:51:42] wikidata replag on z-dat-s2-b is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2964.000000 [05:52:02] s1 replag on thyme is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2526.000000 [05:52:02] wikidata replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2980.000000 [05:52:02] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2605.000000 [05:52:02] wikidata replag on thyme is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2984.000000 [05:52:12] s4 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3031.000000 [05:52:22] wikidata replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2997.000000 [05:52:22] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [05:52:22] MySQL slave on thyme is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2516 [05:52:32] wikidata replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3009.000000 [05:52:33] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2599 [05:52:33] aliasd on yarrow is CRITICAL: Connection refused [05:52:33] wikidata replag on z-dat-s7-a is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3014.000000 [05:52:33] wikidata replag on z-dat-s6-a is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3013.000000 [05:52:42] aliasd on nightshade is CRITICAL: Connection refused [05:58:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3632.000000 [06:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3609.000000 [06:02:32] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3613.000000 [06:02:32] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3613.000000 [06:02:42] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3625.000000 [06:03:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3641.000000 [06:03:03] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3643.000000 [06:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3657.000000 [06:06:42] Sun Grid Engine execd on yarrow is UNKNOWN: Execution timeout exceeded [06:07:23] SSH on yarrow is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:31:42] SSH on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:31:52] Sun Grid Engine execd on nightshade is UNKNOWN: Execution timeout exceeded [06:36:12] SSH on yarrow is OK: SSH OK - OpenSSH_5.5p1 Debian-6+squeeze3 (protocol 2.0) [06:36:23] Sun Grid Engine execd on yarrow is OK: Host and Queues Ok [06:43:52] Sun Grid Engine execd on nightshade is OK: Host and Queues Ok [06:44:02] s1 replag on thyme is OK: QUERY OK: SELECT ts_rc_age() returned 1786.000000 [06:44:22] MySQL slave on thyme is OK: Uptime: 29023 Threads: 9 Questions: 15782587 Slow queries: 1279 Opens: 398 Flush tables: 1 Open tables: 246 Queries per second avg: 543.795 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1777 [06:44:32] SSH on nightshade is OK: SSH OK - OpenSSH_5.5p1 Debian-6+squeeze3 (protocol 2.0) [06:45:32] MySQL slave on rosemary is OK: Uptime: 37784 Threads: 12 Questions: 33371264 Slow queries: 6627 Opens: 4519 Flush tables: 1 Open tables: 2864 Queries per second avg: 883.211 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1782 [06:45:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 247503 MB (4% inode=67%): [06:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [06:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [06:46:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [06:46:02] s1 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1755.000000 [06:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [06:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [06:46:21] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 116380 MB (19% inode=99%): [06:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:46:33] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [06:46:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [06:46:33] aliasd on mayapple is CRITICAL: Connection refused [06:46:33] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [06:51:49] *Grrr*! tsnag renamed himself, ignore rules need to be adapted ... [06:52:22] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [06:52:32] aliasd on yarrow is CRITICAL: Connection refused [06:52:42] aliasd on nightshade is CRITICAL: Connection refused [06:52:42] @replag [06:52:44] Susan: s1-rr-a: 24m 8s [-]; s1-rr-a-wd: 1h 50m 29s [-]; s1-user: 27m 35s [-]; s1-user-wd: 1h 50m 29s [-]; s2-rr: 1h 50m 14s [-]; s2-user: 1h 50m 14s [-]; s2-user-c: error; s2-user-wd: 1h 50m 34s [-] [06:52:45] Susan: s3-user: 1m 24s [-]; s3-user-wd: 1h 50m 29s [-]; s4-user-wd: 1h 50m 29s [-]; s5-user: 4h 21m 59s [-]; s5-user-c: error; s6-user-wd: 1h 50m 31s [-]; s7-user-wd: 1h 50m 31s [-] [06:58:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7231.000000 [07:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7212.000000 [07:02:33] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7213.000000 [07:02:33] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7214.000000 [07:02:41] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7225.000000 [07:03:01] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7240.000000 [07:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7244.000000 [07:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7257.000000 [07:03:32] Sun Grid Engine execd on ortelius is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [07:12:47] [[Special:Log/newusers]] create 10 * Poppy * (New user account) [07:29:32] aliasd on yarrow is OK: TCP OK - 0.003 second response time on port 984 [500 Not found.] [07:29:32] aliasd on mayapple is OK: TCP OK - 0.002 second response time on port 984 [500 Not found.] [07:30:42] aliasd on nightshade is OK: TCP OK - 0.004 second response time on port 984 [500 Not found.] [07:45:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 247298 MB (4% inode=67%): [07:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [07:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [07:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [07:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [07:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 116230 MB (19% inode=99%): [07:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:46:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [07:46:32] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [07:46:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [07:52:22] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [07:58:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 10832.000000 [08:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 10813.000000 [08:02:32] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 10814.000000 [08:02:32] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 10814.000000 [08:02:42] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 10825.000000 [08:03:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 10841.000000 [08:03:03] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 10843.000000 [08:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 10857.000000 [08:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [08:04:42] /sql on rosemary is WARNING: DISK WARNING - free space: /sql 71010 MB (7% inode=99%): [08:35:58] Who runs tsnag, BTW? [08:36:19] DaB./nosy, I guess. [08:36:23] AFAIK. [08:38:28] In any case, either the thresholds need to be adjusted or the issues reported fixed. In the present form, it hides any useful information in a bunch of noise. [08:39:16] * Susan shrugs. I've had it ignored for years. [08:39:21] ;-) [08:40:11] Me, too, but if we don't get more roots, we have to find ways to reduce the existing's workload :-). [08:45:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 247228 MB (4% inode=67%): [08:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [08:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [08:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [08:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [08:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 116025 MB (19% inode=99%): [08:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:46:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [08:46:32] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [08:46:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [08:52:21] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [08:58:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 14431.000000 [09:02:32] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 14414.000000 [09:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 14413.000000 [09:02:32] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 14413.000000 [09:02:42] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 14425.000000 [09:03:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 14440.000000 [09:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 14443.000000 [09:03:21] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 14457.000000 [09:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [09:45:41] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 247123 MB (4% inode=67%): [09:45:51] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [09:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [09:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [09:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [09:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 115849 MB (19% inode=99%): [09:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:46:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [09:46:32] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [09:46:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicati [09:52:22] MySQL slave on z-dat-s2-b is CRITICAL: (Service Check Timed Out) [09:58:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 18031.000000 [10:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 18014.000000 [10:02:32] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 18014.000000 [10:02:32] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 18014.000000 [10:02:42] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 18025.000000 [10:03:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 18040.000000 [10:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 18043.000000 [10:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 18057.000000 [10:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [10:17:42] aliasd on nightshade is CRITICAL: Connection refused [10:35:22] Environment IPMI on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [10:36:02] Environment IPMI on thyme is CRITICAL: Connection refused by host [10:37:01] Environment IPMI on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [10:38:41] aliasd on nightshade is OK: TCP OK - 0.003 second response time on port 984 [500 Not found.] [10:41:12] Load avg. on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [10:41:32] / on thyme is CRITICAL: Connection refused by host [10:41:42] /mnt on thyme is CRITICAL: Connection refused by host [10:41:43] RAID on thyme is CRITICAL: Connection refused by host [10:41:43] /sql on thyme is CRITICAL: Connection refused by host [10:42:12] Load avg. on thyme is CRITICAL: Connection refused by host [10:42:42] /mnt on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [10:42:42] RAID on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [10:42:42] /sql on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [10:43:14] Load avg. on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [10:43:41] / on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [10:44:32] / on thyme is CRITICAL: Connection refused by host [10:44:42] /mnt on thyme is CRITICAL: Connection refused by host [10:44:42] RAID on thyme is CRITICAL: Connection refused by host [10:44:42] /sql on thyme is CRITICAL: Connection refused by host [10:44:52] /tmp on thyme is CRITICAL: Connection refused by host [10:45:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 246390 MB (4% inode=67%): [10:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [10:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [10:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [10:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [10:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 115611 MB (19% inode=99%): [10:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:46:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [10:46:32] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [10:46:33] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicati [10:48:02] SSH on thyme is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:48:52] /tmp on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [10:49:52] /tmp on thyme is OK: DISK OK - free space: /tmp 133 MB (99% inode=99%): [10:49:52] SSH on thyme is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [10:53:02] MySQL slave on z-dat-s2-b is CRITICAL: (Service Check Timed Out) [10:58:02] SSH on thyme is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 21632.000000 [11:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 21614.000000 [11:02:32] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 21613.000000 [11:02:32] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 21614.000000 [11:02:41] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 21625.000000 [11:03:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 21641.000000 [11:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 21643.000000 [11:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 21657.000000 [11:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [11:11:42] / on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:11:43] /mnt on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:11:43] RAID on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:11:43] /sql on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:12:01] Environment IPMI on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:13:02] SSH on thyme is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [11:13:31] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:13:32] / on thyme is OK: DISK OK - free space: / 9941 MB (49% inode=92%): [11:13:42] /mnt on thyme is OK: DISK OK - free space: / 9941 MB (49% inode=92%): [11:13:43] RAID on thyme is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [11:13:43] /sql on thyme is OK: DISK OK - free space: /sql 126535 MB (13% inode=99%): [11:19:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [11:19:22] MySQL on thyme is CRITICAL: Cant connect to MySQL server on thyme (146) [11:19:22] MySQL slave on thyme is CRITICAL: Cant connect to MySQL server on thyme (146) [11:20:21] MySQL on thyme is OK: Uptime: 434 Threads: 2 Questions: 1703 Slow queries: 0 Opens: 46 Flush tables: 1 Open tables: 39 Queries per second avg: 3.923 [11:20:22] MySQL slave on thyme is OK: Uptime: 441 Threads: 3 Questions: 2685 Slow queries: 0 Opens: 53 Flush tables: 1 Open tables: 46 Queries per second avg: 6.88 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 441 [11:21:02] s1 replag on thyme is OK: QUERY OK: SELECT ts_rc_age() returned 410.000000 [11:44:12] MySQL on z-dat-s2-b is CRITICAL: Too many connections [11:45:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 245799 MB (4% inode=67%): [11:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [11:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [11:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [11:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [11:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 115387 MB (19% inode=99%): [11:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:46:33] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [11:46:33] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [11:46:33] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicati [11:53:42] MySQL slave on z-dat-s2-b is CRITICAL: Too many connections [11:56:12] MySQL on z-dat-s2-b is OK: Uptime: 53100 Threads: 501 Questions: 68690977 Slow queries: 441 Opens: 366737 Flush tables: 1 Open tables: 244 Queries per second avg: 1293.615 [12:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 25232.000000 [12:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 25213.000000 [12:02:33] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 25213.000000 [12:02:33] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 25214.000000 [12:02:41] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 25225.000000 [12:03:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 25240.000000 [12:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 25243.000000 [12:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 25256.000000 [12:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [12:14:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:45:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 245000 MB (4% inode=67%): [12:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [12:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [12:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [12:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [12:46:21] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 115164 MB (18% inode=99%): [12:46:31] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:46:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [12:46:32] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [12:46:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [12:54:00] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 20495 [13:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 28831.000000 [13:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 28814.000000 [13:02:32] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 28814.000000 [13:02:32] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 28814.000000 [13:02:42] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 28825.000000 [13:03:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 28841.000000 [13:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 28844.000000 [13:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 28857.000000 [13:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [13:14:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:45:43] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 243640 MB (4% inode=67%): [13:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [13:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [13:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [13:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [13:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 114947 MB (18% inode=99%): [13:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:46:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [13:46:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicati [13:46:32] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [13:53:52] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 6969 [14:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 32432.000000 [14:02:32] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 32414.000000 [14:02:32] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 32414.000000 [14:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 32414.000000 [14:02:42] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 32425.000000 [14:03:01] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 32440.000000 [14:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 32443.000000 [14:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 32457.000000 [14:03:52] MySQL slave on z-dat-s2-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3325 [14:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [14:08:52] MySQL slave on z-dat-s2-b is OK: Uptime: 7709 Threads: 8 Questions: 9185358 Slow queries: 134 Opens: 131293 Flush tables: 1 Open tables: 257 Queries per second avg: 1191.510 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1149 [14:14:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:45:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 242300 MB (4% inode=67%): [14:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [14:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [14:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [14:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [14:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 114711 MB (18% inode=99%): [14:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:46:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [14:46:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicati [14:46:32] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [15:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36032.000000 [15:02:32] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36016.000000 [15:02:32] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36015.000000 [15:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36016.000000 [15:02:42] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36026.000000 [15:03:01] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36040.000000 [15:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36044.000000 [15:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36057.000000 [15:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [15:14:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:45:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 241667 MB (4% inode=67%): [15:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [15:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [15:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [15:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [15:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 114463 MB (18% inode=99%): [15:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:46:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [15:46:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicati [15:46:32] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [16:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39632.000000 [16:02:32] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39615.000000 [16:02:32] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39616.000000 [16:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39615.000000 [16:02:42] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39626.000000 [16:03:01] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39641.000000 [16:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39644.000000 [16:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39658.000000 [16:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [16:14:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:45:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 241923 MB (4% inode=67%): [16:45:52] [[Special:Log/newusers]] create 10 * Arsene wenger7 * (New user account) [16:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [16:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [16:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [16:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [16:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 114107 MB (18% inode=99%): [16:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:46:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [16:46:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicati [16:46:32] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [17:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43232.000000 [17:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43215.000000 [17:02:33] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43216.000000 [17:02:33] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43216.000000 [17:02:42] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43226.000000 [17:03:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43241.000000 [17:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43243.000000 [17:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43257.000000 [17:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [17:14:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:32:52] SSH on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:44:51] Free Memory on damiana is WARNING: WARNING - 6.2% (517500 kB) free! [17:45:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 240624 MB (4% inode=67%): [17:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [17:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [17:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [17:46:12] /tmp on ortelius is WARNING: DISK WARNING - free space: /tmp 2719 MB (13% inode=99%): [17:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [17:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 113875 MB (18% inode=99%): [17:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:46:42] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicati [17:46:43] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [17:46:43] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [17:51:52] Free Memory on damiana is CRITICAL: CRITICAL - 4.7% (396660 kB) free! [17:53:06] Marc A. Pelletier * [Toolserver-l] Hello, and the short term plans [17:54:12] /tmp on ortelius is CRITICAL: DISK CRITICAL - free space: /tmp 334 MB (1% inode=99%): [18:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 46831.000000 [18:02:31] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 46818.000000 [18:02:42] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 46818.000000 [18:02:43] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 46818.000000 [18:02:43] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 46826.000000 [18:03:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 46841.000000 [18:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 46844.000000 [18:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 46858.000000 [18:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [18:05:26] scfc_de`: give a poke if you see dab [18:07:12] /tmp on ortelius is OK: DISK OK - free space: /tmp 4373 MB (21% inode=99%): [18:07:54] jeremyb_: Wilco. [18:14:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:28:51] Free Memory on damiana is OK: OK - 7.2% (600640 kB) free. [18:31:12] /tmp on ortelius is WARNING: DISK WARNING - free space: /tmp 4164 MB (20% inode=99%): [18:32:12] /tmp on ortelius is OK: DISK OK - free space: /tmp 4413 MB (21% inode=99%): [18:32:52] SSH on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:42:23] MySQL on thyme is CRITICAL: Cant connect to MySQL server on thyme (146) [18:42:23] MySQL slave on thyme is CRITICAL: Cant connect to MySQL server on thyme (146) [18:43:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [18:45:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 239961 MB (4% inode=67%): [18:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [18:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [18:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [18:46:11] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [18:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 113603 MB (18% inode=99%): [18:46:31] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:46:43] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [18:46:43] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicati [18:46:52] Free Memory on damiana is WARNING: WARNING - 5.6% (468104 kB) free! [18:47:02] s1 replag on thyme is OK: QUERY OK: SELECT ts_rc_age() returned 643.000000 [18:47:22] MySQL on thyme is OK: Uptime: 667 Threads: 5 Questions: 20465 Slow queries: 1 Opens: 89 Flush tables: 1 Open tables: 82 Queries per second avg: 30.682 [18:47:22] MySQL slave on thyme is OK: Uptime: 672 Threads: 3 Questions: 23818 Slow queries: 1 Opens: 93 Flush tables: 1 Open tables: 86 Queries per second avg: 35.443 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 637 [18:49:52] Free Memory on damiana is CRITICAL: CRITICAL - 5.0% (417616 kB) free! [19:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 50431.000000 [19:02:31] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 50417.000000 [19:02:43] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 50419.000000 [19:02:43] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 50420.000000 [19:02:43] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 50426.000000 [19:03:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 50441.000000 [19:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 50444.000000 [19:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 50459.000000 [19:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [19:09:53] Free Memory on damiana is WARNING: WARNING - 5.2% (436860 kB) free! [19:11:52] Free Memory on damiana is CRITICAL: CRITICAL - 4.3% (360112 kB) free! [19:14:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:16:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [19:24:52] Free Memory on damiana is WARNING: WARNING - 5.1% (425808 kB) free! [19:32:51] SSH on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:39:42] SSH on yucca is OK: SSH OK - OpenSSH_5.5p1 Debian-6+squeeze3 (protocol 2.0) [19:45:01] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [19:45:22] MySQL on thyme is CRITICAL: Cant connect to MySQL server on thyme (146) [19:45:23] MySQL slave on thyme is CRITICAL: Cant connect to MySQL server on thyme (146) [19:45:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 239378 MB (4% inode=67%): [19:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [19:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [19:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [19:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [19:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 113302 MB (18% inode=99%): [19:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:46:42] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicati [19:46:43] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [19:53:02] s1 replag on thyme is OK: QUERY OK: SELECT ts_rc_age() returned 855.000000 [19:53:22] MySQL on thyme is OK: Uptime: 856 Threads: 3 Questions: 4198 Slow queries: 0 Opens: 64 Flush tables: 1 Open tables: 57 Queries per second avg: 4.904 [19:53:22] MySQL slave on thyme is OK: Uptime: 862 Threads: 2 Questions: 6811 Slow queries: 0 Opens: 65 Flush tables: 1 Open tables: 58 Queries per second avg: 7.901 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 860 [19:59:43] /mnt user-store on rosemary is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [19:59:52] /sql on rosemary is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [20:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 42247.000000 [20:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 35796.000000 [20:02:42] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 33873.000000 [20:02:42] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36425.000000 [20:02:42] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 47790.000000 [20:02:52] Free Memory on damiana is CRITICAL: CRITICAL - 2.6% (217712 kB) free! [20:03:03] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 35667.000000 [20:03:03] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 49161.000000 [20:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on rosemary (146) [20:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [20:04:12] /mnt user-store on rosemary is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:04:12] /sql on rosemary is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:06:12] / on rosemary is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:06:12] RAID on rosemary is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:06:21] /sql/data/commonswiki on rosemary is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:06:31] /tmp on rosemary is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:06:33] Environment IPMI on rosemary is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:08:21] MySQL on rosemary is CRITICAL: Cant connect to MySQL server on rosemary (146) [20:08:42] MySQL slave on rosemary is CRITICAL: Cant connect to MySQL server on rosemary (146) [20:09:02] SMTP on rosemary is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:09:02] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on rosemary (146) [20:09:02] /sql on rosemary is WARNING: DISK WARNING - free space: /sql 70289 MB (7% inode=99%): [20:09:02] /sql/data/commonswiki on rosemary is OK: DISK OK - free space: /sql/data/commonswiki 151104 MB (24% inode=99%): [20:09:02] /tmp on rosemary is OK: DISK OK - free space: /tmp 197 MB (99% inode=99%): [20:09:02] / on rosemary is OK: DISK OK - free space: / 10170 MB (50% inode=91%): [20:09:12] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on rosemary (146) [20:11:12] Environment IPMI on rosemary is OK: ok: temperature ok fan ok voltage ok chassis ok [20:14:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:16:12] SSH on rosemary is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:16:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [20:17:41] RAID on rosemary is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [20:17:51] SMTP on rosemary is OK: SMTP OK - 0.113 sec. response time [20:18:02] SSH on rosemary is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:20:18] are all ts servers in amsterdam? [20:27:12] I think so, yes [20:34:26] I believe amaranth is in Tampa. [20:34:52] Free Memory on damiana is WARNING: WARNING - 5.2% (431980 kB) free! [20:35:42] MySQL slave on rosemary is OK: Uptime: 1096 Threads: 1 Questions: 3 Slow queries: 0 Opens: 16 Flush tables: 1 Open tables: 9 Queries per second avg: 0.2 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [20:35:52] Free Memory on damiana is CRITICAL: CRITICAL - 5.0% (417748 kB) free! [20:36:22] MySQL on rosemary is OK: Uptime: 1141 Threads: 35 Questions: 555 Slow queries: 17 Opens: 88 Flush tables: 1 Open tables: 81 Queries per second avg: 0.486 [20:36:32] s4 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2038.000000 [20:36:32] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2031.000000 [20:43:47] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2142 [20:45:52] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [20:45:53] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [20:46:01] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [20:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [20:46:21] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 113074 MB (18% inode=99%): [20:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:46:42] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [20:46:42] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [20:49:17] s4 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1795.000000 [21:02:11] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36234.000000 [21:02:32] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8756.000000 [21:02:42] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 13362.000000 [21:02:43] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8514.000000 [21:02:43] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 27664.000000 [21:03:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 15637.000000 [21:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 52762.000000 [21:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43164.000000 [21:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [21:04:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 238261 MB (4% inode=67%): [21:10:02] s1 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1797.000000 [21:10:42] MySQL slave on rosemary is OK: Uptime: 3196 Threads: 6 Questions: 3661135 Slow queries: 181 Opens: 378 Flush tables: 1 Open tables: 300 Queries per second avg: 1145.536 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1764 [21:14:31] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:15:32] wikidata replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3236.000000 [21:16:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [21:17:42] wikidata replag on z-dat-s6-a is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3340.000000 [21:19:31] wikidata replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1614.000000 [21:23:42] wikidata replag on z-dat-s6-a is OK: QUERY OK: SELECT ts_rc_age() returned 1506.000000 [21:34:42] wikidata replag on z-dat-s7-a is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3562.000000 [21:35:52] Free Memory on damiana is CRITICAL: CRITICAL - 2.3% (189268 kB) free! [21:43:42] wikidata replag on z-dat-s7-a is OK: QUERY OK: SELECT ts_rc_age() returned 1699.000000 [21:45:53] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [21:45:53] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [21:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [21:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [21:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 112819 MB (18% inode=99%): [21:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:46:42] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [21:46:42] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [22:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 31857.000000 [22:02:42] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 9969.000000 [22:03:01] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 15007.000000 [22:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 56362.000000 [22:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 46764.000000 [22:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [22:04:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 237913 MB (4% inode=67%): [22:14:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:16:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [22:32:06] Maarten Dammers * Re: [Toolserver-l] Hello, and the short term plans [22:35:52] Free Memory on damiana is CRITICAL: CRITICAL - 2.1% (172392 kB) free! [22:40:32] @replag [22:40:36] Danny_B: s1-rr-a-wd: 16h 16m 59s [+1.00 s/s]; s1-user-wd: 13h 36m 46s [+1.00 s/s]; s2-rr: 14m 50s [+0.71 s/s]; s2-user: 14m 50s [+0.71 s/s]; s2-user-c: error; s2-user-wd: 1h 54m 29s [-0.25 s/s]; s3-user: 1m 41s [+0.06 s/s]; s3-user-wd: 1m 13s [+0.01 s/s] [22:40:37] Danny_B: s4-user-wd: 3h 45m 59s [-0.33 s/s]; s5-user: 17h 38m 1s [+1.00 s/s]; s5-user-c: error; s6-user-wd: 1m 15s [+0.01 s/s]; s7-user: 5m 26s [+0.05 s/s]; s7-user-wd: 1m 11s [+0.01 s/s] [22:42:06] Marc A. Pelletier * Re: [Toolserver-l] Hello, and the short term plans [22:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [22:45:53] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [22:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [22:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [22:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 112518 MB (18% inode=99%): [22:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:46:42] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [22:46:42] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [22:49:06] Magnus Manske * Re: [Toolserver-l] Hello, and the short term plans [22:55:02] Environment IPMI on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [22:56:32] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:02:12] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 22411.000000 [23:02:42] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 6716.000000 [23:03:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 13642.000000 [23:03:02] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 59961.000000 [23:03:22] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 50367.000000 [23:04:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [23:04:43] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 237450 MB (4% inode=67%): [23:04:51] MySQL slave on z-dat-s2-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2113 [23:11:52] MySQL slave on z-dat-s2-b is OK: Uptime: 40289 Threads: 9 Questions: 49727527 Slow queries: 410 Opens: 483924 Flush tables: 1 Open tables: 256 Queries per second avg: 1234.270 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1656 [23:15:21] Environment IPMI on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [23:16:02] Environment IPMI on thyme is CRITICAL: Connection refused by host [23:16:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [23:20:21] Environment IPMI on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [23:27:43] aliasd on nightshade is CRITICAL: Connection refused [23:35:52] Free Memory on damiana is CRITICAL: CRITICAL - 2.1% (173108 kB) free! [23:41:12] Load avg. on thyme is CRITICAL: Connection refused by host [23:41:43] / on thyme is CRITICAL: Connection refused by host [23:41:43] /mnt on thyme is CRITICAL: Connection refused by host [23:41:43] /sql on thyme is CRITICAL: Connection refused by host [23:41:43] RAID on thyme is CRITICAL: Connection refused by host [23:41:51] /tmp on thyme is CRITICAL: Connection refused by host [23:42:02] SSH on thyme is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:45:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [23:45:53] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [23:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [23:46:12] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [23:46:22] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 112271 MB (18% inode=99%): [23:46:32] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:46:42] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [23:46:43] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [23:48:42] wikidata replag on z-dat-s2-b is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3458.000000 [23:55:02] Environment IPMI on thyme is CRITICAL: Connection refused by host [23:58:52] SSH on thyme is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [23:59:43] wikidata replag on z-dat-s2-b is OK: QUERY OK: SELECT ts_rc_age() returned 1647.000000 [23:59:52] /tmp on thyme is OK: DISK OK - free space: /tmp 13593 MB (99% inode=99%):