[00:02:05] (03PS1) 10Tim Starling: Set date.timezone=UTC [operations/puppet] - 10https://gerrit.wikimedia.org/r/93009 [00:05:55] (03CR) 10Chad: [C: 031] Set date.timezone=UTC [operations/puppet] - 10https://gerrit.wikimedia.org/r/93009 (owner: 10Tim Starling) [00:13:36] (03CR) 10Tim Starling: [C: 032] Set date.timezone=UTC [operations/puppet] - 10https://gerrit.wikimedia.org/r/93009 (owner: 10Tim Starling) [01:04:16] !log catrope synchronized wmf-config/CommonSettings.php 'Debugging 127.0.0.1 issue' [01:04:37] Logged the message, Master [01:11:03] !log catrope synchronized wmf-config/CommonSettings.php 'Debugging 127.0.0.1 issue: add XFP logging' [01:11:19] Logged the message, Master [01:16:18] (03PS1) 10Lcarr: Revert "Send OC text traffic to ulsfo" [operations/dns] - 10https://gerrit.wikimedia.org/r/93016 [01:16:51] (03CR) 10Lcarr: "Reverting since it makes all edits appear to be sourced from 127.0.0.1 - Roan can explain in further detail" [operations/dns] - 10https://gerrit.wikimedia.org/r/93016 (owner: 10Lcarr) [01:16:59] (03CR) 10Lcarr: [C: 032] "Reverting since it makes all edits appear to be sourced from 127.0.0.1 - Roan can explain in further detail" [operations/dns] - 10https://gerrit.wikimedia.org/r/93016 (owner: 10Lcarr) [01:21:22] !log catrope synchronized wmf-config/CommonSettings.php 'Clean up 127.0.0.1 logging code' [01:21:34] Logged the message, Master [01:21:55] (03PS1) 10Catrope: Log requests with wfGetIP() === '127.0.0.1' [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93017 [01:22:07] (03CR) 10Catrope: [C: 032] Log requests with wfGetIP() === '127.0.0.1' [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93017 (owner: 10Catrope) [01:22:18] (03Merged) 10jenkins-bot: Log requests with wfGetIP() === '127.0.0.1' [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93017 (owner: 10Catrope) [02:15:15] !log LocalisationUpdate completed (1.23wmf1) at Fri Nov 1 02:15:15 UTC 2013 [02:15:30] Logged the message, Master [02:59:31] PROBLEM - Puppet freshness on analytics1021 is CRITICAL: No successful Puppet run in the last 10 hours [03:01:11] !log LocalisationUpdate ResourceLoader cache refresh completed at Fri Nov 1 03:01:11 UTC 2013 [03:01:30] Logged the message, Master [03:02:09] anyone here know anything about mailman? [03:53:00] (03PS1) 10MZMcBride: Follow-up to I357d3ed1c1: log a timestamp. Untested. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93020 [06:33:04] PROBLEM - RAID on db47 is CRITICAL: CRITICAL: 1 failed logical drive(s) (Degraded) [07:08:38] PROBLEM - RAID on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:09:38] RECOVERY - RAID on snapshot3 is OK: OK: no RAID installed [07:58:08] PROBLEM - RAID on arsenic is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:58:58] RECOVERY - RAID on arsenic is OK: OK: no RAID installed [08:00:08] PROBLEM - DPKG on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:01:08] RECOVERY - DPKG on snapshot3 is OK: All packages OK [08:07:39] PROBLEM - Host mw1085 is DOWN: PING CRITICAL - Packet loss = 100% [08:08:49] RECOVERY - Host mw1085 is UP: PING OK - Packet loss = 0%, RTA = 0.25 ms [08:09:49] PROBLEM - DPKG on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:12:39] RECOVERY - DPKG on snapshot3 is OK: All packages OK [08:13:20] * apergos growls [08:18:37] !log shot two of the forceSearchIndex on arsenic, they were using 7gb between them and arsenic was in swapdeath [08:18:59] Logged the message, Master [08:19:27] not that this will make a difference, unless the next one manages to finish before they run out of memory again [08:21:49] PROBLEM - DPKG on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:22:29] PROBLEM - Disk space on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:24:19] RECOVERY - Disk space on snapshot3 is OK: DISK OK [08:24:39] PROBLEM - RAID on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:24:39] RECOVERY - DPKG on snapshot3 is OK: All packages OK [08:25:29] RECOVERY - RAID on snapshot3 is OK: OK: no RAID installed [08:27:49] PROBLEM - DPKG on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [08:29:39] RECOVERY - DPKG on snapshot3 is OK: All packages OK [08:49:18] (03PS7) 10Akosiaris: Modularizing puppetmaster [operations/puppet] - 10https://gerrit.wikimedia.org/r/91353 [08:58:49] PROBLEM - DPKG on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:00:39] PROBLEM - RAID on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:02:29] RECOVERY - RAID on snapshot3 is OK: OK: no RAID installed [09:05:24] PROBLEM - Disk space on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:06:24] RECOVERY - Disk space on snapshot3 is OK: DISK OK [09:06:34] RECOVERY - DPKG on snapshot3 is OK: All packages OK [09:08:07] !log Jenkins: fixing up a race condition in MobileFrontend qunit tests. Both variant were using the same path and the first job to complete would break the other one by deleting the mediawiki install. {{gerrit|93030}} [09:08:24] Logged the message, Master [09:12:34] PROBLEM - RAID on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:12:44] PROBLEM - DPKG on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:13:24] RECOVERY - RAID on snapshot3 is OK: OK: no RAID installed [09:14:44] RECOVERY - DPKG on snapshot3 is OK: All packages OK [09:16:19] (03CR) 10Yurik: [C: 031] Further constrain W0 X-CS setting to mobile Wikipedia, for now. [operations/puppet] - 10https://gerrit.wikimedia.org/r/92818 (owner: 10Dr0ptp4kt) [09:16:24] PROBLEM - Disk space on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:16:34] PROBLEM - RAID on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:17:44] PROBLEM - DPKG on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:18:24] RECOVERY - Disk space on snapshot3 is OK: DISK OK [09:18:34] RECOVERY - RAID on snapshot3 is OK: OK: no RAID installed [09:20:44] RECOVERY - DPKG on snapshot3 is OK: All packages OK [09:23:34] PROBLEM - RAID on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:23:44] (03PS1) 10ArielGlenn: use hostname for labstore1-4 in dhcp, like everything else [operations/puppet] - 10https://gerrit.wikimedia.org/r/93031 [09:23:44] PROBLEM - DPKG on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:23:52] (ignore those folks) [09:24:42] (03CR) 10ArielGlenn: [C: 032] use hostname for labstore1-4 in dhcp, like everything else [operations/puppet] - 10https://gerrit.wikimedia.org/r/93031 (owner: 10ArielGlenn) [09:27:44] PROBLEM - SSH on snapshot3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:28:34] RECOVERY - RAID on snapshot3 is OK: OK: no RAID installed [09:28:44] RECOVERY - SSH on snapshot3 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [09:29:44] RECOVERY - DPKG on snapshot3 is OK: All packages OK [09:29:58] (03PS2) 10ArielGlenn: remove virt1002/3 (rt #3687 renamed), virt1009 [operations/dns] - 10https://gerrit.wikimedia.org/r/92850 [09:32:33] (03CR) 10ArielGlenn: [C: 032] remove virt1002/3 (rt #3687 renamed), virt1009 [operations/dns] - 10https://gerrit.wikimedia.org/r/92850 (owner: 10ArielGlenn) [09:32:34] PROBLEM - RAID on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:33:34] RECOVERY - RAID on snapshot3 is OK: OK: no RAID installed [09:33:44] PROBLEM - DPKG on snapshot3 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [09:34:44] RECOVERY - DPKG on snapshot3 is OK: All packages OK [09:42:35] (03PS1) 10ArielGlenn: completely remove tmc1,2 [operations/dns] - 10https://gerrit.wikimedia.org/r/93033 [09:43:46] (03CR) 10ArielGlenn: [C: 032] completely remove tmc1,2 [operations/dns] - 10https://gerrit.wikimedia.org/r/93033 (owner: 10ArielGlenn) [10:05:20] (03PS1) 10ArielGlenn: remove srv266, 278 from dns, decommed (rt #4534) [operations/dns] - 10https://gerrit.wikimedia.org/r/93041 [10:06:31] (03CR) 10ArielGlenn: [C: 032] remove srv266, 278 from dns, decommed (rt #4534) [operations/dns] - 10https://gerrit.wikimedia.org/r/93041 (owner: 10ArielGlenn) [10:09:48] (03PS1) 10ArielGlenn: remove srv266,278 from dhcp, decommed (rt #4534) [operations/puppet] - 10https://gerrit.wikimedia.org/r/93042 [10:11:39] (03CR) 10ArielGlenn: [C: 032] remove srv266,278 from dhcp, decommed (rt #4534) [operations/puppet] - 10https://gerrit.wikimedia.org/r/93042 (owner: 10ArielGlenn) [10:48:58] (03PS1) 10Springle: depool first batch of pmtpa boxes to be decommissioned/shipped [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93043 [10:49:09] yay! [10:49:54] (03CR) 10Springle: [C: 032] depool first batch of pmtpa boxes to be decommissioned/shipped [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93043 (owner: 10Springle) [10:51:10] !log springle synchronized wmf-config/db-pmtpa.php 'depool first batch of pmtpa boxes to be decommissioned/shipped' [10:51:23] Logged the message, Master [10:57:10] !log shot one more forceSearchIndex on arsenic, only two left... (using about 12gb between them) [10:57:27] Logged the message, Master [11:02:42] (03CR) 10QChris: "Thanks for the initiative! \o/" [operations/puppet] - 10https://gerrit.wikimedia.org/r/93006 (owner: 10Yurik) [11:03:53] (03PS1) 10ArielGlenn: remove colby, reclaimed for shipment (rt #5120) [operations/dns] - 10https://gerrit.wikimedia.org/r/93045 [11:06:31] (03CR) 10ArielGlenn: [C: 032] remove colby, reclaimed for shipment (rt #5120) [operations/dns] - 10https://gerrit.wikimedia.org/r/93045 (owner: 10ArielGlenn) [11:07:17] (03CR) 10QChris: "Seems like this change came out of a discussion between" [operations/puppet] - 10https://gerrit.wikimedia.org/r/93006 (owner: 10Yurik) [11:12:04] (03PS1) 10ArielGlenn: remove project2 from dsh files, decommed (rt #2637) [operations/puppet] - 10https://gerrit.wikimedia.org/r/93047 [11:14:15] (03CR) 10ArielGlenn: [C: 032] remove project2 from dsh files, decommed (rt #2637) [operations/puppet] - 10https://gerrit.wikimedia.org/r/93047 (owner: 10ArielGlenn) [11:15:07] (03PS1) 10Springle: pull various pmtpa db boxes from coredb pending 12th floor reorg and imminent decomm/reclaim [operations/puppet] - 10https://gerrit.wikimedia.org/r/93048 [11:18:42] (03CR) 10Springle: [C: 032] pull various pmtpa db boxes from coredb pending 12th floor reorg and imminent decomm/reclaim [operations/puppet] - 10https://gerrit.wikimedia.org/r/93048 (owner: 10Springle) [11:21:34] (03PS1) 10ArielGlenn: removing storage1, singer from dsh/dhcp, decommed [operations/puppet] - 10https://gerrit.wikimedia.org/r/93049 [11:21:57] (03PS2) 10ArielGlenn: removing storage1, singer from dsh/dhcp, decommed [operations/puppet] - 10https://gerrit.wikimedia.org/r/93049 [11:24:37] (03CR) 10ArielGlenn: [C: 032] removing storage1, singer from dsh/dhcp, decommed [operations/puppet] - 10https://gerrit.wikimedia.org/r/93049 (owner: 10ArielGlenn) [11:28:17] (03CR) 10QChris: [C: 031] Further constrain W0 X-CS setting to mobile Wikipedia, for now. [operations/puppet] - 10https://gerrit.wikimedia.org/r/92818 (owner: 10Dr0ptp4kt) [11:30:08] apergos: mark: I am looking at the puppetmaster::dashboard class and it is not used anywhere from what git grep dashboard says. Also searched wikitech. I am pretty confident we don't use it anywhere and that the class can be safely deleted but please advise if this is not the case. [11:30:31] yeah it was on server sockpuppet but isn't used [11:30:32] I did the same hunt a while back and came to the same conclusion. BUT [11:30:36] and disabled as it used tons of disk [11:30:41] feel free to remove [11:30:43] someone who knows more should weigh in (thanks mark) [11:30:45] it was crap anyway :) [11:30:54] yes it was crap :-) [11:30:58] and it still is [11:31:03] thanks guys [11:31:06] i'm not surprised [11:31:09] yes it still does (as of a month ago when I was looking into it again) [11:31:23] sweet, more cruft gone [11:41:48] (03PS1) 10Springle: decommision db3[29] db4[2-6] db5[1235689] [operations/puppet] - 10https://gerrit.wikimedia.org/r/93052 [11:53:02] (03PS1) 10ArielGlenn: remove titanium from dsh/dhcp/netboot.cfg, wiped (rt #5854) [operations/puppet] - 10https://gerrit.wikimedia.org/r/93054 [11:54:31] (03CR) 10ArielGlenn: [C: 032] remove titanium from dsh/dhcp/netboot.cfg, wiped (rt #5854) [operations/puppet] - 10https://gerrit.wikimedia.org/r/93054 (owner: 10ArielGlenn) [11:56:40] paravoid or mark, could you +2 https://gerrit.wikimedia.org/r/#/c/92818 -- it further reduces noise for non-wikipedia zero sites. Thx! [11:58:10] (03PS1) 10ArielGlenn: remove titanium, wiped (#rt 5854) [operations/dns] - 10https://gerrit.wikimedia.org/r/93055 [11:59:40] (03CR) 10ArielGlenn: [C: 032] remove titanium, wiped (#rt 5854) [operations/dns] - 10https://gerrit.wikimedia.org/r/93055 (owner: 10ArielGlenn) [12:24:40] (03PS8) 10Akosiaris: Modularizing puppetmaster [operations/puppet] - 10https://gerrit.wikimedia.org/r/91353 [12:37:21] why two modules? [12:37:30] why not just one puppet module? [12:39:10] akosiaris: did you install those ulsfo varnish caches? [12:44:05] the ulsfo caches have the following in /etc/hosts: [12:44:12] 127.0.1.1 cp4008.ulsfo.wmnet cp4008 [12:44:26] normally our servers have their regular, non-loopback ip there [12:44:36] and this is I think what causes the 127.0.0.1 to show up in the XFF :) [12:44:54] from what I gather, the debian installer sometimes puts 127.0.1.1 in /etc/hosts if it doesn't have a fixed ip during installation [12:45:01] so I wonder why it was different for these boxes [12:46:33] (03PS1) 10ArielGlenn: remove db1012 from dsh files, renamed (rt #4912) [operations/puppet] - 10https://gerrit.wikimedia.org/r/93058 [12:49:40] (03PS2) 10ArielGlenn: remove db1012 from dsh files, renamed (rt #4921) [operations/puppet] - 10https://gerrit.wikimedia.org/r/93058 [12:51:37] (03CR) 10ArielGlenn: [C: 032] remove db1012 from dsh files, renamed (rt #4921) [operations/puppet] - 10https://gerrit.wikimedia.org/r/93058 (owner: 10ArielGlenn) [12:53:43] mark: yes i installed those. [12:53:51] hey, was anything nonstandard on those? [12:53:56] not a fully automated install or something? [12:54:06] because all ulsfo hosts seem to have that 127.0.1.1 in /etc/hosts [12:54:46] nope. fully automated [12:54:52] weird [12:55:26] i blame faidon [12:55:30] why [12:55:35] its debian [12:55:41] it's ubuntu [12:55:42] so for some reason the debian installer thought that the system did not have a permanent ip address ? [12:55:50] its debian installer :P [12:56:04] akosiaris: well that's what a quick google search suggested, it's far from hard evidence ;) [12:58:09] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=316099 [12:58:38] yes, d-i does that [12:58:43] but it's also not new behaviour by any means, so I wonder what's different for ulsfo [12:58:58] if (!empty_str(ipaddress)) { [12:59:00] perhaps the resolvers didn't answer at the time or smt stupid like that [12:59:01] [...] [12:59:08] } else { [12:59:08] if (domain_nodot && !empty_str(domain_nodot)) [12:59:08] fprintf(fp, "127.0.1.1\t%s.%s\t%s\n", hostname, domain_nodot, hostname); [12:59:11] else [12:59:13] fprintf(fp, "127.0.1.1\t%s\n", hostname); [12:59:16] } [12:59:32] PROBLEM - Puppet freshness on analytics1021 is CRITICAL: No successful Puppet run in the last 10 hours [13:01:03] the only weird thing when I did the installation was that the /etc/apt/apt.conf was missing and so apt did not have brewster as a proxy and security.debian.org was inaccessible and puppet timed out waiting for apt-get update. Which is way down the road after installation and has nothing to do with this. [13:02:24] might be similar [13:02:33] that apt.conf, iirc, is conditional on which subnet/zone a host is in [13:02:40] which is determined by puppet based on ip address [13:03:08] i guess we should just do another install to see if problems still exist today [13:04:09] should we reinstall a cp ? [13:04:15] so, [13:04:15] perhaps just a backup lvs [13:04:18] which doesn't do anything [13:04:22] the only subnet def we have in puppet [13:04:28] is public1-ulsfo.cfg [13:04:38] nah, this check is in $realm iirc [13:04:41] checks 10.x or not [13:04:45] that doesn't have brewster set as a proxy, because it's supposed to be a public subnet [13:05:06] d-i mirror/http/proxy string http://brewster.wikimedia.org:8080 [13:05:09] is missing [13:05:24] $network_zone = $main_ipaddress ? { [13:05:25] /^10./ => "internal", [13:05:25] default => "public" [13:05:25] } [13:05:25] the other thing the ulsfo config is missing is netcfg/get_domain [13:05:42] oh you mean installer subnet [13:06:01] er, no it's not missing, but it's wikimedia.org [13:06:08] right [13:06:11] but [13:06:29] this means d-i would override the domain to wikimedia.org [13:06:34] so it'd become cp4001.wikimedia.org [13:06:40] gethostbyname in that would fail [13:07:44] wait, how did the private ulsfo boxes even worked [13:07:51] there's no subnet definition for them at all [13:08:35] how did the private ulsfo boxes *install* even worked [13:09:39] (this explains everything) [13:09:56] indeed [13:10:54] well good [13:11:06] * mark -> doctor [13:11:07] There were two edits as localhost on en.wikivoyage about 7 hours after Leslie reverted the ulsfo change, BTW. [13:11:10] Not sure if that was lingering cache or what. [13:11:20] the fact that ulsfo is out of dns doesn't prevent anyone from using it [13:11:21] https://en.wikivoyage.org/wiki/Special:Contributions/127.0.0.1 [13:11:32] i'll fix it properly after I come back [13:12:29] (03PS1) 10Faidon Liambotis: autoinstall: add private1-ulsfo subnet [operations/puppet] - 10https://gerrit.wikimedia.org/r/93059 [13:13:05] mark: stop blaming me? :P [13:14:07] (03PS2) 10Faidon Liambotis: autoinstall: add private1-ulsfo subnet [operations/puppet] - 10https://gerrit.wikimedia.org/r/93059 [13:14:31] (03CR) 10Faidon Liambotis: [C: 032] autoinstall: add private1-ulsfo subnet [operations/puppet] - 10https://gerrit.wikimedia.org/r/93059 (owner: 10Faidon Liambotis) [13:14:38] (03CR) 10Faidon Liambotis: [V: 032] autoinstall: add private1-ulsfo subnet [operations/puppet] - 10https://gerrit.wikimedia.org/r/93059 (owner: 10Faidon Liambotis) [13:15:07] so the defaults worked before ? [13:15:27] dhcp did [13:15:31] the rest "worked" [13:15:50] as evidenced by the 127.0.1.1 stanza :) [13:19:45] (03PS1) 10ArielGlenn: remove msfe1002 from decomm, long since out of monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/93060 [13:20:55] (03CR) 10ArielGlenn: [C: 032] remove msfe1002 from decomm, long since out of monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/93060 (owner: 10ArielGlenn) [13:21:14] (03PS9) 10Akosiaris: Modularizing puppetmaster [operations/puppet] - 10https://gerrit.wikimedia.org/r/91353 [13:43:51] (03PS10) 10Akosiaris: Modularizing puppetmaster [operations/puppet] - 10https://gerrit.wikimedia.org/r/91353 [13:51:49] !log gallium : Jenkins console log compression completed, saved 80G out of 500G total disk space. [13:52:44] Logged the message, Master [13:58:19] paravoid: but blaming you is such a nice way to get you to do work ;p [14:05:02] * mark ran for i in cp{4001..4020} lvs{4001..4004}; do ssh root@$i.ulsfo.wmnet 'sed -i "s/^127\\.0\\.1\\.1/$(facter ipaddress)/" /etc/hosts'; done [14:05:27] probably needs varnish restarts :/ [14:06:03] OC is sleeping ... do it now ... nobody will notice :P [14:06:59] yeah doing now [14:24:55] Coren: your mail re: ipsec sounds like a volunteering email to me :P [14:25:54] you are not the only one :-D [14:25:55] paravoid: iff that scheme is agreed upon to be sane for out setup, which I think needs more discussion first. It worked for me, in comparable circumstances, but there is a maintenance cost involved we need to agree to pay first. :-) [14:27:16] i've done some experiments with newer ike daemons [14:27:18] worked well [14:27:23] (03Draft5) 10Akosiaris: Puppetmaster module multi-master capable [operations/puppet] - 10https://gerrit.wikimedia.org/r/93061 [14:27:25] (03PS1) 10ArielGlenn: remove hosts no longer in dns from dsh files too [operations/puppet] - 10https://gerrit.wikimedia.org/r/93065 [14:27:29] * apergos is interested [14:27:36] (ipsec) [14:28:30] Yeah, I don't think the reason why we went with static cryptomaps in our setup is as important is the WMF's (we wanted to be absolutely certain that box would have a proper crypto setup on boot even if they were isolated) [14:29:11] Also, we stored the actual keys in the TPM with tamper switch support -- again, probably overkill in our case. [14:29:12] right [14:30:01] Although that requires us to ask ourselves how confident we are in our physical security. [14:30:22] (03CR) 10ArielGlenn: [C: 032] remove hosts no longer in dns from dsh files too [operations/puppet] - 10https://gerrit.wikimedia.org/r/93065 (owner: 10ArielGlenn) [14:31:40] The only relevant advantage I see in our setup to the concept of static cryptomaps is that it fails safely in degraded environments (you'll never lose access to boxes, because the last known good key always work) [14:31:48] more confident than in prohibiting physical access to the entire length of our 'private' links between dcs [14:32:44] apergos: That's a given given that the hardware isn't even ours. [14:32:52] yup [14:33:50] which reminds me [14:33:53] I'd rather have it be an issue of physical security in our cages [14:33:56] our new private link between eqiad and knams is up [14:33:58] than some other random place [14:34:04] ah it's there? yaaayyy [14:34:44] mark: are you interested in seeing that that simple static mapping would look like in practice? It wouldn't be hard for me to deploy a demo in labs. [14:35:00] Coren: not really, I've used it before myself as well [14:35:06] I'd rather just use an IKE daemon tbh [14:35:15] they aren't horrible anymore, unlike 10 years ago [14:35:21] mark: Nowadays, that's a reasonable approach, yeah. [14:35:46] mark: But how to you plan on handling possible breakage? Allow TCP 22 without ipsec? [14:36:19] i'm actually planning JUST doing ipsec on the varnish traffic [14:36:21] perhaps [14:36:23] and possibly logging [14:36:30] yeah logging definitely ;) [14:37:05] I should think that having logs in cleartext in case of failure makes too obvious an attack point. "Make ike fail, see logs for free" [14:37:31] Depends how paranoid we want to be, I suppose. [14:38:12] well, traffic would be blocked if ike wouldn't work [14:38:18] unless the daemon wouldn't be started you mean? [14:40:30] yeah, i'm not gonna be /that/ paranoid [14:42:38] But yeah, fundamentally, I think we only want host-to-host ipsec; using a SSL tunnel has too many moving parts. [14:42:58] yep [14:46:06] I don't think MTU is an issue; I see no reason why PMTU discovery won't work right on our network. [14:47:01] MTU doesn't even need to be an issue, we could use MTU 9000 if needed [14:50:05] Ah, we have jumbo frame supports from one end to the other on all our links? Cool. [14:50:14] yup [14:50:38] I remember the initial contract saying 1500, did this get changed? [14:50:57] (iirc that's what Leslie told me a while back) [14:51:17] yup [14:51:39] cool [14:53:30] but yeah, don't even think we'll use that, PMTUD should work fine [14:53:51] and I don't have time to deal with the jumbo frame peculiarities between hosts on subnets atm ;) [14:54:08] like, our routers have been configured for jumbo frames everywhere already for a few years, but our linux boxes don't use it [14:54:10] more RTTs between varnish caches though :) [14:54:25] not for established connections though [14:54:44] right, varnish pipelining to the rescue :) [14:54:48] but sure, it's nice to be able to use that later [14:57:40] I suppose we could raise the MTU for IPv6 only first, as IPv6 NDP can pass supported MTU [15:04:39] (03PS1) 10Mark Bergsma: Send OC text traffic to ulsfo [operations/dns] - 10https://gerrit.wikimedia.org/r/93067 [15:10:36] * Coren probably needs to look into modern IKE implementations [15:10:44] (03CR) 10Mark Bergsma: [C: 032] Send OC text traffic to ulsfo [operations/dns] - 10https://gerrit.wikimedia.org/r/93067 (owner: 10Mark Bergsma) [15:10:55] I've recovered enough from the queasiness the older ones gave me. :-) [15:12:43] strongswan is what I was experimenting with recently [15:13:26] strongswan 5 has gotten into Debian unstable (& wheezy-backports) now, fwiw [15:13:54] traffic is back on ulsfo [15:13:55] looking good [15:26:56] why does ganglia always crash when I put traffic on a different dc? ;) [15:36:51] yoo paravoid, (how you feeling?) [15:37:02] i'd like to get the librdkafka and varnishkafka packages in order today [15:48:02] his temp went up by 3 degrees when you said that ;) [15:48:07] (degC) [15:49:47] (03PS1) 10Mark Bergsma: Cache donate.wikimedia.org in ulsfo as well [operations/dns] - 10https://gerrit.wikimedia.org/r/93071 [15:50:38] haha [15:51:41] akosiaris: hiiii [15:51:48] we are looking at moving forward with more kafka stuff soon [15:51:52] what's the word on libsnappy java/ [15:51:53] ? [15:52:39] hey [15:52:50] what do you need? [15:54:20] oh hey [15:54:29] librdkafka seems good to go [15:54:47] mind if I push and build 0.8−1 version and put in apt? [15:55:29] that would be 0.8-1~precise1 :) [15:55:33] snaps and I have some config naming issues to think about for varnishkafka, but aside from that its looking good [15:55:33] if I get both of those packages in apt, then I can puppetize and install on mobiles [15:56:00] RECOVERY - Puppet freshness on analytics1021 is OK: puppet ran at Fri Nov 1 15:55:57 UTC 2013 [15:56:03] ha, k! [15:56:03] tanks [15:56:21] ottomata, thanks for taking care of that puppet issue [15:56:32] (I'll upload 0.8-1 to Debian soon) [15:56:35] i think there are some other issues with the varnishkafka package (manpage? other things?) I'll look into those and will get final review from you [15:56:37] probably not today [15:56:38] ok awesome! [16:00:32] mwalker|away: yo, mind if I move your CN fixes that were supposed to go out yesterday to Monday at 2pm? [16:01:17] !log shot one more forceSearchIndex on arsenic because we were back in swap, only one left... [16:01:34] hi greg-g [16:01:34] Logged the message, Master [16:01:38] aude: hi there [16:01:56] did we sort out the issues from yesterday? [16:01:58] oh, aude, you may not have seen the post-mortem/plan for the next half week re deploys :) [16:02:06] * greg-g forwards [16:02:09] no i did not [16:02:44] aude: aude*@gmail or filpe*@gmail ? [16:02:49] s/p/b/ [16:02:52] aude [16:02:55] * greg-g nods [16:02:55] aude.wiki [16:02:58] :) [16:02:58] yah [16:03:36] paravoid, ottomata: I want to make a 0.8.0 formal of librdkafka [16:03:42] tagged [16:03:53] so, 0.8.0, not 0.8 [16:04:22] I want to follow apache kafka versioning, M.m.r [16:04:36] aude: I'm updating the [[wikitech:Deployments]] page now to reflect it (not yet saved) [16:05:56] oh ok [16:05:56] sure [16:06:04] (03CR) 10Hashar: "I am pretty sure that is overridden by MediaWiki in Setup.php by doing something like:" [operations/puppet] - 10https://gerrit.wikimedia.org/r/93009 (owner: 10Tim Starling) [16:06:13] mwalker|away: btw, I'll assume it's ok (for you to do the CN fixes at 2pm on Monday) unless I hear from you this morning. 2pm because I want to move the LD up to 3pm and make it an hour (testing one of the suggestions in the fallout) [16:06:22] 0.8.0-1~precise1 [16:07:04] I would like the final 0.8.0 deb pkg to be based on the 0.8.0 tag. Should I do that now or are you building a non-final package? [16:07:13] greg-g: ok, thanks [16:07:37] aude: is that going to work for you/wikidata? [16:07:38] so we still have a bug in wikidata (wmf1) that we'd like to patch [16:07:39] https://gerrit.wikimedia.org/r/#/c/92985/ [16:07:52] it's a small patch, but important to not let it sit [16:07:58] * greg-g nods & looks [16:08:27] otherwise, we are not doing anything new next week except maybe try new config / stuff on beta [16:08:38] oh, aude, so, what's the actual diff that will be hitting the cluster? [16:09:04] I mean, in wikibase [16:09:17] https://gerrit.wikimedia.org/r/#/c/92856/ [16:09:28] it's actually one line of js [16:09:33] + changes in selenium tests [16:10:20] probably will need to 'touch' that file once deployed to purge caches [16:10:43] * greg-g nods [16:10:53] oh make a tag Snaps [16:11:11] good idea [16:11:11] i mean, if you want us to wait until 0.8.0 is actually released by kafka [16:11:11] we can [16:11:17] maybe that will be nov 4 [16:12:06] PROBLEM - DPKG on cerium is CRITICAL: DPKG CRITICAL dpkg reports broken packages [16:14:16] PROBLEM - RAID on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:14:16] PROBLEM - RAID on mw1120 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:14:16] PROBLEM - DPKG on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:14:26] PROBLEM - RAID on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:14:26] PROBLEM - DPKG on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:14:46] PROBLEM - Apache HTTP on mw1120 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:14:46] PROBLEM - SSH on mw1121 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:14:46] PROBLEM - Apache HTTP on mw1140 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:14:46] PROBLEM - RAID on mw1115 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:14:46] PROBLEM - RAID on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:14:47] PROBLEM - twemproxy process on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:14:47] PROBLEM - DPKG on mw1115 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:14:48] PROBLEM - RAID on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:14:48] PROBLEM - RAID on mw1141 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:14:56] PROBLEM - DPKG on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:14:56] PROBLEM - DPKG on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:06] PROBLEM - RAID on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:06] RECOVERY - DPKG on cerium is OK: All packages OK [16:15:07] PROBLEM - RAID on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:07] PROBLEM - Disk space on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:16] PROBLEM - twemproxy process on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:17] PROBLEM - SSH on mw1143 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:15:17] PROBLEM - RAID on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:17] PROBLEM - Apache HTTP on mw1139 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:15:17] PROBLEM - Apache HTTP on mw1143 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:15:17] PROBLEM - RAID on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:26] PROBLEM - RAID on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:26] PROBLEM - twemproxy process on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:26] PROBLEM - Apache HTTP on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:15:26] PROBLEM - RAID on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:26] PROBLEM - DPKG on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:27] PROBLEM - DPKG on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:27] PROBLEM - DPKG on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:28] PROBLEM - RAID on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:28] PROBLEM - twemproxy process on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:29] PROBLEM - Apache HTTP on mw1126 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:15:29] PROBLEM - RAID on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:30] PROBLEM - DPKG on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:36] PROBLEM - DPKG on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:36] PROBLEM - Apache HTTP on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:15:37] PROBLEM - DPKG on mw1141 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:37] PROBLEM - Apache HTTP on mw1128 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:15:46] RECOVERY - SSH on mw1121 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:15:46] PROBLEM - Apache HTTP on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:15:46] PROBLEM - RAID on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:46] PROBLEM - twemproxy process on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:46] PROBLEM - RAID on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:46] PROBLEM - Apache HTTP on mw1121 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:15:47] PROBLEM - twemproxy process on mw1141 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:15:56] PROBLEM - Apache HTTP on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:15:56] RECOVERY - RAID on mw1145 is OK: OK: no RAID installed [16:16:06] RECOVERY - Disk space on mw1143 is OK: DISK OK [16:16:07] PROBLEM - SSH on mw1120 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:16:07] RECOVERY - Apache HTTP on mw1139 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.202 second response time [16:16:16] PROBLEM - RAID on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:16:16] PROBLEM - DPKG on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:16:16] PROBLEM - Disk space on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:16:17] RECOVERY - Apache HTTP on mw1126 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.373 second response time [16:16:17] RECOVERY - Apache HTTP on mw1146 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.293 second response time [16:16:26] PROBLEM - twemproxy process on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:16:26] RECOVERY - twemproxy process on mw1132 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:16:26] RECOVERY - DPKG on mw1116 is OK: All packages OK [16:16:26] PROBLEM - Apache HTTP on mw1123 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:16:26] PROBLEM - twemproxy process on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:16:36] PROBLEM - DPKG on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:16:36] RECOVERY - Apache HTTP on mw1120 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.895 second response time [16:16:37] RECOVERY - Apache HTTP on mw1128 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.554 second response time [16:16:37] RECOVERY - RAID on mw1116 is OK: OK: no RAID installed [16:16:37] RECOVERY - DPKG on mw1115 is OK: All packages OK [16:16:37] RECOVERY - RAID on mw1115 is OK: OK: no RAID installed [16:16:37] RECOVERY - Apache HTTP on mw1140 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.693 second response time [16:16:46] RECOVERY - twemproxy process on mw1141 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:16:46] PROBLEM - DPKG on mw1120 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:16:56] RECOVERY - DPKG on mw1140 is OK: All packages OK [16:17:06] RECOVERY - twemproxy process on mw1126 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:17:07] PROBLEM - SSH on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:17:07] RECOVERY - Disk space on mw1123 is OK: DISK OK [16:17:16] PROBLEM - DPKG on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:17:16] PROBLEM - SSH on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:17:16] PROBLEM - SSH on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:17:16] RECOVERY - twemproxy process on mw1123 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:17:16] RECOVERY - Apache HTTP on mw1123 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.278 second response time [16:17:16] RECOVERY - DPKG on mw1126 is OK: All packages OK [16:17:17] RECOVERY - RAID on mw1126 is OK: OK: no RAID installed [16:17:26] PROBLEM - SSH on mw1141 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:17:26] PROBLEM - DPKG on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:17:36] RECOVERY - Apache HTTP on mw1144 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.511 second response time [16:17:36] RECOVERY - twemproxy process on mw1143 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:17:46] PROBLEM - SSH on mw1126 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:17:46] RECOVERY - RAID on mw1141 is OK: OK: no RAID installed [16:17:46] PROBLEM - Disk space on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:17:46] RECOVERY - Apache HTTP on mw1122 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.957 second response time [16:17:56] RECOVERY - SSH on mw1120 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:18:06] RECOVERY - RAID on mw1120 is OK: OK: no RAID installed [16:18:07] RECOVERY - RAID on mw1139 is OK: OK: no RAID installed [16:18:07] RECOVERY - Apache HTTP on mw1143 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.370 second response time [16:18:16] RECOVERY - SSH on mw1141 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:18:16] RECOVERY - DPKG on mw1132 is OK: All packages OK [16:18:16] RECOVERY - DPKG on mw1139 is OK: All packages OK [16:18:17] RECOVERY - RAID on mw1132 is OK: OK: no RAID installed [16:18:26] RECOVERY - DPKG on mw1128 is OK: All packages OK [16:18:26] PROBLEM - DPKG on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:18:26] RECOVERY - DPKG on mw1141 is OK: All packages OK [16:18:36] RECOVERY - Apache HTTP on mw1135 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.284 second response time [16:18:36] RECOVERY - DPKG on mw1120 is OK: All packages OK [16:18:46] PROBLEM - SSH on mw1121 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:18:46] PROBLEM - RAID on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:18:46] PROBLEM - Apache HTTP on mw1148 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:18:46] PROBLEM - SSH on mw1134 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:18:46] PROBLEM - Apache HTTP on mw1127 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:18:46] PROBLEM - RAID on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:18:47] PROBLEM - DPKG on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:18:47] PROBLEM - twemproxy process on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:18:48] PROBLEM - RAID on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:18:56] PROBLEM - SSH on mw1145 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:18:56] PROBLEM - Disk space on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:06] PROBLEM - twemproxy process on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:06] PROBLEM - RAID on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:06] PROBLEM - twemproxy process on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:06] RECOVERY - SSH on mw1143 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:19:06] RECOVERY - RAID on mw1143 is OK: OK: no RAID installed [16:19:07] RECOVERY - SSH on mw1146 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:19:07] RECOVERY - SSH on mw1144 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:19:07] RECOVERY - DPKG on mw1123 is OK: All packages OK [16:19:08] PROBLEM - DPKG on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:08] PROBLEM - RAID on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:09] PROBLEM - Apache HTTP on mw1137 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:19:09] RECOVERY - DPKG on mw1135 is OK: All packages OK [16:19:16] PROBLEM - Apache HTTP on mw1125 is CRITICAL: Connection timed out [16:19:16] PROBLEM - SSH on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:19:16] PROBLEM - DPKG on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:16] PROBLEM - RAID on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:16] PROBLEM - DPKG on mw1130 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:16] PROBLEM - DPKG on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:16] PROBLEM - DPKG on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:17] PROBLEM - Apache HTTP on mw1129 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:19:17] PROBLEM - Apache HTTP on mw1145 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:19:18] PROBLEM - Apache HTTP on mw1124 is CRITICAL: Connection timed out [16:19:26] RECOVERY - RAID on mw1135 is OK: OK: no RAID installed [16:19:26] RECOVERY - DPKG on mw1143 is OK: All packages OK [16:19:26] PROBLEM - Apache HTTP on mw1116 is CRITICAL: Connection timed out [16:19:26] PROBLEM - Apache HTTP on mw1142 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:19:26] PROBLEM - RAID on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:26] PROBLEM - Apache HTTP on mw1136 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:19:26] PROBLEM - Apache HTTP on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:19:27] PROBLEM - RAID on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:27] PROBLEM - RAID on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:28] PROBLEM - Apache HTTP on mw1119 is CRITICAL: Connection timed out [16:19:28] PROBLEM - RAID on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:36] RECOVERY - SSH on mw1121 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:19:36] PROBLEM - DPKG on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:36] PROBLEM - DPKG on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:36] PROBLEM - DPKG on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:36] PROBLEM - RAID on mw1130 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:36] PROBLEM - Apache HTTP on mw1133 is CRITICAL: Connection timed out [16:19:37] RECOVERY - SSH on mw1134 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:19:37] PROBLEM - RAID on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:38] RECOVERY - Apache HTTP on mw1121 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.062 second response time [16:19:38] RECOVERY - DPKG on mw1134 is OK: All packages OK [16:19:46] RECOVERY - RAID on mw1134 is OK: OK: no RAID installed [16:19:46] PROBLEM - RAID on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:46] PROBLEM - Apache HTTP on mw1114 is CRITICAL: Connection timed out [16:19:46] PROBLEM - RAID on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:46] PROBLEM - DPKG on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:47] PROBLEM - RAID on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:19:47] PROBLEM - Apache HTTP on mw1132 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:19:56] PROBLEM - DPKG on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:20:06] RECOVERY - Apache HTTP on mw1137 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.796 second response time [16:20:06] PROBLEM - Apache HTTP on mw1130 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:20:06] PROBLEM - Apache HTTP on mw1118 is CRITICAL: Connection timed out [16:20:06] RECOVERY - DPKG on mw1148 is OK: All packages OK [16:20:06] PROBLEM - twemproxy process on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:20:16] PROBLEM - Apache HTTP on mw1117 is CRITICAL: Connection timed out [16:20:16] PROBLEM - twemproxy process on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:20:16] PROBLEM - Disk space on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:20:16] PROBLEM - SSH on mw1142 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:20:16] PROBLEM - twemproxy process on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:20:17] RECOVERY - RAID on mw1131 is OK: OK: no RAID installed [16:20:22] eek [16:20:26] RECOVERY - twemproxy process on mw1146 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:20:26] PROBLEM - twemproxy process on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:20:26] PROBLEM - Apache HTTP on mw1126 is CRITICAL: Connection timed out [16:20:26] PROBLEM - SSH on mw1123 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:20:26] PROBLEM - DPKG on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:20:27] PROBLEM - Apache HTTP on mw1123 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:20:27] PROBLEM - DPKG on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:20:28] PROBLEM - RAID on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:20:28] RECOVERY - DPKG on mw1146 is OK: All packages OK [16:20:36] PROBLEM - SSH on mw1140 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:20:46] PROBLEM - RAID on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:20:46] PROBLEM - Apache HTTP on mw1140 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:20:46] RECOVERY - Apache HTTP on mw1132 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.283 second response time [16:20:46] RECOVERY - twemproxy process on mw1138 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:20:56] PROBLEM - Apache HTTP on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:21:06] RECOVERY - RAID on mw1146 is OK: OK: no RAID installed [16:21:06] RECOVERY - Disk space on mw1122 is OK: DISK OK [16:21:07] RECOVERY - SSH on mw1122 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:21:07] RECOVERY - RAID on mw1148 is OK: OK: no RAID installed [16:21:07] RECOVERY - SSH on mw1138 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:21:07] PROBLEM - twemproxy process on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:21:16] RECOVERY - SSH on mw1142 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:21:16] PROBLEM - SSH on mw1124 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:21:16] RECOVERY - SSH on mw1123 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:21:16] RECOVERY - Apache HTTP on mw1123 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.122 second response time [16:21:26] RECOVERY - twemproxy process on mw1122 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:21:26] PROBLEM - RAID on mw1125 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:21:26] RECOVERY - Apache HTTP on mw1136 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.122 second response time [16:21:26] PROBLEM - twemproxy process on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:21:26] PROBLEM - DPKG on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:21:27] RECOVERY - SSH on mw1140 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:21:36] RECOVERY - RAID on mw1123 is OK: OK: no RAID installed [16:21:36] RECOVERY - RAID on mw1124 is OK: OK: no RAID installed [16:21:46] PROBLEM - DPKG on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:21:46] RECOVERY - RAID on mw1128 is OK: OK: no RAID installed [16:21:46] RECOVERY - DPKG on mw1122 is OK: All packages OK [16:21:46] PROBLEM - DPKG on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:21:56] PROBLEM - RAID on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:22:06] RECOVERY - twemproxy process on mw1116 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:22:06] PROBLEM - SSH on mw1119 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:22:16] RECOVERY - SSH on mw1124 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:22:16] PROBLEM - SSH on mw1130 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:22:17] RECOVERY - RAID on mw1125 is OK: OK: no RAID installed [16:22:26] PROBLEM - DPKG on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:22:26] PROBLEM - DPKG on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:22:46] PROBLEM - SSH on mw1121 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:22:46] PROBLEM - twemproxy process on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:22:46] PROBLEM - Disk space on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:06] PROBLEM - SSH on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:23:07] PROBLEM - RAID on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:07] PROBLEM - Disk space on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:16] PROBLEM - twemproxy process on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:16] PROBLEM - DPKG on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:16] PROBLEM - DPKG on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:16] PROBLEM - Disk space on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:16] PROBLEM - Apache HTTP on mw1143 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:23:26] PROBLEM - RAID on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:26] PROBLEM - Apache HTTP on mw1131 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:23:26] PROBLEM - twemproxy process on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:26] PROBLEM - twemproxy process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:26] PROBLEM - RAID on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:36] PROBLEM - Disk space on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:36] PROBLEM - Disk space on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:36] PROBLEM - Apache HTTP on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:23:46] PROBLEM - Apache HTTP on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:23:46] PROBLEM - SSH on mw1131 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:23:46] RECOVERY - twemproxy process on mw1143 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:23:46] PROBLEM - DPKG on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:23:46] PROBLEM - SSH on mw1137 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:23:47] PROBLEM - Disk space on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:24:06] PROBLEM - SSH on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:24:07] PROBLEM - twemproxy process on mw1130 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:24:16] PROBLEM - SSH on mw1142 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:24:16] PROBLEM - SSH on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:24:16] PROBLEM - RAID on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:24:26] RECOVERY - DPKG on mw1137 is OK: All packages OK [16:24:26] PROBLEM - Disk space on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:24:26] PROBLEM - Apache HTTP on mw1146 is CRITICAL: Connection timed out [16:24:26] PROBLEM - twemproxy process on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:24:26] PROBLEM - Disk space on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:24:36] PROBLEM - SSH on mw1140 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:24:36] PROBLEM - DPKG on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:24:36] PROBLEM - Apache HTTP on mw1128 is CRITICAL: Connection timed out [16:24:36] RECOVERY - RAID on mw1137 is OK: OK: no RAID installed [16:24:37] PROBLEM - RAID on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:24:37] RECOVERY - SSH on mw1137 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:24:46] PROBLEM - Disk space on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:24:46] RECOVERY - Disk space on mw1144 is OK: DISK OK [16:24:46] RECOVERY - Apache HTTP on mw1122 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.453 second response time [16:24:56] PROBLEM - SSH on mw1129 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:24:56] RECOVERY - SSH on mw1122 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:24:56] RECOVERY - SSH on mw1135 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:25:06] PROBLEM - twemproxy process on mw1125 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:07] RECOVERY - SSH on mw1138 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:25:07] PROBLEM - Disk space on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:07] PROBLEM - SSH on mw1132 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:25:07] RECOVERY - DPKG on mw1135 is OK: All packages OK [16:25:16] PROBLEM - twemproxy process on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:16] PROBLEM - SSH on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:25:16] PROBLEM - SSH on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:25:16] RECOVERY - RAID on mw1135 is OK: OK: no RAID installed [16:25:16] PROBLEM - RAID on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:17] PROBLEM - DPKG on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:17] PROBLEM - twemproxy process on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:18] RECOVERY - twemproxy process on mw1135 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:25:18] RECOVERY - RAID on mw1122 is OK: OK: no RAID installed [16:25:19] RECOVERY - Disk space on mw1142 is OK: DISK OK [16:25:26] PROBLEM - RAID on mw1125 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:26] PROBLEM - twemproxy process on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:26] PROBLEM - SSH on mw1133 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:25:26] PROBLEM - RAID on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:26] PROBLEM - DPKG on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:26] PROBLEM - twemproxy process on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:36] RECOVERY - Apache HTTP on mw1135 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.726 second response time [16:25:36] PROBLEM - SSH on mw1128 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:25:37] RECOVERY - DPKG on mw1136 is OK: All packages OK [16:25:46] RECOVERY - RAID on mw1136 is OK: OK: no RAID installed [16:25:46] PROBLEM - RAID on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:46] PROBLEM - twemproxy process on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:25:46] PROBLEM - Apache HTTP on mw1121 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:25:56] PROBLEM - twemproxy process on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:26:06] RECOVERY - SSH on mw1132 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:26:06] PROBLEM - DPKG on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:26:06] PROBLEM - DPKG on mw1125 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:26:07] RECOVERY - DPKG on mw1132 is OK: All packages OK [16:26:07] PROBLEM - Disk space on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:26:16] RECOVERY - SSH on mw1146 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:26:16] PROBLEM - SSH on mw1124 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:26:16] PROBLEM - SSH on mw1143 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:26:16] PROBLEM - twemproxy process on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:26:16] PROBLEM - Disk space on mw1125 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:26:17] RECOVERY - twemproxy process on mw1132 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:26:17] RECOVERY - RAID on mw1132 is OK: OK: no RAID installed [16:26:26] RECOVERY - twemproxy process on mw1146 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:26:26] PROBLEM - SSH on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:26:36] RECOVERY - Apache HTTP on mw1144 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.230 second response time [16:26:46] PROBLEM - twemproxy process on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:26:46] PROBLEM - twemproxy process on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:26:46] RECOVERY - twemproxy process on mw1144 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:27:06] PROBLEM - Apache HTTP on mw1137 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:27:07] RECOVERY - SSH on mw1144 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:27:16] PROBLEM - Disk space on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:27:16] RECOVERY - twemproxy process on mw1131 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:27:26] PROBLEM - SSH on mw1116 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:27:36] PROBLEM - SSH on mw1117 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:27:46] PROBLEM - twemproxy process on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:27:46] PROBLEM - Apache HTTP on mw1122 is CRITICAL: Connection timed out [16:27:56] RECOVERY - twemproxy process on mw1124 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:28:06] RECOVERY - Disk space on mw1117 is OK: DISK OK [16:28:06] RECOVERY - twemproxy process on mw1117 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:28:06] RECOVERY - twemproxy process on mw1130 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:28:06] RECOVERY - Disk space on mw1114 is OK: DISK OK [16:28:07] PROBLEM - twemproxy process on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:28:07] PROBLEM - Disk space on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:28:16] RECOVERY - SSH on mw1130 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:28:16] RECOVERY - SSH on mw1116 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:28:16] RECOVERY - Apache HTTP on mw1131 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.066 second response time [16:28:26] PROBLEM - Disk space on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:28:26] PROBLEM - SSH on mw1148 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:28:26] PROBLEM - Disk space on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:28:26] RECOVERY - RAID on mw1131 is OK: OK: no RAID installed [16:28:26] RECOVERY - SSH on mw1128 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:28:27] PROBLEM - DPKG on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:28:27] RECOVERY - SSH on mw1117 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:28:28] RECOVERY - Disk space on mw1124 is OK: DISK OK [16:28:36] RECOVERY - twemproxy process on mw1128 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:28:46] RECOVERY - twemproxy process on mw1138 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:28:46] PROBLEM - RAID on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:28:46] PROBLEM - SSH on mw1137 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:28:56] PROBLEM - DPKG on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:28:56] RECOVERY - Disk space on mw1138 is OK: DISK OK [16:29:06] RECOVERY - twemproxy process on mw1116 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:29:06] RECOVERY - DPKG on mw1131 is OK: All packages OK [16:29:07] RECOVERY - Disk space on mw1133 is OK: DISK OK [16:29:07] PROBLEM - SSH on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:29:16] RECOVERY - SSH on mw1124 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:29:16] PROBLEM - SSH on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:29:26] RECOVERY - DPKG on mw1138 is OK: All packages OK [16:29:26] PROBLEM - twemproxy process on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:29:26] PROBLEM - RAID on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:29:26] PROBLEM - twemproxy process on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:29:36] RECOVERY - SSH on mw1131 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:29:36] RECOVERY - RAID on mw1138 is OK: OK: no RAID installed [16:29:46] PROBLEM - SSH on mw1118 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:29:46] PROBLEM - Disk space on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:30:06] RECOVERY - Disk space on mw1143 is OK: DISK OK [16:30:06] RECOVERY - SSH on mw1143 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:30:06] RECOVERY - SSH on mw1142 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:30:06] RECOVERY - twemproxy process on mw1118 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:30:06] RECOVERY - SSH on mw1122 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:30:16] PROBLEM - SSH on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:30:16] RECOVERY - Disk space on mw1148 is OK: DISK OK [16:30:26] RECOVERY - SSH on mw1133 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:30:26] RECOVERY - DPKG on mw1144 is OK: All packages OK [16:30:36] PROBLEM - SSH on mw1125 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:30:36] RECOVERY - Disk space on mw1146 is OK: DISK OK [16:30:37] RECOVERY - RAID on mw1144 is OK: OK: no RAID installed [16:30:46] RECOVERY - SSH on mw1137 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:30:46] PROBLEM - twemproxy process on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:30:56] RECOVERY - twemproxy process on mw1125 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:31:06] RECOVERY - SSH on mw1144 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:31:06] PROBLEM - twemproxy process on mw1130 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:31:07] PROBLEM - Disk space on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:31:16] PROBLEM - twemproxy process on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:31:16] PROBLEM - Disk space on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:31:26] PROBLEM - SSH on mw1116 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:31:26] RECOVERY - SSH on mw1114 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:31:36] PROBLEM - SSH on mw1117 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:31:36] RECOVERY - Apache HTTP on mw1128 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.139 second response time [16:31:37] RECOVERY - SSH on mw1118 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:31:46] PROBLEM - Disk space on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:32:06] PROBLEM - twemproxy process on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:32:06] RECOVERY - twemproxy process on mw1130 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:32:07] RECOVERY - twemproxy process on mw1117 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:32:07] PROBLEM - twemproxy process on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:32:16] PROBLEM - SSH on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:32:16] RECOVERY - SSH on mw1116 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:32:17] RECOVERY - DPKG on mw1128 is OK: All packages OK [16:32:26] RECOVERY - DPKG on mw1116 is OK: All packages OK [16:32:26] RECOVERY - SSH on mw1117 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:32:26] PROBLEM - Disk space on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:32:36] PROBLEM - DPKG on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:32:36] RECOVERY - twemproxy process on mw1148 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:32:36] RECOVERY - RAID on mw1128 is OK: OK: no RAID installed [16:32:37] RECOVERY - Disk space on mw1118 is OK: DISK OK [16:32:46] PROBLEM - RAID on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:32:46] PROBLEM - twemproxy process on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:33:06] RECOVERY - SSH on mw1146 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:33:06] RECOVERY - Apache HTTP on mw1137 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.232 second response time [16:33:07] RECOVERY - Disk space on mw1125 is OK: DISK OK [16:33:07] PROBLEM - SSH on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:33:07] PROBLEM - Disk space on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:33:07] PROBLEM - Disk space on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:33:16] PROBLEM - twemproxy process on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:33:16] PROBLEM - SSH on mw1142 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:33:16] RECOVERY - Apache HTTP on mw1146 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.061 second response time [16:33:17] RECOVERY - twemproxy process on mw1146 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:33:17] RECOVERY - SSH on mw1148 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:33:26] PROBLEM - Disk space on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:33:26] RECOVERY - SSH on mw1125 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:33:26] PROBLEM - SSH on mw1127 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:33:36] RECOVERY - DPKG on mw1146 is OK: All packages OK [16:33:46] RECOVERY - twemproxy process on mw1137 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:33:46] RECOVERY - RAID on mw1137 is OK: OK: no RAID installed [16:34:06] RECOVERY - RAID on mw1146 is OK: OK: no RAID installed [16:34:06] RECOVERY - Apache HTTP on mw1125 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.633 second response time [16:34:16] PROBLEM - Disk space on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:34:16] PROBLEM - SSH on mw1124 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:34:17] RECOVERY - DPKG on mw1137 is OK: All packages OK [16:34:17] RECOVERY - Disk space on mw1148 is OK: DISK OK [16:34:26] RECOVERY - SSH on mw1127 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:34:26] RECOVERY - DPKG on mw1143 is OK: All packages OK [16:34:46] PROBLEM - SSH on mw1118 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:34:56] RECOVERY - twemproxy process on mw1124 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:34:56] RECOVERY - DPKG on mw1125 is OK: All packages OK [16:35:06] RECOVERY - Disk space on mw1114 is OK: DISK OK [16:35:07] RECOVERY - twemproxy process on mw1114 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:35:16] PROBLEM - SSH on mw1130 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:35:16] RECOVERY - RAID on mw1125 is OK: OK: no RAID installed [16:35:26] RECOVERY - DPKG on mw1114 is OK: All packages OK [16:35:36] PROBLEM - DPKG on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:35:36] RECOVERY - SSH on mw1118 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:35:46] PROBLEM - Disk space on mw1130 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:35:46] PROBLEM - Disk space on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:35:46] PROBLEM - twemproxy process on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:35:46] RECOVERY - Apache HTTP on mw1122 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.053 second response time [16:36:06] RECOVERY - SSH on mw1122 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:36:07] RECOVERY - twemproxy process on mw1118 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:36:07] PROBLEM - twemproxy process on mw1130 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:36:07] RECOVERY - Disk space on mw1122 is OK: DISK OK [16:36:07] RECOVERY - SSH on mw1130 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:36:16] RECOVERY - twemproxy process on mw1122 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:36:17] RECOVERY - Disk space on mw1142 is OK: DISK OK [16:36:26] PROBLEM - SSH on mw1148 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:36:26] PROBLEM - SSH on mw1133 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:36:36] RECOVERY - Disk space on mw1116 is OK: DISK OK [16:36:36] PROBLEM - SSH on mw1117 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:36:37] RECOVERY - Disk space on mw1130 is OK: DISK OK [16:36:59] (03PS2) 10MZMcBride: Remove codereview specific config file, collaps into CommonSettings.php [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/92605 (owner: 10Reedy) [16:37:06] RECOVERY - twemproxy process on mw1133 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:37:16] PROBLEM - SSH on mw1143 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:37:16] RECOVERY - RAID on mw1122 is OK: OK: no RAID installed [16:37:26] PROBLEM - Disk space on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:37:26] PROBLEM - SSH on mw1127 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:37:26] PROBLEM - DPKG on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:37:46] RECOVERY - Disk space on mw1121 is OK: DISK OK [16:37:46] RECOVERY - DPKG on mw1122 is OK: All packages OK [16:37:56] RECOVERY - Apache HTTP on mw1130 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.055 second response time [16:37:56] RECOVERY - twemproxy process on mw1130 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:38:06] PROBLEM - twemproxy process on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:38:06] RECOVERY - Disk space on mw1127 is OK: DISK OK [16:38:06] RECOVERY - SSH on mw1138 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:38:07] RECOVERY - DPKG on mw1130 is OK: All packages OK [16:38:07] PROBLEM - Disk space on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:38:16] RECOVERY - SSH on mw1124 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:38:16] PROBLEM - twemproxy process on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:38:16] PROBLEM - Apache HTTP on mw1125 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:38:17] PROBLEM - twemproxy process on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:38:17] RECOVERY - Apache HTTP on mw1138 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.058 second response time [16:38:17] RECOVERY - Disk space on mw1148 is OK: DISK OK [16:38:26] RECOVERY - DPKG on mw1138 is OK: All packages OK [16:38:26] RECOVERY - SSH on mw1117 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:38:26] RECOVERY - RAID on mw1130 is OK: OK: no RAID installed [16:38:26] PROBLEM - DPKG on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:38:27] PROBLEM - SSH on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:38:36] RECOVERY - RAID on mw1138 is OK: OK: no RAID installed [16:38:36] RECOVERY - SSH on mw1121 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:38:37] RECOVERY - twemproxy process on mw1138 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:38:46] RECOVERY - twemproxy process on mw1121 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:39:06] RECOVERY - twemproxy process on mw1116 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:39:16] PROBLEM - twemproxy process on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:39:26] RECOVERY - RAID on mw1121 is OK: OK: no RAID installed [16:39:26] PROBLEM - SSH on mw1116 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:39:26] PROBLEM - Disk space on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:39:26] RECOVERY - SSH on mw1140 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:39:26] RECOVERY - SSH on mw1133 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:39:26] RECOVERY - DPKG on mw1121 is OK: All packages OK [16:39:26] RECOVERY - Disk space on mw1140 is OK: DISK OK [16:39:36] RECOVERY - Apache HTTP on mw1121 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.657 second response time [16:39:36] RECOVERY - Disk space on mw1118 is OK: DISK OK [16:39:46] RECOVERY - Disk space on mw1129 is OK: DISK OK [16:39:46] RECOVERY - SSH on mw1129 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:39:56] PROBLEM - Apache HTTP on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:39:56] RECOVERY - Disk space on mw1145 is OK: DISK OK [16:40:06] RECOVERY - twemproxy process on mw1145 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:40:06] RECOVERY - Disk space on mw1117 is OK: DISK OK [16:40:07] RECOVERY - twemproxy process on mw1114 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:40:07] PROBLEM - DPKG on mw1125 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:40:07] PROBLEM - twemproxy process on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:40:17] RECOVERY - SSH on mw1116 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:40:26] PROBLEM - RAID on mw1125 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:40:26] PROBLEM - twemproxy process on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:40:26] PROBLEM - RAID on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:40:46] RECOVERY - Apache HTTP on mw1122 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.059 second response time [16:40:56] PROBLEM - DPKG on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:40:56] RECOVERY - DPKG on mw1125 is OK: All packages OK [16:41:06] RECOVERY - Apache HTTP on mw1125 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.061 second response time [16:41:06] RECOVERY - Disk space on mw1133 is OK: DISK OK [16:41:16] RECOVERY - SSH on mw1148 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:41:16] RECOVERY - RAID on mw1125 is OK: OK: no RAID installed [16:41:16] RECOVERY - twemproxy process on mw1122 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:41:17] RECOVERY - RAID on mw1122 is OK: OK: no RAID installed [16:41:26] RECOVERY - twemproxy process on mw1127 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:41:26] RECOVERY - SSH on mw1114 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:41:36] PROBLEM - SSH on mw1117 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:41:46] PROBLEM - SSH on mw1118 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:41:46] RECOVERY - Apache HTTP on mw1114 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.722 second response time [16:41:46] RECOVERY - DPKG on mw1122 is OK: All packages OK [16:42:06] RECOVERY - Disk space on mw1143 is OK: DISK OK [16:42:06] PROBLEM - twemproxy process on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:42:17] RECOVERY - DPKG on mw1114 is OK: All packages OK [16:42:17] RECOVERY - RAID on mw1114 is OK: OK: no RAID installed [16:42:26] RECOVERY - Apache HTTP on mw1142 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.030 second response time [16:42:26] RECOVERY - twemproxy process on mw1140 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:42:36] RECOVERY - SSH on mw1117 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:42:36] RECOVERY - twemproxy process on mw1148 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:42:46] RECOVERY - SSH on mw1118 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:42:46] PROBLEM - Disk space on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:42:46] PROBLEM - Disk space on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:42:56] PROBLEM - SSH on mw1129 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:42:56] PROBLEM - Disk space on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:43:06] PROBLEM - twemproxy process on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:43:06] RECOVERY - twemproxy process on mw1117 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:43:07] RECOVERY - twemproxy process on mw1116 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:43:07] RECOVERY - Apache HTTP on mw1129 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.269 second response time [16:43:07] PROBLEM - Apache HTTP on mw1137 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:43:07] RECOVERY - twemproxy process on mw1118 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:43:16] RECOVERY - Apache HTTP on mw1116 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.040 second response time [16:43:26] PROBLEM - SSH on mw1116 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:43:26] PROBLEM - RAID on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:43:26] RECOVERY - Disk space on mw1124 is OK: DISK OK [16:43:46] RECOVERY - twemproxy process on mw1143 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:43:46] PROBLEM - RAID on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:43:46] RECOVERY - SSH on mw1145 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:43:56] RECOVERY - Disk space on mw1145 is OK: DISK OK [16:43:56] RECOVERY - twemproxy process on mw1145 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:43:56] RECOVERY - Apache HTTP on mw1118 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.061 second response time [16:43:56] RECOVERY - Apache HTTP on mw1137 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.370 second response time [16:44:06] RECOVERY - Apache HTTP on mw1117 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.060 second response time [16:44:06] RECOVERY - Apache HTTP on mw1143 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.362 second response time [16:44:07] PROBLEM - Disk space on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:44:16] PROBLEM - SSH on mw1124 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:44:16] RECOVERY - SSH on mw1116 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:44:16] RECOVERY - DPKG on mw1117 is OK: All packages OK [16:44:26] RECOVERY - RAID on mw1118 is OK: OK: no RAID installed [16:44:26] RECOVERY - DPKG on mw1116 is OK: All packages OK [16:44:26] PROBLEM - twemproxy process on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:44:26] RECOVERY - RAID on mw1117 is OK: OK: no RAID installed [16:44:36] RECOVERY - Disk space on mw1116 is OK: DISK OK [16:44:36] RECOVERY - Apache HTTP on mw1140 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.063 second response time [16:44:46] RECOVERY - Disk space on mw1129 is OK: DISK OK [16:44:46] RECOVERY - SSH on mw1129 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:44:46] RECOVERY - RAID on mw1137 is OK: OK: no RAID installed [16:45:06] RECOVERY - RAID on mw1143 is OK: OK: no RAID installed [16:45:06] RECOVERY - twemproxy process on mw1129 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:45:06] RECOVERY - Disk space on mw1133 is OK: DISK OK [16:45:06] RECOVERY - SSH on mw1143 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:45:07] RECOVERY - twemproxy process on mw1133 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:45:16] RECOVERY - DPKG on mw1129 is OK: All packages OK [16:45:16] PROBLEM - DPKG on mw1130 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:45:17] RECOVERY - Apache HTTP on mw1126 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.340 second response time [16:45:26] RECOVERY - DPKG on mw1143 is OK: All packages OK [16:45:26] PROBLEM - RAID on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:45:26] RECOVERY - Apache HTTP on mw1133 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.060 second response time [16:45:36] PROBLEM - RAID on mw1130 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:45:36] RECOVERY - RAID on mw1129 is OK: OK: no RAID installed [16:45:36] RECOVERY - RAID on mw1116 is OK: OK: no RAID installed [16:45:36] RECOVERY - DPKG on mw1133 is OK: All packages OK [16:45:46] PROBLEM - Disk space on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:45:46] RECOVERY - DPKG on mw1140 is OK: All packages OK [16:45:46] PROBLEM - Apache HTTP on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:46:06] RECOVERY - twemproxy process on mw1142 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:46:06] RECOVERY - RAID on mw1140 is OK: OK: no RAID installed [16:46:06] RECOVERY - SSH on mw1142 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:46:06] RECOVERY - RAID on mw1133 is OK: OK: no RAID installed [16:46:07] RECOVERY - DPKG on mw1142 is OK: All packages OK [16:46:07] RECOVERY - DPKG on mw1130 is OK: All packages OK [16:46:16] RECOVERY - Disk space on mw1142 is OK: DISK OK [16:46:26] PROBLEM - DPKG on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:46:26] PROBLEM - Disk space on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:46:36] RECOVERY - RAID on mw1142 is OK: OK: no RAID installed [16:46:46] PROBLEM - twemproxy process on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:46:46] PROBLEM - Disk space on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:46:56] PROBLEM - SSH on mw1145 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:46:56] PROBLEM - Disk space on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:47:06] PROBLEM - twemproxy process on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:47:16] PROBLEM - twemproxy process on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:47:16] RECOVERY - SSH on mw1127 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:47:26] PROBLEM - SSH on mw1148 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:47:26] PROBLEM - Disk space on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:47:26] RECOVERY - Disk space on mw1126 is OK: DISK OK [16:47:26] PROBLEM - RAID on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:47:36] RECOVERY - RAID on mw1130 is OK: OK: no RAID installed [16:47:36] RECOVERY - SSH on mw1126 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:47:36] RECOVERY - Disk space on mw1121 is OK: DISK OK [16:47:36] RECOVERY - twemproxy process on mw1121 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:47:46] PROBLEM - twemproxy process on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:47:46] PROBLEM - SSH on mw1118 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:47:56] PROBLEM - DPKG on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:06] RECOVERY - twemproxy process on mw1126 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:48:06] RECOVERY - SSH on mw1124 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:48:07] PROBLEM - Disk space on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:16] PROBLEM - Apache HTTP on mw1129 is CRITICAL: Connection timed out [16:48:16] PROBLEM - RAID on mw1120 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:16] PROBLEM - DPKG on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:16] PROBLEM - RAID on mw1147 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:16] PROBLEM - Apache HTTP on mw1117 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:48:16] RECOVERY - DPKG on mw1126 is OK: All packages OK [16:48:17] PROBLEM - Apache HTTP on mw1131 is CRITICAL: Connection timed out [16:48:17] RECOVERY - RAID on mw1126 is OK: OK: no RAID installed [16:48:18] RECOVERY - Disk space on mw1148 is OK: DISK OK [16:48:18] RECOVERY - SSH on mw1148 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:48:22] shut uup [16:48:26] PROBLEM - Apache HTTP on mw1136 is CRITICAL: Connection timed out [16:48:26] PROBLEM - Apache HTTP on mw1138 is CRITICAL: Connection timed out [16:48:26] PROBLEM - twemproxy process on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:26] PROBLEM - Apache HTTP on mw1123 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:48:26] PROBLEM - RAID on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:27] PROBLEM - Apache HTTP on mw1116 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:48:27] PROBLEM - DPKG on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:28] PROBLEM - DPKG on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:28] PROBLEM - RAID on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:29] PROBLEM - RAID on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:36] PROBLEM - Disk space on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:36] PROBLEM - DPKG on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:36] PROBLEM - DPKG on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:36] RECOVERY - Disk space on mw1118 is OK: DISK OK [16:48:36] PROBLEM - Apache HTTP on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:48:37] RECOVERY - twemproxy process on mw1148 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:48:46] PROBLEM - Apache HTTP on mw1134 is CRITICAL: Connection timed out [16:48:46] PROBLEM - RAID on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:46] PROBLEM - RAID on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:46] PROBLEM - Apache HTTP on mw1147 is CRITICAL: Connection timed out [16:48:46] PROBLEM - Apache HTTP on mw1132 is CRITICAL: Connection timed out [16:48:47] PROBLEM - Apache HTTP on mw1140 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:48:47] PROBLEM - RAID on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:48] RECOVERY - DPKG on mw1122 is OK: All packages OK [16:48:48] PROBLEM - DPKG on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:49] PROBLEM - RAID on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:49] PROBLEM - RAID on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:50] PROBLEM - Disk space on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:56] PROBLEM - SSH on mw1129 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:48:56] PROBLEM - twemproxy process on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:56] PROBLEM - RAID on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:56] PROBLEM - DPKG on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:48:56] RECOVERY - Disk space on mw1145 is OK: DISK OK [16:49:06] RECOVERY - twemproxy process on mw1118 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:49:06] PROBLEM - Apache HTTP on mw1137 is CRITICAL: Connection timed out [16:49:07] PROBLEM - DPKG on mw1147 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:49:16] PROBLEM - Apache HTTP on mw1139 is CRITICAL: Connection timed out [16:49:16] PROBLEM - twemproxy process on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:49:16] PROBLEM - RAID on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:49:16] PROBLEM - twemproxy process on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:49:16] PROBLEM - RAID on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:49:16] PROBLEM - DPKG on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:49:17] PROBLEM - RAID on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:49:17] RECOVERY - twemproxy process on mw1127 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:49:26] RECOVERY - RAID on mw1118 is OK: OK: no RAID installed [16:49:36] RECOVERY - Apache HTTP on mw1114 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.060 second response time [16:49:46] PROBLEM - RAID on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:49:46] PROBLEM - SSH on mw1121 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:49:46] RECOVERY - twemproxy process on mw1144 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:50:06] RECOVERY - RAID on mw1120 is OK: OK: no RAID installed [16:50:06] RECOVERY - twemproxy process on mw1142 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:50:07] PROBLEM - Disk space on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:50:16] PROBLEM - DPKG on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:50:26] RECOVERY - Disk space on mw1124 is OK: DISK OK [16:50:26] PROBLEM - SSH on mw1127 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:50:26] PROBLEM - twemproxy process on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:50:36] PROBLEM - SSH on mw1140 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:50:36] RECOVERY - RAID on mw1142 is OK: OK: no RAID installed [16:50:46] PROBLEM - twemproxy process on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:50:46] PROBLEM - RAID on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:50:46] PROBLEM - Disk space on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:51:16] PROBLEM - DPKG on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:51:26] PROBLEM - SSH on mw1116 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:51:26] RECOVERY - twemproxy process on mw1132 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:51:26] PROBLEM - twemproxy process on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:51:26] PROBLEM - DPKG on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:51:36] RECOVERY - SSH on mw1118 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:51:46] PROBLEM - SSH on mw1126 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:51:46] PROBLEM - DPKG on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:51:46] PROBLEM - twemproxy process on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:51:56] PROBLEM - Apache HTTP on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:51:56] PROBLEM - DPKG on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:51:56] PROBLEM - Disk space on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:06] PROBLEM - Apache HTTP on mw1115 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:52:06] RECOVERY - Disk space on mw1117 is OK: DISK OK [16:52:06] PROBLEM - RAID on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:06] PROBLEM - Apache HTTP on mw1141 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:52:07] PROBLEM - SSH on mw1132 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:52:07] PROBLEM - twemproxy process on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:07] RECOVERY - DPKG on mw1142 is OK: All packages OK [16:52:16] PROBLEM - DPKG on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:16] PROBLEM - SSH on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:52:16] PROBLEM - SSH on mw1143 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:52:17] PROBLEM - twemproxy process on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:17] PROBLEM - Apache HTTP on mw1125 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:52:17] PROBLEM - Apache HTTP on mw1143 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:52:17] PROBLEM - RAID on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:18] RECOVERY - SSH on mw1116 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:52:18] PROBLEM - twemproxy process on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:19] RECOVERY - SSH on mw1127 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:52:26] PROBLEM - SSH on mw1148 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:52:26] PROBLEM - Disk space on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:26] RECOVERY - twemproxy process on mw1134 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:52:26] PROBLEM - twemproxy process on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:26] PROBLEM - Apache HTTP on mw1126 is CRITICAL: Connection timed out [16:52:26] PROBLEM - Apache HTTP on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:52:27] PROBLEM - RAID on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:27] PROBLEM - DPKG on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:28] PROBLEM - RAID on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:28] PROBLEM - DPKG on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:29] PROBLEM - RAID on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:36] PROBLEM - Disk space on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:36] PROBLEM - Apache HTTP on mw1128 is CRITICAL: Connection timed out [16:52:46] RECOVERY - Apache HTTP on mw1148 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.618 second response time [16:52:46] PROBLEM - Apache HTTP on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:52:46] PROBLEM - Apache HTTP on mw1114 is CRITICAL: Connection timed out [16:52:46] PROBLEM - RAID on mw1115 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:46] PROBLEM - DPKG on mw1115 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:46] PROBLEM - RAID on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:46] PROBLEM - RAID on mw1141 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:47] PROBLEM - twemproxy process on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:52:56] PROBLEM - twemproxy process on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:53:06] RECOVERY - SSH on mw1132 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:53:06] PROBLEM - SSH on mw1120 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:53:07] RECOVERY - Apache HTTP on mw1117 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.219 second response time [16:53:16] PROBLEM - twemproxy process on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:53:16] PROBLEM - DPKG on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:53:16] PROBLEM - SSH on mw1124 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:53:16] PROBLEM - RAID on mw1120 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:53:16] PROBLEM - Disk space on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:53:26] PROBLEM - SSH on mw1141 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:53:26] PROBLEM - RAID on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:53:26] PROBLEM - DPKG on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:53:26] PROBLEM - DPKG on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:53:26] PROBLEM - Disk space on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:53:33] um, that is a lot of complaining [16:53:36] PROBLEM - Disk space on mw1147 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:53:46] PROBLEM - twemproxy process on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:53:46] PROBLEM - Apache HTTP on mw1120 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:53:46] PROBLEM - RAID on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:53:46] PROBLEM - RAID on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:54:06] PROBLEM - Apache HTTP on mw1118 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:54:07] PROBLEM - DPKG on mw1125 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:54:07] RECOVERY - twemproxy process on mw1114 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:54:16] PROBLEM - Disk space on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:54:26] RECOVERY - twemproxy process on mw1127 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:54:26] PROBLEM - twemproxy process on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:54:26] RECOVERY - Disk space on mw1124 is OK: DISK OK [16:54:26] PROBLEM - twemproxy process on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:54:26] PROBLEM - twemproxy process on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:54:26] PROBLEM - twemproxy process on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:54:26] PROBLEM - Disk space on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:54:36] PROBLEM - Disk space on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:54:36] RECOVERY - twemproxy process on mw1148 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:54:37] PROBLEM - DPKG on mw1141 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:54:46] PROBLEM - Disk space on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:54:46] PROBLEM - twemproxy process on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:54:46] PROBLEM - SSH on mw1137 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:54:46] PROBLEM - SSH on mw1147 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:54:56] RECOVERY - twemproxy process on mw1124 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:54:56] RECOVERY - Disk space on mw1127 is OK: DISK OK [16:55:06] PROBLEM - Disk space on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:55:06] RECOVERY - SSH on mw1124 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:55:06] RECOVERY - SSH on mw1144 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:55:06] RECOVERY - twemproxy process on mw1116 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:55:07] PROBLEM - Disk space on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:55:16] RECOVERY - Apache HTTP on mw1124 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.330 second response time [16:55:16] PROBLEM - twemproxy process on mw1115 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:55:16] PROBLEM - SSH on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:55:16] RECOVERY - Disk space on mw1148 is OK: DISK OK [16:55:17] RECOVERY - RAID on mw1117 is OK: OK: no RAID installed [16:55:17] RECOVERY - DPKG on mw1117 is OK: All packages OK [16:55:17] PROBLEM - twemproxy process on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:55:26] RECOVERY - DPKG on mw1127 is OK: All packages OK [16:55:26] PROBLEM - twemproxy process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:55:26] PROBLEM - SSH on mw1123 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:55:26] PROBLEM - SSH on mw1139 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:55:26] RECOVERY - RAID on mw1124 is OK: OK: no RAID installed [16:55:27] PROBLEM - DPKG on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:55:36] RECOVERY - DPKG on mw1124 is OK: All packages OK [16:55:36] RECOVERY - RAID on mw1142 is OK: OK: no RAID installed [16:55:36] RECOVERY - Disk space on mw1147 is OK: DISK OK [16:55:36] RECOVERY - RAID on mw1127 is OK: OK: no RAID installed [16:55:36] RECOVERY - SSH on mw1121 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:55:37] RECOVERY - Apache HTTP on mw1127 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.063 second response time [16:55:37] RECOVERY - twemproxy process on mw1121 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:55:38] RECOVERY - Disk space on mw1121 is OK: DISK OK [16:55:46] PROBLEM - Apache HTTP on mw1148 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:55:46] PROBLEM - SSH on mw1134 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:55:46] PROBLEM - Disk space on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:55:46] PROBLEM - Apache HTTP on mw1121 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:56:06] RECOVERY - DPKG on mw1125 is OK: All packages OK [16:56:06] RECOVERY - twemproxy process on mw1118 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:56:06] PROBLEM - SSH on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:56:16] PROBLEM - Disk space on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:56:16] PROBLEM - Disk space on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:56:17] RECOVERY - twemproxy process on mw1123 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:56:26] PROBLEM - twemproxy process on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:56:26] PROBLEM - twemproxy process on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:56:26] RECOVERY - SSH on mw1123 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:56:26] RECOVERY - RAID on mw1118 is OK: OK: no RAID installed [16:56:36] PROBLEM - DPKG on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:56:36] RECOVERY - Apache HTTP on mw1148 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.064 second response time [16:56:46] PROBLEM - SSH on mw1131 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:56:46] PROBLEM - twemproxy process on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:56:46] PROBLEM - DPKG on mw1120 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:56:46] PROBLEM - DPKG on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:56:46] PROBLEM - twemproxy process on mw1141 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:56:56] RECOVERY - SSH on mw1135 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:57:06] RECOVERY - Apache HTTP on mw1118 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.666 second response time [16:57:06] RECOVERY - Apache HTTP on mw1125 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.044 second response time [16:57:16] PROBLEM - Disk space on mw1120 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:57:16] RECOVERY - SSH on mw1139 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:57:16] PROBLEM - twemproxy process on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:57:17] PROBLEM - twemproxy process on mw1120 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:57:26] RECOVERY - SSH on mw1141 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:57:26] RECOVERY - DPKG on mw1121 is OK: All packages OK [16:57:36] RECOVERY - Apache HTTP on mw1121 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.069 second response time [16:57:46] RECOVERY - DPKG on mw1118 is OK: All packages OK [16:57:46] PROBLEM - SSH on mw1118 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:58:06] PROBLEM - SSH on mw1132 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:58:07] PROBLEM - twemproxy process on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:58:07] PROBLEM - Disk space on mw1115 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:58:07] RECOVERY - SSH on mw1138 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:58:07] PROBLEM - Disk space on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:58:16] PROBLEM - SSH on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:58:16] RECOVERY - twemproxy process on mw1135 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:58:17] RECOVERY - RAID on mw1121 is OK: OK: no RAID installed [16:58:26] PROBLEM - Disk space on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:58:26] PROBLEM - twemproxy process on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:58:26] PROBLEM - Disk space on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:58:36] PROBLEM - DPKG on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:58:36] RECOVERY - Disk space on mw1116 is OK: DISK OK [16:58:36] RECOVERY - Disk space on mw1144 is OK: DISK OK [16:58:46] PROBLEM - twemproxy process on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:58:46] PROBLEM - Apache HTTP on mw1127 is CRITICAL: Connection timed out [16:58:46] RECOVERY - Disk space on mw1129 is OK: DISK OK [16:58:46] PROBLEM - RAID on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:58:46] PROBLEM - twemproxy process on mw1147 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:58:47] RECOVERY - SSH on mw1129 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:58:56] PROBLEM - SSH on mw1115 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:58:56] RECOVERY - Disk space on mw1136 is OK: DISK OK [16:59:06] RECOVERY - twemproxy process on mw1133 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:59:16] PROBLEM - Disk space on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:59:16] PROBLEM - SSH on mw1136 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:59:16] PROBLEM - twemproxy process on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:59:16] PROBLEM - Apache HTTP on mw1117 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:59:26] PROBLEM - twemproxy process on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:59:26] RECOVERY - DPKG on mw1119 is OK: All packages OK [16:59:26] PROBLEM - twemproxy process on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:59:26] PROBLEM - DPKG on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:59:26] PROBLEM - RAID on mw1117 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:59:26] PROBLEM - SSH on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:59:27] PROBLEM - RAID on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:59:36] RECOVERY - twemproxy process on mw1148 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [16:59:36] PROBLEM - RAID on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:59:37] PROBLEM - Apache HTTP on mw1133 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:59:46] PROBLEM - DPKG on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [16:59:46] PROBLEM - Apache HTTP on mw1148 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:59:46] RECOVERY - DPKG on mw1133 is OK: All packages OK [17:00:06] RECOVERY - SSH on mw1119 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:00:06] RECOVERY - Disk space on mw1119 is OK: DISK OK [17:00:16] RECOVERY - SSH on mw1148 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:00:16] RECOVERY - Disk space on mw1148 is OK: DISK OK [17:00:16] RECOVERY - twemproxy process on mw1119 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:00:26] PROBLEM - SSH on mw1141 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:00:36] RECOVERY - Apache HTTP on mw1148 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.061 second response time [17:00:36] RECOVERY - SSH on mw1118 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:00:36] PROBLEM - Disk space on mw1147 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:00:36] RECOVERY - twemproxy process on mw1128 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:00:42] ottomata: can you do me a favor and clear out the rrds for testsearch1001? I've gummed them up while working on this change [17:00:46] RECOVERY - RAID on mw1127 is OK: OK: no RAID installed [17:00:46] PROBLEM - Disk space on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:00:46] RECOVERY - SSH on mw1147 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:00:46] PROBLEM - Disk space on mw1141 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:00:56] RECOVERY - DPKG on mw1148 is OK: All packages OK [17:01:03] oh yeah sorry [17:01:06] RECOVERY - RAID on mw1148 is OK: OK: no RAID installed [17:01:06] RECOVERY - SSH on mw1136 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:01:06] RECOVERY - Disk space on mw1115 is OK: DISK OK [17:01:06] PROBLEM - SSH on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:01:07] RECOVERY - SSH on mw1143 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:01:13] yes can do [17:01:16] PROBLEM - SSH on mw1124 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:01:16] PROBLEM - SSH on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:01:26] RECOVERY - twemproxy process on mw1131 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:01:26] PROBLEM - LVS HTTP IPv4 on api.svc.eqiad.wmnet is CRITICAL: Connection timed out [17:01:29] ottomata: thanks! with all these icinga-wm_ logs I'm not sure how you see anything [17:01:30] PROBLEM - Apache HTTP on mw1142 is CRITICAL: Connection timed out [17:01:30] PROBLEM - RAID on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:01:30] PROBLEM - DPKG on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:01:36] RECOVERY - SSH on mw1131 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:01:36] RECOVERY - SSH on mw1137 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:01:46] RECOVERY - twemproxy process on mw1137 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:01:46] RECOVERY - SSH on mw1115 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:01:46] PROBLEM - SSH on mw1121 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:01:46] PROBLEM - twemproxy process on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:01:46] PROBLEM - Disk space on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:01:46] PROBLEM - Apache HTTP on mw1121 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:01:47] PROBLEM - Disk space on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:01:56] RECOVERY - SSH on mw1132 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:01:59] gwicke: ping [17:02:06] RECOVERY - SSH on mw1124 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:02:06] RECOVERY - SSH on mw1144 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:02:06] RECOVERY - SSH on mw1135 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:02:16] PROBLEM - DPKG on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:02:17] PROBLEM - SSH on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:02:17] manybubbles: done and restarted gmetad [17:02:17] RECOVERY - SSH on mw1141 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:02:21] ottomata: thanks! [17:02:26] PROBLEM - SSH on mw1139 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:02:26] PROBLEM - DPKG on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:02:45] what the hell [17:02:46] PROBLEM - DPKG on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:03:06] PROBLEM - Disk space on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:03:16] PROBLEM - Disk space on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:03:26] PROBLEM - Apache HTTP on mw1124 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:03:26] RECOVERY - SSH on mw1139 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:03:26] PROBLEM - twemproxy process on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:03:26] RECOVERY - Disk space on mw1147 is OK: DISK OK [17:03:36] RECOVERY - Disk space on mw1146 is OK: DISK OK [17:03:46] PROBLEM - Apache HTTP on mw1148 is CRITICAL: Connection timed out [17:03:46] RECOVERY - DPKG on mw1124 is OK: All packages OK [17:03:46] RECOVERY - DPKG on mw1133 is OK: All packages OK [17:03:46] PROBLEM - twemproxy process on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:03:46] PROBLEM - RAID on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:03:47] PROBLEM - Disk space on mw1121 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:03:47] PROBLEM - SSH on mw1147 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:04:06] RECOVERY - twemproxy process on mw1117 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:04:06] RECOVERY - Disk space on mw1127 is OK: DISK OK [17:04:07] PROBLEM - twemproxy process on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:04:07] PROBLEM - SSH on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:04:07] PROBLEM - Disk space on mw1115 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:04:07] RECOVERY - Apache HTTP on mw1117 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.893 second response time [17:04:16] RECOVERY - Disk space on mw1134 is OK: DISK OK [17:04:16] RECOVERY - SSH on mw1146 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:04:16] PROBLEM - SSH on mw1143 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:04:17] RECOVERY - DPKG on mw1117 is OK: All packages OK [17:04:17] RECOVERY - RAID on mw1117 is OK: OK: no RAID installed [17:04:26] PROBLEM - SSH on mw1116 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:04:26] PROBLEM - twemproxy process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:04:36] RECOVERY - SSH on mw1147 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:04:46] RECOVERY - twemproxy process on mw1128 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:04:46] RECOVERY - SSH on mw1134 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:04:46] PROBLEM - SSH on mw1137 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:04:46] PROBLEM - twemproxy process on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:04:56] PROBLEM - SSH on mw1129 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:04:56] PROBLEM - SSH on mw1115 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:04:56] RECOVERY - twemproxy process on mw1145 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:05:06] PROBLEM - twemproxy process on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:05:06] PROBLEM - Disk space on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:05:07] RECOVERY - Apache HTTP on mw1145 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.885 second response time [17:05:07] RECOVERY - SSH on mw1122 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:05:07] PROBLEM - SSH on mw1132 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:05:07] PROBLEM - SSH on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:05:07] PROBLEM - SSH on mw1119 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:05:07] RECOVERY - Disk space on mw1120 is OK: DISK OK [17:05:16] PROBLEM - SSH on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:05:16] PROBLEM - SSH on mw1124 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:05:27] RECOVERY - twemproxy process on mw1134 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:05:27] PROBLEM - SSH on mw1127 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:05:27] PROBLEM - Disk space on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:05:37] RECOVERY - LVS HTTP IPv4 on api.svc.eqiad.wmnet is OK: HTTP OK: HTTP/1.1 200 OK - 2919 bytes in 0.088 second response time [17:05:47] RECOVERY - Disk space on mw1129 is OK: DISK OK [17:05:47] RECOVERY - SSH on mw1145 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:05:47] RECOVERY - Disk space on mw1145 is OK: DISK OK [17:05:47] RECOVERY - SSH on mw1115 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:05:47] PROBLEM - twemproxy process on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:05:57] RECOVERY - DPKG on mw1132 is OK: All packages OK [17:05:57] RECOVERY - SSH on mw1116 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:05:57] RECOVERY - RAID on mw1145 is OK: OK: no RAID installed [17:05:57] RECOVERY - twemproxy process on mw1116 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:05:57] RECOVERY - SSH on mw1132 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:06:07] RECOVERY - DPKG on mw1145 is OK: All packages OK [17:06:07] RECOVERY - SSH on mw1135 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:06:08] RECOVERY - Disk space on mw1122 is OK: DISK OK [17:06:17] PROBLEM - SSH on mw1136 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:06:17] RECOVERY - twemproxy process on mw1132 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:06:27] PROBLEM - SSH on mw1123 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:06:27] PROBLEM - SSH on mw1141 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:06:27] RECOVERY - DPKG on mw1116 is OK: All packages OK [17:06:37] RECOVERY - RAID on mw1116 is OK: OK: no RAID installed [17:06:38] RECOVERY - Apache HTTP on mw1116 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.066 second response time [17:06:38] RECOVERY - twemproxy process on mw1138 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:06:47] RECOVERY - twemproxy process on mw1137 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:06:47] RECOVERY - Disk space on mw1121 is OK: DISK OK [17:06:47] PROBLEM - DPKG on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:06:47] PROBLEM - SSH on mw1131 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:06:47] PROBLEM - twemproxy process on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:06:47] RECOVERY - twemproxy process on mw1141 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:06:48] PROBLEM - DPKG on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:06:48] RECOVERY - Disk space on mw1124 is OK: DISK OK [17:06:57] RECOVERY - Disk space on mw1136 is OK: DISK OK [17:06:57] RECOVERY - twemproxy process on mw1124 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:06:57] RECOVERY - Disk space on mw1132 is OK: DISK OK [17:07:02] greg-g: that's fine for me; 2PM centralnotice fun it is [17:07:07] RECOVERY - SSH on mw1124 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:07:07] RECOVERY - Disk space on mw1138 is OK: DISK OK [17:07:08] RECOVERY - SSH on mw1138 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:07:08] RECOVERY - RAID on mw1133 is OK: OK: no RAID installed [17:07:17] PROBLEM - Disk space on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:07:32] mwalker: (shouting over icinga) THANKS SIR [17:07:37] PROBLEM - Disk space on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:07:47] PROBLEM - twemproxy process on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:07:47] PROBLEM - SSH on mw1147 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:08:07] PROBLEM - SSH on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:08:17] PROBLEM - RAID on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:08:17] RECOVERY - SSH on mw1127 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:08:17] RECOVERY - SSH on mw1123 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:08:27] RECOVERY - Disk space on mw1148 is OK: DISK OK [17:08:27] RECOVERY - twemproxy process on mw1120 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:08:37] PROBLEM - Disk space on mw1147 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:08:42] paravoid: you were talking with Jeff_Green about geolocation in IPv6 -- we sort of just assumed it was crummy based on statements from MaxMind; and as well in random samples they seemed to have more generic returns; like 'Asia Pacific Region' -- if you can wait a couple of days I should be able to get you an actual breakdown [17:08:47] RECOVERY - SSH on mw1131 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:08:47] RECOVERY - SSH on mw1121 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:08:47] PROBLEM - RAID on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:08:47] RECOVERY - twemproxy process on mw1121 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:08:47] PROBLEM - Disk space on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:08:48] PROBLEM - DPKG on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:08:57] PROBLEM - Disk space on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:08:57] PROBLEM - SSH on mw1145 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:08:57] PROBLEM - DPKG on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:08:57] RECOVERY - Disk space on mw1139 is OK: DISK OK [17:09:07] PROBLEM - twemproxy process on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:09:07] PROBLEM - RAID on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:09:07] RECOVERY - Disk space on mw1115 is OK: DISK OK [17:09:08] RECOVERY - SSH on mw1119 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:09:08] RECOVERY - SSH on mw1122 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:09:08] PROBLEM - SSH on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:09:08] PROBLEM - Apache HTTP on mw1118 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:09:09] RECOVERY - Disk space on mw1119 is OK: DISK OK [17:09:09] RECOVERY - twemproxy process on mw1115 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:09:10] RECOVERY - DPKG on mw1142 is OK: All packages OK [17:09:10] PROBLEM - Apache HTTP on mw1145 is CRITICAL: Connection timed out [17:09:17] PROBLEM - SSH on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:09:17] PROBLEM - DPKG on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:09:27] RECOVERY - SSH on mw1114 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:09:27] PROBLEM - SSH on mw1139 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:09:37] RECOVERY - Disk space on mw1147 is OK: DISK OK [17:09:37] RECOVERY - SSH on mw1147 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:09:38] RECOVERY - DPKG on mw1133 is OK: All packages OK [17:09:38] RECOVERY - Disk space on mw1141 is OK: DISK OK [17:09:47] RECOVERY - twemproxy process on mw1147 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:09:47] RECOVERY - DPKG on mw1118 is OK: All packages OK [17:09:47] PROBLEM - Disk space on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:09:47] PROBLEM - RAID on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:09:47] PROBLEM - twemproxy process on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:09:47] PROBLEM - Disk space on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:09:47] RECOVERY - twemproxy process on mw1119 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:09:48] PROBLEM - twemproxy process on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:09:48] PROBLEM - twemproxy process on mw1141 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:10:07] PROBLEM - twemproxy process on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:10:07] PROBLEM - Disk space on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:10:07] RECOVERY - SSH on mw1120 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:10:07] PROBLEM - Disk space on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:10:07] PROBLEM - Disk space on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:10:08] RECOVERY - Disk space on mw1114 is OK: DISK OK [17:10:08] RECOVERY - Disk space on mw1134 is OK: DISK OK [17:10:17] PROBLEM - RAID on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:10:17] RECOVERY - DPKG on mw1119 is OK: All packages OK [17:10:17] RECOVERY - twemproxy process on mw1139 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:10:17] RECOVERY - SSH on mw1139 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:10:27] PROBLEM - twemproxy process on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:10:37] PROBLEM - SSH on mw1128 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:10:37] RECOVERY - Disk space on mw1146 is OK: DISK OK [17:10:47] RECOVERY - Disk space on mw1135 is OK: DISK OK [17:10:47] PROBLEM - Apache HTTP on mw1116 is CRITICAL: Connection timed out [17:10:47] RECOVERY - SSH on mw1129 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:10:47] PROBLEM - SSH on mw1148 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:10:47] PROBLEM - Disk space on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:10:57] RECOVERY - RAID on mw1119 is OK: OK: no RAID installed [17:10:57] RECOVERY - twemproxy process on mw1124 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:10:57] RECOVERY - Disk space on mw1136 is OK: DISK OK [17:11:07] RECOVERY - Disk space on mw1127 is OK: DISK OK [17:11:07] PROBLEM - DPKG on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:11:08] RECOVERY - SSH on mw1144 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:11:17] PROBLEM - Disk space on mw1120 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:11:27] PROBLEM - twemproxy process on mw1120 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:11:27] PROBLEM - SSH on mw1127 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:11:27] RECOVERY - SSH on mw1140 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:11:37] RECOVERY - twemproxy process on mw1148 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:11:47] RECOVERY - Disk space on mw1124 is OK: DISK OK [17:11:47] RECOVERY - SSH on mw1148 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:11:47] RECOVERY - Disk space on mw1129 is OK: DISK OK [17:11:57] RECOVERY - DPKG on mw1147 is OK: All packages OK [17:12:07] RECOVERY - DPKG on mw1129 is OK: All packages OK [17:12:07] PROBLEM - Disk space on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:12:07] RECOVERY - SSH on mw1136 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:12:08] PROBLEM - SSH on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:12:08] RECOVERY - twemproxy process on mw1129 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:12:17] RECOVERY - Disk space on mw1120 is OK: DISK OK [17:12:17] PROBLEM - SSH on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:12:17] PROBLEM - twemproxy process on mw1115 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:12:17] PROBLEM - DPKG on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:12:17] RECOVERY - RAID on mw1118 is OK: OK: no RAID installed [17:12:18] RECOVERY - SSH on mw1127 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:12:21] (03PS1) 10Manybubbles: Rebuild elasticsearch ganglia monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/93077 [17:12:27] RECOVERY - twemproxy process on mw1136 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:12:27] RECOVERY - SSH on mw1128 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:12:27] PROBLEM - SSH on mw1114 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:12:37] PROBLEM - DPKG on mw1116 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:12:47] PROBLEM - DPKG on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:12:47] PROBLEM - DPKG on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:12:57] RECOVERY - twemproxy process on mw1135 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:12:57] PROBLEM - SSH on mw1115 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:12:57] RECOVERY - SSH on mw1135 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:13:03] (03PS1) 10Reedy: Remove loginwiki from phase1.dblist [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93078 [17:13:07] RECOVERY - Disk space on mw1139 is OK: DISK OK [17:13:08] PROBLEM - SSH on mw1120 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:13:08] PROBLEM - twemproxy process on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:13:13] (03CR) 10jenkins-bot: [V: 04-1] Rebuild elasticsearch ganglia monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/93077 (owner: 10Manybubbles) [17:13:17] RECOVERY - Apache HTTP on mw1124 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.067 second response time [17:13:17] PROBLEM - Disk space on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:13:27] RECOVERY - DPKG on mw1116 is OK: All packages OK [17:13:27] PROBLEM - twemproxy process on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:13:27] RECOVERY - Disk space on mw1131 is OK: DISK OK [17:13:27] PROBLEM - SSH on mw1133 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:13:28] RECOVERY - RAID on mw1124 is OK: OK: no RAID installed [17:13:37] RECOVERY - DPKG on mw1124 is OK: All packages OK [17:13:37] RECOVERY - RAID on mw1116 is OK: OK: no RAID installed [17:13:47] RECOVERY - twemproxy process on mw1138 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:13:47] PROBLEM - SSH on mw1118 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:13:47] PROBLEM - twemproxy process on mw1147 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:13:47] RECOVERY - twemproxy process on mw1141 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:13:47] RECOVERY - SSH on mw1115 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:13:48] PROBLEM - twemproxy process on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:13:48] RECOVERY - Disk space on mw1145 is OK: DISK OK [17:13:57] RECOVERY - SSH on mw1145 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:14:07] PROBLEM - Disk space on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:14:07] RECOVERY - SSH on mw1146 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:14:07] RECOVERY - DPKG on mw1142 is OK: All packages OK [17:14:08] PROBLEM - SSH on mw1132 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:14:08] PROBLEM - RAID on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:14:08] RECOVERY - Disk space on mw1114 is OK: DISK OK [17:14:08] PROBLEM - SSH on mw1119 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:14:09] PROBLEM - Disk space on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:14:09] RECOVERY - SSH on mw1138 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:14:17] PROBLEM - SSH on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:14:17] PROBLEM - Disk space on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:14:23] (03CR) 10Manybubbles: [C: 04-1] "Now that I've rewritten a good chunk of this I'll get it to pass pep8." [operations/puppet] - 10https://gerrit.wikimedia.org/r/93077 (owner: 10Manybubbles) [17:14:27] RECOVERY - twemproxy process on mw1146 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:14:27] PROBLEM - SSH on mw1123 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:14:27] PROBLEM - DPKG on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:14:27] PROBLEM - twemproxy process on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:14:27] RECOVERY - Disk space on mw1137 is OK: DISK OK [17:14:28] RECOVERY - DPKG on mw1146 is OK: All packages OK [17:14:37] PROBLEM - Disk space on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:14:37] PROBLEM - SSH on mw1140 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:14:37] RECOVERY - Apache HTTP on mw1116 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.156 second response time [17:14:47] PROBLEM - Disk space on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:14:47] PROBLEM - Disk space on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:14:47] PROBLEM - twemproxy process on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:14:47] PROBLEM - SSH on mw1147 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:14:47] PROBLEM - Disk space on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:14:57] PROBLEM - SSH on mw1129 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:15:07] RECOVERY - Apache HTTP on mw1118 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.816 second response time [17:15:08] RECOVERY - Disk space on mw1138 is OK: DISK OK [17:15:08] PROBLEM - Disk space on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:15:08] PROBLEM - DPKG on mw1147 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:15:17] PROBLEM - Disk space on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:15:17] PROBLEM - twemproxy process on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:15:17] PROBLEM - Disk space on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:15:17] PROBLEM - SSH on mw1136 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:15:17] PROBLEM - Disk space on mw1120 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:15:17] PROBLEM - DPKG on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:15:18] RECOVERY - SSH on mw1123 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:15:18] RECOVERY - RAID on mw1121 is OK: OK: no RAID installed [17:15:19] RECOVERY - DPKG on mw1121 is OK: All packages OK [17:15:19] RECOVERY - Apache HTTP on mw1142 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.434 second response time [17:15:27] PROBLEM - RAID on mw1118 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:15:27] PROBLEM - twemproxy process on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:15:27] PROBLEM - SSH on mw1127 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:15:37] RECOVERY - Disk space on mw1118 is OK: DISK OK [17:15:37] RECOVERY - SSH on mw1118 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:15:37] RECOVERY - RAID on mw1129 is OK: OK: no RAID installed [17:15:38] RECOVERY - Disk space on mw1129 is OK: DISK OK [17:15:38] RECOVERY - DPKG on mw1118 is OK: All packages OK [17:15:47] RECOVERY - RAID on mw1142 is OK: OK: no RAID installed [17:15:47] RECOVERY - twemproxy process on mw1128 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:15:47] RECOVERY - DPKG on mw1128 is OK: All packages OK [17:15:47] RECOVERY - SSH on mw1129 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:15:47] PROBLEM - SSH on mw1148 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:16:07] RECOVERY - twemproxy process on mw1129 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:16:07] RECOVERY - Disk space on mw1127 is OK: DISK OK [17:16:07] RECOVERY - DPKG on mw1129 is OK: All packages OK [17:16:07] RECOVERY - Disk space on mw1119 is OK: DISK OK [17:16:17] RECOVERY - RAID on mw1118 is OK: OK: no RAID installed [17:16:17] RECOVERY - SSH on mw1127 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:16:17] PROBLEM - Apache HTTP on mw1124 is CRITICAL: Connection timed out [17:16:27] RECOVERY - twemproxy process on mw1127 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:16:27] RECOVERY - DPKG on mw1127 is OK: All packages OK [17:16:27] PROBLEM - Disk space on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:16:27] RECOVERY - twemproxy process on mw1139 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:16:37] PROBLEM - RAID on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:16:38] RECOVERY - SSH on mw1147 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:16:47] RECOVERY - twemproxy process on mw1147 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:16:47] RECOVERY - Disk space on mw1144 is OK: DISK OK [17:16:47] PROBLEM - Disk space on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:16:57] PROBLEM - SSH on mw1115 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:16:57] RECOVERY - twemproxy process on mw1145 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:17:08] PROBLEM - Disk space on mw1115 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:17:08] PROBLEM - Disk space on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:17:17] RECOVERY - Disk space on mw1123 is OK: DISK OK [17:17:17] PROBLEM - SSH on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:17:17] PROBLEM - SSH on mw1124 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:17:17] RECOVERY - SSH on mw1141 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:17:17] RECOVERY - SSH on mw1133 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:17:27] PROBLEM - twemproxy process on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:17:27] PROBLEM - Disk space on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:17:37] RECOVERY - DPKG on mw1138 is OK: All packages OK [17:17:37] PROBLEM - SSH on mw1128 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:17:37] PROBLEM - DPKG on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:17:47] RECOVERY - Disk space on mw1146 is OK: DISK OK [17:17:57] RECOVERY - Disk space on mw1136 is OK: DISK OK [17:18:07] RECOVERY - SSH on mw1136 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:18:07] RECOVERY - SSH on mw1122 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:18:08] RECOVERY - Disk space on mw1122 is OK: DISK OK [17:18:08] RECOVERY - SSH on mw1124 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:18:08] RECOVERY - SSH on mw1146 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:18:27] RECOVERY - Disk space on mw1131 is OK: DISK OK [17:18:27] RECOVERY - twemproxy process on mw1136 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:18:27] RECOVERY - Disk space on mw1148 is OK: DISK OK [17:18:47] RECOVERY - SSH on mw1137 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:18:47] PROBLEM - twemproxy process on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:18:47] PROBLEM - DPKG on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:18:47] PROBLEM - RAID on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:18:47] RECOVERY - Disk space on mw1135 is OK: DISK OK [17:18:48] PROBLEM - twemproxy process on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:18:48] PROBLEM - Disk space on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:18:49] PROBLEM - DPKG on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:18:57] RECOVERY - twemproxy process on mw1122 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:18:57] RECOVERY - DPKG on mw1122 is OK: All packages OK [17:18:57] PROBLEM - SSH on mw1129 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:18:57] RECOVERY - Disk space on mw1133 is OK: DISK OK [17:18:57] RECOVERY - DPKG on mw1131 is OK: All packages OK [17:18:58] RECOVERY - SSH on mw1132 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:18:58] RECOVERY - Disk space on mw1132 is OK: DISK OK [17:18:59] RECOVERY - Disk space on mw1115 is OK: DISK OK [17:19:07] RECOVERY - SSH on mw1120 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:19:07] RECOVERY - twemproxy process on mw1115 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:19:17] PROBLEM - twemproxy process on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:19:17] PROBLEM - DPKG on mw1129 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:19:17] PROBLEM - Disk space on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:19:17] RECOVERY - twemproxy process on mw1132 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:19:27] PROBLEM - SSH on mw1123 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:19:37] PROBLEM - DPKG on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:19:37] RECOVERY - SSH on mw1148 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:19:47] PROBLEM - SSH on mw1134 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:19:47] PROBLEM - twemproxy process on mw1147 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:19:47] PROBLEM - Disk space on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:20:07] PROBLEM - twemproxy process on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:20:07] PROBLEM - SSH on mw1135 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:20:08] RECOVERY - Disk space on mw1119 is OK: DISK OK [17:20:08] RECOVERY - Disk space on mw1120 is OK: DISK OK [17:20:08] PROBLEM - Disk space on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:20:13] (03PS2) 10Manybubbles: Rebuild elasticsearch ganglia monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/93077 [17:20:17] PROBLEM - Disk space on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:20:27] RECOVERY - SSH on mw1128 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:20:27] PROBLEM - SSH on mw1139 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:20:27] RECOVERY - DPKG on mw1119 is OK: All packages OK [17:20:27] PROBLEM - twemproxy process on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:20:27] RECOVERY - DPKG on mw1141 is OK: All packages OK [17:20:37] PROBLEM - DPKG on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:20:37] RECOVERY - DPKG on mw1133 is OK: All packages OK [17:20:47] PROBLEM - Disk space on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:20:47] PROBLEM - LVS HTTP IPv4 on api.svc.eqiad.wmnet is CRITICAL: Connection timed out [17:20:51] RECOVERY - SSH on mw1115 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:20:51] RECOVERY - twemproxy process on mw1119 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:20:51] PROBLEM - twemproxy process on mw1141 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:20:57] RECOVERY - RAID on mw1136 is OK: OK: no RAID installed [17:20:57] PROBLEM - Disk space on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:20:57] PROBLEM - SSH on mw1145 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:20:57] RECOVERY - twemproxy process on mw1133 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:21:06] (03CR) 10jenkins-bot: [V: 04-1] Rebuild elasticsearch ganglia monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/93077 (owner: 10Manybubbles) [17:21:07] RECOVERY - Disk space on mw1123 is OK: DISK OK [17:21:17] PROBLEM - Disk space on mw1114 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:21:17] PROBLEM - SSH on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:21:17] PROBLEM - SSH on mw1124 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:21:27] PROBLEM - Disk space on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:21:37] RECOVERY - Disk space on mw1146 is OK: DISK OK [17:21:37] RECOVERY - LVS HTTP IPv4 on api.svc.eqiad.wmnet is OK: HTTP OK: HTTP/1.1 200 OK - 2919 bytes in 0.071 second response time [17:21:41] RECOVERY - twemproxy process on mw1114 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:21:47] RECOVERY - twemproxy process on mw1148 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:21:47] PROBLEM - SSH on mw1131 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:21:47] RECOVERY - twemproxy process on mw1141 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:21:47] RECOVERY - SSH on mw1145 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:21:47] PROBLEM - Disk space on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:21:57] RECOVERY - Disk space on mw1145 is OK: DISK OK [17:21:57] PROBLEM - DPKG on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:21:57] PROBLEM - twemproxy process on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:21:58] RECOVERY - twemproxy process on mw1145 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:22:07] RECOVERY - DPKG on mw1148 is OK: All packages OK [17:22:07] RECOVERY - SSH on mw1124 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:22:07] RECOVERY - RAID on mw1132 is OK: OK: no RAID installed [17:22:08] PROBLEM - SSH on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:22:08] PROBLEM - DPKG on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:22:17] RECOVERY - DPKG on mw1123 is OK: All packages OK [17:22:17] RECOVERY - twemproxy process on mw1123 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:22:17] PROBLEM - SSH on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:22:17] PROBLEM - twemproxy process on mw1115 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:22:17] RECOVERY - SSH on mw1123 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:22:18] RECOVERY - Disk space on mw1131 is OK: DISK OK [17:22:18] RECOVERY - twemproxy process on mw1120 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:22:27] RECOVERY - twemproxy process on mw1146 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:22:27] RECOVERY - DPKG on mw1127 is OK: All packages OK [17:22:27] RECOVERY - SSH on mw1114 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:22:37] RECOVERY - twemproxy process on mw1128 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:22:38] RECOVERY - DPKG on mw1136 is OK: All packages OK [17:22:38] RECOVERY - DPKG on mw1128 is OK: All packages OK [17:22:47] PROBLEM - SSH on mw1137 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:22:47] PROBLEM - SSH on mw1147 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:22:57] RECOVERY - SSH on mw1135 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:23:07] PROBLEM - Disk space on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:23:07] RECOVERY - SSH on mw1122 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:23:07] PROBLEM - SSH on mw1120 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:23:13] ottomata: do you know why jenkins may be -1ing my commit even though it passes pep8 locally and reports no errors on jenkins? [17:23:17] RECOVERY - twemproxy process on mw1115 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:23:17] RECOVERY - Disk space on mw1114 is OK: DISK OK [17:23:17] RECOVERY - Disk space on mw1137 is OK: DISK OK [17:23:27] PROBLEM - SSH on mw1133 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:23:35] hmm, no [17:23:37] RECOVERY - RAID on mw1123 is OK: OK: no RAID installed [17:23:37] PROBLEM - DPKG on mw1141 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:23:38] RECOVERY - Disk space on mw1135 is OK: DISK OK [17:23:47] RECOVERY - Disk space on mw1129 is OK: DISK OK [17:23:47] PROBLEM - DPKG on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:23:47] RECOVERY - twemproxy process on mw1122 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:23:47] RECOVERY - DPKG on mw1122 is OK: All packages OK [17:23:47] RECOVERY - DPKG on mw1139 is OK: All packages OK [17:23:57] PROBLEM - RAID on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:23:57] RECOVERY - Apache HTTP on mw1123 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.370 second response time [17:24:07] RECOVERY - SSH on mw1120 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:24:07] PROBLEM - twemproxy process on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:24:07] PROBLEM - Disk space on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:24:08] RECOVERY - Disk space on mw1138 is OK: DISK OK [17:24:08] PROBLEM - twemproxy process on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:24:08] RECOVERY - Disk space on mw1134 is OK: DISK OK [17:24:13] ottomata: I think I know. [17:24:17] PROBLEM - Disk space on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:24:27] RECOVERY - twemproxy process on mw1134 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:24:27] PROBLEM - twemproxy process on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:24:37] ottomata: https://integration.wikimedia.org/ci/job/operations-puppet-pep8/3852/console says [17:24:39] puppet_pep8.py: No such file or directory [17:24:47] RECOVERY - Apache HTTP on mw1120 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.732 second response time [17:24:47] RECOVERY - Disk space on mw1144 is OK: DISK OK [17:24:47] RECOVERY - SSH on mw1137 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:24:47] RECOVERY - SSH on mw1147 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:24:47] PROBLEM - twemproxy process on mw1141 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:24:51] so its unlikely to pass [17:24:57] RECOVERY - twemproxy process on mw1144 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:24:57] PROBLEM - SSH on mw1145 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:24:57] PROBLEM - Disk space on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:25:07] RECOVERY - Disk space on mw1136 is OK: DISK OK [17:25:07] RECOVERY - SSH on mw1144 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:25:07] RECOVERY - SSH on mw1146 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:25:08] PROBLEM - twemproxy process on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:25:08] RECOVERY - Disk space on mw1139 is OK: DISK OK [17:25:08] PROBLEM - DPKG on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:25:08] PROBLEM - SSH on mw1132 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:25:08] RECOVERY - SSH on mw1138 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:25:17] PROBLEM - RAID on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:25:27] RECOVERY - SSH on mw1133 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:25:27] PROBLEM - SSH on mw1141 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:25:27] PROBLEM - twemproxy process on mw1120 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:25:27] PROBLEM - twemproxy process on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:25:28] PROBLEM - Disk space on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:25:28] RECOVERY - Apache HTTP on mw1128 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.066 second response time [17:25:37] RECOVERY - Apache HTTP on mw1135 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.062 second response time [17:25:37] RECOVERY - DPKG on mw1120 is OK: All packages OK [17:25:47] RECOVERY - twemproxy process on mw1141 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:25:47] RECOVERY - RAID on mw1127 is OK: OK: no RAID installed [17:25:47] RECOVERY - SSH on mw1134 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:25:47] PROBLEM - DPKG on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:25:47] PROBLEM - RAID on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:25:48] RECOVERY - Apache HTTP on mw1122 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.081 second response time [17:25:48] PROBLEM - DPKG on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:25:49] RECOVERY - Disk space on mw1145 is OK: DISK OK [17:25:57] RECOVERY - twemproxy process on mw1135 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:26:07] RECOVERY - Disk space on mw1128 is OK: DISK OK [17:26:07] RECOVERY - Disk space on mw1143 is OK: DISK OK [17:26:07] RECOVERY - RAID on mw1120 is OK: OK: no RAID installed [17:26:07] RECOVERY - DPKG on mw1135 is OK: All packages OK [17:26:12] !log depooling mw1114-mw1148; balancing is unfair, boxes overloaded, mw1189-mw1208 capable of handling the load [17:26:17] PROBLEM - twemproxy process on mw1115 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:26:17] PROBLEM - Disk space on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:26:17] RECOVERY - twemproxy process on mw1120 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:26:17] RECOVERY - SSH on mw1139 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:26:17] RECOVERY - twemproxy process on mw1146 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:26:26] Logged the message, Master [17:26:27] RECOVERY - twemproxy process on mw1139 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:26:27] RECOVERY - Disk space on mw1131 is OK: DISK OK [17:26:27] PROBLEM - Apache HTTP on mw1142 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:26:27] PROBLEM - DPKG on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:26:27] PROBLEM - Disk space on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:26:37] RECOVERY - SSH on mw1131 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:26:37] RECOVERY - DPKG on mw1136 is OK: All packages OK [17:26:38] RECOVERY - Apache HTTP on mw1127 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.108 second response time [17:26:47] RECOVERY - twemproxy process on mw1137 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:26:47] RECOVERY - SSH on mw1129 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:26:47] RECOVERY - SSH on mw1145 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:26:47] RECOVERY - twemproxy process on mw1131 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:26:47] RECOVERY - RAID on mw1135 is OK: OK: no RAID installed [17:26:57] PROBLEM - SSH on mw1115 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:26:57] PROBLEM - DPKG on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:26:57] PROBLEM - DPKG on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:26:57] RECOVERY - SSH on mw1132 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:26:57] RECOVERY - SSH on mw1119 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:27:07] RECOVERY - twemproxy process on mw1129 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:27:07] RECOVERY - Disk space on mw1119 is OK: DISK OK [17:27:07] RECOVERY - DPKG on mw1129 is OK: All packages OK [17:27:07] RECOVERY - Apache HTTP on mw1129 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.061 second response time [17:27:07] RECOVERY - twemproxy process on mw1115 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:27:08] PROBLEM - Disk space on mw1133 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:27:17] RECOVERY - Apache HTTP on mw1145 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.504 second response time [17:27:17] PROBLEM - Disk space on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:27:17] RECOVERY - DPKG on mw1119 is OK: All packages OK [17:27:17] RECOVERY - Apache HTTP on mw1142 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.068 second response time [17:27:27] RECOVERY - Apache HTTP on mw1119 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.064 second response time [17:27:27] RECOVERY - SSH on mw1140 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:27:27] PROBLEM - twemproxy process on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:27:27] PROBLEM - twemproxy process on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:27:37] PROBLEM - Disk space on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:27:37] RECOVERY - RAID on mw1142 is OK: OK: no RAID installed [17:27:37] RECOVERY - twemproxy process on mw1147 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:27:38] RECOVERY - RAID on mw1129 is OK: OK: no RAID installed [17:27:38] RECOVERY - Apache HTTP on mw1147 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.061 second response time [17:27:38] RECOVERY - RAID on mw1128 is OK: OK: no RAID installed [17:27:38] RECOVERY - DPKG on mw1128 is OK: All packages OK [17:27:47] PROBLEM - twemproxy process on mw1148 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:27:47] RECOVERY - RAID on mw1122 is OK: OK: no RAID installed [17:27:47] PROBLEM - SSH on mw1137 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:27:47] RECOVERY - DPKG on mw1122 is OK: All packages OK [17:27:47] RECOVERY - Apache HTTP on mw1115 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.116 second response time [17:27:57] PROBLEM - twemproxy process on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:27:57] RECOVERY - DPKG on mw1147 is OK: All packages OK [17:27:58] RECOVERY - Disk space on mw1140 is OK: DISK OK [17:28:07] RECOVERY - RAID on mw1140 is OK: OK: no RAID installed [17:28:07] RECOVERY - RAID on mw1147 is OK: OK: no RAID installed [17:28:07] RECOVERY - Apache HTTP on mw1141 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.630 second response time [17:28:08] PROBLEM - twemproxy process on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:28:08] PROBLEM - SSH on mw1122 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:28:17] PROBLEM - SSH on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:28:17] PROBLEM - SSH on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:28:17] RECOVERY - SSH on mw1141 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:28:27] PROBLEM - SSH on mw1133 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:28:27] RECOVERY - DPKG on mw1146 is OK: All packages OK [17:28:37] RECOVERY - twemproxy process on mw1148 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:28:47] RECOVERY - DPKG on mw1115 is OK: All packages OK [17:28:47] PROBLEM - SSH on mw1134 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:28:47] RECOVERY - SSH on mw1115 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:28:47] PROBLEM - Disk space on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:28:47] PROBLEM - SSH on mw1148 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:28:57] RECOVERY - twemproxy process on mw1124 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:28:57] RECOVERY - RAID on mw1119 is OK: OK: no RAID installed [17:28:57] RECOVERY - twemproxy process on mw1145 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:29:07] RECOVERY - DPKG on mw1114 is OK: All packages OK [17:29:07] RECOVERY - SSH on mw1122 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:29:08] PROBLEM - Disk space on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:29:17] RECOVERY - Apache HTTP on mw1124 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.070 second response time [17:29:17] RECOVERY - RAID on mw1114 is OK: OK: no RAID installed [17:29:17] RECOVERY - twemproxy process on mw1136 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:29:27] PROBLEM - Disk space on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:29:27] PROBLEM - twemproxy process on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:29:27] RECOVERY - Disk space on mw1137 is OK: DISK OK [17:29:27] RECOVERY - RAID on mw1124 is OK: OK: no RAID installed [17:29:27] RECOVERY - DPKG on mw1141 is OK: All packages OK [17:29:37] PROBLEM - DPKG on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:29:37] RECOVERY - DPKG on mw1124 is OK: All packages OK [17:29:37] RECOVERY - RAID on mw1115 is OK: OK: no RAID installed [17:29:47] RECOVERY - SSH on mw1148 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:29:47] RECOVERY - RAID on mw1141 is OK: OK: no RAID installed [17:29:47] PROBLEM - DPKG on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:29:47] PROBLEM - twemproxy process on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:29:47] PROBLEM - twemproxy process on mw1131 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:29:57] RECOVERY - RAID on mw1145 is OK: OK: no RAID installed [17:29:57] RECOVERY - DPKG on mw1148 is OK: All packages OK [17:30:07] RECOVERY - RAID on mw1148 is OK: OK: no RAID installed [17:30:07] RECOVERY - DPKG on mw1145 is OK: All packages OK [17:30:17] RECOVERY - SSH on mw1138 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:30:17] RECOVERY - SSH on mw1144 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:30:27] RECOVERY - twemproxy process on mw1139 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:30:27] RECOVERY - Disk space on mw1148 is OK: DISK OK [17:30:37] PROBLEM - SSH on mw1140 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:30:37] RECOVERY - Apache HTTP on mw1148 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.061 second response time [17:30:37] RECOVERY - Apache HTTP on mw1114 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.319 second response time [17:30:38] RECOVERY - SSH on mw1137 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:30:46] not so capable [17:30:47] RECOVERY - twemproxy process on mw1138 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:30:47] PROBLEM - RAID on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:31:03] very borderline :) [17:31:05] swapoff must be helping [17:31:07] RECOVERY - Apache HTTP on mw1139 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.667 second response time [17:31:07] PROBLEM - Disk space on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:31:17] RECOVERY - RAID on mw1139 is OK: OK: no RAID installed [17:31:17] PROBLEM - RAID on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:31:17] PROBLEM - SSH on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:31:17] RECOVERY - twemproxy process on mw1132 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:31:37] PROBLEM - DPKG on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:31:47] RECOVERY - RAID on mw1128 is OK: OK: no RAID installed [17:31:47] PROBLEM - Disk space on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:31:47] RECOVERY - twemproxy process on mw1137 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:31:47] PROBLEM - RAID on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:31:47] RECOVERY - DPKG on mw1139 is OK: All packages OK [17:32:07] RECOVERY - Disk space on mw1133 is OK: DISK OK [17:32:07] PROBLEM - SSH on mw1132 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:32:08] RECOVERY - Disk space on mw1134 is OK: DISK OK [17:32:17] PROBLEM - DPKG on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:32:27] PROBLEM - twemproxy process on mw1136 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:32:28] RECOVERY - DPKG on mw1146 is OK: All packages OK [17:32:28] RECOVERY - Apache HTTP on mw1133 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.059 second response time [17:32:37] RECOVERY - Apache HTTP on mw1132 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.138 second response time [17:32:47] PROBLEM - SSH on mw1131 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:33:07] RECOVERY - Apache HTTP on mw1136 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.192 second response time [17:33:08] PROBLEM - Disk space on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:33:17] PROBLEM - SSH on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:33:17] RECOVERY - Disk space on mw1131 is OK: DISK OK [17:33:27] RECOVERY - SSH on mw1133 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:33:38] RECOVERY - Apache HTTP on mw1121 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.034 second response time [17:33:38] RECOVERY - SSH on mw1131 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:33:38] RECOVERY - twemproxy process on mw1131 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:33:47] PROBLEM - twemproxy process on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:33:47] RECOVERY - DPKG on mw1132 is OK: All packages OK [17:33:57] RECOVERY - SSH on mw1132 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:34:07] PROBLEM - twemproxy process on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:34:07] PROBLEM - RAID on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:34:08] PROBLEM - RAID on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:34:08] PROBLEM - SSH on mw1119 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:34:17] PROBLEM - DPKG on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:34:17] PROBLEM - SSH on mw1136 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:34:17] PROBLEM - RAID on mw1147 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:34:17] PROBLEM - DPKG on mw1145 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:34:17] RECOVERY - RAID on mw1132 is OK: OK: no RAID installed [17:34:27] PROBLEM - DPKG on mw1119 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:34:37] RECOVERY - SSH on mw1134 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:34:38] PROBLEM - RAID on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:34:47] RECOVERY - Disk space on mw1146 is OK: DISK OK [17:34:47] RECOVERY - Disk space on mw1144 is OK: DISK OK [17:34:47] PROBLEM - DPKG on mw1124 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:34:57] RECOVERY - RAID on mw1119 is OK: OK: no RAID installed [17:34:57] RECOVERY - SSH on mw1119 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:35:07] RECOVERY - SSH on mw1146 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:35:07] RECOVERY - SSH on mw1136 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:35:07] RECOVERY - RAID on mw1147 is OK: OK: no RAID installed [17:35:07] RECOVERY - RAID on mw1133 is OK: OK: no RAID installed [17:35:07] RECOVERY - twemproxy process on mw1133 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:35:07] RECOVERY - DPKG on mw1135 is OK: All packages OK [17:35:08] RECOVERY - DPKG on mw1123 is OK: All packages OK [17:35:17] PROBLEM - Apache HTTP on mw1145 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:35:17] PROBLEM - RAID on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:35:17] RECOVERY - DPKG on mw1119 is OK: All packages OK [17:35:18] RECOVERY - twemproxy process on mw1136 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:35:27] PROBLEM - twemproxy process on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:35:37] RECOVERY - DPKG on mw1124 is OK: All packages OK [17:35:47] RECOVERY - DPKG on mw1136 is OK: All packages OK [17:35:47] RECOVERY - DPKG on mw1133 is OK: All packages OK [17:35:47] PROBLEM - RAID on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:35:47] PROBLEM - Apache HTTP on mw1127 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:35:57] RECOVERY - RAID on mw1136 is OK: OK: no RAID installed [17:35:57] PROBLEM - SSH on mw1145 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:36:17] RECOVERY - twemproxy process on mw1146 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:36:27] RECOVERY - DPKG on mw1127 is OK: All packages OK [17:36:37] RECOVERY - RAID on mw1127 is OK: OK: no RAID installed [17:36:37] RECOVERY - Apache HTTP on mw1127 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.318 second response time [17:36:47] RECOVERY - SSH on mw1145 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:36:47] PROBLEM - twemproxy process on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:36:47] PROBLEM - RAID on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:36:57] RECOVERY - Disk space on mw1140 is OK: DISK OK [17:36:57] RECOVERY - Apache HTTP on mw1146 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.069 second response time [17:36:57] PROBLEM - DPKG on mw1122 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:36:57] PROBLEM - DPKG on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:37:07] RECOVERY - SSH on mw1143 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:37:07] RECOVERY - Disk space on mw1143 is OK: DISK OK [17:37:07] RECOVERY - Apache HTTP on mw1145 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.008 second response time [17:37:08] RECOVERY - twemproxy process on mw1140 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:37:17] PROBLEM - RAID on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:37:17] RECOVERY - RAID on mw1131 is OK: OK: no RAID installed [17:37:27] RECOVERY - twemproxy process on mw1134 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:37:27] RECOVERY - SSH on mw1140 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:37:28] RECOVERY - DPKG on mw1138 is OK: All packages OK [17:37:37] RECOVERY - RAID on mw1123 is OK: OK: no RAID installed [17:37:47] PROBLEM - SSH on mw1134 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:37:47] PROBLEM - Disk space on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:37:47] RECOVERY - DPKG on mw1132 is OK: All packages OK [17:37:57] RECOVERY - DPKG on mw1122 is OK: All packages OK [17:37:57] RECOVERY - twemproxy process on mw1145 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:37:57] RECOVERY - RAID on mw1145 is OK: OK: no RAID installed [17:37:57] RECOVERY - DPKG on mw1131 is OK: All packages OK [17:38:07] RECOVERY - DPKG on mw1145 is OK: All packages OK [17:38:07] RECOVERY - RAID on mw1132 is OK: OK: no RAID installed [17:38:08] RECOVERY - SSH on mw1138 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:38:17] PROBLEM - SSH on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:38:17] PROBLEM - DPKG on mw1123 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:38:17] RECOVERY - Apache HTTP on mw1131 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.046 second response time [17:38:27] PROBLEM - twemproxy process on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:38:37] RECOVERY - RAID on mw1124 is OK: OK: no RAID installed [17:38:47] PROBLEM - RAID on mw1142 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:38:47] RECOVERY - twemproxy process on mw1137 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:39:07] RECOVERY - DPKG on mw1123 is OK: All packages OK [17:39:17] RECOVERY - RAID on mw1139 is OK: OK: no RAID installed [17:39:27] RECOVERY - twemproxy process on mw1139 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:39:37] RECOVERY - SSH on mw1134 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:39:37] RECOVERY - twemproxy process on mw1143 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:39:38] RECOVERY - RAID on mw1142 is OK: OK: no RAID installed [17:39:38] RECOVERY - DPKG on mw1134 is OK: All packages OK [17:39:47] PROBLEM - RAID on mw1128 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:39:57] RECOVERY - Disk space on mw1138 is OK: DISK OK [17:40:17] PROBLEM - SSH on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:40:17] PROBLEM - Disk space on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:40:37] PROBLEM - DPKG on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:40:37] PROBLEM - DPKG on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:40:37] RECOVERY - Apache HTTP on mw1140 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.063 second response time [17:40:37] RECOVERY - RAID on mw1122 is OK: OK: no RAID installed [17:40:47] PROBLEM - Disk space on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:40:47] RECOVERY - DPKG on mw1140 is OK: All packages OK [17:40:57] PROBLEM - DPKG on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:41:17] PROBLEM - SSH on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:41:17] PROBLEM - RAID on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:41:27] PROBLEM - Disk space on mw1137 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:41:37] RECOVERY - Apache HTTP on mw1134 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.094 second response time [17:41:47] PROBLEM - Apache HTTP on mw1122 is CRITICAL: Connection refused [17:42:07] RECOVERY - Disk space on mw1134 is OK: DISK OK [17:42:07] RECOVERY - RAID on mw1140 is OK: OK: no RAID installed [17:42:08] PROBLEM - Apache HTTP on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:42:08] PROBLEM - Disk space on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:42:17] PROBLEM - DPKG on mw1135 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:42:17] PROBLEM - SSH on mw1143 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:42:27] PROBLEM - twemproxy process on mw1146 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:42:37] PROBLEM - DPKG on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:42:38] RECOVERY - Disk space on mw1144 is OK: DISK OK [17:42:47] RECOVERY - DPKG on mw1144 is OK: All packages OK [17:42:47] PROBLEM - RAID on mw1127 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:42:47] RECOVERY - Apache HTTP on mw1122 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.212 second response time [17:42:47] RECOVERY - twemproxy process on mw1144 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:42:57] PROBLEM - DPKG on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:43:07] RECOVERY - SSH on mw1144 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:43:08] PROBLEM - Disk space on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:43:27] RECOVERY - DPKG on mw1127 is OK: All packages OK [17:43:37] RECOVERY - RAID on mw1127 is OK: OK: no RAID installed [17:43:37] RECOVERY - RAID on mw1128 is OK: OK: no RAID installed [17:43:47] PROBLEM - Apache HTTP on mw1132 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:43:57] PROBLEM - Apache HTTP on mw1123 is CRITICAL: Connection refused [17:43:57] RECOVERY - Disk space on mw1143 is OK: DISK OK [17:43:57] RECOVERY - RAID on mw1143 is OK: OK: no RAID installed [17:44:07] RECOVERY - SSH on mw1143 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:44:07] RECOVERY - Apache HTTP on mw1143 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.472 second response time [17:44:08] PROBLEM - Disk space on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:44:08] RECOVERY - DPKG on mw1135 is OK: All packages OK [17:44:08] PROBLEM - SSH on mw1132 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:44:17] RECOVERY - DPKG on mw1143 is OK: All packages OK [17:44:27] RECOVERY - Disk space on mw1137 is OK: DISK OK [17:44:27] PROBLEM - twemproxy process on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:44:37] PROBLEM - Apache HTTP on mw1127 is CRITICAL: Connection refused [17:44:37] RECOVERY - Disk space on mw1146 is OK: DISK OK [17:44:47] RECOVERY - RAID on mw1137 is OK: OK: no RAID installed [17:44:47] PROBLEM - DPKG on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:44:57] RECOVERY - DPKG on mw1139 is OK: All packages OK [17:44:57] RECOVERY - Disk space on mw1132 is OK: DISK OK [17:44:57] RECOVERY - Apache HTTP on mw1137 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.059 second response time [17:44:57] RECOVERY - SSH on mw1132 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:45:07] PROBLEM - Disk space on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:45:08] RECOVERY - SSH on mw1146 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:45:17] RECOVERY - twemproxy process on mw1134 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:45:17] RECOVERY - DPKG on mw1137 is OK: All packages OK [17:45:37] RECOVERY - Apache HTTP on mw1127 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.390 second response time [17:45:47] PROBLEM - SSH on mw1134 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:45:47] PROBLEM - Disk space on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:45:47] PROBLEM - DPKG on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:45:57] PROBLEM - Apache HTTP on mw1130 is CRITICAL: Connection refused [17:45:57] PROBLEM - twemproxy process on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:46:07] PROBLEM - Apache HTTP on mw1129 is CRITICAL: Connection refused [17:46:07] RECOVERY - Disk space on mw1138 is OK: DISK OK [17:46:17] PROBLEM - Apache HTTP on mw1131 is CRITICAL: Connection refused [17:46:17] RECOVERY - twemproxy process on mw1146 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:46:21] (03CR) 10Reedy: [C: 032] Remove loginwiki from phase1.dblist [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93078 (owner: 10Reedy) [17:46:30] (03Merged) 10jenkins-bot: Remove loginwiki from phase1.dblist [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93078 (owner: 10Reedy) [17:46:47] RECOVERY - SSH on mw1134 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:47:04] (03PS3) 10Reedy: Remove CodeReview-specific config file, collapse into CommonSettings.php [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/92605 [17:47:14] (03CR) 10Reedy: [C: 032] Remove CodeReview-specific config file, collapse into CommonSettings.php [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/92605 (owner: 10Reedy) [17:47:17] PROBLEM - RAID on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:47:17] PROBLEM - Apache HTTP on mw1139 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:47:17] PROBLEM - RAID on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:47:27] PROBLEM - SSH on mw1139 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:47:27] (03Merged) 10jenkins-bot: Remove CodeReview-specific config file, collapse into CommonSettings.php [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/92605 (owner: 10Reedy) [17:47:27] PROBLEM - Apache HTTP on mw1133 is CRITICAL: Connection refused [17:47:27] PROBLEM - DPKG on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:47:37] RECOVERY - Apache HTTP on mw1132 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.071 second response time [17:47:47] RECOVERY - Disk space on mw1144 is OK: DISK OK [17:47:57] PROBLEM - DPKG on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:48:07] RECOVERY - Disk space on mw1139 is OK: DISK OK [17:48:07] RECOVERY - RAID on mw1140 is OK: OK: no RAID installed [17:48:07] RECOVERY - Apache HTTP on mw1129 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.252 second response time [17:48:07] PROBLEM - Apache HTTP on mw1136 is CRITICAL: Connection refused [17:48:08] PROBLEM - Disk space on mw1132 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:48:08] PROBLEM - RAID on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:48:08] PROBLEM - Disk space on mw1143 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:48:17] RECOVERY - RAID on mw1139 is OK: OK: no RAID installed [17:48:17] PROBLEM - SSH on mw1143 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:48:17] PROBLEM - SSH on mw1146 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:48:17] RECOVERY - SSH on mw1139 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:48:21] (03PS2) 10Reedy: Explicitly set 'watchcreations' and 'watchdefault' options to false [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/89603 (owner: 10Bartosz Dziewoński) [17:48:26] (03CR) 10Reedy: [C: 032] Explicitly set 'watchcreations' and 'watchdefault' options to false [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/89603 (owner: 10Bartosz Dziewoński) [17:48:39] (03Merged) 10jenkins-bot: Explicitly set 'watchcreations' and 'watchdefault' options to false [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/89603 (owner: 10Bartosz Dziewoński) [17:48:47] PROBLEM - Apache HTTP on mw1134 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:48:50] Nemo_bis: ^ [17:48:57] RECOVERY - Disk space on mw1132 is OK: DISK OK [17:48:57] PROBLEM - Apache HTTP on mw1137 is CRITICAL: Connection refused [17:49:07] RECOVERY - SSH on mw1138 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:49:07] RECOVERY - RAID on mw1132 is OK: OK: no RAID installed [17:49:08] RECOVERY - SSH on mw1143 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:49:10] Nemo_bis: now you just need to get someone to merge the core change as well. [17:49:17] RECOVERY - twemproxy process on mw1126 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:49:17] RECOVERY - DPKG on mw1143 is OK: All packages OK [17:49:27] RECOVERY - DPKG on mw1138 is OK: All packages OK [17:49:40] (and fix it up, i guess) [17:49:47] PROBLEM - SSH on mw1134 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:49:57] PROBLEM - Apache HTTP on mw1141 is CRITICAL: Connection refused [17:49:57] RECOVERY - RAID on mw1143 is OK: OK: no RAID installed [17:49:58] RECOVERY - Disk space on mw1143 is OK: DISK OK [17:50:07] RECOVERY - SSH on mw1146 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:50:07] RECOVERY - RAID on mw1146 is OK: OK: no RAID installed [17:50:17] PROBLEM - SSH on mw1144 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:50:17] PROBLEM - Apache HTTP on mw1142 is CRITICAL: Connection refused [17:50:27] RECOVERY - DPKG on mw1146 is OK: All packages OK [17:50:27] PROBLEM - twemproxy process on mw1134 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:50:47] PROBLEM - Disk space on mw1144 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:50:57] RECOVERY - Apache HTTP on mw1146 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.222 second response time [17:51:17] PROBLEM - RAID on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:51:37] RECOVERY - twemproxy process on mw1138 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:51:47] RECOVERY - DPKG on mw1132 is OK: All packages OK [17:51:57] RECOVERY - DPKG on mw1139 is OK: All packages OK [17:52:07] RECOVERY - SSH on mw1144 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:52:07] PROBLEM - Apache HTTP on mw1143 is CRITICAL: Connection refused [17:52:07] RECOVERY - Apache HTTP on mw1139 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.069 second response time [17:52:08] RECOVERY - RAID on mw1139 is OK: OK: no RAID installed [17:52:08] PROBLEM - Disk space on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:52:17] PROBLEM - twemproxy process on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:52:27] RECOVERY - Disk space on mw1126 is OK: DISK OK [17:52:37] RECOVERY - Disk space on mw1144 is OK: DISK OK [17:52:38] RECOVERY - RAID on mw1144 is OK: OK: no RAID installed [17:52:38] RECOVERY - Apache HTTP on mw1134 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.018 second response time [17:52:38] RECOVERY - DPKG on mw1144 is OK: All packages OK [17:52:47] RECOVERY - SSH on mw1126 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:52:47] RECOVERY - twemproxy process on mw1144 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:53:07] RECOVERY - twemproxy process on mw1126 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:53:17] RECOVERY - twemproxy process on mw1134 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:53:48] !log reedy synchronized wmf-config/ [17:54:01] Logged the message, Master [17:54:07] RECOVERY - Disk space on mw1138 is OK: DISK OK [17:54:37] PROBLEM - DPKG on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:54:57] PROBLEM - DPKG on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:55:17] RECOVERY - RAID on mw1126 is OK: OK: no RAID installed [17:55:17] PROBLEM - RAID on mw1139 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:55:47] PROBLEM - Apache HTTP on mw1134 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:56:10] mw1138: Timeout, server mw1138 not responding. [17:56:16] I was going to say, presumably that's an api box [17:56:24] I guess icinga-wm_ concurs [17:56:27] yup [17:56:32] yes [17:56:34] I'm cleaning those up [17:56:37] RECOVERY - SSH on mw1134 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:56:38] RECOVERY - DPKG on mw1134 is OK: All packages OK [17:56:38] RECOVERY - RAID on mw1134 is OK: OK: no RAID installed [17:56:47] PROBLEM - twemproxy process on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:56:47] RECOVERY - DPKG on mw1139 is OK: All packages OK [17:57:07] RECOVERY - RAID on mw1139 is OK: OK: no RAID installed [17:57:08] PROBLEM - Disk space on mw1138 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:57:17] PROBLEM - SSH on mw1138 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:57:37] RECOVERY - Apache HTTP on mw1134 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.240 second response time [17:57:38] RECOVERY - twemproxy process on mw1138 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [17:57:47] PROBLEM - SSH on mw1126 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:58:07] RECOVERY - SSH on mw1138 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:58:07] RECOVERY - Disk space on mw1138 is OK: DISK OK [17:58:17] PROBLEM - twemproxy process on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:58:27] PROBLEM - RAID on mw1126 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:59:27] RECOVERY - Apache HTTP on mw1144 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.386 second response time [17:59:29] just those two left [18:00:07] RECOVERY - twemproxy process on mw1126 is OK: PROCS OK: 1 process with UID = 65534 (nobody), command name nutcracker [18:01:27] RECOVERY - DPKG on mw1138 is OK: All packages OK [18:01:37] RECOVERY - RAID on mw1138 is OK: OK: no RAID installed [18:01:37] RECOVERY - Apache HTTP on mw1138 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.317 second response time [18:03:57] PROBLEM - Host mw1126 is DOWN: PING CRITICAL - Packet loss = 100% [18:04:37] RECOVERY - SSH on mw1126 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [18:04:47] RECOVERY - Host mw1126 is UP: PING OK - Packet loss = 0%, RTA = 0.24 ms [18:05:07] RECOVERY - DPKG on mw1126 is OK: All packages OK [18:05:25] RECOVERY - RAID on mw1126 is OK: OK: no RAID installed [18:05:25] RECOVERY - Apache HTTP on mw1133 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.339 second response time [18:05:35] RECOVERY - Apache HTTP on mw1131 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.432 second response time [18:05:55] RECOVERY - Apache HTTP on mw1130 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.115 second response time [18:05:55] RECOVERY - Apache HTTP on mw1123 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.287 second response time [18:05:55] RECOVERY - Apache HTTP on mw1137 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.294 second response time [18:06:05] RECOVERY - Apache HTTP on mw1141 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.833 second response time [18:06:05] RECOVERY - Apache HTTP on mw1136 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.172 second response time [18:06:05] RECOVERY - Apache HTTP on mw1143 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.207 second response time [18:06:36] RECOVERY - Apache HTTP on mw1142 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.215 second response time [18:08:14] !log swapoff on all api appservers [18:08:27] Logged the message, Master [18:10:01] (03CR) 10Ottomata: "(1 comment)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/93077 (owner: 10Manybubbles) [18:11:31] (03PS3) 10Manybubbles: Rebuild elasticsearch ganglia monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/93077 [18:12:44] !log reedy synchronized wmf-config/ [18:12:48] (03CR) 10jenkins-bot: [V: 04-1] Rebuild elasticsearch ganglia monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/93077 (owner: 10Manybubbles) [18:12:59] Certainly a lot quicker ;) [18:13:03] Logged the message, Master [18:13:38] any hosts with errors? or did it go through on them all? [18:13:40] reedy@tin:/a/common$ sync-dir wmf-config/ [18:13:40] copying to apaches [18:13:40] reedy@tin:/a/common$ [18:13:43] None at all [18:13:49] good [18:14:16] !log reenabling mw1114-mw1148 [18:14:30] of course no errors, I cleaned them up one by one :) [18:14:33] Logged the message, Master [18:15:09] Lets see if scap plays ball today [18:17:40] !log reedy Started syncing Wikimedia installation... : testwiki and test2wiki to 1.23wmf2 [18:17:56] Logged the message, Master [18:20:23] deploy! [18:22:49] (03PS1) 10Akosiaris: palladium/strontium as puppetmasters [operations/puppet] - 10https://gerrit.wikimedia.org/r/93082 [18:24:15] PROBLEM - Apache HTTP on mw1070 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:27:05] RECOVERY - Apache HTTP on mw1070 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.476 second response time [18:31:05] !log reedy Finished syncing Wikimedia installation... : testwiki and test2wiki to 1.23wmf2 [18:31:20] Logged the message, Master [18:32:07] Reedy: what were all these fatals from? http://ur1.ca/fyspx [18:32:42] Things being synced in the wrong order [18:32:49] which has been the case for ages [18:33:02] heh, sweet [18:33:06] which I'm going to fix now [18:33:13] Reedy: but, can you explain... nvm, just fix it [18:33:17] :) [18:33:52] Basically we're syncing the file that tells php to use the new php code first [18:34:06] Before localisation cache, and other (needed) config files are pushed [18:34:09] no files => fatal [18:34:35] oh that's smart [18:34:36] ;) [18:35:32] whew, no more fatals coming in [18:35:35] * greg-g breathes [18:36:19] Reedy: when you're done with that (no rush, can wait an hour or so), can you do what I told you not to do earlier? specificlly: BetaFeatures with CommonsMetadata and MultimediaViewer to test/test2 [18:37:02] (03PS1) 10Reedy: Run sync-wikiversions AFTER all code is deployed [operations/puppet] - 10https://gerrit.wikimedia.org/r/93083 [18:38:49] Fixeded [18:41:30] (03PS1) 10Reedy: Revert "Revert "Add version specific extension-list"" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93085 [18:41:37] I'm in ur reverts reverting your reverts [18:41:55] (03PS2) 10Reedy: Revert "Revert "Add version specific extension-list"" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93085 [18:41:59] (03PS3) 10Reedy: Add version specific extension-list [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93085 [18:42:50] uhhhhh [18:43:08] ^d wasn't sure if that worked, Reedy :) [18:43:32] It wouldn't have done anything anyway [18:43:39] if ( file_exists( "$wmfConfigDir/extension-list-$wmfExtendedVersionNumber.php" ) ) { [18:43:42] Said file never existed [18:43:52] no .php prefix [18:43:56] like it is in [18:43:56] if ( file_exists( "$wmfConfigDir/extension-list-$wmfExtendedVersionNumber.php" ) ) { [18:43:59] fail [18:44:03] $wgExtensionEntryPointListFiles[] = "$wmfConfigDir/extension-list-$wmfExtendedVersionNumber"; [18:44:46] (03PS4) 10Reedy: Add version specific extension-list [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93085 [18:46:08] (03CR) 10Reedy: [C: 032] Add version specific extension-list [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93085 (owner: 10Reedy) [18:46:17] (03Merged) 10jenkins-bot: Add version specific extension-list [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93085 (owner: 10Reedy) [18:49:54] !log reedy Started syncing Wikimedia installation... : scap take 2 to build l10ncache with magical beta things [18:50:01] reedy@ubuntu64-web-esxi:~/git/operations/puppet$ git push origin HEAD:refs/for/master [18:50:01] ^C [18:50:01] reedy@ubuntu64-web-esxi:~/git/operations/puppet$ git push origin HEAD:refs/for/production [18:50:03] Bad Ryan_Lane [18:50:07] Logged the message, Master [18:50:17] Updating ExtensionMessages-1.23wmf2.php... [18:50:17] done [18:50:17] Copying to local copy... [18:50:17] done [18:50:17] Updating LocalisationCache for 1.23wmf2... done [18:50:19] /a/common/wikiversions.cdb successfully built. [18:50:21] Computers are hard [18:53:41] (03CR) 10BryanDavis: [C: 031] "This seems like a good idea to me" [operations/puppet] - 10https://gerrit.wikimedia.org/r/93083 (owner: 10Reedy) [18:58:20] (03CR) 10Aaron Schulz: [C: 031] Run sync-wikiversions AFTER all code is deployed [operations/puppet] - 10https://gerrit.wikimedia.org/r/93083 (owner: 10Reedy) [18:58:33] !log reedy Finished syncing Wikimedia installation... : scap take 2 to build l10ncache with magical beta things [18:58:34] (03CR) 10ArielGlenn: [C: 031] Run sync-wikiversions AFTER all code is deployed [operations/puppet] - 10https://gerrit.wikimedia.org/r/93083 (owner: 10Reedy) [18:58:46] Logged the message, Master [18:58:48] heh I think you're about to get a flood of sign-ffs on that one [18:58:55] (03CR) 10Reedy: [C: 031] Run sync-wikiversions AFTER all code is deployed [operations/puppet] - 10https://gerrit.wikimedia.org/r/93083 (owner: 10Reedy) [19:01:08] (03PS1) 10Reedy: Add three Multimedia extensions to config [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93091 [19:01:15] * greg-g should -1 for the fun of it [19:01:18] (03PS2) 10Reedy: Add three Multimedia extensions to config [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93091 [19:01:32] marktraceur: ^^^ [19:01:42] Huzzah [19:01:44] marktraceur: also, my -1 joke was about the other change above [19:02:02] greg-g: Thanks for the clarity :) [19:02:07] :) [19:02:18] (03PS3) 10Reedy: Add three Multimedia extensions to config, enable on testwiki and test2wiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93091 [19:02:24] Ahaa. [19:02:30] Revert [19:02:31] Rebase [19:02:32] Amend [19:02:34] PROFIT [19:02:48] Reedy: I can +2 [19:02:53] If you'd like [19:03:08] If you want ;) [19:03:26] Reedy always self-merges [19:03:27] (03CR) 10MarkTraceur: [C: 032] "Yayyy" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93091 (owner: 10Reedy) [19:03:37] (03Merged) 10jenkins-bot: Add three Multimedia extensions to config, enable on testwiki and test2wiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93091 (owner: 10Reedy) [19:03:38] greg-g: Symbolic and all that [19:03:43] :) [19:03:57] "may I do the honor?" [19:04:03] er, have? [19:04:48] * greg-g waits to go to lunch [19:06:39] !log reedy synchronized wmf-config/ 'Enable magic beta extensions' [19:06:54] Logged the message, Master [19:07:04] slow internet is slooow [19:08:04] https://www.youtube.com/watch?v=kMPF-XMyN7g [19:08:06] and that includes the new extension magic? [19:08:24] Special:Version says so [19:08:43] the extensionlist-1.23wmf2 bit? [19:08:56] ugh [19:08:57] https://test.wikipedia.org/wiki/Special:Preferences [19:08:59] A database query error has occurred. This may indicate a bug in the software. [19:09:00] Function: BetaFeaturesHooks::getUserCountsFromDb [19:09:00] Error: 1146 Table 'testwiki.betafeatures_user_counts' doesn't exist (10.64.16.8) [19:09:08] it *almost* worked this time, guys. [19:09:17] marktraceur: ^^ [19:09:20] ahah [19:09:22] That's easily fixed [19:09:27] No one mentioned about that... [19:09:38] marktraceur: Any other tables/sql files I need to find? [19:09:48] just run update.php! ;) [19:10:39] if I comment out most of MySQLUpdater it'd be fine [19:10:59] Just BetaFeatures [19:11:09] (03PS1) 10Reedy: testwiki and test2wiki to 1.23wmf2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93093 [19:11:26] Reedy: No, just update.php [19:11:56] !log Created betafeatures_user_counts on testwiki and test2wiki [19:12:12] Logged the message, Master [19:13:09] marktraceur: why do I see VE stuff in there? is that automatically included? [19:13:22] woot, it loaded! [19:13:25] and where's CommonsMetadat? [19:13:35] * greg-g gives marktraceur a hard time [19:13:44] is it just me or are the tabs' styles gone? [19:13:44] http://i.imgur.com/SkXVmBi.png [19:13:51] greg-g: VE is loaded, yeah [19:14:04] i.e. if VE is on a wiki it has hook that register betafeatures [19:14:08] PROBLEM - MySQL Idle Transactions on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 1166 seconds [19:14:14] greg-g: CommonsMetadata is an API patch basically [19:14:17] (03CR) 10Reedy: [C: 032] testwiki and test2wiki to 1.23wmf2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93093 (owner: 10Reedy) [19:14:25] But it's exposed via MMV [19:14:27] (03Merged) 10jenkins-bot: testwiki and test2wiki to 1.23wmf2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93093 (owner: 10Reedy) [19:14:28] British EnglishReedy0TalkMy sandboxPreferencesBetaWatchlistNew messages (none)ContributionsMy uploadsLog out [19:14:29] marktraceur: huh [19:14:38] PROBLEM - MySQL InnoDB on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 1197 seconds [19:14:46] marktraceur: WFM [19:14:51] Cool cool [19:14:55] marktraceur: why does "VisualEditor" (not formula) show up in test but not test2? [19:15:08] RECOVERY - MySQL Idle Transactions on db1059 is OK: OK longest blocking idle transaction sleeps for 0 seconds [19:15:10] Maybe test2 has VE enabled as opt-out? [19:15:17] * marktraceur defers to James_F|Away  [19:15:21] Damn. RoanKattouw_away ? [19:15:22] Damn. [19:15:30] Trev damn it you guys [19:15:38] RECOVERY - MySQL InnoDB on db1059 is OK: OK longest blocking idle transaction sleeps for 0 seconds [19:16:12] marktraceur: when it says "one user has enabled this extension" and I have it enabled, does that mean it's "one other than you" or "just you, dawg" [19:16:23] * greg-g is being nitpicky for no good reason [19:16:28] * greg-g should go get lunch [19:16:51] * greg-g gos [19:16:55] set +e [19:17:45] Just you [19:18:02] hm. [19:24:26] So, ahm, do I need to deploy anything to testwikis? [19:24:32] It sounds like it's done [19:24:56] And if that's the case I can release my deploy window into the world [19:26:03] MatmaRex: who fixes th tests? :) [19:28:50] Nemo_bis: ori-l, actually. :> [19:34:42] (03CR) 10Manybubbles: "(1 comment)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/93077 (owner: 10Manybubbles) [19:37:10] Reedy: Database error on trying to upload to testwiki [19:37:22] "1054 Unknown column 'rc_source' in 'field list' (10.64.16.27)" [19:37:44] I thought springle-away had done those schema changes? [19:37:45] aude: ^^ [19:38:13] Oh, no, repro'd on file page https://test.wikipedia.org/wiki/File:MMV_screenie_with_text_box_scrollable.png [19:38:19] https://gerrit.wikimedia.org/r/#/c/85787/ [19:38:21] Reedy: the request was for all wikis, and pringle said he updated them all. [19:38:28] PROBLEM - DPKG on xenon is CRITICAL: DPKG CRITICAL dpkg reports broken packages [19:38:38] ebernhardson|lch: right [19:39:33] not on test2wiki either [19:39:53] or mediawikiwiki [19:40:16] Reedy: he did [19:40:18] it is on enwiki [19:40:39] i can't imagine he forgot some wikis [19:40:56] Well, recent changes table on testwiki is tiny [19:41:08] PROBLEM - DPKG on praseodymium is CRITICAL: DPKG CRITICAL dpkg reports broken packages [19:41:21] true [19:41:26] there is no rush for the patch, reverting it again if things are not ready is fine by me [19:42:08] PROBLEM - MySQL Idle Transactions on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 2089 seconds [19:42:38] PROBLEM - MySQL InnoDB on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 2121 seconds [19:42:55] it would be bad to find out on monday or whenever we update other wikis, that they are missing the column [19:43:01] yah [19:43:08] RECOVERY - DPKG on praseodymium is OK: All packages OK [19:43:15] mysql:wikiadmin@db1038 [testwiki]> ALTER TABLE /*$wgDBprefix*/recentchanges ADD rc_source varbinary(16) NOT NULL default '' AFTER rc_type; [19:43:15] Query OK, 3628 rows affected (0.41 sec) [19:43:15] Records: 3628 Duplicates: 0 Warnings: 0 [19:43:16] lol [19:43:35] !log Added rc_source column to testwiki.recentchanges [19:43:51] blugh [19:43:52] Logged the message, Master [19:44:08] PROBLEM - MySQL Idle Transactions on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 2209 seconds [19:44:38] PROBLEM - MySQL InnoDB on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 2241 seconds [19:44:39] ugh [19:44:47] * marktraceur needs that deploy window after all [19:44:51] what? [19:44:59] There's at least one fix we need to push out, that apparently didn't get caught [19:45:08] wha tis it? [19:45:15] (gerrit url) [19:45:17] I see bug [19:45:38] RECOVERY - MySQL InnoDB on db1059 is OK: OK longest blocking idle transaction sleeps for 0 seconds [19:45:38] greg-g: https://gerrit.wikimedia.org/r/92995 but also https://gerrit.wikimedia.org/r/93092 if tgr can merge it [19:46:08] RECOVERY - MySQL Idle Transactions on db1059 is OK: OK longest blocking idle transaction sleeps for 0 seconds [19:46:16] poor pep8 job is missing a file :/ [19:46:17] how important are those? [19:46:22] greg-g: The former is critical [19:46:29] !log Added rc_source column to test2wiki.recentchanges [19:46:31] tested on beta? [19:46:34] The latter is...eh, i'd say normal or major [19:46:35] Yeah [19:46:38] hashar: poor pep8! [19:46:44] Logged the message, Master [19:46:51] marktraceur: what happens without the former? [19:47:02] "as expected" isn't enough for me ;) [19:47:21] greg-g: It loads an empty lightbox and gives a fatal JS error in the console [19:47:24] !log Added rc_source column to mediawikiwiki.recentchanges [19:47:25] awesome [19:47:38] Logged the message, Master [19:47:40] can this week be over yet? [19:47:42] greg-g: No [19:47:53] don't forget https://gerrit.wikimedia.org/r/#/c/92985/ :) [19:47:58] our one line javascript change [19:48:01] It also looks like there's a CSS caching error on beta...if not I have another normal-level bug, I guess [19:48:11] (03CR) 10Hashar: "There is an issue with jenkins slaves scripts sorry. Fixing them." [operations/puppet] - 10https://gerrit.wikimedia.org/r/93077 (owner: 10Manybubbles) [19:48:13] marktraceur, sorry, on it [19:48:15] aude: can that wait until monday? [19:48:18] tgr: No problem [19:48:26] We have 20 whole minutes before our window. :P [19:48:31] !log Added rc_source column to testwikidatawiki.recentchanges [19:48:33] greg-g: prefer not but if we must [19:48:44] Logged the message, Master [19:48:53] the fix is already on test wikidata [19:48:58] Without checking every wiki... [19:49:00] since it made the cut yesterday into the new branch [19:49:04] !log jenkins: refreshing all *pep8 jobs, pep8 wrapper for puppet pointing to a wrong path [19:49:13] It seems only the phase1 wikis are possibly missing said column [19:49:18] Logged the message, Master [19:49:22] aude: oh right... [19:49:45] ugh [19:49:52] +1 [19:49:53] 'd [19:49:59] it's wmf1, which seems separate from the problems [19:50:03] yeah [19:50:13] sorry, just wishing the churn would stop soon :) [19:50:16] marktraceur: see? not just me! [19:50:17] yeah, sorry [19:50:23] aude: not your fault :/ [19:50:25] MatmaRex: Wat? [19:50:30] :/ [19:50:50] [20:48] It also looks like there's a CSS caching error on beta...if not I have another normal-level bug, I guess [19:51:04] Reedy: when you're done fixing dbs, can you update wikibase for wmf1? (the patch I added you as a reviewer on just now) [19:51:07] MatmaRex: What about it? [19:51:24] I don't remember talking about that at all [19:51:28] marktraceur: i complained about css being missing earlier :( [19:51:32] Ahh. [19:51:44] Yay, tgr is amazing and merged the other CSS fix [19:51:50] Testing on Beta soon [19:52:29] (03PS4) 10Hashar: Rebuild elasticsearch ganglia monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/93077 (owner: 10Manybubbles) [19:52:40] manybubbles: fixed. Thanks for the notification! [19:52:54] aude: Don't suppose you've made a commit for it already? [19:53:02] https://gerrit.wikimedia.org/r/#/c/92985/ [19:53:02] hashar: thanks! [19:59:03] ori-l: Are you able to ssh into instance 'flog' in the eventlogging project? [19:59:27] !log reedy synchronized php-1.23wmf1/extensions/Wikibase [19:59:41] Logged the message, Master [20:00:35] thanks Reedy and greg-g [20:03:35] MatmaRex: Was the CSS thing only on Beta? [20:04:42] marktraceur: Back, what's up? [20:04:55] Oh, we were just talking about VE [20:05:05] RoanKattouw: Test has it enabled and test2 has it enabled opt-out, right? [20:06:02] echo "explain recentchanges;" | sudo -u apache foreachwiki sql.php | grep -c rc_source [20:06:08] Is that going to tell me what I actually want to know... [20:06:35] marktraceur: I don't know offhand, you'd have to look at InitialiseSettings [20:06:44] Balls. 'kay, will do [20:08:09] Answer is yes. [20:08:29] So yeah, that's not a problem. [20:10:13] marktraceur: i saw it on testwiki preferences page [20:10:20] Oh dear. [20:10:38] http://i.imgur.com/SkXVmBi.png [20:10:46] instead of the pretty tabs [20:10:49] MatmaRex: Seems fine now, though [20:10:50] everything else looked okay everywhere [20:11:02] As opposed to beta, which has seen CSS caching since apparently yesterday [20:11:14] yeah, looks okay now [20:11:33] okay, then maybe that was not related after all :) [20:11:42] (another reason why 'beta features' is a bad name) [20:12:49] MatmaRex: Why? [20:13:35] we have too many things named 'beta' already [20:13:44] or am i confused again? [20:13:48] anyway, nevermind [20:14:24] "I"m glad I'm a delta" :-P [20:18:46] (03CR) 10Ottomata: [C: 032 V: 032] Rebuild elasticsearch ganglia monitoring [operations/puppet] - 10https://gerrit.wikimedia.org/r/93077 (owner: 10Manybubbles) [20:19:11] manybubbles: merged [20:19:56] ottomata: yay! I'll poke at ganglia in the next few hours to see if I can clear any weird values the changes cause it to get temporarily. [20:20:07] ottomata: ori-l gave me some magic to try [20:20:15] hmm ok [20:21:32] greg-g: 'kay, should I be worried that CSS updates aren't happening on beta? [20:21:42] chrismcmahon: Maybe you have some idea of what's happening there? [20:22:08] Once I can confirm that CSS is getting updated properly I would like to deploy our two fixes to tests [20:22:11] testwikis* [20:22:26] (03CR) 10Andrew Bogott: "I just removed all references to puppetmaster::self from labs -- now we only use role::puppet::self." [operations/puppet] - 10https://gerrit.wikimedia.org/r/91353 (owner: 10Akosiaris) [20:22:51] marktraceur: we're getting CSS from bits.beta.wmflabs.org [20:23:11] marktraceur: how so 'not happening'? [20:23:29] chrismcmahon: I'm seeing old CSS on beta.wikipedia [20:23:39] Or whatever [20:23:46] restart all the bits! [20:24:03] ugh [20:24:27] Actually I'm not seeing the updated JS from this last patch, either [20:24:28] Or touch all the files [20:24:36] wtf [20:24:58] brion's patches got in, but my last one didn't go [20:25:31] Oh, no, sorry, false alarm [20:25:33] But CSS [20:26:34] PROBLEM - MySQL InnoDB on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 1267 seconds [20:26:53] PROBLEM - MySQL Idle Transactions on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 1282 seconds [20:29:27] what's up with db1059? it's been like that for a while [20:29:33] RECOVERY - MySQL InnoDB on db1059 is OK: OK longest blocking idle transaction sleeps for 0 seconds [20:29:53] RECOVERY - MySQL Idle Transactions on db1059 is OK: OK longest blocking idle transaction sleeps for 0 seconds [20:33:53] PROBLEM - MySQL Idle Transactions on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 753 seconds [20:35:26] greg-g: Should I be bugging you to work on this, or are you busy with more important things [20:36:22] Our deployment is half over and I haven't heard an opinion from you about whether I should deploy the fixes [20:36:33] PROBLEM - MySQL InnoDB on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 918 seconds [20:37:33] RECOVERY - MySQL InnoDB on db1059 is OK: OK longest blocking idle transaction sleeps for 0 seconds [20:39:00] marktraceur: sorry. Yes please, go for it. Reedy, can you help? [20:39:53] RECOVERY - MySQL Idle Transactions on db1059 is OK: OK longest blocking idle transaction sleeps for 0 seconds [20:39:59] We deploy from tin now, right? [20:40:04] Ah, yeah [20:41:00] Indeed [20:41:42] <^d> Where "now" is "for the past 9 months or so" :p [20:43:41] aude: Looks like it's maybe a quarter of the wikis have rc_source :/ [20:43:54] May need to extend a bit [20:43:56] how can that be? [20:43:56] * Reedy waits for foreachwiki to finish running [20:44:33] reedy@tin:/a/common$ grep -c rc_source rc.sql [20:44:33] 158 [20:44:33] reedy@tin:/a/common$ grep -c rc_params rc.sql [20:44:33] 879 [20:44:33] reedy@tin:/a/common$ [20:44:41] hmmmm [20:45:15] where rc.sql is the output from explain recentchanges; via sql.php [20:45:33] PROBLEM - MySQL InnoDB on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 1458 seconds [20:45:53] PROBLEM - MySQL Idle Transactions on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 1473 seconds [20:45:57] s4 master [20:46:02] commons [20:46:33] RECOVERY - MySQL InnoDB on db1059 is OK: OK longest blocking idle transaction sleeps for 0 seconds [20:46:48] I wonder if that's related to [20:46:53] RECOVERY - MySQL Idle Transactions on db1059 is OK: OK longest blocking idle transaction sleeps for 0 seconds [20:46:54] PHP Notice: DB transaction callbacks still pending (from User::invalidateCache). in /usr/local/apache/common-local/php-1.23wmf1/includes/db/Database.php on line 3913 [20:48:58] Arghlefarghle [20:49:09] git submodule update y u take so long [20:49:58] 20 KiB/s that might be why [20:50:05] where, locally? [20:50:09] Yeah [20:50:16] I can just update the extensions to master if you want [20:50:19] might be quicker [20:50:22] Office network slow as balls today [20:50:24] Yeah it would be [20:50:33] Which one(s)? [20:50:36] Oh, picking up now [20:50:40] Reedy: Only MultimediaViewer [20:50:47] The others should all be at master anyway [20:55:38] * YuviPanda pokes LeslieCarr [20:55:42] public IP for parsoid cluster? [20:57:17] Computer says no [20:58:44] Bargh [20:58:55] I'm seeing the same CSS on testwiki too. [20:59:01] * marktraceur checks that it actually got updated [20:59:31] Yeah, it's on tin [20:59:35] What on earth [21:00:05] Reedy: Should I just touch these CSS files manually or is this a symptom of a bigger problem? [21:00:20] In the extension? [21:00:23] Yeah [21:00:25] Er [21:00:32] Not a problem in the extension, no [21:00:36] The extension is working [21:00:37] (03CR) 10Dr0ptp4kt: [C: 04-1] "(3 comments)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/93006 (owner: 10Yurik) [21:00:43] But I'm not getting the extension's files [21:01:21] version specific extensionlist hack problem? :) [21:01:32] That only affects l10ncache [21:01:40] touching wmf-config/InitialseSettings.php and resources/startup.js and resources/Resources.php [21:01:47] and then syncing them should cause stuff to be recached [21:02:31] Where's wmf-config? [21:02:43] Oh, /a/common [21:04:08] Yeah, that didn't do it [21:04:24] The JS changes made it out, so at least the interface isn't crashing [21:04:37] But there's a really old version of the CSS sticking around for I don't know why [21:04:49] Uh [21:04:54] You've not deployed any code yet [21:05:23] Reedy: I've pulled it to tin and synced it to mw1017 [21:05:30] Oh [21:06:43] What did you run on mw1017? [21:06:58] sync-common [21:11:10] Actually [21:11:28] Magically, somehow, the CSS being served isn't even from what I *thought* was deployed earlier [21:11:34] It's older than that [21:12:29] :/ [21:12:50] Back when I was using position: absolute for the image links div [21:14:22] so... [21:14:28] how old is this code? [21:14:58] Looking [21:15:24] It should've only been branched 29-30 hours or so ago [21:15:59] The CSS got changed on the 29th [21:16:07] So what we have is older [21:16:21] I can't say anything more definitive because that's already older than what was deployed [21:16:22] :/ [21:18:00] Reedy: Any thoughts? [21:21:16] So, I'm close to saying "undeploy BetaFeatures" just fyi [21:21:26] I'm also close to that [21:21:30] :( [21:21:32] sorry man [21:21:37] Unless we can figure out what the fuck is up with the CSS [21:21:46] I'm backpedalling at 14:45 [21:21:53] yah, good idea [21:22:17] Because I want my peppermint mocha at 15:00. Er, I mean, because it's too much time mid-deploy. [21:22:19] I want to give you time to debug and such, but I don't want to leave even test wikis in a bad state over the weekend [21:22:49] greg-g: I don't really know what's going on...I mean, clearly MMV is getting hit by it, but I don't see how it could be _causing_ this [21:22:57] yeah [21:23:15] And if it's an RL caching issue that magically reaches backwards in git history, that's a beast I don't know how to slay [21:23:18] so, Reedy I know this is unrelated, but is that version specific extension list still needed? [21:23:53] +1 for undeploy [21:24:03] !log reedy synchronized php-1.23wmf2/ [21:24:13] * greg-g wonders what was in there [21:24:17] Logged the message, Master [21:24:20] What was? [21:24:32] what you just sync'd [21:24:41] touching numerous files [21:24:44] ah [21:24:45] Probably my stuff [21:24:53] * greg-g is just curious, ya konw [21:24:56] s/on/no/ [21:25:11] !log reedy synchronized wmf-config 'touch' [21:25:15] AHA [21:25:17] THERE IT IS [21:25:22] * marktraceur hugs Reedy [21:25:26] Logged the message, Master [21:25:40] Arright [21:25:58] The messages need to be sync'd too, maybe [21:26:29] Are they missing? [21:26:38] They look like they are, let me check again [21:26:41] https://test2.wikipedia.org/wiki/Special:Preferences#mw-prefsection-betafeatures looks fine to me [21:26:55] Yeah, but MMV is missing two messages [21:27:15] multimediaviewer-about-mmv and multimediaviewer-discuss-mmv [21:27:28] Show up as s in my interface [21:28:19] I can maybe just do sync-l10nupdate-1 [21:29:13] Reedy: Yay? Nay? You're all up in tin too, so I don't want to go unless you're aware of what I'm up to [21:30:46] * marktraceur queues up mw-update-l10n [21:30:47] I'd let reedy do it for now [21:30:54] Ah, K [21:31:01] Are those messages new? [21:31:07] ie from the update to master? [21:31:17] I believe so [21:31:26] (I usually assume, and it turns out correct, that he's off not responding becuase he's checking something before running a command) [21:31:50] Reedy: Yeah, they are [21:32:07] Which is why they're not showing up then [21:32:11] Yup [21:32:40] bah, battery [21:32:58] no! [21:33:02] not again! [21:33:04] :) [21:33:09] Damn it Reedy [21:33:13] Are you on another ferry [21:33:37] No, my battery was fine yesterday [21:33:44] There only seems to be like 1 socket in this room :/ [21:38:50] so.... 2 minutes [21:39:02] !log reedy Started syncing Wikimedia installation... : Rebuild l10n cache for MultimediaViewer update [21:39:06] ah [21:39:13] Huzzah [21:39:13] see what I mean? [21:39:14] noooo [21:39:16] Logged the message, Master [21:39:19] Reedy: can you break it? [21:39:24] probably [21:39:32] I'm still worried about that extension list [21:39:55] Why? [21:39:59] It's already been deployed from earlier [21:40:09] I wanted to merge your scap update [21:40:15] Haha [21:40:18] Hrm, the messages still aren't there, but...hm, let me try something [21:40:18] but I didn't want to leave it deployed without trying it out [21:40:21] Doesn't make any difference for the deploy now ;) [21:40:41] marktraceur: give it a sec, still goign [21:40:54] Oh, right [21:41:23] oh shit [21:41:28] I better fix maths before people complain [21:41:48] ef what? [21:42:05] build the stupid texvc binary [21:42:10] oh god [21:42:11] dsh -F25 -cM -g mediawiki-installation -o -oSetupTimeout=10 'sudo -u mwdeploy /usr/bin/scap-recompile' [21:42:12] hahaha [21:42:14] It's alright [21:42:20] I just hate that thihng [21:42:20] every frickin time [21:42:21] -h [21:42:25] apergos: That's not fair! [21:42:26] can we fix that? [21:42:32] Not every time [21:42:36] 1 in 4 maybe [21:42:37] hey I've done my share of having to build it after the fact [21:42:40] :D [21:42:43] :-D [21:42:45] I blame other people for distracting me [21:43:28] it's really a horrid system, bound to break in exactly that way (not blaming the coders, just sayin...) [21:43:48] Y'know... [21:43:55] I know a very simple workaround for this [21:43:56] ... we shoulf fix that [21:44:02] man, typos today [21:44:02] $wgTexvc = "/usr/local/apache/uncommon/$wmfVersionNumber/bin/texvc"; [21:44:16] just remove the damn version number from the path [21:44:23] :-D [21:44:30] The old binaries only get deleted when I remember to do it now and again [21:44:32] it is always the same freakin binary ain't it [21:44:40] ... is it? [21:44:47] diff awa [21:44:50] gah! [21:44:51] away [21:45:10] It might be slightly different if we upgrade ubuntu in the meantime [21:45:27] yeah ok but in that case we would want it to be different regardless [21:45:56] built with the new libraries [21:45:59] shrug [21:46:02] Just looking at /usr/bin/scap-recompile now [21:46:29] paravoid: the official graphite packages are really put together well, btw [21:46:35] i was impressed [21:46:44] mwIP=/usr/local/apache/common-local/php-"$mwVerNum" [21:46:44] # Math was moved out to an extension in MW 1.18 [21:46:44] if [ -d $mwIP/extensions/Math/math ]; then [21:46:44] MATHPATH=$mwIP/extensions/Math/math [21:46:44] else [21:46:46] MATHPATH=$mwIP/math [21:46:48] fi [21:46:52] Do we really need that back compat? [21:47:26] I think it's the extensions/Math/math dir we care about [21:47:51] I'm not competent to answer at this hour (almost midnight, day started at 7 am) [21:47:59] I'm not competent enough to answer [21:48:02] https://git.wikimedia.org/history/mediawiki%2Fextensions%2FMath.git/b3ad6b91c5ceb7a9561dcfb1cb9b99ac16fd2d03/math [21:48:13] Last change was february [21:48:17] Bar me updating a comment [21:48:34] maybe that gets changed monday instead of... [21:48:34] So we'd have had to build it twice in 2012 [21:48:39] And once in 2013 [21:48:43] friday afternoon? [21:49:04] I mean put the patch set in but don't merge it today [21:49:19] To fix which issue? :P [21:49:36] Reedy: marktraceur we're over time, btw [21:49:42] how's things? [21:49:46] texvc buils [21:49:49] builds [21:49:54] css works but not messages? [21:49:55] I'm just waiting for scap to finish with the localisation update [21:49:56] greg-g: We're just waiting for l10n, once that's done I'm fine [21:49:57] !log reedy Finished syncing Wikimedia installation... : Rebuild l10n cache for MultimediaViewer update [21:50:02] wooo timing [21:50:02] alright test [21:50:09] Logged the message, Master [21:50:34] ...wat [21:50:36] ./debs/wikimedia-task-appserver/scap-recompile [21:50:37] lol [21:50:43] marktraceur: that's not good sounding [21:50:46] The messages don't look like they're there still [21:50:52] * greg-g sighs [21:51:20] alright then... we've put in enough time debugging this. It's almost 3pm on a Friday [21:51:56] Couple of missing messages isn't a big deal [21:52:14] True [21:52:31] (03PS1) 10Reedy: Remove 1.18 back compat [operations/debs/wikimedia-task-appserver] - 10https://gerrit.wikimedia.org/r/93116 [21:52:34] so, tell me what went wrong and why it should stay and not be reverted and done right on MOnday? [21:54:03] also, why the hell we did this today after I got verbal agreements from people in the deploy meeting that we wouldn't :/ (that's mostly my fault) [21:54:14] greg-g: There are two messages that aren't showing up. It's not clear why. The links are working fine, the functionality is there, there's just a pair of messages that aren't loading. It's only on testwiki. We have no real pressure. If it doesn't get fixed by Monday we can postpone deploy to mw.o and fix it. [21:54:24] greg-g: that was my fault. I had to make a call in your absense [21:54:44] https://test2.wikipedia.org/wiki/MediaWiki:Multimediaviewer-discuss-mmv [21:54:48] The message is there [21:54:57] marktraceur: Do you expose that message via JS? [21:55:01] Yeah [21:55:03] Right [21:55:08] So the problem is the resource loader message cache [21:55:13] Ah, crap. [21:55:16] greg-g: I'm assuming by "this" you mean deploy to test/test2. it hasn't gone out anywhere else, right? [21:55:17] localisationupdate does it [21:55:18] scap doesn't [21:55:24] robla: right and right [21:55:28] So when localisation update runs in a few hours it'll fix it [21:55:40] greg-g: So we have a well-defined issue that'll get fixed right quick, automatically. [21:55:49] marktraceur: better answer :) [21:56:13] Indeed it is [21:56:21] But I didn't have it 'til Reedy gave it to me :) [21:56:35] OK, I can stop fucking around with this now, I think. [21:56:44] marktraceur: so, now that we're in a steady state, can I ask a favor? [21:56:56] Yup! [21:57:02] marktraceur: can you write up what went wrong with this and what I can do better next time? [21:57:16] greg-g: Sure [21:57:29] (03PS1) 10Reedy: Move texvc out of MediaWiki version specific folder [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93118 [21:57:30] It'd be from my point of view, there may be missing bits, but I'll absolutely do that next [21:57:34] !log LocalisationUpdate completed (1.23wmf1) at Fri Nov 1 21:57:34 UTC 2013 [21:57:36] Once I have my coffee [21:57:39] s/you/we/ but really I want to make sure I know how to prevent this again, other than just saying no to everyone ;) [21:57:48] Logged the message, Master [21:57:50] go get your pumpkin spice [21:57:50] Right right [21:57:57] greg-g: Hell naw, peppermint yo [21:58:12] Oh, the messages show up on test2 no problem. Awesome. [21:58:14] * marktraceur very happy [21:58:22] marktraceur: oh, bad memory, my bad [21:58:23] not to make things more awesome but you might want https://gerrit.wikimedia.org/r/#/c/93117/ [21:58:30] before doing stuff on monday [21:58:39] and then we need to check with sean [22:00:29] aude: can you email sean now-ish about that? [22:00:39] you or Reedy, I guess [22:01:05] I've half written an email [22:01:09] I got distracted with other things [22:01:10] all the better [22:01:13] * Reedy notices a pattern [22:01:15] understandly :) [22:01:20] robla: I need a secretary! [22:01:38] They can remind me I've got meetings too [22:01:42] Reedy: I'll get right on that :-P [22:01:48] reedy can handle it but i can ask again on monday [22:02:06] aude: thanks much [22:02:29] i don't know if he thought it was only needed for wikis that have wikibase or something [22:02:39] * greg-g shrugs [22:02:39] There's more than that [22:02:40] no idea [22:02:44] oh [22:02:45] ok [22:03:25] http://p.defau.lt/?zCJx0jvocevdune_nLymMg [22:03:26] That's the list [22:04:13] strange [22:05:28] It's not even all the wikipedias [22:05:59] * aude nods [22:09:15] <^d> 11806626 / 14mil done. [22:09:17] <^d> aude: ^ [22:09:27] <^d> I'm so redoing this batch indexing :p [22:14:16] ^d: ok [22:16:51] !log LocalisationUpdate completed (1.23wmf2) at Fri Nov 1 22:16:51 UTC 2013 [22:17:06] Logged the message, Master [22:21:16] * marktraceur cautiously looks around [22:21:22] Did anything explode while I was gone [22:21:36] marktraceur: chech the messages thing again [22:21:48] * marktraceur will [22:21:49] wow, seriously, typos galore today [22:21:49] rl cache stuff is still waiting [22:21:53] oh [22:21:55] nvm [22:22:26] * marktraceur sends Chechans to look at the message cache [22:22:44] Actually it's working now [22:23:20] !log LocalisationUpdate ResourceLoader cache refresh completed at Fri Nov 1 22:23:20 UTC 2013 [22:23:37] Logged the message, Master [22:26:54] marktraceur: thanks for the chechen joke, I was hoping someone would do it [22:31:32] MatmaRex: really? [22:39:56] Anyway, I'mma send greg-g an email or something [22:39:59] Maybe an etherpad? [22:40:02] Up to you [22:41:44] marktraceur: whatevs [22:46:25] greg-g: When do you want me to start the narrative, yesterday? Earlier? This morning? [22:47:25] marktraceur: when betafeatures became a thing [22:48:05] 'kay [22:49:04] Nemo_bis: yes, sorta kinda [22:49:26] Nemo_bis: he said he'll look into the failures before he disappeared a few hours ago [22:49:37] (or some other channel) [22:58:11] asked confirmation on https://gerrit.wikimedia.org/r/#/c/89604/ :) [22:59:52] :) [23:41:35] (03PS1) 10Ori.livneh: Set $wgResourceLoaderStorageEnabled to true on test & test2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93134 [23:43:07] (03CR) 10Ori.livneh: [C: 032] Set $wgResourceLoaderStorageEnabled to true on test & test2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93134 (owner: 10Ori.livneh) [23:43:25] (03Merged) 10jenkins-bot: Set $wgResourceLoaderStorageEnabled to true on test & test2 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/93134 (owner: 10Ori.livneh) [23:45:31] !log ori synchronized wmf-config/InitialiseSettings.php 'Ib19944b16: = true on test & test2' [23:45:48] Logged the message, Master [23:45:52] <^d> ori-l: Quotes are hard ;-) [23:46:01] even dumber [23:46:12] $foo is a variable in bash, too, not just php [23:46:36] oh, perhaps you meant that single quotes would have done the trick [23:46:37] yes. [23:46:50] <^d> :)