[00:00:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [00:00:56] hehe [00:00:59] i dunno [00:01:02] i am going to say….. yes! [00:01:09] it's gym time! [00:01:24] i think it's safe :) [00:02:15] I guess I can check back later and see if there are hundreds of icinga warnings [00:02:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 00:02:41 UTC 2013 [00:03:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [00:04:37] how can both those icinga alerts be true? [00:04:55] watching people talk about icinga-wm while ignoring icinga-wm is fun [00:05:29] Well even if I didn't want to ignore it, what could I do? puppet is both fresh and stale [00:05:40] Clearly if I log in and observe it'll collapse the wave function [00:07:48] * andrewbogott goes to dinner and not dinner [00:07:59] * YuviPanda makes andrewbogott_afk a cat [00:24:56] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 00:24:52 UTC 2013 [00:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [00:28:15] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 00:28:06 UTC 2013 [00:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [00:28:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 00:28:52 UTC 2013 [00:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [00:32:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 00:32:41 UTC 2013 [00:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [00:54:45] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 00:54:43 UTC 2013 [00:54:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [00:58:15] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 00:58:11 UTC 2013 [00:58:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [00:58:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 00:58:46 UTC 2013 [00:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [01:05:35] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 01:05:32 UTC 2013 [01:05:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [01:17:45] PROBLEM - Disk space on cp1045 is CRITICAL: DISK CRITICAL - free space: /srv/sda3 12619 MB (4% inode=99%): /srv/sdb3 12437 MB (3% inode=99%): [01:24:56] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 01:24:46 UTC 2013 [01:24:56] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [01:27:45] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 01:27:38 UTC 2013 [01:27:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [01:28:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 01:28:50 UTC 2013 [01:29:46] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [01:32:56] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 01:32:53 UTC 2013 [01:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [01:47:10] (PS1) TTO: (bug 51715) allow sysops to add/remove confirmed group on ckbwiki [operations/mediawiki-config] - https://gerrit.wikimedia.org/r/74825 [01:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 01:54:47 UTC 2013 [01:54:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [01:58:46] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 01:58:41 UTC 2013 [01:58:56] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 01:58:51 UTC 2013 [01:58:59] (CR) Faidon: [C: -1] "/etc/ssl is the right place. You just need to add the user to the ssl-cert group (the directory has --x for the group)." [operations/puppet/cdh4] - https://gerrit.wikimedia.org/r/74686 (owner: Ottomata) [01:59:38] (CR) Faidon: [C: 1] Add elasticsearch module and role. [operations/puppet] - https://gerrit.wikimedia.org/r/74534 (owner: Manybubbles) [01:59:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [01:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [02:02:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 02:02:40 UTC 2013 [02:03:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [02:07:36] info s1 [02:07:38] @info s1 [02:07:38] Krinkle: [s1] db1056: 10.64.32.26, db1043: 10.64.16.32, db1049: 10.64.16.144, db1050: 10.64.16.145, db1051: 10.64.32.21, db1052: 10.64.32.22 [02:07:43] @info centralauth [02:07:43] Krinkle: [centralauth: s7] db1041: 10.64.16.30, db1007: 10.64.0.11, db1024: 10.64.16.13, db1028: 10.64.16.17 [02:07:51] @replag db1052 [02:07:51] Krinkle: [db1052: s1] db1052: 0s [02:08:04] @docs [02:08:04] Krinkle: https://www.mediawiki.org/wiki/dbbot-wm [02:08:06] @externals [02:08:07] Krinkle: [operations/mediawiki-config.git] Checked out HEAD: 27107f197184abe1ccee3a962999c779f37c85bf - https://gerrit.wikimedia.org/r/gitweb?p=operations/mediawiki-config.git;a=commit;h=27107f197184abe1ccee3a962999c779f37c85bf [02:10:27] !log LocalisationUpdate completed (1.22wmf10) at Sat Jul 20 02:10:26 UTC 2013 [02:10:39] Logged the message, Master [02:18:55] !log LocalisationUpdate completed (1.22wmf11) at Sat Jul 20 02:18:54 UTC 2013 [02:19:06] Logged the message, Master [02:24:56] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 02:24:53 UTC 2013 [02:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [02:27:45] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 02:27:41 UTC 2013 [02:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [02:28:56] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 02:28:53 UTC 2013 [02:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [02:30:43] !log LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 20 02:30:42 UTC 2013 [02:30:53] Logged the message, Master [02:33:15] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 02:33:12 UTC 2013 [02:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [02:34:55] PROBLEM - Puppet freshness on neon is CRITICAL: No successful Puppet run in the last 10 hours [02:54:57] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 02:54:54 UTC 2013 [02:55:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [02:57:55] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 02:57:47 UTC 2013 [02:58:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [02:58:45] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 02:58:43 UTC 2013 [02:59:46] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [03:00:55] PROBLEM - Puppet freshness on analytics1019 is CRITICAL: No successful Puppet run in the last 10 hours [03:02:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 03:02:43 UTC 2013 [03:03:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [03:04:55] PROBLEM - Puppet freshness on analytics1018 is CRITICAL: No successful Puppet run in the last 10 hours [03:05:55] PROBLEM - Puppet freshness on analytics1020 is CRITICAL: No successful Puppet run in the last 10 hours [03:25:25] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 03:25:16 UTC 2013 [03:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [03:28:05] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 03:28:04 UTC 2013 [03:28:46] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [03:28:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 03:28:49 UTC 2013 [03:29:46] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [03:33:15] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 03:33:09 UTC 2013 [03:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [03:40:55] PROBLEM - Puppet freshness on virt1 is CRITICAL: No successful Puppet run in the last 10 hours [03:40:55] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours [03:40:55] PROBLEM - Puppet freshness on erzurumi is CRITICAL: No successful Puppet run in the last 10 hours [03:40:55] PROBLEM - Puppet freshness on virt3 is CRITICAL: No successful Puppet run in the last 10 hours [03:40:55] PROBLEM - Puppet freshness on virt4 is CRITICAL: No successful Puppet run in the last 10 hours [03:40:55] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours [03:40:55] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours [03:45:55] PROBLEM - Puppet freshness on ms-fe1002 is CRITICAL: No successful Puppet run in the last 10 hours [03:51:55] PROBLEM - Puppet freshness on ms-fe1003 is CRITICAL: No successful Puppet run in the last 10 hours [03:54:56] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 03:54:52 UTC 2013 [03:55:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [03:56:55] PROBLEM - Puppet freshness on ms-fe1004 is CRITICAL: No successful Puppet run in the last 10 hours [03:57:57] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 03:57:45 UTC 2013 [03:58:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [03:58:56] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 03:58:53 UTC 2013 [03:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [04:02:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 04:02:36 UTC 2013 [04:02:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [04:12:55] PROBLEM - Puppet freshness on ms-fe1001 is CRITICAL: No successful Puppet run in the last 10 hours [04:13:55] PROBLEM - Puppet freshness on bast1001 is CRITICAL: No successful Puppet run in the last 10 hours [04:24:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 04:24:52 UTC 2013 [04:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [04:27:55] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 04:27:44 UTC 2013 [04:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [04:29:05] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 04:28:56 UTC 2013 [04:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [04:33:15] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 04:33:14 UTC 2013 [04:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [04:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 04:54:49 UTC 2013 [04:55:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [04:58:15] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 04:58:08 UTC 2013 [04:58:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [04:58:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 04:58:44 UTC 2013 [04:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [05:02:35] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 05:02:33 UTC 2013 [05:02:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [05:26:45] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 05:26:39 UTC 2013 [05:26:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [05:27:45] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 05:27:40 UTC 2013 [05:28:46] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [05:28:46] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 05:28:41 UTC 2013 [05:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [05:32:46] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 05:32:40 UTC 2013 [05:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [05:35:43] paravoid: hi [05:35:51] paravoid: can you merge this changeset please ? https://gerrit.wikimedia.org/r/74827 [05:36:21] I don't know why #74826 was abandoned, but that doesn't matter [05:36:34] #74827 contains the same changes and is available for review [05:39:05] wait [05:39:08] I pushed a patchset today [05:39:20] that was https://gerrit.wikimedia.org/r/#/c/74651/ [05:39:31] how many changesets do we have for the same thing [05:40:15] average: https://gerrit.wikimedia.org/r/#/c/74651/ ps2 is your last night's work + fixes that I had to do [05:45:57] alright, I guess that will do [05:46:05] paravoid: was it merged ? [05:46:06] can you cleanup the patchsets? [05:46:13] no, because I want you to test it first :) [05:46:22] note how I removed the symlinks you added [05:46:23] these were wrong [05:46:28] but please test it works without them [05:46:43] there's also https://gerrit.wikimedia.org/r/#/c/68711/ [05:46:54] and 74826, 74287, 74651 [05:47:00] it will not work without them because java does not know where to load the libs from in a dynamic way. [05:47:08] which java? [05:47:13] that shouldn't be the case [05:47:33] openjdk in precise should be multiarch aware [05:47:56] we are not relying on openjdk, our aim is to use oracle jdk6, with a view towards openjdk in the future [05:48:48] that's a bug in oracle jdk though [05:48:58] we can't ship in /usr/lib [05:49:15] that's broken for precise [05:49:43] * average is rolling a cigarette and thinking about it [05:50:27] openjdk-6 (6b23~pre8-2) unstable; urgency=low [05:50:31] * Make the installation multiarch aware. [05:50:31] -- Matthias Klose Sun, 28 Aug 2011 17:55:22 +0200 [05:50:40] fixed 2 years ago [05:50:59] we could do the symlinks manually (i.e. puppet) [05:51:33] but we'd have to do them for every node of the cluster [05:51:36] wouldn't that be a hassle ? [05:51:45] that's why I said puppet [05:52:32] but then the devs would have this problem [05:52:59] they can just use a openjdk version which is, you know, not buggy & legal to redistribute? [05:53:00] on their local development environments [05:53:03] :) [05:53:12] but .. but... [05:53:16] this is an oracle jdk bug [05:54:52] so I will tell you what I know [05:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 05:54:46 UTC 2013 [05:54:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [05:55:16] https://github.com/wikimedia/kraken/blob/master/README.md [05:55:19] paravoid: ^^ [05:55:35] if you read that document, it explains how one sets up his Kraken locally for development [05:55:42] it is clearly stated there that we use oracle jdk [05:55:50] I fully agree, oracle jdk has a bug [05:55:55] but we can't just ditch it because it has a bug [05:56:01] because openjdk might bring more problems [05:56:17] currently everyone (except for qchris I think) is using oracle jdk for development on Kraken [05:56:29] installing oracle jdk involved manual steps [05:56:37] just add a couple of ln in those steps [05:56:57] ok [05:57:06] and work towards getting rid of oracle? [05:57:41] I mean, you (your team) has made a choice and now your facing problems because of that choice [05:58:02] I'm providing you with a workaround, but you need to work towards the solution if you don't want to hit these problems all the time :) [05:58:15] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 05:58:08 UTC 2013 [05:58:15] ok, I agree [05:58:46] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [05:58:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 05:58:46 UTC 2013 [05:59:30] Get the [...] files from an Analytics Team member and store them in /usr/share/GeoIP. [05:59:34] wtf [05:59:46] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [06:00:08] paravoid: those files are actually located on stat1.. [06:00:11] please don't handout those files to non-staffers [06:00:23] I won't, I never did and never will [06:00:39] this is a licensed product, it's illegal to do so [06:00:41] okay [06:00:58] they were always handed out with password generated random from /dev/random and then whirlpooled hash and hash sent on e-mail [06:01:18] it doesn't need to be encrypted, it just needs to be contained within the wmf [06:01:32] where within the wmf do we keep such files ? [06:01:41] is there such a policy ? I would adhere to it if I knew about it [06:02:20] well, there's no policy but there's no fl/oss license attached to them so your default action should be to not share them :) [06:02:46] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 06:02:44 UTC 2013 [06:02:47] I can guarantee they were only provided to staff members [06:03:01] i.e. if you worked at a regular company and a friend asked you for the company's key of microsoft office or whatever, you wouldn't give it to them would you :) [06:03:27] nod [06:03:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [06:03:58] just saying, the document there kind of implied that random people could ask you :) [06:04:26] random people would be asked who they are and what they need it for [06:04:35] okay [06:05:14] I have an NDA signed .. [06:06:01] paravoid: can I ask a question though ? [06:06:05] I'm not blaming you for something, I'm just pointing it out [06:06:09] mistakes can happen [06:07:12] do we have a private debian repo ? and if so, can we package those GeoIP files in private deb packages who are only available to certain ppl ? [06:07:41] the licensed GeoIP files are deployed using a separate mechanism that is private [06:07:45] this is already in production [06:08:07] this isn't using packages, as these don't come in a packaged form anyway [06:08:24] we don't have a private apt repository [06:08:29] ok [06:08:37] I know about the update utility that maxmind has [06:08:51] (which, going back to the other discussion, presents a problem with oracle java .debs...)) [06:09:01] yeah, we use that utility but in one place [06:09:10] then use puppet to distribute them to the rest of the systems [06:09:20] instead of hitting maxmind from 100 different systems [06:09:32] operations/puppet, modules/geoip has all the fun [06:09:56] makes sense [06:13:45] what are you doing working? :) [06:13:51] all day [06:13:56] why? [06:14:17] because I have a strong desire to make dclass happen [06:15:31] I'm testing your changes [06:23:14] paravoid: can I ask something ? [06:24:22] why do you keep clearing the changelog ? it's like groundhog day with bill murray where he lives the same day over and over [06:24:28] it's funny [06:24:34] :) [06:24:46] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 06:24:44 UTC 2013 [06:24:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [06:27:12] but not a problem [06:28:15] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 06:28:05 UTC 2013 [06:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [06:28:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 06:28:52 UTC 2013 [06:29:37] paravoid: ok verified [06:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [06:32:55] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 06:32:51 UTC 2013 [06:33:18] the changelog describes the difference between versions [06:33:28] these changes are under review, they've never been commited much less released [06:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [06:36:23] paravoid: +1 on https://gerrit.wikimedia.org/r/#/c/74651/ [06:36:48] paravoid: tested your patchset, deployed packages in a vagrant vm, so I tested them.. [06:36:55] PROBLEM - Puppet freshness on dobson is CRITICAL: No successful Puppet run in the last 10 hours [06:37:51] paravoid: should I +2 ? am I doing this right ? [06:37:58] you should V+2 [06:38:47] if you're feeling up to it, pull request the wikimedia changes to upstream :) [06:42:14] for sure [06:47:21] (PS1) TTO: (bug 42113) remove ability to debureaucrat from enwiktionary bureaucrats [operations/mediawiki-config] - https://gerrit.wikimedia.org/r/74828 [06:50:35] paravoid: we might need to build for both openjdk and oracle [06:50:45] paravoid: I'm checking now if they're compatible with openjdk [06:51:15] hm, looks like it's working fine with openjdk as well [06:51:18] uhm, yeah [06:52:34] paravoid: is it possible that we have the packages in apt.wikimedia.org ? [06:52:55] PROBLEM - Puppet freshness on mchenry is CRITICAL: No successful Puppet run in the last 10 hours [06:53:20] people would be happy if we did [06:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 06:54:48 UTC 2013 [06:55:22] will do [06:55:34] cool [06:55:36] thanks paravoid [06:55:40] can you cleanup the old patchsets [06:55:51] paravoid: which one ? [06:55:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [06:56:04] https://gerrit.wikimedia.org/r/#/c/74827/ https://gerrit.wikimedia.org/r/#/c/68711/ [06:56:04] paravoid: you mean to abandon them right ? [06:56:07] yes [06:56:09] yes [06:57:45] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 06:57:35 UTC 2013 [06:57:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [06:58:27] paravoid: done [06:58:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 06:58:48 UTC 2013 [06:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [06:59:56] !log apt: including dclass 2.2.2-1 [07:00:07] Logged the message, Master [07:00:47] done [07:01:10] this is a dream come true [07:01:15] thanks paravoid [07:02:04] can I go have the rest of the weekend now? :) [07:02:19] I suggest you do the same [07:02:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 07:02:38 UTC 2013 [07:03:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [07:04:48] paravoid: yes, thank you, I will [07:04:52] have a nice weekend [07:07:55] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:08:45] RECOVERY - RAID on searchidx1001 is OK: OK: State is Optimal, checked 4 logical device(s) [07:14:55] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:15:45] RECOVERY - RAID on searchidx1001 is OK: OK: State is Optimal, checked 4 logical device(s) [07:20:15] PROBLEM - DPKG on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [07:21:15] RECOVERY - DPKG on searchidx1001 is OK: All packages OK [07:24:56] PROBLEM - SSH on cp1044 is CRITICAL: Server answer: [07:25:05] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 07:24:55 UTC 2013 [07:25:50] PROBLEM - search indices - check lucene status page on search20 is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern found - 60051 bytes in 0.133 second response time [07:25:56] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [07:25:56] RECOVERY - SSH on cp1044 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [07:28:15] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 07:28:12 UTC 2013 [07:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [07:28:56] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 07:28:53 UTC 2013 [07:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [07:32:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 07:32:41 UTC 2013 [07:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [07:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 07:54:53 UTC 2013 [07:55:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [07:58:55] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 07:58:47 UTC 2013 [07:58:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 07:58:47 UTC 2013 [07:59:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [07:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [08:04:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 08:04:39 UTC 2013 [08:05:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [08:24:56] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 08:24:52 UTC 2013 [08:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [08:28:15] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 08:28:11 UTC 2013 [08:28:46] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [08:28:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 08:28:52 UTC 2013 [08:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [08:32:55] PROBLEM - Puppet freshness on manutius is CRITICAL: No successful Puppet run in the last 10 hours [08:35:05] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 08:34:57 UTC 2013 [08:35:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [08:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 08:54:46 UTC 2013 [08:54:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [09:08:39] PROBLEM - search indices - check lucene status page on search1005 is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern found - 139 bytes in 0.002 second response time [09:11:46] PROBLEM - search indices - check lucene status page on search1004 is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - pattern found - 139 bytes in 0.002 second response time [14:23:03] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:23:34] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.125 second response time [14:24:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 14:24:50 UTC 2013 [14:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [14:27:45] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 14:27:36 UTC 2013 [14:27:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [14:28:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 14:28:51 UTC 2013 [14:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [14:32:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 14:32:43 UTC 2013 [14:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [14:44:51] (CR) Odder: "Bump." [operations/mediawiki-config] - https://gerrit.wikimedia.org/r/73565 (owner: Odder) [14:51:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:53:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.127 second response time [14:54:45] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 14:54:43 UTC 2013 [14:54:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [14:58:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 14:58:51 UTC 2013 [14:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [15:22:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:23:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.125 second response time [15:25:05] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 15:24:59 UTC 2013 [15:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [15:27:55] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 15:27:47 UTC 2013 [15:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [15:28:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 15:28:54 UTC 2013 [15:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [15:32:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 15:32:42 UTC 2013 [15:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [15:49:06] YuviPanda: I see you used WTFPL for lolrrit; how does this relate to the licencing of Wikimedia repos in general? [15:49:31] twkozlowski: no idea. lolrrit is something I wrote in my personal capacity as a volunteer, so can be anything, no? [15:49:39] twkozlowski: SuchABot (my other gerrit bot) is also WTFPL [15:50:17] most of my new code in general (that aren't libraries) are mostly WTFPL [15:50:20] (libraries are apache) [15:51:05] YuviPanda: yes, I just got the impression that, say, most extensions were GPL'd [15:51:15] twkozlowski: indeed, they are. [15:51:21] YuviPanda: why using Apache for libraries instead of LGPL? [15:51:31] I'm more of a BSDish kinda guy :) [15:51:36] hence my non-libraries are WTFPL than GPL [15:52:09] twkozlowski, some of MW extensions in production are also WTFPL [15:52:43] I know that because I wrote them :P [15:53:27] MaxSem: most + some = all [15:53:53] I specifically wrote 'most' because I assumed some might not have been GPL'd [15:54:04] that'd be just too easy [15:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 15:54:45 UTC 2013 [15:54:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [15:58:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 15:58:52 UTC 2013 [15:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [16:00:45] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 16:00:44 UTC 2013 [16:01:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [16:02:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 16:02:38 UTC 2013 [16:03:24] (CR) Odder: "Shouldn't this have also removed the now obsolete per-project settings (all set to 'true' just like the default one)?" [operations/mediawiki-config] - https://gerrit.wikimedia.org/r/74262 (owner: CSteipp) [16:03:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [16:05:47] (CR) Odder: "Hint: This only touched testwiki, mediawikiwiki and loginwiki; wikidatawiki was not affected." [operations/mediawiki-config] - https://gerrit.wikimedia.org/r/74405 (owner: Reedy) [16:22:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:23:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.128 second response time [16:24:56] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 16:24:54 UTC 2013 [16:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [16:27:55] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 16:27:47 UTC 2013 [16:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [16:28:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 16:28:49 UTC 2013 [16:29:46] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [16:29:55] PROBLEM - SSH on lvs1001 is CRITICAL: Server answer: [16:31:55] RECOVERY - SSH on lvs1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:34:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 16:34:41 UTC 2013 [16:35:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [16:37:55] PROBLEM - Puppet freshness on dobson is CRITICAL: No successful Puppet run in the last 10 hours [16:37:56] PROBLEM - SSH on lvs1001 is CRITICAL: Server answer: [16:38:56] RECOVERY - SSH on lvs1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [16:52:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:53:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.128 second response time [16:53:55] PROBLEM - Puppet freshness on mchenry is CRITICAL: No successful Puppet run in the last 10 hours [16:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 16:54:45 UTC 2013 [16:54:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [16:58:45] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 16:58:40 UTC 2013 [16:59:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [16:59:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 16:59:51 UTC 2013 [17:00:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [17:02:35] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 17:02:34 UTC 2013 [17:02:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [17:16:25] PROBLEM - RAID on searchidx2 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:18:25] PROBLEM - SSH on searchidx2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:20:15] RECOVERY - SSH on searchidx2 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [17:22:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:23:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.124 second response time [17:25:05] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 17:24:57 UTC 2013 [17:25:25] RECOVERY - RAID on searchidx2 is OK: OK: State is Optimal, checked 4 logical device(s) [17:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [17:28:25] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 17:28:21 UTC 2013 [17:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [17:29:45] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 17:29:42 UTC 2013 [17:30:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [17:32:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 17:32:41 UTC 2013 [17:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [17:36:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:37:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.124 second response time [17:52:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:53:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.125 second response time [17:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 17:54:47 UTC 2013 [17:54:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [17:59:15] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 17:59:07 UTC 2013 [17:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [18:00:41] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 18:00:33 UTC 2013 [18:00:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [18:02:45] PROBLEM - Solr on vanadium is CRITICAL: Average request time is 1000.0476 (gt 1000) [18:02:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 18:02:40 UTC 2013 [18:03:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [18:04:45] RECOVERY - Solr on vanadium is OK: All OK [18:10:25] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 18:10:21 UTC 2013 [18:10:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [18:11:55] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 18:11:45 UTC 2013 [18:12:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [18:14:59] (PS2) Andrew Bogott: Replace generic::sysctl::ipv6-disable-ra [operations/puppet] - https://gerrit.wikimedia.org/r/74799 [18:15:52] (CR) Andrew Bogott: [C: 2] Replace generic::sysctl::ipv6-disable-ra [operations/puppet] - https://gerrit.wikimedia.org/r/74799 (owner: Andrew Bogott) [18:15:53] (Merged) Andrew Bogott: Replace generic::sysctl::ipv6-disable-ra [operations/puppet] - https://gerrit.wikimedia.org/r/74799 (owner: Andrew Bogott) [18:16:55] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 18:16:52 UTC 2013 [18:17:35] (PS1) Ottomata: Adding role::analytics::kraken for common kraken client classes [operations/puppet] - https://gerrit.wikimedia.org/r/74848 [18:17:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [18:17:48] (CR) Ottomata: [C: 2 V: 2] Adding role::analytics::kraken for common kraken client classes [operations/puppet] - https://gerrit.wikimedia.org/r/74848 (owner: Ottomata) [18:17:49] (Merged) Ottomata: Adding role::analytics::kraken for common kraken client classes [operations/puppet] - https://gerrit.wikimedia.org/r/74848 (owner: Ottomata) [18:18:19] (PS5) Ottomata: Adding role::analytics::hue [operations/puppet] - https://gerrit.wikimedia.org/r/74388 [18:21:13] (PS1) Ottomata: Renaming role::analytics::kraken to role::analytics::common [operations/puppet] - https://gerrit.wikimedia.org/r/74849 [18:21:22] (CR) Ottomata: [C: 2 V: 2] Renaming role::analytics::kraken to role::analytics::common [operations/puppet] - https://gerrit.wikimedia.org/r/74849 (owner: Ottomata) [18:21:23] (Merged) Ottomata: Renaming role::analytics::kraken to role::analytics::common [operations/puppet] - https://gerrit.wikimedia.org/r/74849 (owner: Ottomata) [18:21:38] (PS6) Ottomata: Adding role::analytics::hue [operations/puppet] - https://gerrit.wikimedia.org/r/74388 [18:25:05] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 18:24:55 UTC 2013 [18:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [18:26:58] (PS2) Andrew Bogott: Replace generic::sysctl::lvs [operations/puppet] - https://gerrit.wikimedia.org/r/74800 [18:28:33] (CR) Andrew Bogott: [C: 2] Replace generic::sysctl::lvs [operations/puppet] - https://gerrit.wikimedia.org/r/74800 (owner: Andrew Bogott) [18:28:34] (Merged) Andrew Bogott: Replace generic::sysctl::lvs [operations/puppet] - https://gerrit.wikimedia.org/r/74800 (owner: Andrew Bogott) [18:28:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 18:28:53 UTC 2013 [18:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [18:31:05] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 18:31:01 UTC 2013 [18:31:15] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 18:31:07 UTC 2013 [18:31:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [18:31:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [18:32:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:32:55] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 18:32:47 UTC 2013 [18:33:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.127 second response time [18:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [18:33:55] PROBLEM - Puppet freshness on manutius is CRITICAL: No successful Puppet run in the last 10 hours [18:52:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:53:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.131 second response time [18:53:45] PROBLEM - Solr on vanadium is CRITICAL: Average request time is 1000.5198 (gt 1000) [18:57:39] (PS1) Andrew Bogott: Replace uses of generic::sysctl with sysctlfile module [operations/puppet] - https://gerrit.wikimedia.org/r/74852 [18:57:55] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 18:57:46 UTC 2013 [18:57:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 18:57:46 UTC 2013 [18:57:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [18:58:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [18:58:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 18:58:47 UTC 2013 [18:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [19:02:55] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 19:02:52 UTC 2013 [19:03:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [19:07:45] RECOVERY - Solr on vanadium is OK: All OK [19:15:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:16:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.140 second response time [19:22:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:23:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.142 second response time [19:24:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 19:24:51 UTC 2013 [19:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [19:27:55] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 19:27:48 UTC 2013 [19:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [19:29:05] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 19:28:59 UTC 2013 [19:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [19:32:55] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 19:32:48 UTC 2013 [19:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [19:37:45] PROBLEM - Solr on vanadium is CRITICAL: Average request time is 1000.3906 (gt 1000) [19:49:35] (PS1) Yuvipanda: Add (and re-organize) TCL packages for giftpflanze [operations/puppet] - https://gerrit.wikimedia.org/r/74856 [19:51:58] (PS2) Yuvipanda: Add (and re-organize) TCL packages for giftpflanze [operations/puppet] - https://gerrit.wikimedia.org/r/74856 [19:52:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:53:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.128 second response time [19:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 19:54:50 UTC 2013 [19:55:09] (PS3) Yuvipanda: Add (and re-organize) TCL packages for giftpflanze [operations/puppet] - https://gerrit.wikimedia.org/r/74856 [19:55:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [19:57:52] (PS4) Yuvipanda: Add (and re-organize) TCL packages for giftpflanze [operations/puppet] - https://gerrit.wikimedia.org/r/74856 [19:58:05] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 19:57:57 UTC 2013 [19:58:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [19:58:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 19:58:48 UTC 2013 [19:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [20:02:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 20:02:38 UTC 2013 [20:03:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [20:21:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:23:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.125 second response time [20:24:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 20:24:50 UTC 2013 [20:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [20:27:45] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 20:27:44 UTC 2013 [20:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [20:29:05] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 20:28:59 UTC 2013 [20:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [20:31:47] (CR) coren: [C: 2] "Ticklish!" [operations/puppet] - https://gerrit.wikimedia.org/r/74856 (owner: Yuvipanda) [20:31:48] (Merged) coren: Add (and re-organize) TCL packages for giftpflanze [operations/puppet] - https://gerrit.wikimedia.org/r/74856 (owner: Yuvipanda) [20:33:15] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 20:33:05 UTC 2013 [20:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [20:52:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:53:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.127 second response time [20:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 20:54:53 UTC 2013 [20:55:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [20:57:32] apergos: are you around? got error 500 on tools.wmflabs.org [20:57:59] Internal Server Error [20:57:59] The server encountered an internal error or misconfiguration and was unable to complete your request. [20:57:59] Please contact the server administrator, mpelletier@wikimedia.org and inform them of the time the error occurred, and anything you might have done that may have caused the error. [20:57:59] More information about this error may be available in the server error log. [20:58:15] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 20:58:07 UTC 2013 [20:58:31] matanya: ask in #wikimedia-labs maybe [20:58:40] good idea hashar thanks [20:58:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [20:58:45] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 20:58:43 UTC 2013 [20:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [21:03:25] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 21:03:22 UTC 2013 [21:03:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [21:04:55] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:04:56] PROBLEM - SSH on searchidx1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:05:55] RECOVERY - RAID on searchidx1001 is OK: OK: State is Optimal, checked 4 logical device(s) [21:05:55] RECOVERY - SSH on searchidx1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [21:06:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:07:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.134 second response time [21:11:55] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [21:13:45] RECOVERY - RAID on searchidx1001 is OK: OK: State is Optimal, checked 4 logical device(s) [21:22:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:23:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.123 second response time [21:24:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 21:24:48 UTC 2013 [21:25:56] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [21:27:45] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 21:27:39 UTC 2013 [21:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [21:30:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 21:30:45 UTC 2013 [21:31:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:31:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [21:32:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.129 second response time [21:32:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 21:32:38 UTC 2013 [21:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [21:52:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:53:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.129 second response time [21:54:45] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 21:54:40 UTC 2013 [21:54:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [21:57:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:57:45] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 21:57:44 UTC 2013 [21:58:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.126 second response time [21:58:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [21:58:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 21:58:50 UTC 2013 [21:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [22:02:56] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 22:02:54 UTC 2013 [22:03:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [22:05:45] RECOVERY - search indices - check lucene status page on search20 is OK: HTTP OK: HTTP/1.1 200 OK - 60075 bytes in 0.114 second response time [22:09:25] !log Debugging memory leak on vanadium; solr instance may see some intermittent outages. [22:22:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:23:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.133 second response time [22:24:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 22:24:52 UTC 2013 [22:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [22:27:55] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 22:27:46 UTC 2013 [22:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [22:29:05] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 22:28:58 UTC 2013 [22:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [22:32:01] (CR) Legoktm: [C: 1] (bug 50929) Remove 'visualeditor-enable' from $wgHiddenPrefs [operations/mediawiki-config] - https://gerrit.wikimedia.org/r/73565 (owner: Odder) [22:33:25] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 22:33:18 UTC 2013 [22:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [22:36:56] PROBLEM - Puppet freshness on neon is CRITICAL: No successful Puppet run in the last 10 hours [22:52:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:53:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.135 second response time [22:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 22:54:46 UTC 2013 [22:54:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [22:58:15] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 22:58:09 UTC 2013 [22:58:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [22:59:35] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 22:59:25 UTC 2013 [22:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [23:02:45] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 23:02:40 UTC 2013 [23:02:55] PROBLEM - Puppet freshness on analytics1019 is CRITICAL: No successful Puppet run in the last 10 hours [23:03:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [23:06:55] PROBLEM - Puppet freshness on analytics1018 is CRITICAL: No successful Puppet run in the last 10 hours [23:07:55] PROBLEM - Puppet freshness on analytics1020 is CRITICAL: No successful Puppet run in the last 10 hours [23:10:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:11:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.132 second response time [23:14:45] RECOVERY - Solr on vanadium is OK: All OK [23:17:45] PROBLEM - Solr on vanadium is CRITICAL: Average request time is 1795.125 (gt 1000) [23:22:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:23:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.124 second response time [23:24:56] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 23:24:53 UTC 2013 [23:25:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [23:28:15] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 23:28:05 UTC 2013 [23:28:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [23:29:06] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 23:29:01 UTC 2013 [23:29:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours [23:31:22] (CR) MZMcBride: "This should be merged and deployed as soon as possible." [operations/mediawiki-config] - https://gerrit.wikimedia.org/r/73565 (owner: Odder) [23:33:15] RECOVERY - Puppet freshness on cp1043 is OK: puppet ran at Sat Jul 20 23:33:09 UTC 2013 [23:33:45] PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours [23:38:45] RECOVERY - Solr on vanadium is OK: All OK [23:42:55] PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours [23:42:55] PROBLEM - Puppet freshness on virt1 is CRITICAL: No successful Puppet run in the last 10 hours [23:42:55] PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours [23:42:56] PROBLEM - Puppet freshness on erzurumi is CRITICAL: No successful Puppet run in the last 10 hours [23:42:56] PROBLEM - Puppet freshness on virt3 is CRITICAL: No successful Puppet run in the last 10 hours [23:42:56] PROBLEM - Puppet freshness on virt4 is CRITICAL: No successful Puppet run in the last 10 hours [23:42:56] PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours [23:47:55] PROBLEM - Puppet freshness on ms-fe1002 is CRITICAL: No successful Puppet run in the last 10 hours [23:52:25] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:53:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.131 second response time [23:53:55] PROBLEM - Puppet freshness on ms-fe1003 is CRITICAL: No successful Puppet run in the last 10 hours [23:54:55] RECOVERY - Puppet freshness on cp1042 is OK: puppet ran at Sat Jul 20 23:54:49 UTC 2013 [23:55:55] PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours [23:58:15] RECOVERY - Puppet freshness on cp1044 is OK: puppet ran at Sat Jul 20 23:58:12 UTC 2013 [23:58:45] PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours [23:58:45] PROBLEM - Solr on vanadium is CRITICAL: Average request time is 1044.6666 (gt 1000) [23:58:55] PROBLEM - Puppet freshness on ms-fe1004 is CRITICAL: No successful Puppet run in the last 10 hours [23:58:55] RECOVERY - Puppet freshness on cp1041 is OK: puppet ran at Sat Jul 20 23:58:48 UTC 2013 [23:59:45] PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours