[00:01:14] (03CR) 10Rush: [C: 031] "sounds good" [operations/puppet] - 10https://gerrit.wikimedia.org/r/147640 (owner: 10Dzahn) [00:01:42] (03CR) 10Dzahn: [C: 032] phab-login screen, login message and old HTML [operations/puppet] - 10https://gerrit.wikimedia.org/r/147640 (owner: 10Dzahn) [00:10:35] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [00:26:10] (03PS1) 10Dzahn: phab - configurable login message by auth type [operations/puppet] - 10https://gerrit.wikimedia.org/r/148564 [00:27:40] (03CR) 10Rush: [C: 031] "good enough :)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148564 (owner: 10Dzahn) [00:28:25] (03PS2) 10Dzahn: phab - configurable login message by auth type [operations/puppet] - 10https://gerrit.wikimedia.org/r/148564 [00:29:38] (03PS3) 10Dzahn: phab - configurable login message by auth type [operations/puppet] - 10https://gerrit.wikimedia.org/r/148564 [00:31:43] (03CR) 10Greg Grossmeier: phab - configurable login message by auth type (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/148564 (owner: 10Dzahn) [00:35:45] greg-g: will it work though? [00:35:47] it's not mw [00:36:18] (03PS1) 10Withoutaname: Remove deprecated $wgCopyrightIcon [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148568 [00:37:45] mutante: aren't proto relative urls a browser thing? [00:38:38] they are [00:38:40] it will work [00:39:16] (other than when you preview the file locally, using the file:// protocol, in which case it probably won't work) [00:40:08] mutante: yeah, just tested on my server, they work [00:40:16] thanks MatmaRex [00:40:46] (03PS4) 10Dzahn: phab - configurable login message by auth type [operations/puppet] - 10https://gerrit.wikimedia.org/r/148564 [00:41:31] (03PS5) 10Dzahn: phab - configurable login message by auth type [operations/puppet] - 10https://gerrit.wikimedia.org/r/148564 [00:41:32] ok :) [00:42:06] :) [00:42:37] https://phabricator.org .. not sure :) [00:42:41] connecting.. [00:44:19] well, that's lame on their part... [00:47:26] mutante: I guess leave it http for phab.org :/ [00:49:38] (03PS6) 10Dzahn: phab - configurable login message by auth type [operations/puppet] - 10https://gerrit.wikimedia.org/r/148564 [00:50:33] sorry for the extra work :/ [00:50:51] np, that's normal :) [00:55:22] (03CR) 10Dzahn: "wth.@ fail on http://puppet-compiler.wmflabs.org/172/change/148293/html/ms-be3001.esams.wmet.html" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148293 (owner: 10Dzahn) [00:57:35] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Puppet has 1 failures [01:07:55] (03PS7) 10Dzahn: turn RT from misc/* into puppet module [operations/puppet] - 10https://gerrit.wikimedia.org/r/116064 [01:12:35] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [01:19:25] (03CR) 10Dzahn: [C: 04-2] "MODULES/NGINX so damn annoying...." [operations/puppet] - 10https://gerrit.wikimedia.org/r/116064 (owner: 10Dzahn) [01:53:14] PROBLEM - Puppet freshness on db1006 is CRITICAL: Last successful Puppet run was Tue 22 Jul 2014 23:53:01 UTC [02:13:04] RECOVERY - Puppet freshness on db1006 is OK: puppet ran at Wed Jul 23 02:12:54 UTC 2014 [02:26:44] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Puppet has 2 failures [02:50:38] !log LocalisationUpdate completed (1.24wmf13) at 2014-07-23 02:49:34+00:00 [02:50:44] Logged the message, Master [02:54:44] PROBLEM - HTTP error ratio anomaly detection on tungsten is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 6 below the confidence bounds [02:55:59] (03CR) 10BBlack: [C: 031] "Confirmed whitespace only. Better formatting can't hurt!" [operations/dns] - 10https://gerrit.wikimedia.org/r/148437 (owner: 10Dzahn) [03:11:54] PROBLEM - graphite.wikimedia.org on tungsten is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:13:35] RECOVERY - HTTP error ratio anomaly detection on tungsten is OK: OK: No anomaly detected [03:13:44] RECOVERY - graphite.wikimedia.org on tungsten is OK: HTTP OK: HTTP/1.1 200 OK - 1607 bytes in 0.017 second response time [03:21:35] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [03:21:49] !log LocalisationUpdate completed (1.24wmf14) at 2014-07-23 03:20:45+00:00 [03:21:54] Logged the message, Master [03:45:14] (03PS1) 10Brian Wolff: Remove flickrApiUrl from $wgUploadWizardConfig [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148593 (https://bugzilla.wikimedia.org/67298) [04:11:01] !log LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 23 04:09:54 UTC 2014 (duration 9m 53s) [04:11:06] Logged the message, Master [04:14:57] (03CR) 10Scottlee: [C: 031] "Looks good." [operations/dns] - 10https://gerrit.wikimedia.org/r/148437 (owner: 10Dzahn) [04:20:31] PROBLEM - Puppet freshness on search1009 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 04:18:28 UTC [04:22:31] PROBLEM - Puppet freshness on search1009 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 04:18:28 UTC [04:24:31] PROBLEM - Puppet freshness on search1009 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 04:18:28 UTC [04:26:31] PROBLEM - Puppet freshness on search1009 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 04:18:28 UTC [04:28:31] PROBLEM - Puppet freshness on search1009 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 04:18:28 UTC [04:29:32] (03PS2) 10Ori.livneh: apache: add apache::mpm [operations/puppet] - 10https://gerrit.wikimedia.org/r/148542 [04:30:31] PROBLEM - Puppet freshness on search1009 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 04:18:28 UTC [04:32:05] (03CR) 10Ori.livneh: "I have a somewhat different take on MPMs that I'd like you to consider: https://gerrit.wikimedia.org/r/#/c/148542/ ." [operations/puppet] - 10https://gerrit.wikimedia.org/r/148099 (owner: 10Giuseppe Lavagetto) [04:32:31] PROBLEM - Puppet freshness on search1009 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 04:18:28 UTC [04:34:31] PROBLEM - Puppet freshness on search1009 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 04:18:28 UTC [04:34:36] (03PS3) 10Ori.livneh: apache: add apache::mpm [operations/puppet] - 10https://gerrit.wikimedia.org/r/148542 [04:36:31] PROBLEM - Puppet freshness on search1009 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 04:18:28 UTC [04:38:31] PROBLEM - Puppet freshness on search1009 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 04:18:28 UTC [04:38:31] RECOVERY - Puppet freshness on search1009 is OK: puppet ran at Wed Jul 23 04:38:24 UTC 2014 [05:29:46] !log clone mariadb 10 labsdb1002 to labsdb100[13] [05:29:50] Logged the message, Master [05:32:15] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 21.43% of data above the critical threshold [500.0] [05:34:32] quite a spike [05:37:22] (03CR) 10Gergő Tisza: [C: 032] Remove flickrApiUrl from $wgUploadWizardConfig [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148593 (https://bugzilla.wikimedia.org/67298) (owner: 10Brian Wolff) [05:37:43] (03Merged) 10jenkins-bot: Remove flickrApiUrl from $wgUploadWizardConfig [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148593 (https://bugzilla.wikimedia.org/67298) (owner: 10Brian Wolff) [05:46:15] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0] [05:47:45] PROBLEM - Unmerged changes on repository mediawiki_config on tin is CRITICAL: There is one unmerged change in mediawiki_config (dir /a/common/). [06:20:45] PROBLEM - Puppet freshness on db1007 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 04:20:04 UTC [06:20:45] RECOVERY - Puppet freshness on db1007 is OK: puppet ran at Wed Jul 23 06:20:40 UTC 2014 [06:28:35] PROBLEM - puppet last run on db1040 is CRITICAL: CRITICAL: Puppet has 1 failures [06:28:35] PROBLEM - puppet last run on db1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:28:36] PROBLEM - puppet last run on ms-fe1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:28:45] PROBLEM - puppet last run on db1018 is CRITICAL: CRITICAL: Puppet has 1 failures [06:28:45] PROBLEM - puppet last run on mw1069 is CRITICAL: CRITICAL: Puppet has 1 failures [06:28:45] PROBLEM - puppet last run on mw1068 is CRITICAL: CRITICAL: Puppet has 1 failures [06:28:46] PROBLEM - puppet last run on mw1170 is CRITICAL: CRITICAL: Puppet has 1 failures [06:28:55] PROBLEM - puppet last run on mw1099 is CRITICAL: CRITICAL: Puppet has 1 failures [06:28:55] PROBLEM - puppet last run on mw1217 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:05] PROBLEM - puppet last run on mw1123 is CRITICAL: CRITICAL: Puppet has 5 failures [06:29:15] PROBLEM - puppet last run on amssq35 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:35] PROBLEM - puppet last run on search1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:35] PROBLEM - puppet last run on mw1118 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:37] <_joe_> the 6.30 bug [06:30:08] <_joe_> or - how can you write a daemon in 2014 and not keep log rotation into account [06:35:15] heh [06:40:35] PROBLEM - puppet last run on db1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:42:25] PROBLEM - puppet last run on neon is CRITICAL: CRITICAL: Puppet has 1 failures [06:43:06] PROBLEM - puppet last run on db1007 is CRITICAL: CRITICAL: Puppet has 1 failures [06:45:05] RECOVERY - puppet last run on mw1099 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [06:45:55] RECOVERY - puppet last run on db1018 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [06:45:55] RECOVERY - puppet last run on mw1069 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [06:45:55] RECOVERY - puppet last run on mw1068 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [06:46:05] RECOVERY - puppet last run on mw1217 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [06:46:15] RECOVERY - puppet last run on mw1123 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [06:46:16] RECOVERY - puppet last run on amssq35 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [06:46:35] RECOVERY - puppet last run on db1002 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [06:46:35] RECOVERY - puppet last run on db1040 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [06:46:36] RECOVERY - puppet last run on search1001 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [06:46:36] RECOVERY - puppet last run on ms-fe1004 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [06:46:36] RECOVERY - puppet last run on mw1118 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [06:46:55] RECOVERY - puppet last run on mw1170 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [06:58:35] RECOVERY - puppet last run on db1009 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [07:00:25] RECOVERY - puppet last run on neon is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [07:01:06] RECOVERY - puppet last run on db1007 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [08:15:36] (03CR) 10Alexandros Kosiaris: [C: 032] ldap: qualify vars [operations/puppet] - 10https://gerrit.wikimedia.org/r/148035 (owner: 10Matanya) [08:16:12] (03CR) 10Alexandros Kosiaris: [C: 032] ganglia_view.json.erb variable qualification [operations/puppet] - 10https://gerrit.wikimedia.org/r/148346 (owner: 10Alexandros Kosiaris) [08:16:23] (03CR) 10Alexandros Kosiaris: [C: 032] Stabilize dnsmasq-nova hash [operations/puppet] - 10https://gerrit.wikimedia.org/r/148345 (owner: 10Alexandros Kosiaris) [08:21:14] (03CR) 10Alexandros Kosiaris: [C: 04-1] "Minor stuff" (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/148099 (owner: 10Giuseppe Lavagetto) [08:21:25] PROBLEM - puppet last run on neon is CRITICAL: CRITICAL: Puppet has 1 failures [08:35:07] (03CR) 10Alexandros Kosiaris: [C: 04-1] "I like this. Minor comment" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/148542 (owner: 10Ori.livneh) [08:37:44] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "a couple of minor comments, I can fix those btw." (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/148542 (owner: 10Ori.livneh) [08:37:54] <_joe_> I still had to post it [08:39:10] <_joe_> akosiaris: just seen why icinga is failing on neon... very very iteresting [08:39:20] <_joe_> we're doing basically everything wrong there [08:39:28] <_joe_> puppet-wise [08:39:44] <_joe_> we have an exec that /bin/chown -R icinga /var/lib/icinga [08:39:56] <_joe_> only, that dir includes icinga spool directory [08:40:17] <_joe_> so, race conditions happen, and chown returns an exit code of 1 in that case [08:40:25] RECOVERY - puppet last run on neon is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [08:40:33] <_joe_> (when a file is deleted while it was supposed to be chowned) [08:41:34] <_joe_> so, my actual question: do you have any idea why we do something like that? [08:42:20] _joe_: nope [08:42:27] I am surprised every time I see it [08:43:28] I think the same approach is in the init file [08:44:19] I am hoping to kill all of this TBH [08:45:18] (03CR) 10Alexandros Kosiaris: [C: 032] "LGTM too, will merge with Antoine present" [operations/puppet] - 10https://gerrit.wikimedia.org/r/144708 (owner: 10Hashar) [08:45:39] (03CR) 10Ori.livneh: apache: add apache::mpm (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/148542 (owner: 10Ori.livneh) [08:46:27] akosiaris: you're going to break my puppet ascii art with this user, group, mode stuff! [08:46:54] <_joe_> ORI ALERT [08:46:54] those squiggly parens don't align themselves you know [08:46:59] <_joe_> :) [08:47:05] actually they do with the right vim plugin [08:47:16] or the wrong vim plugin, depending on how you look at it [08:48:12] <_joe_> or the wrong editor, depending on how sane you are [08:48:24] ori: I knew I was going to hear it. But the pedantic little bastard in me won [08:48:42] he was kind of sorry TBH [08:48:49] heheh. it's true though, they should be there [08:51:54] (03CR) 10Alexandros Kosiaris: [C: 032] "LGTM to me too, will merge with Antoine present" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/144709 (owner: 10Hashar) [08:52:29] "LGTM to me"? :p [08:54:07] Vogone: In this context, LGTM stands for 'Looks Generally Totally Marvelous' [08:54:29] odder: thanks, I was going for something way less inspired [08:54:42] :-) [08:54:54] like Looks good to most (and) to me too [08:55:12] hehe [08:58:05] (03PS4) 10Ori.livneh: apache: add apache::mpm [operations/puppet] - 10https://gerrit.wikimedia.org/r/148542 [09:02:33] i nominate "lets get that merged" [09:02:51] * hashar points ori at the clock [09:02:53] jouncebot, die [09:03:04] mwalker: that was a little extreme [09:03:17] akosiaris: I am there around :-D [09:03:24] (03PS1) 10QChris: Fix typo when setting hive.exec.parallel.thread.number [operations/puppet/cdh] - 10https://gerrit.wikimedia.org/r/148616 [09:03:25] hehe; it's the easiest way to kill it in order to update it [09:03:40] <_joe_> mwalker: oh really? nice [09:05:20] _joe_, it doesn't restart itself though; so you have to be part of the group on tools to restart it... [09:10:19] (03CR) 10Filippo Giunchedi: [C: 032 V: 032] "thanks Daniel and Matanya :))" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148293 (owner: 10Dzahn) [09:13:32] (03PS1) 10Chmarkine: tendril -- update cipher suite list to support PFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/148618 (https://bugzilla.wikimedia.org/53259) [09:14:10] (03PS2) 10Chmarkine: tendril -- update cipher suite list to support PFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/148618 (https://bugzilla.wikimedia.org/53259) [09:15:55] hashar: can jenkins vote please on literal tabs ? [09:15:56] hashar: sorry, got a computer emergency. wanna start ? [09:16:07] akosiaris: sure [09:16:29] akosiaris: do you want to use hangout? [09:16:39] sure [09:16:53] https://plus.google.com/hangouts/_/wikimedia.org/akosiaris-amuss?hceid=YW11c3NvQHdpa2ltZWRpYS5vcmc.ni80sq8lvau5d3coa524c0jph8 [09:17:00] matanya: busy this morning with Alexandros :) [09:17:12] hi akosiaris why not try my new toy? :P [09:18:32] (03CR) 10Matanya: [C: 031] tendril -- update cipher suite list to support PFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/148618 (https://bugzilla.wikimedia.org/53259) (owner: 10Chmarkine) [09:24:12] (03PS10) 10Alexandros Kosiaris: zuul: migrate settings to role::zuul::configuration [operations/puppet] - 10https://gerrit.wikimedia.org/r/144709 (owner: 10Hashar) [09:25:15] (03CR) 10Alexandros Kosiaris: [V: 032] zuul: migrate settings to role::zuul::configuration [operations/puppet] - 10https://gerrit.wikimedia.org/r/144709 (owner: 10Hashar) [09:25:22] (03CR) 10Filippo Giunchedi: [C: 031] apache: add apache::mpm [operations/puppet] - 10https://gerrit.wikimedia.org/r/148542 (owner: 10Ori.livneh) [09:25:56] (03PS6) 10Hashar: zuul: remove $zuul_url from zuul::server [operations/puppet] - 10https://gerrit.wikimedia.org/r/144997 [09:26:42] (03CR) 10Alexandros Kosiaris: [C: 032] zuul: remove $zuul_url from zuul::server [operations/puppet] - 10https://gerrit.wikimedia.org/r/144997 (owner: 10Hashar) [09:29:57] (03PS10) 10Alexandros Kosiaris: zuul: phase out zuulwikimedia [operations/puppet] - 10https://gerrit.wikimedia.org/r/145047 (owner: 10Hashar) [09:30:06] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] zuul: phase out zuulwikimedia [operations/puppet] - 10https://gerrit.wikimedia.org/r/145047 (owner: 10Hashar) [09:33:44] (03PS4) 10Hashar: zuul: introduce 'zuul' system user [operations/puppet] - 10https://gerrit.wikimedia.org/r/145278 [09:33:51] (03PS1) 10Springle: Prepare MariaDB 10 on labsdb100[13] [operations/puppet] - 10https://gerrit.wikimedia.org/r/148620 [09:33:59] (03PS1) 10Matanya: redis: qualify vars [operations/puppet] - 10https://gerrit.wikimedia.org/r/148621 [09:36:16] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] zuul: introduce 'zuul' system user [operations/puppet] - 10https://gerrit.wikimedia.org/r/145278 (owner: 10Hashar) [09:38:09] (03PS2) 10Springle: Prepare MariaDB 10 on labsdb100[13] [operations/puppet] - 10https://gerrit.wikimedia.org/r/148620 [09:39:03] (03CR) 10Springle: [C: 032 V: 032] Prepare MariaDB 10 on labsdb100[13] [operations/puppet] - 10https://gerrit.wikimedia.org/r/148620 (owner: 10Springle) [09:39:19] (03PS2) 10Alexandros Kosiaris: admin: contint-admins can now sudo as 'zuul' [operations/puppet] - 10https://gerrit.wikimedia.org/r/145289 (owner: 10Hashar) [09:39:33] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] admin: contint-admins can now sudo as 'zuul' [operations/puppet] - 10https://gerrit.wikimedia.org/r/145289 (owner: 10Hashar) [09:42:00] !log breaking zuul [09:42:01] (03PS4) 10Alexandros Kosiaris: zuul: switch to run as 'zuul' user BREAKING CHANGE [operations/puppet] - 10https://gerrit.wikimedia.org/r/145290 (owner: 10Hashar) [09:42:04] Logged the message, Master [09:42:56] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] zuul: switch to run as 'zuul' user BREAKING CHANGE [operations/puppet] - 10https://gerrit.wikimedia.org/r/145290 (owner: 10Hashar) [09:43:36] !log zuul changing file ownership on gallium for /srv/ssd/zuul/git from jenkins:root to zuul:zuul [09:43:40] Logged the message, Master [09:45:05] PROBLEM - puppet last run on gallium is CRITICAL: CRITICAL: Epic puppet fail [09:45:42] <_joe_> "Epic puppet fail"? [09:47:05] RECOVERY - puppet last run on gallium is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [09:47:35] (03PS1) 10Chmarkine: planet -- update cipher suite list to support PFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/148624 (https://bugzilla.wikimedia.org/53259) [09:48:33] (03PS3) 10Alexandros Kosiaris: zuul: switch installer from setuptools to pip [operations/puppet] - 10https://gerrit.wikimedia.org/r/145300 (owner: 10Hashar) [09:49:56] (03PS2) 10Chmarkine: planet -- update cipher suite list to support PFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/148624 (https://bugzilla.wikimedia.org/53259) [09:51:32] (03CR) 10Hashar: "recheck" [operations/puppet] - 10https://gerrit.wikimedia.org/r/145300 (owner: 10Hashar) [09:54:42] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] "@ottomata, yeah, that bugs me too (/me hates pip/setuptools). But Antoine has made a pretty good case for it and plus it is not strictly p" [operations/puppet] - 10https://gerrit.wikimedia.org/r/145300 (owner: 10Hashar) [09:57:12] !log Zuul migrated to zuul user :) [09:57:16] Logged the message, Master [10:03:52] akosiaris: regarding ruby 1.9 and puppetmaster, you can ask Zelkof probably he knows about ruby [10:04:24] hashar: Oh, I know enough, I just want a guinea pig :-) [10:04:47] _joe_: https://gerrit.wikimedia.org/r/#/c/148394/ [10:05:51] <_joe_> lol [10:07:14] matanya: re so making Jenkins vote on puppet tabs.. I guess we want to have puppet-lenient to pass :] [10:07:42] yes, just vote -1 if literal tabs found [10:08:10] i.e. http://puppet-lint.com/checks/hard_tabs/ [10:09:00] <_joe_> anyway, I think 90% of puppet-lint rules are braindead [10:09:09] <_joe_> but it's rubyist conventions [10:09:15] <_joe_> so I surrender logic [10:09:48] I only have hard feelings against line has more than 80 characters [10:10:50] we can disable rules in .puppet-lint.rc [10:10:56] there is one at the root of the repo already [10:11:18] so we can probably remove some rules like the lines being 80 chars [10:11:50] <_joe_> hashar: my hatred goes way deeper [10:12:09] <_joe_> I find the idea of vertically aligned => horrible and wasteful [10:12:34] it is so much readable this way [10:12:45] PROBLEM - Puppet freshness on labsdb1004 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 08:11:52 UTC [10:12:52] ^ that is me [10:13:33] (03PS1) 10QChris: Fix typo when setting hive ports [operations/puppet/cdh] - 10https://gerrit.wikimedia.org/r/148628 [10:13:54] _joe_: i can demonstrate this easily: git blame manifests/role/cache.pp [10:13:56] (03PS2) 10Hashar: contint: install Zuul on all CI slaves [operations/puppet] - 10https://gerrit.wikimedia.org/r/141758 [10:14:44] matanya: git blame -w [10:14:49] that should ignore whitespaces [10:14:52] yes, that too [10:15:15] apergos: https://gerrit.wikimedia.org/r/148094 can you make sure this is getting merged before the new dump gets created on Monday? [10:15:38] ACKNOWLEDGEMENT - Puppet freshness on labsdb1004 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 08:11:52 UTC alexandros kosiaris Testing more postgresql connections [10:15:47] ahhh puppet [10:16:46] Duplicate declaration: Package[python-pip] [10:16:47] lovely [10:17:57] https://git.wikimedia.org/blob/operations%2Fpuppet.git/96269ed926b9e5bca8a8604ad879c4254fe9a529/manifests%2Frole%2Fcache.pp line 600 onward. I bet you can't tell what relates to what from a quick glance [10:20:26] (03PS3) 10Hashar: contint: install Zuul on all CI slaves [operations/puppet] - 10https://gerrit.wikimedia.org/r/141758 [10:24:49] (03PS4) 10Hashar: contint: install Zuul on all CI slaves [operations/puppet] - 10https://gerrit.wikimedia.org/r/141758 [10:28:13] (03CR) 10Hashar: [C: 031] "PS2 is a rebase, Zuul manifests no more depends on a Jenkins user and thus no more depends on Jenkins package to be installed. That addre" [operations/puppet] - 10https://gerrit.wikimedia.org/r/141758 (owner: 10Hashar) [10:31:13] (03PS1) 10Chmarkine: svn -- update cipher suite list to support PFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/148631 (https://bugzilla.wikimedia.org/53259) [10:32:44] (03PS2) 10Chmarkine: svn -- update cipher suite list to support PFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/148631 (https://bugzilla.wikimedia.org/53259) [10:33:40] <_joe_> the fact that we need 50 patches to support PFS everywhere tells you something about how disorganized our webservers setup is [10:34:35] PROBLEM - mysqld processes on labsdb1001 is CRITICAL: PROCS CRITICAL: 2 processes with command name mysqld [10:37:12] (03CR) 10Alexandros Kosiaris: Fix stdlib's min() (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/148391 (owner: 10Ori.livneh) [10:37:29] euh [10:37:47] (03PS1) 10Hashar: zuul: run install after packages installation [operations/puppet] - 10https://gerrit.wikimedia.org/r/148633 [10:38:26] (03CR) 10Hashar: "Tested on labs :)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148633 (owner: 10Hashar) [10:38:30] ACKNOWLEDGEMENT - mysqld processes on labsdb1001 is CRITICAL: PROCS CRITICAL: 2 processes with command name mysqld Sean Pringle 2 processes during migration [10:39:24] lunnnch time [10:42:44] (03CR) 10Alexandros Kosiaris: [C: 032] zuul: run install after packages installation [operations/puppet] - 10https://gerrit.wikimedia.org/r/148633 (owner: 10Hashar) [10:50:14] hoo, did you see my question about it yesterday shortly after you mentioned it? [10:50:23] of course he's gone [10:50:27] sigh [10:51:47] apergos: he is in a meeting atm but should be back in 30 mins or so [10:51:56] (03CR) 10ArielGlenn: "Have you had a chance to test the change on a small testdb to make sure it does what you want? I ask because I have had troubles getting " [operations/puppet] - 10https://gerrit.wikimedia.org/r/148094 (owner: 10Hoo man) [10:52:06] I have posted on the changeset, he will see that [10:52:10] cool [10:52:11] thx [11:09:46] (03PS1) 10Hedonil: lighttpd-starter: Update default settings to oversome some issues. Bug: 68431 hange-Id: I17ef551fd75dabb60c5c47b42eb7644109acad29i [operations/puppet] - 10https://gerrit.wikimedia.org/r/148637 [11:17:29] (03PS1) 10Hashar: zuul: allow gearman access from merger [operations/puppet] - 10https://gerrit.wikimedia.org/r/148640 [11:19:49] (03PS2) 10Hashar: zuul: allow gearman access from merger [operations/puppet] - 10https://gerrit.wikimedia.org/r/148640 [11:24:06] (03CR) 10Hashar: [C: 04-1] "Puppet compilation against gallium.wikimedia.org : http://puppet-compiler.wmflabs.org/174/change/148640/html/gallium.wikimedia.org.html" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148640 (owner: 10Hashar) [11:25:34] (03PS3) 10Hashar: zuul: allow gearman access from merger [operations/puppet] - 10https://gerrit.wikimedia.org/r/148640 [11:27:05] PROBLEM - check google safe browsing for mediawiki.org on google is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:27:55] RECOVERY - check google safe browsing for mediawiki.org on google is OK: HTTP OK: HTTP/1.1 200 OK - 3840 bytes in 0.086 second response time [11:30:23] (03PS4) 10Hashar: zuul: allow gearman access from merger [operations/puppet] - 10https://gerrit.wikimedia.org/r/148640 [11:30:47] _joe_: that puppet catalog compiler is really useful [11:31:34] <_joe_> glad it is [11:33:02] <_joe_> we should make it less gerrit-specific and it may become more useful to others [11:33:31] <_joe_> ok, lunchtime~ [11:34:15] (03CR) 10Hashar: "used join(.., ' ') instead and that seems correct now. Result of catalog compiler at http://puppet-compiler.wmflabs.org/176/change/148640/" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148640 (owner: 10Hashar) [11:39:56] (03CR) 10JanZerebecki: [C: 031] Add puppet module for a tor relay [operations/puppet] - 10https://gerrit.wikimedia.org/r/140948 (owner: 10Dzahn) [11:51:42] (03CR) 10JanZerebecki: "@Dzahn: Sent you the key in a signed mail. I updated my gpg keys expiry date earlier this year, update it from the key server." [operations/puppet] - 10https://gerrit.wikimedia.org/r/144994 (owner: 10JanZerebecki) [11:53:37] (03CR) 10Alexandros Kosiaris: "I" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148386 (owner: 10BryanDavis) [11:54:21] (03CR) 10Alexandros Kosiaris: "I don't see why not merging this. The issue you describe obviously still exists, but this patch fixes a different issue anyway" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148386 (owner: 10BryanDavis) [11:57:38] (03CR) 10Alexandros Kosiaris: [C: 032] svn -- update cipher suite list to support PFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/148631 (https://bugzilla.wikimedia.org/53259) (owner: 10Chmarkine) [11:57:49] (03CR) 10Alexandros Kosiaris: [V: 032] svn -- update cipher suite list to support PFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/148631 (https://bugzilla.wikimedia.org/53259) (owner: 10Chmarkine) [12:02:48] (03PS2) 10JanZerebecki: Give jzerebecki access to analytics data [operations/puppet] - 10https://gerrit.wikimedia.org/r/144994 [12:07:38] (03CR) 10JanZerebecki: "PS2: Do not add me to statistics-privatedata-users as all the necessary data should be on the analytics slaves." [operations/puppet] - 10https://gerrit.wikimedia.org/r/144994 (owner: 10JanZerebecki) [12:14:56] (03CR) 10coren: [C: 04-1] "Minor tweak to make (see comment)." (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/148637 (owner: 10Hedonil) [12:17:49] (03CR) 10JanZerebecki: [C: 031] tendril -- update cipher suite list to support PFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/148618 (https://bugzilla.wikimedia.org/53259) (owner: 10Chmarkine) [12:20:15] PROBLEM - puppet last run on mw1078 is CRITICAL: CRITICAL: Puppet has 1 failures [12:21:07] (03CR) 10JanZerebecki: [C: 031] planet -- update cipher suite list to support PFS [operations/puppet] - 10https://gerrit.wikimedia.org/r/148624 (https://bugzilla.wikimedia.org/53259) (owner: 10Chmarkine) [12:33:52] (03PS1) 10Alexandros Kosiaris: svn.wikimedia.org uses apache::site [operations/puppet] - 10https://gerrit.wikimedia.org/r/148645 [12:38:16] RECOVERY - puppet last run on mw1078 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [12:42:19] !log upgraded gdnsd on gallium (used to lint operations/dns.git changes) [12:42:25] Logged the message, Master [12:43:23] (03CR) 10Hashar: [C: 031] the last tab char in any .pp file !? [operations/puppet] - 10https://gerrit.wikimedia.org/r/148295 (owner: 10Dzahn) [12:45:42] apergos: The script works for me on snapshot1003 [12:48:19] the bash builtin echo should do the right thing [12:48:27] sh might (probaby will) act differently [12:48:33] might not support -e or -n [12:48:34] or both [12:49:35] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "Add removal of the old vhost" (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/148645 (owner: 10Alexandros Kosiaris) [12:50:35] (03CR) 10Giuseppe Lavagetto: [C: 031] "Sorry, discard my previous comment. This is correct." [operations/puppet] - 10https://gerrit.wikimedia.org/r/148645 (owner: 10Alexandros Kosiaris) [12:51:01] hoo: -e and -n have been part of POSIX.1 for ages; if your shell builtin doesn't speak it, you can always explicitly use /bin/echo [12:52:08] Coren: Don't think that's needed... we have a script using bash anyway so that should just work [12:52:28] ahhh [12:53:22] Yep; bash echo works. :-) [12:57:40] hoo: well that's why I was asking about testing [12:57:57] I can shove it through but you might find it doe sthe Wrong Thing out of cron [12:58:44] I've had to use /bin/echo explicitly in a couple things in fact [12:58:58] (that last is to Coren) [13:00:04] K4-713: Sir, Please deploy Fundraising (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20140723T1300), the time has come. At your service [13:00:35] what a polite bot! I"m sure we'll be looking back on this as the golden age for bot behavior, once skynet gets going... [13:06:50] apergos: Let's risk it :S [13:06:56] ok [13:07:15] (03PS2) 10ArielGlenn: Make use of new lines more consistent within wikidata json dumps [operations/puppet] - 10https://gerrit.wikimedia.org/r/148094 (owner: 10Hoo man) [13:09:38] (03CR) 10ArielGlenn: [C: 032] Make use of new lines more consistent within wikidata json dumps [operations/puppet] - 10https://gerrit.wikimedia.org/r/148094 (owner: 10Hoo man) [13:12:31] (03CR) 10Alexandros Kosiaris: svn.wikimedia.org uses apache::site (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/148645 (owner: 10Alexandros Kosiaris) [13:12:40] all right, your next cron job will run with that version [13:12:52] :) [13:13:07] (03PS2) 10Alexandros Kosiaris: svn.wikimedia.org uses apache::site [operations/puppet] - 10https://gerrit.wikimedia.org/r/148645 [13:17:17] (03CR) 10Alexandros Kosiaris: [C: 032] svn.wikimedia.org uses apache::site [operations/puppet] - 10https://gerrit.wikimedia.org/r/148645 (owner: 10Alexandros Kosiaris) [13:22:57] (03CR) 10Ottomata: RT 7858: datasets Apache and Puppet edits. (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/147226 (owner: 10Scottlee) [13:23:51] (03CR) 10Ottomata: [C: 032 V: 032] Fix typo when setting hive.exec.parallel.thread.number [operations/puppet/cdh] - 10https://gerrit.wikimedia.org/r/148616 (owner: 10QChris) [13:24:06] (03CR) 10Ottomata: [C: 032 V: 032] Fix typo when setting hive ports [operations/puppet/cdh] - 10https://gerrit.wikimedia.org/r/148628 (owner: 10QChris) [13:25:17] (03CR) 10Ottomata: Give jzerebecki access to analytics data (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/144994 (owner: 10JanZerebecki) [13:25:31] (03PS1) 10Alexandros Kosiaris: Specify owner/group/mode for apache::conf [operations/puppet] - 10https://gerrit.wikimedia.org/r/148652 [14:14:16] (03PS2) 10Hedonil: lighttpd-starter: Update default settings to oversome some issues. Bug: 68431 hange-Id: I17ef551fd75dabb60c5c47b42eb7644109acad29i [operations/puppet] - 10https://gerrit.wikimedia.org/r/148637 [14:14:18] (03PS1) 10Hedonil: lighttpd-starter: Update default settings to overcome some issues. Bug: 68431 moved default php to if condition [operations/puppet] - 10https://gerrit.wikimedia.org/r/148660 [14:15:48] (03Abandoned) 10Hedonil: lighttpd-starter: Update default settings to overcome some issues. Bug: 68431 moved default php to if condition [operations/puppet] - 10https://gerrit.wikimedia.org/r/148660 (owner: 10Hedonil) [14:19:23] !log upgraded php5 on mw1017 (test.wikipedia.org) deployment-apache0{1,2} (beta) to 5.3.10-1ubuntu3.13+wmf1 [14:19:27] Logged the message, Master [14:29:20] <_joe_> akosiaris: mw1017 should be ok, but do perform a puppet run just to be sure [14:30:15] PROBLEM - puppet last run on ruthenium is CRITICAL: CRITICAL: Puppet has 1 failures [14:32:12] (03PS1) 10Hedonil: lighttpd-starter: Update default settings to overcome some issues. Bug: 68431 updated parameters moved default php to if condition [operations/puppet] - 10https://gerrit.wikimedia.org/r/148662 [14:34:45] PROBLEM - Puppet freshness on db1006 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 12:33:39 UTC [14:42:01] (03PS5) 10Scottlee: RT 7858: datasets Apache and Puppet edits. [operations/puppet] - 10https://gerrit.wikimedia.org/r/147226 [14:42:56] (03CR) 10Ottomata: "Thanks looks good! I will merge and test this later today." [operations/puppet] - 10https://gerrit.wikimedia.org/r/147226 (owner: 10Scottlee) [14:47:37] (03PS3) 10JanZerebecki: Give jzerebecki access to analytics data [operations/puppet] - 10https://gerrit.wikimedia.org/r/144994 [14:48:15] RECOVERY - puppet last run on ruthenium is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [14:49:06] (03CR) 10JanZerebecki: "PS3: replace restricted with bastiononly" [operations/puppet] - 10https://gerrit.wikimedia.org/r/144994 (owner: 10JanZerebecki) [14:51:12] (03CR) 10JanZerebecki: Give jzerebecki access to analytics data (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/144994 (owner: 10JanZerebecki) [14:55:43] Reedy: You're doing the SWAT? [14:59:28] I love that jouncebot called K4 "Sir" [15:00:04] Reedy: Sir, Please deploy SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20140723T1500), the time has come. At your service [15:02:53] swat time [15:02:56] marktraceur: you should make jouncebot reply 'You're fired' if you say no to it :p [15:03:19] JohnLewis: Hmm. :-) [15:03:23] JohnLewis: Not sure I like that idea, I like the people who do deploys [15:03:41] * aude is not sir :) [15:03:46] Reedy: around? :) [15:04:25] if he's not, i'd be willing to do swat [15:04:29] marktraceur: oh okay :( [15:04:33] but need a minute to prepare my submodule patch [15:05:11] greg-g: Aye [15:05:18] yay! [15:05:26] marktraceur is now off the hook :) [15:05:29] Reedy: We were 2 seconds away from giving the job to marktraceur. :-) [15:05:33] DAMN IT [15:05:42] * Reedy makes a note to walk slower when taking the dog out [15:06:15] Reedy: I'm not even supposed to BE HERE Today [15:06:17] today. [15:06:23] Where is here? [15:06:28] * Reedy grins [15:06:55] I dunno. The office? This chair? San Francisco? Earth? [15:07:15] … you were not planning to be on Earth today? [15:07:16] (03CR) 10Hedonil: "Sry. Abandoned the first change." [operations/puppet] - 10https://gerrit.wikimedia.org/r/148662 (owner: 10Hedonil) [15:08:04] <_joe_> and... hhvm is ready [15:08:11] ooooo [15:08:22] * aude wonders if wikidata is ready for hhvm [15:08:27] James_F: I've been expecting the abduction for some time now [15:08:32] aude: you can test on beta labs now! [15:08:41] <_joe_> aude: for the next couple of hours, until we try it in production, it burns to flames, and we revert :P [15:08:50] https://gerrit.wikimedia.org/r/#/c/148593 looks like it has already been done [15:08:54] <_joe_> greg-g: right, maybe we should use our packages in beta? [15:08:54] marktraceur: Aha. GLWT. [15:08:55] greg-g: wow [15:08:59] _joe_: yes please [15:09:14] phpunit fails but no idea if it's tests that are broken or what [15:09:26] <_joe_> greg-g: eheh lemme work on taking one JR to hhvm [15:09:32] * aude setting up labs vagrant thing to debug [15:09:42] <_joe_> greg-g: I'll bug ori or bd808 about that :) [15:09:57] * bd808 sees a mention [15:10:10] <_joe_> :) [15:10:20] (03CR) 10Reedy: "This seems to have been merged, but not deployed (not even staged on tin)?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148593 (https://bugzilla.wikimedia.org/67298) (owner: 10Brian Wolff) [15:10:23] * Reedy grumbles [15:10:32] bad gi11es [15:10:33] Reedy: There's a note about it in the SWAT section. [15:10:42] I saw him do it and admonished him already [15:10:42] _joe_: also, what did you mean by "hhvm is ready" and "for the next couple of hours, until we try it in production, it burns to flames, and we revert :P [15:10:45] " [15:10:56] RECOVERY - Unmerged changes on repository mediawiki_config on tin is OK: No changes to merge. [15:11:01] oi, I'm not the one who +2ed it [15:11:08] Oh, right, sorry [15:11:13] haha [15:11:16] sory too [15:11:19] !log reedy Synchronized wmf-config/InitialiseSettings.php: Remove flickrApiUrl from (duration: 00m 15s) [15:11:20] <_joe_> greg-g: the package is ready, with a version that should not crash [15:11:24] Logged the message, Master [15:11:26] It was late, I'm only on my one and a halfth cup of coffee [15:11:32] ah, I was confused, I haven't finished my coffee yet [15:12:34] Reedy: thanks for SWAT today [15:12:37] greg-g: to be honest; everything works until it goes to production :p [15:12:51] i don't know if all our jobs are running on beta [15:13:01] if not, we should make them run there [15:19:42] !log reedy Synchronized php-1.24wmf14/resources/Resources.php: Fixing forgotten OOUI messages (duration: 00m 15s) [15:19:48] Logged the message, Master [15:20:09] PROBLEM - Puppet freshness on cp4017 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 15:17:57 UTC [15:22:09] PROBLEM - Puppet freshness on cp4017 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 15:17:57 UTC [15:24:09] PROBLEM - Puppet freshness on cp4017 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 15:17:57 UTC [15:26:09] PROBLEM - Puppet freshness on cp4017 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 15:17:57 UTC [15:28:09] PROBLEM - Puppet freshness on cp4017 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 15:17:57 UTC [15:30:09] PROBLEM - Puppet freshness on cp4017 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 15:17:57 UTC [15:30:38] aude: Have you made a bump commit for https://gerrit.wikimedia.org/r/#/c/148656/ ? [15:31:52] yes [15:31:57] https://gerrit.wikimedia.org/r/#/c/148670/ [15:32:02] that's it [15:32:09] PROBLEM - Puppet freshness on cp4017 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 15:17:57 UTC [15:33:29] RECOVERY - Puppet freshness on db1006 is OK: puppet ran at Wed Jul 23 15:33:24 UTC 2014 [15:34:09] PROBLEM - Puppet freshness on cp4017 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 15:17:57 UTC [15:34:49] !log reedy Synchronized php-1.24wmf14/extensions/Wikidata: Fix css issue in entity suggester on Wikidata (duration: 00m 17s) [15:34:55] Logged the message, Master [15:35:04] * aude verifies [15:35:17] might need to touch and resync [15:35:28] possible [15:36:09] PROBLEM - Puppet freshness on cp4017 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 15:17:57 UTC [15:36:38] ok in debug mode [15:37:15] probably can touch extensions/ValueView/lib/jquery.ui/jquery.ui.suggester [15:37:22] or that entire folder [15:37:39] within Wikidata 'extension' [15:38:09] PROBLEM - Puppet freshness on cp4017 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 15:17:57 UTC [15:38:39] RECOVERY - Puppet freshness on cp4017 is OK: puppet ran at Wed Jul 23 15:38:32 UTC 2014 [15:38:56] !log reedy Synchronized php-1.24wmf14/extensions/Wikidata: touch (duration: 00m 15s) [15:39:00] Logged the message, Master [15:39:47] * aude try again [15:39:56] looks good [15:43:55] (03CR) 10Reedy: "count( array_intersect( 'all.dblist', 'private.dblist' ) ) === 0" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/145743 (https://bugzilla.wikimedia.org/67910) (owner: 10Legoktm) [16:15:05] (03PS1) 10Krinkle: diamond: Enable for 'cvn' project in labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/148689 (https://bugzilla.wikimedia.org/68444) [16:25:37] (03PS1) 10Giuseppe Lavagetto: jobrunners: install the first hhvm jobrunner [operations/puppet] - 10https://gerrit.wikimedia.org/r/148695 [16:32:22] PROBLEM - puppet last run on ms-be1012 is CRITICAL: CRITICAL: Puppet has 1 failures [16:39:42] PROBLEM - Host platinum is DOWN: PING CRITICAL - Packet loss = 100% [16:44:02] RECOVERY - Host platinum is UP: PING OK - Packet loss = 0%, RTA = 1.11 ms [16:50:22] RECOVERY - puppet last run on ms-be1012 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [16:50:38] (03CR) 10Alexandros Kosiaris: [C: 04-1] wmflib: add apt_version() (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/148512 (owner: 10Ori.livneh) [16:54:18] greg-g: I'm squatting the 1000-1100 deployment window until you can confirm or deny that this is OK... [16:56:01] (03PS2) 10Tim Landscheidt: lighttpd-starter: Update default settings to overcome some issues [operations/puppet] - 10https://gerrit.wikimedia.org/r/148662 (https://bugzilla.wikimedia.org/68431) (owner: 10Hedonil) [17:05:42] greg-g: I am going ahead and self-serving a deployment... [17:05:56] (03CR) 10Tim Landscheidt: "Please don't leave information about changes between patchsets in the commit message. When someone looks at the Git history in some month" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148662 (https://bugzilla.wikimedia.org/68431) (owner: 10Hedonil) [17:11:51] (03CR) 10Gilles: "I think the syntax is just wrong, which would explain why it's not working on beta. Fix incoming." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/145132 (https://bugzilla.wikimedia.org/67525) (owner: 10Gergő Tisza) [17:11:55] !log awight Synchronized php-1.24wmf14/extensions/FundraisingTranslateWorkflow: automatic translate workflow fix for Fundraising/ pages on meta.wmo (duration: 00m 04s) [17:11:59] Logged the message, Master [17:19:06] greg-g: done with deployment [17:21:01] (03PS1) 10Gilles: Fix reference thumbnail settings syntax [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148707 [17:21:16] (03CR) 10Gilles: "https://gerrit.wikimedia.org/r/148707" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/145132 (https://bugzilla.wikimedia.org/67525) (owner: 10Gergő Tisza) [17:22:41] (03CR) 10Tim Landscheidt: wmflib: add funcs requires_realm() and requires_ubuntu() (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/148422 (owner: 10Ori.livneh) [17:30:01] Hey all, chrismcmahon is the acting greg-g today. greg-g is offsite at a training. [17:31:02] in that case; chrismcmahon - a query I sent Greg coming your way :p [17:31:37] (03CR) 10Gergő Tisza: "Uhh that was stupid :( Sorry." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/145132 (https://bugzilla.wikimedia.org/67525) (owner: 10Gergő Tisza) [17:32:55] (03CR) 10Gergő Tisza: [C: 031] Fix reference thumbnail settings syntax [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148707 (owner: 10Gilles) [17:36:00] (03PS5) 10Ori.livneh: apache: add apache::mpm [operations/puppet] - 10https://gerrit.wikimedia.org/r/148542 [17:36:11] (03CR) 10Ori.livneh: [C: 032 V: 032] apache: add apache::mpm [operations/puppet] - 10https://gerrit.wikimedia.org/r/148542 (owner: 10Ori.livneh) [17:38:25] !log launched a script on ms-fe1001 to collect thumb stats, no impact expected [17:38:30] Logged the message, Master [17:41:16] (03PS1) 10John F. Lewis: beta: Use grey logo for Beta Wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148709 [17:41:37] chrismcmahon ^^ [17:41:59] JohnLewis: Usually we set stuff to $stdLogo and then upload a Wiki.png locally [17:42:25] hoo: But can we do that on a wiki with uplaods disabled? :p [17:42:32] *uploads [17:42:35] good point here [17:42:36] :D [17:42:47] (03CR) 10Cmcmahon: [C: 031] "change is only for beta labs, a new logo for Wikidata" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148709 (owner: 10John F. Lewis) [17:43:53] (03CR) 10Hoo man: [C: 032] beta: Use grey logo for Beta Wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148709 (owner: 10John F. Lewis) [17:44:00] (03Merged) 10jenkins-bot: beta: Use grey logo for Beta Wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148709 (owner: 10John F. Lewis) [17:44:21] Coren or anyone, would you +2 https://gerrit.wikimedia.org/r/#/c/148709/ ? Trivial change for beta labs only. [17:44:35] chrismcmahon: hoo beat you :p [17:44:44] oh good [17:45:13] PROBLEM - check configured eth on platinum is CRITICAL: Connection refused by host [17:45:13] !log hoo Synchronized wmf-config/InitialiseSettings-labs.php: (no message) (duration: 00m 07s) [17:45:13] PROBLEM - DPKG on platinum is CRITICAL: Connection refused by host [17:45:19] Logged the message, Master [17:45:22] PROBLEM - puppet last run on platinum is CRITICAL: Connection refused by host [17:45:22] PROBLEM - RAID on platinum is CRITICAL: Connection refused by host [17:45:26] trivial thing [17:45:52] PROBLEM - Disk space on platinum is CRITICAL: Connection refused by host [17:45:52] PROBLEM - puppet disabled on platinum is CRITICAL: Connection refused by host [17:46:12] PROBLEM - check if dhclient is running on platinum is CRITICAL: Connection refused by host [17:47:13] (03PS1) 10EBernhardson: Prevent warning from logging call [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148710 [17:57:32] PROBLEM - puppet last run on strontium is CRITICAL: CRITICAL: Puppet has 3 failures [17:58:22] PROBLEM - puppet last run on mw1215 is CRITICAL: CRITICAL: Puppet has 2 failures [17:58:23] PROBLEM - puppet last run on mw1107 is CRITICAL: CRITICAL: Puppet has 2 failures [17:58:23] PROBLEM - puppet last run on mw1027 is CRITICAL: CRITICAL: Puppet has 1 failures [17:58:23] PROBLEM - puppet last run on lvs3003 is CRITICAL: CRITICAL: Puppet has 1 failures [17:58:32] PROBLEM - puppet last run on mw1090 is CRITICAL: CRITICAL: Puppet has 2 failures [18:00:04] yurik: Sir, Please deploy Wikipedia Zero (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20140723T1800), the time has come. At your service [18:01:12] PROBLEM - puppet last run on nickel is CRITICAL: CRITICAL: Epic puppet fail [18:01:23] PROBLEM - puppet last run on cp4010 is CRITICAL: CRITICAL: Puppet has 1 failures [18:03:12] RECOVERY - check if dhclient is running on platinum is OK: PROCS OK: 0 processes with command name dhclient [18:03:12] RECOVERY - check configured eth on platinum is OK: NRPE: Unable to read output [18:03:12] RECOVERY - DPKG on platinum is OK: All packages OK [18:03:22] RECOVERY - puppet last run on platinum is OK: OK: Puppet is currently enabled, last run 1203 seconds ago with 0 failures [18:03:23] RECOVERY - RAID on platinum is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0 [18:03:52] RECOVERY - Disk space on platinum is OK: DISK OK [18:03:52] RECOVERY - puppet disabled on platinum is OK: OK [18:04:08] thanks for the reminders about my non-presence, bd808 [18:05:12] greg-g: np. I saw people shouting your name into the darkness :) [18:05:46] it's a lonely place [18:08:26] (03PS1) 10QChris: Add aliases for analytics cluster [operations/dns] - 10https://gerrit.wikimedia.org/r/148714 [18:10:48] (03CR) 10QChris: "I have no clue about our dns setup, so please be extra" [operations/dns] - 10https://gerrit.wikimedia.org/r/148714 (owner: 10QChris) [18:11:13] PROBLEM - check configured eth on platinum is CRITICAL: eth1 reporting no carrier. [18:14:22] RECOVERY - puppet last run on mw1215 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [18:14:32] RECOVERY - puppet last run on strontium is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [18:15:22] RECOVERY - puppet last run on mw1027 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [18:15:32] RECOVERY - puppet last run on mw1090 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [18:15:34] (03PS1) 10Hoo man: Set otherProjectsLinksByDefault for kowiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148719 [18:16:22] RECOVERY - puppet last run on mw1107 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [18:16:23] RECOVERY - puppet last run on lvs3003 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [18:16:47] (03CR) 10Aaron Schulz: [C: 031] jobrunners: install the first hhvm jobrunner [operations/puppet] - 10https://gerrit.wikimedia.org/r/148695 (owner: 10Giuseppe Lavagetto) [18:18:23] RECOVERY - puppet last run on cp4010 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [18:18:35] (03CR) 10Ori.livneh: [C: 031] jobrunners: install the first hhvm jobrunner [operations/puppet] - 10https://gerrit.wikimedia.org/r/148695 (owner: 10Giuseppe Lavagetto) [18:19:23] (03CR) 10jenkins-bot: [V: 04-1] Set otherProjectsLinksByDefault for kowiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148719 (owner: 10Hoo man) [18:22:24] (03CR) 10Hoo man: "recheck" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148719 (owner: 10Hoo man) [18:22:27] wtf [18:24:39] (03PS1) 10Ori.livneh: apache::mod_conf: add explanatory comment [operations/puppet] - 10https://gerrit.wikimedia.org/r/148722 [18:24:41] (03PS1) 10Ori.livneh: wmflib: add safe_filename() [operations/puppet] - 10https://gerrit.wikimedia.org/r/148723 [18:25:04] (03CR) 10Ori.livneh: [C: 032 V: 032] "comment-only change" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148722 (owner: 10Ori.livneh) [18:25:12] PROBLEM - check if dhclient is running on platinum is CRITICAL: Connection refused by host [18:25:12] PROBLEM - DPKG on platinum is CRITICAL: Connection refused by host [18:25:22] PROBLEM - puppet last run on platinum is CRITICAL: Connection refused by host [18:25:22] PROBLEM - RAID on platinum is CRITICAL: Connection refused by host [18:25:52] PROBLEM - puppet disabled on platinum is CRITICAL: Connection refused by host [18:25:52] PROBLEM - Disk space on platinum is CRITICAL: Connection refused by host [18:27:03] please don't tell me jenkins died again :( [18:27:18] Reedy, ? [18:27:26] I've no idea [18:27:28] yurikSPB: https://integration.wikimedia.org/zuul/ looks fine. [18:27:52] James_F, it has been waiting on parsing tests for a while [18:28:28] yurikSPB: Hmm, yeah – https://integration.wikimedia.org/ci/job/mediawiki-core-phpunit-parser/28747/console is showing success. [18:28:38] No Krinkle, no hashar. [18:28:40] * James_F sighs. [18:29:06] lovely, i don't want to bypass it :( [18:29:27] (03CR) 10Ori.livneh: [C: 031] Specify owner/group/mode for apache::conf [operations/puppet] - 10https://gerrit.wikimedia.org/r/148652 (owner: 10Alexandros Kosiaris) [18:31:30] (03CR) 10Dzahn: [C: 031] Specify owner/group/mode for apache::conf [operations/puppet] - 10https://gerrit.wikimedia.org/r/148652 (owner: 10Alexandros Kosiaris) [18:32:38] !log Jenkins stalled [18:32:43] Logged the message, Master [18:33:04] (03PS2) 10Hoo man: Set otherProjectsLinksByDefault for kowiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148719 [18:33:36] hashar: Thanks! [18:33:58] !log Jenkins disabled and reenabled Gearman plugin. The jobs were no more registered in Zuul gearman server :-( [18:34:04] Logged the message, Master [18:35:40] !log yurik Synchronized php-1.24wmf13/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 27s) [18:35:47] Logged the message, Master [18:36:02] dr0ptp4kt, ^ [18:36:02] I can't fix jenkins [18:36:05] will be back in half an hour [18:36:22] !log can't fix jenkins / zuul right now. Will be stalled for at least half an hour [18:36:24] yurikSPB: does that mean it's live on all servers? [18:36:27] Logged the message, Master [18:36:53] (03CR) 10Dzahn: [C: 032] " looks all done" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148295 (owner: 10Dzahn) [18:38:27] !log yurik Synchronized php-1.24wmf14/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 24s) [18:38:32] Logged the message, Master [18:38:46] dr0ptp4kt, both are up [18:39:35] yurikSPB: i see that. hey, the search button's text is getting escaped on http://en.zero.wikipedia.org/wiki/Special:ZeroRatedMobileAccess. the search works fine, but i think we may want to fix that :) [18:40:06] yurikSPB: i'm sure that's something you warned me about :) [18:40:33] sigh [18:40:40] dr0ptp4kt, want to do a q patch? [18:41:00] yurikSPB: yeah, i think it just needs to be wfmessage(), not wfmessage()->text(). hang on [18:41:58] dr0ptp4kt, doubt it, might be something more. How bad is it? worse rollback? [18:42:48] yurikSPB: the text() method expressly puts < and &rt; in the text, so ii'm pretty sure that's it. [18:43:04] dr0ptp4kt, i think you might want ->plain() or something else like that [18:43:14] yurikSPB: oh okay, hang on [18:44:27] yurikSPB: no, those two methods have the same body! [18:45:57] (03PS1) 10RobH: using a different server for labmon1001 [operations/dns] - 10https://gerrit.wikimedia.org/r/148730 [18:48:31] dr0ptp4kt, i will need to look. take a look at tail -f zero.log |grep "2014-07-23" [18:49:02] (03CR) 10RobH: [C: 032] using a different server for labmon1001 [operations/dns] - 10https://gerrit.wikimedia.org/r/148730 (owner: 10RobH) [18:49:45] Coren: so... new question [18:49:58] parititoning for 4 3tb disks [18:50:46] i ahve an exitingt recipie that is... [18:50:55] eww, these all suck. [18:51:20] RobH: What I'd need is two mirrored with the default; two striped untouched by teh OS [18:51:46] ... basically the same deal as with the other setup. [18:51:54] yea, cept gpt so it has differnet receipies [18:51:59] so i gotta make one emulate that, heh [18:52:39] this does a / at 50GB and rest on two disks as lvm.. not quite there [18:53:59] ok back [18:54:02] fixing up jenkins [18:55:14] yurikSPB: i sort of wonder if the translations aren't updating [18:55:28] !log back. attempting to fix jenkins [18:55:29] dr0ptp4kt, translations might not be in yet [18:55:33] Logged the message, Master [18:55:43] dr0ptp4kt, plus in reality i should have ran scap, but it takes sooo long :( [18:56:04] (03PS1) 10RobH: setting labmon1001 install stuff [operations/puppet] - 10https://gerrit.wikimedia.org/r/148734 [18:57:33] !log reenabled Gearman plugin in Jenkins. Jobs have been reregistered and seems to be proceeding again [18:57:38] Logged the message, Master [18:58:15] (03PS1) 10Yurik: Experimenting with GIF banner rendering for W0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148735 [18:58:48] (03PS2) 10Yurik: LABS: Experimenting with GIF banner rendering for W0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148735 [18:59:02] (03CR) 10Yurik: [C: 032] LABS: Experimenting with GIF banner rendering for W0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148735 (owner: 10Yurik) [18:59:05] yurikSPB: are you doing anything further on the deployment? [18:59:28] dr0ptp4kt, trying to get the font to work [18:59:37] might have to quick-deploy your patch later todya [19:00:05] mwalker, ejegg, awight: Sir, Please deploy CentralNotice (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20140723T1900), the time has come. At your service [19:00:38] (03CR) 10Yurik: [V: 032] LABS: Experimenting with GIF banner rendering for W0 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148735 (owner: 10Yurik) [19:00:54] yurikSPB: cool. i mean, in all practicality either text() or escaped() should work. it seems that it's the translations that are missing. okay, i'll talk with you later this afternoon.... [19:01:10] yurikSPB: that is, the message cache isn't filled from what i see [19:01:10] dr0ptp4kt, ok. I think i should just run scap [19:01:20] i'll do it after deploying your patch a bit later [19:01:25] yurikSPB: okay, i'll be back, hit me on my cell if problems [19:01:36] dr0ptp4kt, could you look at the zero.log from above? [19:02:07] yurikSPB: i've been looking at tail -f zero.log | grep "2014-07-23" | grep -v "opera" | grep -v suwiki ...not too bad there [19:02:44] dr0ptp4kt, any thoughts why we have so much of it going through though? [19:02:53] without the -v grep [19:08:03] (03CR) 10RobH: [C: 032] setting labmon1001 install stuff [operations/puppet] - 10https://gerrit.wikimedia.org/r/148734 (owner: 10RobH) [19:21:27] Is jenkins gonna be ok? [19:21:39] it looks locked up... [19:23:21] no wheels spinning and maxed-out pipeline load [19:27:19] !log awight Synchronized php-1.24wmf13/extensions/CentralNotice: push many CentralNotice fixes, including GeoIP cookie and hide cookie (duration: 00m 04s) [19:27:24] Logged the message, Master [19:27:56] (03CR) 10Gage: [C: 031] phab - configurable login message by auth type [operations/puppet] - 10https://gerrit.wikimedia.org/r/148564 (owner: 10Dzahn) [19:28:38] !log awight Synchronized php-1.24wmf14/extensions/CentralNotice: push many CentralNotice fixes, including GeoIP cookie and hide cookie (duration: 00m 04s) [19:28:43] Logged the message, Master [19:37:11] (03CR) 10BryanDavis: "akosiaris: Sure. It's broken now and it will be broken then but I could at least in theory manually register the apache nodes with salt." [operations/puppet] - 10https://gerrit.wikimedia.org/r/148386 (owner: 10BryanDavis) [19:39:12] PROBLEM - Puppet freshness on nickel is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 17:38:26 UTC [19:46:23] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Puppet has 1 failures [19:49:13] !log awight Synchronized php-1.24wmf13/extensions/CentralNotice: push many CentralNotice fixes, including GeoIP cookie and hide cookie (duration: 00m 05s) [19:49:34] !log awight Synchronized php-1.24wmf14/extensions/CentralNotice: push many CentralNotice fixes, including GeoIP cookie and hide cookie (duration: 00m 05s) [19:49:44] (03PS1) 10RobH: setup dns for actinium [operations/dns] - 10https://gerrit.wikimedia.org/r/148747 [19:50:06] (03PS1) 10Dzahn: contacts.wm - http->https redirect [operations/puppet] - 10https://gerrit.wikimedia.org/r/148748 [19:52:34] (03PS1) 10RobH: setting actinium install params [operations/puppet] - 10https://gerrit.wikimedia.org/r/148751 [19:58:23] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [19:58:31] hashar: seems like zuul aint doing anything, you workin on this? [19:58:40] everything shows queued [19:59:22] (03PS1) 10Ori.livneh: beta cluster: use luastandalone [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148754 (https://bugzilla.wikimedia.org/68413) [19:59:49] or anyone else who can look at why zuul is all queued and nothign is happening? [20:00:04] gwicke, subbu, cscott: Sir, Please deploy Parsoid (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20140723T2000), the time has come. At your service [20:01:11] ah shoot; awight are you done? I thought we had two hours; but parsoid is supposed to start right now [20:01:52] mwalker: yep i noticed that [20:01:59] looks like we're done! [20:05:00] RobH: yes I am [20:05:43] (03PS1) 10Ori.livneh: apache::monitoring: add diamond support; ensure mod_status is enabled [operations/puppet] - 10https://gerrit.wikimedia.org/r/148755 [20:06:52] hashar: awesome, i just wanted to make sure that you were aware =] (i saw you had been working on it earlier) [20:07:01] and its moving, woooo [20:07:09] on that note, im going to go get lunch [20:07:09] RobH: yeah something is terribly broken in Zuul and I want to figure it out :] [20:07:16] thanks for thenotification [20:07:20] good luck! [20:07:21] (03PS2) 10BryanDavis: beta cluster: use luastandalone [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148754 (https://bugzilla.wikimedia.org/68413) (owner: 10Ori.livneh) [20:09:47] gwicke, subbu, cscott: I've got a patch for mediawiki-config to fix problems in beta that I'd like to pull on tin when it's safe. It's just to CommonSettings-labs.php so I don't need to sync. [20:10:07] (03CR) 10Ori.livneh: [C: 04-1] mediawiki: use mods-enabled, prepare for HAT (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/148099 (owner: 10Giuseppe Lavagetto) [20:10:21] bd808, parsoid deploy only syncs stuff from /srv/deployment/parsoid/deploy onto the cluster. [20:10:44] subbu: I though so, but I didn't want to step on anyone's toes [20:10:57] meanwhile waiting for zuul to merge the deploy patch. [20:11:04] thanks for checking. :) [20:11:46] awight, mwalker: are you guys done in /a/common on tin? I've got a pull-only change to deploy there to fix beta. [20:12:32] It's like a party on tin right now. So many active sessions. :) [20:12:56] RoanKattouw: there are kernel stack traces in cp1045's dmesg [20:13:03] Ouch [20:13:07] Well that would explain things [20:13:12] Is it the only backend that has that problem? [20:13:52] not sure, checking [20:14:03] !log contacts.wm - set $base_url in default/settings.php to https URL, and $is_https='on' in bootstrap.inc (unpuppetized?) [20:14:08] Logged the message, Master [20:14:12] bd808: yes all done, thanks! [20:14:21] sweet. [20:14:39] RoanKattouw: nope, seeing 503s on cp1058 too [20:14:53] (03CR) 10BryanDavis: [C: 032] beta cluster: use luastandalone [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148754 (https://bugzilla.wikimedia.org/68413) (owner: 10Ori.livneh) [20:15:00] gwicke, RoanKattouw i will hold on deploy till this is resolved? [20:15:05] s/on/off [20:15:21] subbu: I'd say go ahead [20:15:31] it's unlikely to make it worse [20:17:10] ok .. looks like zuul is running the deploy patch tests now .. should be ready in couple mins. [20:17:47] Zuul looks to be busy busy today -- https://integration.wikimedia.org/zuul/ [20:18:45] yeah its is slowly catching up [20:19:00] we have too many changes coming in :-( [20:20:00] * subbu begins deploy [20:23:21] slow git deploy syncing day .. took a couple retries for fetch to complete .. now waiting on checkout. [20:24:19] checkout took a couple retries, but finished now. [20:25:24] (03CR) 10BryanDavis: [C: 031] "We are worst off between Tuesday and Thursday (wikipedias and sister projects running different MW branches). Here are some pretty picture" [operations/puppet] - 10https://gerrit.wikimedia.org/r/145397 (owner: 10Reedy) [20:27:19] !log Having no idea how to fix zuul. Restarting it and killing the whole queue :-/ [20:27:23] Logged the message, Master [20:28:03] poor zuul. [20:28:24] I am not sure why I thought I would find the root cause [20:28:47] hashar: strace? :) [20:28:47] hashar: Because you are an optimist? :) [20:28:51] haha [20:28:54] I guess so [20:28:59] I was supposed to get to bed early today :-( [20:29:09] lets retrigger a bunch of jobs yeah! [20:29:12] (03CR) 10BryanDavis: beta cluster: use luastandalone [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148754 (https://bugzilla.wikimedia.org/68413) (owner: 10Ori.livneh) [20:29:13] it's still early! [20:29:24] (03CR) 10BryanDavis: [C: 032] "trying again" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148754 (https://bugzilla.wikimedia.org/68413) (owner: 10Ori.livneh) [20:29:32] (03Merged) 10jenkins-bot: beta cluster: use luastandalone [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148754 (https://bugzilla.wikimedia.org/68413) (owner: 10Ori.livneh) [20:29:47] * bd808|deploy rushed to the head of the line [20:30:10] !log deployed parsoid version 47d4bc83 [20:30:16] Logged the message, Master [20:31:15] !log Updated /a/common to 07834a9 (beta cluster: use luastandalone); no sync needed [20:31:21] Logged the message, Master [20:31:35] ori: ^ [20:34:07] (03PS1) 10Dzahn: wikimedia.ee - own zonefile and set external MX [operations/dns] - 10https://gerrit.wikimedia.org/r/148762 [20:34:32] (03PS2) 10Dzahn: wikimedia.ee - own zonefile and set external MX [operations/dns] - 10https://gerrit.wikimedia.org/r/148762 [20:35:23] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Puppet has 1 failures [20:37:55] (03CR) 10Hashar: "recheck" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148723 (owner: 10Ori.livneh) [20:38:24] (03CR) 10Ottomata: [C: 031] "Dzahn is confirming the key, otherwise LGTM." [operations/puppet] - 10https://gerrit.wikimedia.org/r/144994 (owner: 10JanZerebecki) [20:40:25] (03Abandoned) 10coren: lighttpd-starter: Update default settings to oversome some issues. Bug: 68431 hange-Id: I17ef551fd75dabb60c5c47b42eb7644109acad29i [operations/puppet] - 10https://gerrit.wikimedia.org/r/148637 (owner: 10Hedonil) [20:42:54] (03PS3) 10Dzahn: wikimedia.ee - own zonefile and set external MX [operations/dns] - 10https://gerrit.wikimedia.org/r/148762 [20:43:37] mutante: not waiting for another reply on the ticket? [20:44:00] (re the last grrrit-wm msg) [20:47:44] jeremyb: might as well do the same thing on gerrit, if i could add him there [20:48:05] what else is he gonna say though [20:48:25] just did "dig MX" on that domain name he mentioned [20:48:26] he could give actual specific MX records [20:48:38] well, yea, but we can look them up [20:48:48] the domain name is the domain name for a webhost [20:48:51] AFAICT [20:49:07] yes, he called it their webhost too [20:49:47] i dont speak .ee but it looks very much like one [20:49:49] also, think about the situation with faidon and OTRS. this is the reverse. what if the webhost changes what MX this domain should use [20:50:09] i'm not sure i know what you mean with "faidon and OTRS" [20:50:15] i mean RT 7726 [20:50:46] that is related?? [20:50:55] no. that's the opposite situation [20:51:00] he was interested in having OTRS queues? [20:51:11] idk, we didn't offer [20:51:45] i'm just thinking "situation where an MX host doesn't control the DNS for domains that use that MX" [20:51:49] and 7726 is an example [20:52:14] sounds like you have concerns ever doing this for chapters then [20:52:21] not a technical one [20:52:24] erm? no [20:53:38] it's irrelevant if it's a chapter. but imagine if someone looked up MX for godaddy in order to see where to send mail for a personal domain hosted at godaddy. would you expect that to work? [20:53:45] something _might_ change.. but isnt that always the case? [20:54:41] sure [20:55:36] please raise concerns on gerrit, that's why i made it :) [20:57:39] ok :-) [21:06:41] (03CR) 10Matanya: [C: 031] phab - configurable login message by auth type [operations/puppet] - 10https://gerrit.wikimedia.org/r/148564 (owner: 10Dzahn) [21:09:30] (03PS1) 10Physikerwelt: WIP: Draft for Mathoid role [operations/puppet] - 10https://gerrit.wikimedia.org/r/148836 [21:10:22] (03PS3) 10Hedonil: lighttpd-starter: Update default settings to overcome some issues - set server.event-handler = "linux-sysepoll" - increase server.max-connections = 300 - remove lighttpd worker processes - make php-fcgi handler optional [operations/puppet] - 10https://gerrit.wikimedia.org/r/148662 (https://bugzilla.wikimedia.org/68431) [21:17:17] !log Zuul is all good. It just receives too many patches :-] [21:17:22] Logged the message, Master [21:17:24] and off to sleep [21:17:56] bonsoir [21:19:04] (03CR) 10Dzahn: [C: 032] phab - configurable login message by auth type [operations/puppet] - 10https://gerrit.wikimedia.org/r/148564 (owner: 10Dzahn) [21:19:10] _joe_: still deploying that patch soon? [21:19:44] <_joe_> AaronSchulz: nope. Hhvm keeps crashing in labs and I thought we should wait for ori to be here [21:20:17] <_joe_> I don't see any added value in having it running in production since we're constantly crashing it in labs :) [21:20:37] I have a question concerning the RT system [21:20:55] If a ticket is owned by nobody what does that mean? [21:21:07] <_joe_> we have beta running on what should be our production package - until that does not crash every other minute, we should wait for any production deploy [21:21:42] <_joe_> AaronSchulz: or - if the JR don't use luasandbox at all, we could just go on and see if we encounter other issues [21:21:56] is beta crashing with runners or servers? I guess if it's Lua stuff it could still be a problem [21:22:04] physikerwelt: assuming it was an ticket created via email? I believe that's the default, there hasn't been a formal assigned ops person [21:22:22] physikerwelt: but I don't believe that is used strictly in all the queues [21:22:28] _joe_: we could run lots of jobs without touched Lua, but not some of the cirrus jobs nor refreshLinks [21:24:00] chasemp: So it also means that nobody feels responsible for the ticket and most probably nothing will happen? [21:24:10] <_joe_> AaronSchulz: mh, well, let's wait for ori [21:25:28] physikerwelt: I would say that is not a charitable or accurate interpretation, but the topic for this chat indicates the RT person on duty who would be the one to ask, or update the ticket asking for status [21:25:37] (03PS2) 10QChris: Add aliases for analytics cluster [operations/dns] - 10https://gerrit.wikimedia.org/r/148714 [21:26:53] chasemp: OK. So it's OK to ask for status without to annoy people. [21:27:52] (03PS1) 10Dzahn: fix template path for phabricator login message [operations/puppet] - 10https://gerrit.wikimedia.org/r/148839 [21:28:51] (03CR) 10Dzahn: [C: 032] fix template path for phabricator login message [operations/puppet] - 10https://gerrit.wikimedia.org/r/148839 (owner: 10Dzahn) [21:30:29] (03CR) 10Dzahn: [V: 032] fix template path for phabricator login message [operations/puppet] - 10https://gerrit.wikimedia.org/r/148839 (owner: 10Dzahn) [21:31:58] (03CR) 10QChris: Add aliases for analytics cluster (031 comment) [operations/dns] - 10https://gerrit.wikimedia.org/r/148714 (owner: 10QChris) [21:32:10] (03PS2) 10Physikerwelt: WIP: Draft for Mathoid role [operations/puppet] - 10https://gerrit.wikimedia.org/r/148836 [21:33:33] ok if anybody is interested in mathoid please let me know [21:34:14] _joe_: i agree w/aaron that there would still be quite a lot of value in having a jobrunner on HHVM even if it's using luastandalone [21:34:23] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [21:34:42] we want to be as proactive as possible about gathering bugs and accumulating experience with different configuration values etc. [21:35:06] <_joe_> ori: ok so I'll amend the patch disabling the luasandbox extension from hhvm config - is that enough? [21:35:06] so it's not at all about having some kind of phony "mission accomplished" banner while ignoring problems [21:35:23] <_joe_> ori: I know both of you [21:35:24] _joe_: yes. well, there would need to be a wmf-config change as well but i can do that [21:35:27] <_joe_> :) [21:35:27] (03CR) 10coren: [C: 032] "Moar power!" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148662 (https://bugzilla.wikimedia.org/68431) (owner: 10Hedonil) [21:35:41] <_joe_> I know it's not a flag but sometimes it's just inertia [21:35:57] <_joe_> ok let's go then [21:36:14] yay! i'll prep the wmf-config change [21:36:14] (03CR) 10coren: [V: 032] "Why did Jenkins only +2?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148662 (https://bugzilla.wikimedia.org/68431) (owner: 10Hedonil) [21:36:35] <_joe_> not sure I can follow through a lot tonight tbh [21:36:44] <_joe_> let's say I have ~ 1 hour left [21:36:58] _joe_: i'll watch it closely. worst case scenario, if it's absolutely exploding, i'll just shut it down, no? [21:37:06] <_joe_> yep [21:37:31] (03CR) 10Dzahn: "likely because hedonil isn't in the trusted users regex" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148662 (https://bugzilla.wikimedia.org/68431) (owner: 10Hedonil) [21:39:33] PROBLEM - puppet last run on mw1128 is CRITICAL: CRITICAL: Puppet has 1 failures [21:40:12] PROBLEM - Puppet freshness on nickel is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 17:38:26 UTC [21:41:13] Duplicate declaration: Apache::Mod_conf[php5] is already declared in file /etc/puppet/modules/apache/manifests/mpm.pp:40; cannot redeclare at /etc/puppet/modules/apache/manifests/mod.pp:37 [21:42:20] (03PS2) 10Giuseppe Lavagetto: jobrunners: install the first hhvm jobrunner [operations/puppet] - 10https://gerrit.wikimedia.org/r/148695 [21:42:43] <_joe_> ewww who on earth merged that? [21:43:13] is 00-dummy.conf something new from there too? [21:43:18] in that case it broke the phab's [21:43:26] and i was thinking i broke it with that other change.. [21:43:33] <_joe_> mutante: can't be [21:43:38] <_joe_> that file is empty [21:43:48] <_joe_> how can it break the apache config? [21:43:52] <_joe_> but I digress [21:44:17] which change did you mean [21:44:28] <_joe_> apache::mpm [21:44:38] is phab on trusty? [21:44:38] <_joe_> it needed testing. [21:44:46] <_joe_> ori: it is [21:44:47] did anything happen that would affect apache correctly parsing php files? [21:44:51] 00-dummy.conf [21:44:52] oops [21:44:56] http://legalpad.wikimedia.org/ [21:44:59] so that is the index.php [21:45:04] but it's...uhhh being dumped as text now [21:45:11] i see the bug [21:45:13] one sec [21:45:14] <_joe_> it's _not_ dummy.conf [21:45:32] <_joe_> ori: told ya [21:45:34] <_joe_> :) [21:45:42] yep, my posting of dummy.conf was an accident [21:45:44] it was just in my clipboard [21:46:08] <_joe_> chasemp: the fact is - both me and ori were unaware we had traditional mod_php on trusty [21:46:17] was chasing a red herring then [21:46:20] no no [21:46:27] it's a simpler mistake (and easier to fix) [21:46:29] (03PS1) 10Ori.livneh: Fix-up for Ic952146b5: check $mpm, not $selected_mod [operations/puppet] - 10https://gerrit.wikimedia.org/r/148843 [21:46:30] ^^ [21:46:37] <_joe_> ori: oh, ok [21:46:41] <_joe_> lol [21:46:59] though if we had gone for mpm_worker by default for trusty it would have been an issue, yeah [21:47:02] <_joe_> ori: lemme merge the JR change [21:47:09] ah yes that woul do it I suppose :) [21:47:10] <_joe_> we're running out of time [21:47:17] <_joe_> (mod_php in 2014 is lame) [21:47:53] (03CR) 10RobH: [C: 032] setup dns for actinium [operations/dns] - 10https://gerrit.wikimedia.org/r/148747 (owner: 10RobH) [21:47:55] (03PS1) 10Ori.livneh: Use luastandalone on HHVM [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148844 [21:47:58] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] jobrunners: install the first hhvm jobrunner [operations/puppet] - 10https://gerrit.wikimedia.org/r/148695 (owner: 10Giuseppe Lavagetto) [21:48:25] (03CR) 10RobH: [C: 032] setting actinium install params [operations/puppet] - 10https://gerrit.wikimedia.org/r/148751 (owner: 10RobH) [21:48:36] (03CR) 10Rush: [C: 031] "excellent, please to be making with the php" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148843 (owner: 10Ori.livneh) [21:48:38] (03CR) 10Ori.livneh: [C: 032 V: 032] Use luastandalone on HHVM [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148844 (owner: 10Ori.livneh) [21:49:01] * _joe_ running puppet [21:49:40] !log ori Synchronized wmf-config/CommonSettings.php: I2f366fa93: Use luastandalone on HHVM (duration: 00m 03s) [21:49:45] Logged the message, Master [21:50:13] (03CR) 10Ori.livneh: [C: 032 V: 032] Fix-up for Ic952146b5: check $mpm, not $selected_mod [operations/puppet] - 10https://gerrit.wikimedia.org/r/148843 (owner: 10Ori.livneh) [21:50:17] _joe_: assume the box will show up on http://ganglia.wikimedia.org/latest/?c=Jobrunners%20eqiad&m=cpu_report&r=hour&s=by%20name&hc=4&mc=2 [21:50:41] RobH: ok to merge RobH: setting actinium install params (4d6208469f) ? [21:51:08] * ori assumes "yes". [21:51:21] <_joe_> AaronSchulz: eventually [21:51:29] <_joe_> when puppet stops failing :) [21:51:30] (03PS1) 10Yuvipanda: statistics: Add packages for rgdal [operations/puppet] - 10https://gerrit.wikimedia.org/r/148847 [21:51:35] (03PS1) 10Giuseppe Lavagetto: jobrunners: fix template [operations/puppet] - 10https://gerrit.wikimedia.org/r/148848 [21:52:00] <_joe_> AaronSchulz: ^^ [21:52:15] tries running puppet on nickel [21:52:41] (03CR) 10Aaron Schulz: [C: 031] jobrunners: fix template [operations/puppet] - 10https://gerrit.wikimedia.org/r/148848 (owner: 10Giuseppe Lavagetto) [21:52:52] RECOVERY - Puppet freshness on nickel is OK: puppet ran at Wed Jul 23 21:52:46 UTC 2014 [21:52:55] ori: yep, sorry, was in another screen [21:52:57] ori: that seems not to have fixed http://legalpad.wikimedia.org/? [21:52:59] and i got distracted, thx for merge [21:53:12] chasemp: what host is that on? did puppet run yet? [21:53:12] RECOVERY - puppet last run on nickel is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [21:53:14] (03PS2) 10Giuseppe Lavagetto: jobrunners: fix template [operations/puppet] - 10https://gerrit.wikimedia.org/r/148848 [21:53:33] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] jobrunners: fix template [operations/puppet] - 10https://gerrit.wikimedia.org/r/148848 (owner: 10Giuseppe Lavagetto) [21:53:37] radon.eqiad.wmnet and I ran it yep [21:53:57] (03PS1) 10EBernhardson: Enable hhvm hotprofiler [operations/debs/hhvm] - 10https://gerrit.wikimedia.org/r/148850 [21:54:19] chasemp: X-Cache: cp1043 hit (4) [21:54:42] varnish cached the bad page? :) [21:54:44] yep [21:54:52] (03PS2) 10EBernhardson: Enable hhvm hotprofiler [operations/debs/hhvm] - 10https://gerrit.wikimedia.org/r/148850 [21:54:52] anyone know how to clear it? [21:55:05] it fixed it on nickel (lucid) [21:55:19] talking about https://www.mediawiki.org/wiki/Requests_for_comment/Composer_managed_libraries_for_use_on_WMF_cluster in #wikimedia-office in 5 min [21:56:11] chasemp: you won't like it , but https://wikitech.wikimedia.org/wiki/Varnish#One-off_purges [21:56:22] PROBLEM - puppet last run on mw1053 is CRITICAL: CRITICAL: Epic puppet fail [21:56:42] <_joe_> ori: just hit a quite strange puppet problem, uhmmm [21:57:20] _joe_: i can't ssh to mw1053? [21:57:33] RECOVERY - puppet last run on mw1128 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [21:57:50] <_joe_> ori: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type alternatives::config at /etc/puppet/modules/mediawiki/manifests/jobrunner/hhvm.pp:28 [21:58:12] PROBLEM - Puppet freshness on db1009 is CRITICAL: Last successful Puppet run was Wed 23 Jul 2014 19:57:10 UTC [22:00:36] _joe_: it's because you don't have an init.pp i think [22:01:18] <_joe_> ori: I was pretty sure puppet 3 resolved this [22:01:26] pretty sure or totally sure? [22:01:28] PROBLEM - Host platinum is DOWN: PING CRITICAL - Packet loss = 100% [22:01:33] <_joe_> pretty :) [22:01:42] <_joe_> so you're probably right [22:02:23] chasemp: although the page was cached, the underlying problem is not in fact resolved: confirmed with "curl -H 'host: legalpad.wikimedia.org' radon.eqiad.wmnet" [22:02:29] chasemp: taking a look [22:02:30] <_joe_> ori: I'm a moron [22:02:38] _joe_: np, me too [22:02:48] * ori proposes being morons together [22:02:48] ;) [22:03:10] ok, talking about https://www.mediawiki.org/wiki/Requests_for_comment/Composer_managed_libraries_for_use_on_WMF_cluster in #wikimedia-office now [22:03:27] palladium is up for me [22:03:37] chasemp: ok, legalpad is up. i can explain what happened [22:05:15] (03PS1) 10Giuseppe Lavagetto: alternatives: correct module name [operations/puppet] - 10https://gerrit.wikimedia.org/r/148856 [22:06:16] heh [22:06:27] <_joe_> ori: told you, I'm a moron :) [22:07:08] <_joe_> ori: so, did you propagate wmf-config already? [22:07:28] (03PS1) 10Rush: phabricator => include apache::mod::php5 [operations/puppet] - 10https://gerrit.wikimedia.org/r/148858 [22:07:35] _joe_: yes. mw1053 will need the package 'lua5.1' too -- i forgot. though it's up to you whether you want to puppetize that or treat it as a one-off [22:07:46] (it's needed for luastandalone) [22:07:58] (03CR) 10Ori.livneh: [C: 031] phabricator => include apache::mod::php5 [operations/puppet] - 10https://gerrit.wikimedia.org/r/148858 (owner: 10Rush) [22:08:01] <_joe_> ori: lua is a dependency of luasandbox I guess? [22:08:11] <_joe_> if not, I'll install it by hand [22:08:12] (03CR) 10Rush: [C: 032 V: 032] phabricator => include apache::mod::php5 [operations/puppet] - 10https://gerrit.wikimedia.org/r/148858 (owner: 10Rush) [22:08:44] _joe_: i think luasandbox manages to depend on some -lib / -dev packages but doesn't require the self-standing lua interpreter [22:09:18] <_joe_> ok [22:09:53] <_joe_> ori: adding to the last patch I'd say [22:10:58] RECOVERY - puppet last run on mw1053 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [22:12:20] <_joe_> how's that possible? [22:13:14] (03PS2) 10Giuseppe Lavagetto: fixup for jobrunners [operations/puppet] - 10https://gerrit.wikimedia.org/r/148856 [22:13:32] _joe_: i think it mistakes puppet running as 'all ok' [22:13:46] so it's ok to be failing, not ok to have failed [22:14:15] <_joe_> btw that's what I deserve for not using the compiler before [22:14:28] <_joe_> I created it and I was too lazy to use it :( [22:14:58] PROBLEM - puppet last run on mw1053 is CRITICAL: CRITICAL: Epic puppet fail [22:15:13] (03PS3) 10Ori.livneh: fixup for jobrunners [operations/puppet] - 10https://gerrit.wikimedia.org/r/148856 (owner: 10Giuseppe Lavagetto) [22:15:20] _joe_: added trailing comma and reference to the bug # [22:15:33] <_joe_> eheh ok [22:15:35] <_joe_> merging [22:15:40] (03CR) 10Ori.livneh: [C: 031] fixup for jobrunners [operations/puppet] - 10https://gerrit.wikimedia.org/r/148856 (owner: 10Giuseppe Lavagetto) [22:15:44] (03PS4) 10Giuseppe Lavagetto: fixup for jobrunners [operations/puppet] - 10https://gerrit.wikimedia.org/r/148856 [22:15:57] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] fixup for jobrunners [operations/puppet] - 10https://gerrit.wikimedia.org/r/148856 (owner: 10Giuseppe Lavagetto) [22:16:22] this may be the day where i finally give someone in ops a heart attack [22:16:31] i've come close before [22:16:58] <_joe_> yes you did :P [22:17:07] (03CR) 10Dzahn: [C: 031] "i just talked with Ariel about fixing this issue for a while today.. there is still hope.." [operations/puppet] - 10https://gerrit.wikimedia.org/r/148386 (owner: 10BryanDavis) [22:18:43] <_joe_> ori: puppet is running... it will take time [22:19:04] _joe_: nod. are you logged in via console? [22:19:28] (03CR) 10Dzahn: "related: New in version 2014.7.0: Pass the minion config as a dictionary. .. we want that, so then we can read the minion confi" [operations/puppet] - 10https://gerrit.wikimedia.org/r/148386 (owner: 10BryanDavis) [22:19:37] <_joe_> ori: no via ssh [22:20:00] it's rejecting my key [22:20:22] <_joe_> mmmh [22:23:28] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [500.0] [22:24:21] _joe_: i had a stale entry in known_hosts [22:24:32] what's the 5xx alert about? not mw errors, it seems [22:24:49] <_joe_> ori: I an't look at that [22:25:05] i'm looking [22:28:00] went away [22:36:00] <_joe_> !log installed mw1053 as the first hhvm jobrunner, currently stopped. Puppet disabled so that it won't restart the jobrunner automatically [22:36:05] Logged the message, Master [22:38:18] _joe_: <3 i love you man [22:38:20] thank you! [22:38:48] <_joe_> ori: seems like everything ran fine apart from the starting of the jobrunner service [22:39:39] !log removed platinum from icinga [22:39:44] Logged the message, Master [22:40:06] <_joe_> which is expected, given the file is not there [22:40:28] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0] [22:40:28] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Puppet has 2 failures [22:41:53] * ori nods [22:46:57] (03PS1) 10Giuseppe Lavagetto: hhvm: correct extensions path [operations/puppet] - 10https://gerrit.wikimedia.org/r/148872 [22:47:36] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] hhvm: correct extensions path [operations/puppet] - 10https://gerrit.wikimedia.org/r/148872 (owner: 10Giuseppe Lavagetto) [22:48:58] PROBLEM - puppet last run on mw1053 is CRITICAL: CRITICAL: Puppet has 1 failures [22:49:47] (03PS1) 10Ori.livneh: admin: fix my bash aliases [operations/puppet] - 10https://gerrit.wikimedia.org/r/148876 [22:50:07] the puppet run on tin shows me it's failing to set the salt grain for deployment [22:50:12] times out [22:50:59] RECOVERY - puppet last run on mw1053 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [22:51:33] <_joe_> ori: now it is really yours :) [22:51:43] * _joe_ off to bed [22:52:07] _joe_: thanks a bunch for everything [22:52:09] have a good rest [22:54:32] I'll do the SWAT [22:55:08] (03CR) 10Giuseppe Lavagetto: "Why does it needs to be enabled if it's a standard procedure? does it add overhead/debug symbols to the binary?" [operations/debs/hhvm] - 10https://gerrit.wikimedia.org/r/148850 (owner: 10EBernhardson) [22:59:21] greg-g, MaxSem, will you scap too? [22:59:34] messages need updating [22:59:38] what's up yurikSPB ? [23:00:04] RoanKattouw, mwalker, ori, MaxSem: Sir, Please deploy SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20140723T2300), the time has come. At your service [23:00:12] hi MaxSem, during my depl earlier - we had a few new messages that i didn't scap [23:00:19] only sync-dired [23:00:26] bad yurik:P [23:00:32] naturally ) [23:00:37] I'll see if I have time for a scap [23:00:44] thx MaxSem [23:01:00] okay, I'm starting [23:03:40] James_F, the patch you nominated for SWAT has test failures [23:04:31] MaxSem: Is that just the usual master vs. wmf14 drift? [23:04:40] dunno [23:04:57] you have devs, ask them:P [23:04:59] MaxSem: Oh, wait, no, that's just the usual jenkins-issues. [23:05:10] MaxSem: It's the CI cluster's inability to run multiple tests at once. [23:05:16] MaxSem: It's fine. :-) [23:05:31] {{somergeifitsfine}} [23:05:54] * James_F forces jenkins to go think about what it has done. [23:07:05] mutante: "22:34 andrewbogott: temporarily fixed puppet on tin by restarting salt-master and salt-minion. A proper fix would involve upgrading to a salt version that fixes https://github.com/saltstack/salt/issues/6306 " [23:07:28] mutante: that's what you're seeing, I think? [23:07:28] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [500.0] [23:11:09] andrewbogott: could be it, yea, that timeout looks like when it cant talk to salt-master [23:11:59] mutante: kicking salt seems to resolve it temporarily. Ryan suggests that we not actually upgrade salt until their next release comes out. [23:12:33] andrewbogott: i actually want the new release for a specific feature that might let us set those grains per puppet role again [23:12:52] 11:28 < mutante> New in version 2014.7.0: Pass the minion config as a dictionary. [23:13:09] I take it .7.0 will come out on the 31st? [23:13:28] 11:28 < Ryan_Lane> they're doing RCs right now [23:13:28] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [23:13:28] 11:29 < Ryan_Lane> 2-3 weeks [23:13:54] andrewbogott: i got that ^ when i asked how those versions related to the Ubuntu packages [23:14:03] ok [23:14:31] (03CR) 10EBernhardson: "Not sure why its a compile time option. The hotprofiler is the name internal to hhvm for their reimplementation of the xhprof php5 module" [operations/debs/hhvm] - 10https://gerrit.wikimedia.org/r/148850 (owner: 10EBernhardson) [23:14:46] andrewbogott: :) @ recovery , thx [23:15:01] mutante: did you restart salt? Or did it just recover on its own? [23:15:27] on its own... [23:15:32] thought you had done it [23:15:35] Ah, so maybe it was just coincidence [23:15:40] Nope, I didn't touch it this time [23:15:47] or unreliable monitoring [23:16:01] watches another run [23:17:08] RECOVERY - Puppet freshness on db1009 is OK: puppet ran at Wed Jul 23 23:17:05 UTC 2014 [23:18:07] RoanKattouw: sorry, should exist now [23:18:07] !log maxsem Synchronized php-1.24wmf14/extensions/MobileFrontend/: (no message) (duration: 00m 04s) [23:18:13] Logged the message, Master [23:18:21] jenkins made up some test errors, I didn't notice [23:19:01] !log maxsem Synchronized php-1.24wmf13/extensions/MobileFrontend/: (no message) (duration: 00m 04s) [23:19:06] Logged the message, Master [23:21:25] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0] [23:22:08] MaxSem: [16:18] tgr RoanKattouw: sorry, should exist now [23:22:18] (03CR) 10EBernhardson: "also i checked with #hhvm, no fb staff responded but other users reported no noticed difference(woo anecdata)." [operations/debs/hhvm] - 10https://gerrit.wikimedia.org/r/148850 (owner: 10EBernhardson) [23:22:42] RoanKattouw: I'm seeing a lot of failed API requests in the parsoid logs [23:26:25] !log maxsem Synchronized php-1.24wmf14/extensions/VisualEditor/: (no message) (duration: 00m 04s) [23:26:31] Logged the message, Master [23:26:35] James_F, ^ [23:26:46] !log maxsem Synchronized php-1.24wmf14/extensions/MultimediaViewer/: (no message) (duration: 00m 03s) [23:26:49] tgr, ^ [23:26:50] Logged the message, Master [23:30:06] PROBLEM - Disk space on gallium is CRITICAL: DISK CRITICAL - free space: /var/lib/jenkins-slave/tmpfs 15 MB (2% inode=99%): [23:30:36] (03CR) 10Ori.livneh: [C: 04-1] Fix a couple warnings in beta (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148552 (owner: 10MaxSem) [23:32:23] !log maxsem Started scap: Pick up messages forgotten during Zero deployment [23:32:28] Logged the message, Master [23:32:30] yurikSPB, [23:32:44] MaxSem, thx [23:36:37] (03CR) 10Krinkle: "To avoid this being rendered as a multi-line subject line by git, put an empty line between the subject line and the body." [operations/puppet] - 10https://gerrit.wikimedia.org/r/148662 (https://bugzilla.wikimedia.org/68431) (owner: 10Hedonil) [23:38:26] PROBLEM - puppetmaster backend https on palladium is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 8141: HTTP/1.1 500 Internal Server Error [23:38:26] PROBLEM - puppet last run on db71 is CRITICAL: CRITICAL: Puppet has 3 failures [23:38:26] PROBLEM - puppet last run on ssl3001 is CRITICAL: CRITICAL: Puppet has 3 failures [23:38:35] PROBLEM - puppet last run on mw1107 is CRITICAL: CRITICAL: Puppet has 8 failures [23:38:35] PROBLEM - puppet last run on cp1066 is CRITICAL: CRITICAL: Puppet has 16 failures [23:38:35] PROBLEM - puppet last run on mw1131 is CRITICAL: CRITICAL: Puppet has 23 failures [23:38:36] PROBLEM - puppet last run on es1004 is CRITICAL: CRITICAL: Puppet has 10 failures [23:38:45] PROBLEM - puppet last run on mw1104 is CRITICAL: CRITICAL: Puppet has 42 failures [23:38:55] PROBLEM - puppet last run on cp1054 is CRITICAL: CRITICAL: Puppet has 14 failures [23:38:55] PROBLEM - puppet last run on db1045 is CRITICAL: CRITICAL: Puppet has 4 failures [23:38:55] PROBLEM - puppet last run on mw1064 is CRITICAL: CRITICAL: Puppet has 9 failures [23:38:55] PROBLEM - puppet last run on lvs1003 is CRITICAL: CRITICAL: Puppet has 14 failures [23:38:55] PROBLEM - puppet last run on mw1207 is CRITICAL: CRITICAL: Puppet has 45 failures [23:38:56] PROBLEM - puppet last run on bast1001 is CRITICAL: CRITICAL: Puppet has 51 failures [23:38:56] PROBLEM - puppet last run on cp1045 is CRITICAL: CRITICAL: Puppet has 3 failures [23:38:56] PROBLEM - puppet last run on mw1110 is CRITICAL: CRITICAL: Puppet has 45 failures [23:39:00] (03Abandoned) 10EBernhardson: Prevent warning from logging call [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/148710 (owner: 10EBernhardson) [23:39:05] PROBLEM - puppet last run on palladium is CRITICAL: CRITICAL: Puppet has 32 failures [23:39:05] PROBLEM - puppet last run on virt1008 is CRITICAL: CRITICAL: Puppet has 8 failures [23:39:05] PROBLEM - puppet last run on mw1037 is CRITICAL: CRITICAL: Puppet has 2 failures [23:39:05] PROBLEM - puppet last run on analytics1031 is CRITICAL: CRITICAL: Puppet has 7 failures [23:39:06] PROBLEM - puppet last run on mw1155 is CRITICAL: CRITICAL: Puppet has 39 failures [23:39:06] PROBLEM - puppet last run on ms-be1010 is CRITICAL: CRITICAL: Puppet has 27 failures [23:39:06] PROBLEM - puppet last run on analytics1021 is CRITICAL: CRITICAL: Puppet has 20 failures [23:39:07] PROBLEM - puppet last run on db1027 is CRITICAL: CRITICAL: Puppet has 19 failures [23:39:07] PROBLEM - puppet last run on mw1154 is CRITICAL: CRITICAL: Puppet has 33 failures [23:39:08] PROBLEM - puppet last run on ssl1003 is CRITICAL: CRITICAL: Puppet has 25 failures [23:39:08] PROBLEM - puppet last run on db1049 is CRITICAL: CRITICAL: Puppet has 20 failures [23:39:09] PROBLEM - puppet last run on wtp1001 is CRITICAL: CRITICAL: Puppet has 2 failures [23:39:09] PROBLEM - puppet last run on mw1090 is CRITICAL: CRITICAL: Puppet has 6 failures [23:39:15] PROBLEM - puppet last run on analytics1012 is CRITICAL: CRITICAL: Puppet has 20 failures [23:39:25] PROBLEM - puppet last run on mexia is CRITICAL: CRITICAL: Puppet has 12 failures [23:39:25] PROBLEM - puppet last run on cp4016 is CRITICAL: CRITICAL: Puppet has 27 failures [23:39:25] PROBLEM - puppetmaster https on palladium is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 8140: HTTP/1.1 500 Internal Server Error [23:39:25] PROBLEM - puppet last run on tmh1001 is CRITICAL: CRITICAL: Puppet has 18 failures [23:39:26] PROBLEM - puppet last run on cp3011 is CRITICAL: CRITICAL: Puppet has 16 failures [23:39:26] PROBLEM - puppet last run on cp3005 is CRITICAL: CRITICAL: Puppet has 15 failures [23:39:26] PROBLEM - puppet last run on eeden is CRITICAL: CRITICAL: Puppet has 20 failures [23:39:27] PROBLEM - puppet last run on cp3021 is CRITICAL: CRITICAL: Puppet has 27 failures [23:39:27] PROBLEM - puppet last run on amssq37 is CRITICAL: CRITICAL: Puppet has 15 failures [23:39:28] PROBLEM - puppet last run on zinc is CRITICAL: CRITICAL: Puppet has 18 failures [23:39:28] PROBLEM - puppet last run on search1003 is CRITICAL: CRITICAL: Puppet has 37 failures [23:39:29] PROBLEM - puppet last run on search1019 is CRITICAL: CRITICAL: Puppet has 47 failures [23:39:29] PROBLEM - puppet last run on cp3017 is CRITICAL: CRITICAL: Puppet has 18 failures [23:39:35] PROBLEM - puppet last run on mw1073 is CRITICAL: CRITICAL: Puppet has 51 failures [23:39:35] PROBLEM - puppet last run on mw1018 is CRITICAL: CRITICAL: Puppet has 59 failures [23:39:35] PROBLEM - puppet last run on mw1113 is CRITICAL: CRITICAL: Puppet has 44 failures [23:39:35] PROBLEM - puppet last run on mw1158 is CRITICAL: CRITICAL: Puppet has 48 failures [23:39:35] PROBLEM - puppet last run on lanthanum is CRITICAL: CRITICAL: Puppet has 28 failures [23:39:36] PROBLEM - puppet last run on db1009 is CRITICAL: CRITICAL: Puppet has 21 failures [23:39:36] PROBLEM - puppet last run on mw1194 is CRITICAL: CRITICAL: Puppet has 60 failures [23:39:41] !log running sync-common on mw1053.eqiad.wmnet [23:39:45] PROBLEM - puppet last run on elastic1010 is CRITICAL: CRITICAL: Puppet has 17 failures [23:39:45] PROBLEM - puppet last run on db1068 is CRITICAL: CRITICAL: Puppet has 18 failures [23:39:45] PROBLEM - puppet last run on mw1058 is CRITICAL: CRITICAL: Puppet has 57 failures [23:39:46] Logged the message, Master [23:39:55] PROBLEM - puppet last run on mw1015 is CRITICAL: CRITICAL: Puppet has 46 failures [23:39:56] PROBLEM - puppet last run on mw1135 is CRITICAL: CRITICAL: Puppet has 46 failures [23:39:56] PROBLEM - puppet last run on labsdb1005 is CRITICAL: CRITICAL: Puppet has 15 failures [23:39:56] PROBLEM - puppet last run on mw1019 is CRITICAL: CRITICAL: Puppet has 59 failures [23:39:56] PROBLEM - puppet last run on db1005 is CRITICAL: CRITICAL: Puppet has 22 failures [23:40:05] PROBLEM - puppet last run on search1008 is CRITICAL: CRITICAL: Puppet has 41 failures [23:40:05] PROBLEM - puppet last run on fenari is CRITICAL: CRITICAL: Puppet has 53 failures [23:40:05] PROBLEM - puppet last run on mc1015 is CRITICAL: CRITICAL: Puppet has 16 failures [23:40:05] PROBLEM - puppet last run on ms-be1014 is CRITICAL: CRITICAL: Puppet has 30 failures [23:40:06] PROBLEM - puppet last run on elastic1013 is CRITICAL: CRITICAL: Puppet has 22 failures [23:40:06] PROBLEM - puppet last run on mw1103 is CRITICAL: CRITICAL: Puppet has 59 failures [23:40:06] PROBLEM - puppet last run on mw1047 is CRITICAL: CRITICAL: Puppet has 65 failures [23:40:06] PROBLEM - puppet last run on mw1199 is CRITICAL: CRITICAL: Puppet has 61 failures [23:40:07] PROBLEM - puppet last run on mw1020 is CRITICAL: CRITICAL: Puppet has 50 failures [23:40:07] PROBLEM - puppet last run on mw1137 is CRITICAL: CRITICAL: Puppet has 66 failures [23:40:08] PROBLEM - puppet last run on lvs4004 is CRITICAL: CRITICAL: Puppet has 12 failures [23:40:09] PROBLEM - puppet last run on virt1005 is CRITICAL: CRITICAL: Puppet has 21 failures [23:40:09] PROBLEM - puppet last run on mw1128 is CRITICAL: CRITICAL: Puppet has 69 failures [23:40:15] PROBLEM - puppet last run on elastic1016 is CRITICAL: CRITICAL: Puppet has 29 failures [23:40:25] PROBLEM - puppet last run on cp4011 is CRITICAL: CRITICAL: Puppet has 24 failures [23:40:25] PROBLEM - puppet last run on cp4010 is CRITICAL: CRITICAL: Puppet has 24 failures [23:40:25] RECOVERY - puppetmaster https on palladium is OK: HTTP OK: Status line output matched 400 - 335 bytes in 0.031 second response time [23:40:25] PROBLEM - puppet last run on tantalum is CRITICAL: CRITICAL: Puppet has 17 failures [23:40:26] PROBLEM - puppet last run on cp1064 is CRITICAL: CRITICAL: Puppet has 22 failures [23:40:26] PROBLEM - puppet last run on virt0 is CRITICAL: CRITICAL: Puppet has 49 failures [23:40:26] PROBLEM - puppet last run on lvs3002 is CRITICAL: CRITICAL: Puppet has 20 failures [23:40:27] PROBLEM - puppet last run on cp3019 is CRITICAL: CRITICAL: Puppet has 24 failures [23:40:27] PROBLEM - puppet last run on amssq50 is CRITICAL: CRITICAL: Puppet has 18 failures [23:40:27] PROBLEM - puppet last run on hydrogen is CRITICAL: CRITICAL: Puppet has 19 failures [23:40:28] PROBLEM - puppet last run on ms-be3004 is CRITICAL: CRITICAL: Puppet has 22 failures [23:40:28] PROBLEM - puppet last run on amssq52 is CRITICAL: CRITICAL: Puppet has 23 failures [23:40:35] PROBLEM - puppet last run on mw1083 is CRITICAL: CRITICAL: Puppet has 52 failures [23:40:35] PROBLEM - puppet last run on ms-be1005 is CRITICAL: CRITICAL: Puppet has 23 failures [23:40:45] PROBLEM - puppet last run on mw1136 is CRITICAL: CRITICAL: Puppet has 58 failures [23:40:45] PROBLEM - puppet last run on db1019 is CRITICAL: CRITICAL: Puppet has 18 failures [23:40:45] PROBLEM - puppet last run on labsdb1002 is CRITICAL: CRITICAL: Puppet has 27 failures [23:40:45] PROBLEM - puppet last run on mw1070 is CRITICAL: CRITICAL: Puppet has 57 failures [23:40:45] PROBLEM - puppet last run on mw1191 is CRITICAL: CRITICAL: Puppet has 47 failures [23:40:46] PROBLEM - puppet last run on es1005 is CRITICAL: CRITICAL: Puppet has 17 failures [23:40:46] PROBLEM - puppet last run on mw1095 is CRITICAL: CRITICAL: Puppet has 60 failures [23:40:47] PROBLEM - puppet last run on lvs1006 is CRITICAL: CRITICAL: Puppet has 25 failures [23:40:47] PROBLEM - puppet last run on mw1078 is CRITICAL: CRITICAL: Puppet has 57 failures [23:40:56] PROBLEM - puppet last run on mw1179 is CRITICAL: CRITICAL: Puppet has 64 failures [23:40:56] PROBLEM - puppet last run on erbium is CRITICAL: CRITICAL: Puppet has 28 failures [23:40:57] ok, that was odd [23:41:05] PROBLEM - puppet last run on virt1009 is CRITICAL: CRITICAL: Puppet has 19 failures [23:41:05] PROBLEM - puppet last run on analytics1029 is CRITICAL: CRITICAL: Puppet has 16 failures [23:41:05] PROBLEM - puppet last run on es1003 is CRITICAL: CRITICAL: Puppet has 18 failures [23:41:05] PROBLEM - puppet last run on mw1094 is CRITICAL: CRITICAL: Puppet has 56 failures [23:41:06] PROBLEM - puppet last run on mw1101 is CRITICAL: CRITICAL: Puppet has 46 failures [23:41:06] PROBLEM - puppet last run on mw1085 is CRITICAL: CRITICAL: Puppet has 59 failures [23:41:06] PROBLEM - puppet last run on mw1075 is CRITICAL: CRITICAL: Puppet has 63 failures [23:41:07] PROBLEM - puppet last run on search1009 is CRITICAL: CRITICAL: Puppet has 49 failures [23:41:07] PROBLEM - puppet last run on cp1065 is CRITICAL: CRITICAL: Puppet has 22 failures [23:41:08] PROBLEM - puppet last run on elastic1009 is CRITICAL: CRITICAL: Puppet has 19 failures [23:41:08] PROBLEM - puppet last run on cp4013 is CRITICAL: CRITICAL: Puppet has 29 failures [23:41:09] PROBLEM - puppet last run on db1058 is CRITICAL: CRITICAL: Puppet has 22 failures [23:41:09] PROBLEM - puppet last run on mw1102 is CRITICAL: CRITICAL: Puppet has 63 failures [23:41:15] PROBLEM - puppet last run on ssl1004 is CRITICAL: CRITICAL: Puppet has 20 failures [23:41:15] PROBLEM - puppet last run on wtp1014 is CRITICAL: CRITICAL: Puppet has 17 failures [23:41:15] PROBLEM - puppet last run on mw1017 is CRITICAL: CRITICAL: Puppet has 59 failures [23:41:21] palladium https error then all this [23:41:24] then https clear [23:41:25] PROBLEM - puppet last run on mw1214 is CRITICAL: CRITICAL: Puppet has 63 failures [23:41:25] PROBLEM - puppet last run on cp1057 is CRITICAL: CRITICAL: Puppet has 21 failures [23:41:25] PROBLEM - puppet last run on lvs4001 is CRITICAL: CRITICAL: Puppet has 18 failures [23:41:25] PROBLEM - puppet last run on cp4015 is CRITICAL: CRITICAL: Puppet has 30 failures [23:41:25] PROBLEM - puppet last run on cp4017 is CRITICAL: CRITICAL: Puppet has 21 failures [23:41:26] PROBLEM - puppet last run on neon is CRITICAL: CRITICAL: Puppet has 60 failures [23:41:26] PROBLEM - puppet last run on mw1096 is CRITICAL: CRITICAL: Puppet has 66 failures [23:41:26] PROBLEM - puppet last run on amssq45 is CRITICAL: CRITICAL: Puppet has 19 failures [23:41:27] PROBLEM - puppet last run on amslvs4 is CRITICAL: CRITICAL: Puppet has 22 failures [23:41:28] PROBLEM - puppet last run on mw1157 is CRITICAL: CRITICAL: Puppet has 69 failures [23:41:28] PROBLEM - puppet last run on cp3007 is CRITICAL: CRITICAL: Puppet has 23 failures [23:41:35] PROBLEM - puppet last run on mw1184 is CRITICAL: CRITICAL: Puppet has 49 failures [23:41:35] PROBLEM - puppet last run on mw1169 is CRITICAL: CRITICAL: Puppet has 61 failures [23:41:35] PROBLEM - puppet last run on mw1127 is CRITICAL: CRITICAL: Puppet has 54 failures [23:41:43] is someone working on the puppetmaster? [23:41:45] PROBLEM - puppet last run on mc1008 is CRITICAL: CRITICAL: Puppet has 17 failures [23:41:51] i've said it before and i'll say it again, we should disable the alert [23:41:53] i ask cuz there are a ton of root logins [23:41:53] it does more harm then good [23:41:56] PROBLEM - puppet last run on mw1182 is CRITICAL: CRITICAL: Puppet has 61 failures [23:41:56] PROBLEM - puppet last run on mw1013 is CRITICAL: CRITICAL: Puppet has 54 failures [23:42:05] PROBLEM - puppet last run on wtp1021 is CRITICAL: CRITICAL: Puppet has 21 failures [23:42:05] PROBLEM - puppet last run on analytics1024 is CRITICAL: CRITICAL: Puppet has 12 failures [23:42:05] PROBLEM - puppet last run on analytics1036 is CRITICAL: CRITICAL: Puppet has 19 failures [23:42:05] PROBLEM - puppet last run on mw1196 is CRITICAL: CRITICAL: Puppet has 57 failures [23:42:05] PROBLEM - puppet last run on virt1002 is CRITICAL: CRITICAL: Puppet has 15 failures [23:42:05] and i dunno wtf you folks are doing as root anyhow [23:42:06] PROBLEM - puppet last run on cp1051 is CRITICAL: CRITICAL: Puppet has 22 failures [23:42:06] PROBLEM - puppet last run on search1014 is CRITICAL: CRITICAL: Puppet has 38 failures [23:42:06] PROBLEM - puppet last run on search1021 is CRITICAL: CRITICAL: Puppet has 48 failures [23:42:07] PROBLEM - puppet last run on mw1138 is CRITICAL: CRITICAL: Puppet has 56 failures [23:42:08] PROBLEM - puppet last run on es10 is CRITICAL: CRITICAL: Puppet has 10 failures [23:42:08] PROBLEM - puppet last run on mw1035 is CRITICAL: CRITICAL: Puppet has 60 failures [23:42:09] PROBLEM - puppet last run on analytics1019 is CRITICAL: CRITICAL: Puppet has 21 failures [23:42:09] PROBLEM - puppet last run on db1041 is CRITICAL: CRITICAL: Puppet has 25 failures [23:42:15] PROBLEM - puppet last run on cp1069 is CRITICAL: CRITICAL: Puppet has 20 failures [23:42:15] PROBLEM - puppet last run on mw1036 is CRITICAL: CRITICAL: Puppet has 60 failures [23:42:16] PROBLEM - puppet last run on tmh1002 is CRITICAL: CRITICAL: Puppet has 30 failures [23:42:17] RobH: salt-* [23:42:26] PROBLEM - puppet last run on ytterbium is CRITICAL: CRITICAL: Puppet has 30 failures [23:42:26] PROBLEM - puppet last run on db1024 is CRITICAL: CRITICAL: Puppet has 17 failures [23:42:26] PROBLEM - puppet last run on amssq58 is CRITICAL: CRITICAL: Puppet has 24 failures [23:42:26] PROBLEM - puppet last run on db1007 is CRITICAL: CRITICAL: Puppet has 20 failures [23:42:26] PROBLEM - puppet last run on es7 is CRITICAL: CRITICAL: Puppet has 15 failures [23:42:27] PROBLEM - puppet last run on amssq57 is CRITICAL: CRITICAL: Puppet has 21 failures [23:42:35] PROBLEM - puppet last run on analytics1039 is CRITICAL: CRITICAL: Puppet has 12 failures [23:42:36] PROBLEM - puppet last run on cp1043 is CRITICAL: CRITICAL: Puppet has 22 failures [23:42:36] PROBLEM - puppet last run on mw1109 is CRITICAL: CRITICAL: Puppet has 66 failures [23:42:36] PROBLEM - puppet last run on db1010 is CRITICAL: CRITICAL: Puppet has 18 failures [23:42:46] PROBLEM - puppet last run on zirconium is CRITICAL: CRITICAL: Puppet has 51 failures [23:42:46] PROBLEM - puppet last run on analytics1034 is CRITICAL: CRITICAL: Puppet has 18 failures [23:42:55] PROBLEM - puppet last run on es1006 is CRITICAL: CRITICAL: Puppet has 19 failures [23:42:56] PROBLEM - puppet last run on mw1216 is CRITICAL: CRITICAL: Puppet has 59 failures [23:42:56] PROBLEM - puppet last run on wtp1024 is CRITICAL: CRITICAL: Puppet has 23 failures [23:42:56] PROBLEM - puppet last run on mw1040 is CRITICAL: CRITICAL: Puppet has 53 failures [23:42:56] PROBLEM - puppet last run on mw1124 is CRITICAL: CRITICAL: Puppet has 50 failures [23:42:56] PROBLEM - puppet last run on analytics1015 is CRITICAL: CRITICAL: Puppet has 19 failures [23:43:05] PROBLEM - puppet last run on mw1192 is CRITICAL: CRITICAL: Puppet has 55 failures [23:43:05] PROBLEM - puppet last run on mw1132 is CRITICAL: CRITICAL: Puppet has 62 failures [23:43:05] PROBLEM - puppet last run on vanadium is CRITICAL: CRITICAL: Puppet has 21 failures [23:43:05] PROBLEM - puppet last run on pc1001 is CRITICAL: CRITICAL: Puppet has 27 failures [23:43:05] PROBLEM - puppet last run on mc1010 is CRITICAL: CRITICAL: Puppet has 21 failures [23:43:06] PROBLEM - puppet last run on es1009 is CRITICAL: CRITICAL: Puppet has 16 failures [23:43:06] PROBLEM - puppet last run on mw1161 is CRITICAL: CRITICAL: Puppet has 59 failures [23:43:15] PROBLEM - puppet last run on mw1130 is CRITICAL: CRITICAL: Puppet has 66 failures [23:43:16] PROBLEM - puppet last run on logstash1003 is CRITICAL: CRITICAL: Puppet has 30 failures [23:43:16] PROBLEM - puppet last run on tarin is CRITICAL: CRITICAL: Puppet has 14 failures [23:43:16] PROBLEM - puppet last run on mw1147 is CRITICAL: CRITICAL: Puppet has 51 failures [23:43:16] PROBLEM - puppet last run on mc1009 is CRITICAL: CRITICAL: Puppet has 23 failures [23:43:25] PROBLEM - puppet last run on ms-fe1003 is CRITICAL: CRITICAL: Puppet has 25 failures [23:43:25] PROBLEM - puppet last run on cp4007 is CRITICAL: CRITICAL: Puppet has 19 failures [23:43:26] PROBLEM - puppet last run on cp3022 is CRITICAL: CRITICAL: Puppet has 22 failures [23:43:26] PROBLEM - puppet last run on ssl3003 is CRITICAL: CRITICAL: Puppet has 32 failures [23:43:26] PROBLEM - puppet last run on cp3013 is CRITICAL: CRITICAL: Puppet has 29 failures [23:43:35] PROBLEM - puppet last run on mw1062 is CRITICAL: CRITICAL: Puppet has 60 failures [23:43:35] PROBLEM - puppet last run on mw1067 is CRITICAL: CRITICAL: Puppet has 59 failures [23:43:35] PROBLEM - puppet last run on labsdb1001 is CRITICAL: CRITICAL: Puppet has 24 failures [23:43:35] PROBLEM - puppet last run on ms-be1013 is CRITICAL: CRITICAL: Puppet has 17 failures [23:43:35] PROBLEM - puppet last run on cp1059 is CRITICAL: CRITICAL: Puppet has 16 failures [23:43:36] PROBLEM - puppet last run on mw1218 is CRITICAL: CRITICAL: Puppet has 60 failures [23:44:05] PROBLEM - puppet last run on mw1115 is CRITICAL: CRITICAL: Puppet has 59 failures [23:44:05] PROBLEM - puppet last run on rdb1004 is CRITICAL: CRITICAL: Puppet has 15 failures [23:44:06] PROBLEM - puppet last run on wtp1019 is CRITICAL: CRITICAL: Puppet has 24 failures [23:59:05] !log maxsem Finished scap: Pick up messages forgotten during Zero deployment (duration: 26m 42s) [23:59:11] Logged the message, Master [23:59:13] RECOVERY - puppet last run on fenari is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [23:59:13] RECOVERY - puppet last run on mw1094 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [23:59:13] RECOVERY - puppet last run on search1009 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [23:59:14] RECOVERY - puppet last run on search1021 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [23:59:14] RECOVERY - puppet last run on erbium is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [23:59:15] RECOVERY - puppet last run on db1058 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [23:59:15] RECOVERY - puppet last run on db1049 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [23:59:16] RECOVERY - puppet last run on cp1069 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [23:59:16] RECOVERY - puppet last run on ssl1004 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [23:59:16] RECOVERY - puppet last run on elastic1016 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [23:59:17] RECOVERY - puppet last run on wtp1014 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [23:59:17] yurikSPB, please test ^ [23:59:18] RECOVERY - puppet last run on tmh1002 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [23:59:22] RECOVERY - puppet last run on mw1017 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [23:59:22] RECOVERY - puppet last run on cp1057 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [23:59:22] RECOVERY - puppet last run on cp4015 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [23:59:22] RECOVERY - puppet last run on mw1214 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [23:59:22] RECOVERY - puppet last run on cp4017 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [23:59:23] RECOVERY - puppet last run on lvs4001 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [23:59:23] RECOVERY - puppet last run on db1024 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [23:59:24] RECOVERY - puppet last run on tantalum is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [23:59:24] RECOVERY - puppet last run on mw1096 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [23:59:24] RECOVERY - puppet last run on neon is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [23:59:25] RECOVERY - puppet last run on virt0 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [23:59:32] RECOVERY - puppet last run on amssq52 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [23:59:32] RECOVERY - puppet last run on ms-be3004 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [23:59:33] RECOVERY - puppet last run on mw1169 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [23:59:33] RECOVERY - puppet last run on mw1184 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [23:59:33] RECOVERY - puppet last run on mw1083 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [23:59:33] RECOVERY - puppet last run on ms-be1005 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [23:59:42] RECOVERY - puppet last run on cp1065 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [23:59:42] RECOVERY - puppet last run on mw1127 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [23:59:42] RECOVERY - puppet last run on mc1008 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [23:59:42] RECOVERY - puppet last run on mw1136 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [23:59:52] RECOVERY - puppet last run on analytics1034 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [23:59:52] RECOVERY - puppet last run on lvs1006 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [23:59:52] RECOVERY - puppet last run on mw1095 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [23:59:52] RECOVERY - puppet last run on mw1182 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures