[00:00:05] RoanKattouw, ^d, marktraceur, MaxSem: Respected human, time to deploy Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20141211T0000). Please do the needful. [00:00:11] still scaping - 60% cdbs is done [00:00:21] * the_nobodies bites yurikR [00:00:34] 94% [00:00:45] I'll do the swat, it's mobile stuff anyway [00:00:56] !log yurik Finished scap: ZeroBanner had some i18n changes, plus bits seems to be out of sync for it (duration: 20m 01s) [00:01:00] :) [00:01:02] Logged the message, Master [00:01:11] the_nobodies, ^ :)) [00:02:00] * bd808 guessed right on the duration and awards self a gold star 🌟 [00:03:21] :D [00:03:24] thanks bd808 ! [00:03:31] RoanKattouw, I put up all my SWAT changes. [00:03:35] There are two for Flow too. [00:03:44] Should I do the submodule bumps, or just leave that for the SWAT team this time? [00:03:50] bd808, 20:01 ! [00:04:04] superm401, please do the bumps [00:04:32] Will do [00:05:17] oh come the motherfuck on Zuul [00:08:10] the_nobodies: finally merged! [00:08:19] after I forced it [00:08:33] gotta disable stupid npm check [00:10:11] the_nobodies: wow, took 9 minutes to run the npm test [00:12:00] the_nobodies: lemme know when those patches are pushed to wmf11 on the cluster. Then I can test real quick. And then you can push that config change to activate it. [00:12:15] aha [00:12:59] !log maxsem Synchronized php-1.25wmf12/extensions/MobileFrontend/: (no message) (duration: 00m 08s) [00:13:06] Logged the message, Master [00:13:10] kaldari, I guess we can't test wmf12? ^^^ [00:13:35] the_nobodies: yeah, I'll go ahead and test that.... [00:13:45] RoanKattouw, the_nobodies, updated the deployments page. [00:13:56] To include the bumps instead of the extension repo commits. [00:14:03] thanks:) [00:14:04] the_nobodies: oh except we don't have any entries in the test database [00:14:39] the_nobodies: so you're right, no way to test :( [00:14:53] guess I should add some entries.... [00:15:03] !log maxsem Synchronized php-1.25wmf11/extensions/MobileFrontend/: (no message) (duration: 00m 06s) [00:15:08] kaldari, ^^ [00:15:08] Logged the message, Master [00:15:16] thanks. looking.... [00:23:20] the_nobodies: looks good so far. Leila is going to QA all the data recording and then let me know when it's OK to turn on for everyone. She says it will take a little bit (but less than half an hour) [00:31:39] !log maxsem Synchronized php-1.25wmf12/extensions/WikimediaEvents/: https://gerrit.wikimedia.org/r/#q,179018,n,z (duration: 00m 05s) [00:31:43] Logged the message, Master [00:31:49] superm401, ^^ [00:33:43] !log maxsem Synchronized php-1.25wmf11/extensions/Flow/: https://gerrit.wikimedia.org/r/#q,179020,n,z (duration: 00m 07s) [00:33:47] the_nobodies, thanks. There is one for wmf11 too. [00:33:47] Logged the message, Master [00:33:59] For WikimediaEvents. [00:34:21] !log maxsem Synchronized php-1.25wmf11/extensions/Flow/: https://gerrit.wikimedia.org/r/#q,179018,n,z (duration: 00m 07s) [00:34:24] Logged the message, Master [00:34:26] the_nobodies, are you MaxSem? [00:34:31] aha [00:35:07] !log maxsem Synchronized php-1.25wmf11/resources/Resources.php: https://gerrit.wikimedia.org/r/#/c/179014/ (duration: 00m 06s) [00:35:12] Logged the message, Master [00:35:17] superm401, and the last one for you ^^^^ [00:37:15] the_nobodies, I also requested https://gerrit.wikimedia.org/r/#q,179016,n,z [00:37:48] oh shit, I synced flow twice instead:P [00:38:35] !log maxsem Synchronized php-1.25wmf11/extensions/WikimediaEvents/: (no message) (duration: 00m 06s) [00:38:37] superm401, ^^^ [00:38:41] Logged the message, Master [00:38:42] Thanks. :) [00:40:19] (03CR) 10MaxSem: [C: 032] Enable MobileFrontend on wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/178990 (owner: 10MaxSem) [00:40:33] (03Merged) 10jenkins-bot: Enable MobileFrontend on wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/178990 (owner: 10MaxSem) [00:41:21] !log maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/178990/ (duration: 00m 05s) [00:41:27] Logged the message, Master [00:42:48] (03PS1) 10Ori.livneh: mediawiki: tidy `cleanup_cache` script [puppet] - 10https://gerrit.wikimedia.org/r/179027 [00:45:06] (03PS1) 10MaxSem: No mobile domain for wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179028 [00:45:26] (03CR) 10MaxSem: [C: 032] No mobile domain for wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179028 (owner: 10MaxSem) [00:45:38] (03Merged) 10jenkins-bot: No mobile domain for wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179028 (owner: 10MaxSem) [00:46:20] !log maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/179028/ (duration: 00m 05s) [00:46:25] Logged the message, Master [00:49:09] aude, https://www.wikidata.org/w/index.php?title=Q565&mobileaction=toggle_view_mobile [00:49:25] the_nobodies, could you scap at the end again? for some strange reason, one of the resources does not get properly populated - zero-interstitial-title gets sent to browser as [00:49:38] we have 11 minutes [00:49:54] jouncebot, next [00:49:55] In 15 hour(s) and 10 minute(s): Morning SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20141211T1600) [00:50:17] yurikR: have you double checked the l10n files? scap hasn't randomly broken l10n for quite a while [00:50:37] bd808, it works on beta [00:50:42] Or is it a RL thing? [00:50:49] might be RL [00:50:54] scap won't fix a bad RL cache [00:51:02] l10nupdate will [00:51:11] thx, good to know [00:51:40] The actual fix is at the end of l10nupdate, it runs a maintenance script to purge the RL cache [00:52:30] the_nobodies, i guess i could run it after you are done [00:53:13] greg-g: Are we deploying wmf13 to MW.org next week? [00:53:23] When I press "Login or Register" on https://phabricator.wikimedia.org, I reproducibly receive "Error: 503" (Service Unavailable at Thu, 11 Dec 2014 00:52:24 GMT), the cyan screen of WMF death. [00:54:08] (not logged-in in MediaWiki nor in Phabricator) [01:00:33] ... the mediaWiki login button is the one i press, not the blue one [01:00:49] !log maxsem Started scap: Noop, regenerating l18n cache for ZeroBanner [01:00:54] Logged the message, Master [01:01:27] yurikR, dr0ptp4kt ^^^ [01:02:01] the_nobodies, did you scap or l10nupdate? [01:03:00] nuria, still doesn't work. The version is based on $epoch (an integer) + revision. [01:03:25] $epoch + "10676430" === $epoch + 10676430 in PHP [01:03:35] superm401 [01:03:37] Beta isn't cached as aggressively so it's fine there. [01:03:39] http://bits.beta.wmflabs.org/en.wikipedia.beta.wmflabs.org/load.php?debug=false&lang=en&modules=schema.SendBeaconReliability&skin=vector&version=20131002T134030Z&* [01:03:42] aham [01:05:12] superm401: there is something i do not get here [01:05:44] How can i help? [01:07:00] superm401: the schema version is the one printed here, right? https://meta.wikimedia.org/wiki/Schema:SendBeaconReliability [01:07:26] nuria, it was, until I just updated it. Now production is pointing to the next oldest version. [01:08:25] superm401, ok, in thsi case i would expect the evnt to send: 10735916 [01:08:30] *event [01:08:57] nuria, the code tells it which revid to use. So updating the meta page alone does nothing: https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FWikimediaEvents.git/master/WikimediaEvents.php#L67 [01:09:14] That way you can work on documentation on the meta page, or even add a new field, without affecting production code until the PHP changes. [01:09:56] superm401: right , what i do not get is the epoch thing you were saying before, we should be sending just 10735916 [01:10:02] as rev-id for the schema [01:10:07] superm401: correct/ [01:10:09] ? [01:11:00] in the url [01:11:31] (03PS1) 10MaxSem: Revert "Enable MobileFrontend on wikidata.org" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179036 [01:11:36] nuria, there was an old bug about that: https://gerrit.wikimedia.org/r/#/c/111731/ [01:12:01] Basically, RL treats the version as max( $epoch, $revision ), so if $revision were less than the epoch, updating the revid would not invalidate anything. [01:12:37] nuria, but max( $epoch, $epoch + $revision ) is always $epoch + $revision so we don't need to worry about the epoch. [01:13:06] the_nobodies: So the analytics folks found an issue with the new WikiGrok data. I'm going to see about addressing it and tomorrow we with either do a SWAT deploy and turn it on, or hold off on the test until after the holidays. [01:13:17] (03PS1) 10MaxSem: Add mobile subdomains to wikidata.org [dns] - 10https://gerrit.wikimedia.org/r/179037 [01:13:32] with=will [01:15:23] ehh, I'd rather do it before quearterly reviews kaldari :) [01:15:54] me too [01:17:33] superm401: From what i can see that bug fixed the loading of schemas but how is that related to our problem? [01:17:53] nuria, the epoch part is not really key. [01:18:56] The main point is that the SendBeaconReliability part of the startup module did not change when I fixed it to be an int. [01:19:08] So the URL is exactly the same and old broken schema JS is still cached: [01:20:14] superm401, ah ok, very soory, i think i get it now [01:22:09] superm401: because beta is not cached as agressively we see it working there but in prod you need to update the "the SendBeaconReliability part of the startup module" [01:23:26] nuria, yeah, exactly. If you go to https://bits.wikimedia.org/en.wikipedia.org/load.php?debug=false&lang=en&modules=startup&only=scripts you'll see: [01:23:28] ["schema.SendBeaconReliability","1380721230",["ext.eventLogging"]] [01:23:47] nuria, that value comes from: [01:23:49] wfTimestamp( TS_UNIX, '20130601000000' ) + 10676430; [01:23:52] Unfortunately: [01:24:00] wfTimestamp( TS_UNIX, '20130601000000' ) + '10676430' [01:24:04] is exactly the same in PHP. [01:24:48] nuria, that first value is the epoch, from https://git.wikimedia.org/blob/operations%2Fmediawiki-config/HEAD/wmf-config%2FCommonSettings.php#L962 [01:25:21] and ... superm401; do we change values on startup module by hand? [01:26:03] nuria, no, I'm just going to update the revid. [01:26:37] so the rev id is updated byhand in the module, there is no build process that does that, correct? [01:28:49] ^ superm401 [01:30:49] nuria, correct. [01:31:35] superm401, ok, i see me comprendo now [01:33:43] nuria, up at https://gerrit.wikimedia.org/r/#/c/179040/1 [01:34:20] superm401, this is been very enlightening.... [01:34:46] !log maxsem Finished scap: Noop, regenerating l18n cache for ZeroBanner (duration: 33m 57s) [01:34:54] Logged the message, Master [01:35:58] didn't fix it [01:36:06] unless its in varnish cache for the next 5 min [01:36:57] i wonder if that was a l10nupdate or scap [01:37:18] "Finished scap" [01:38:02] bd808, greg-g, i'm about to run l10nupdate [01:38:31] running ... [01:38:55] !log redeploy core fixes for wmf12 [01:39:00] Logged the message, Master [01:39:51] csteipp, are you deploying something? [01:40:01] (03CR) 10MaxSem: [C: 032] Revert "Enable MobileFrontend on wikidata.org" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179036 (owner: 10MaxSem) [01:40:10] (03Merged) 10jenkins-bot: Revert "Enable MobileFrontend on wikidata.org" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179036 (owner: 10MaxSem) [01:41:17] !log maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/179036 (duration: 00m 06s) [01:41:25] Logged the message, Master [01:41:45] grr, is max deploying without being logged in into irc? [01:42:11] yurikR, I think he is the_nobodies [01:42:29] sigh [01:42:55] the_nobodies, i'm running the l10nupdate right now, and its doing a lot of git pulling [01:43:06] I'm more omnipresent than ceiling cat [01:44:04] its kinda worrying how much git pull/submodule update l10nupdate does [01:55:27] (03PS1) 10Springle: upgrade db1015 to trusty and mariadb 10 [puppet] - 10https://gerrit.wikimedia.org/r/179046 [01:56:27] (03CR) 10Springle: [C: 032] upgrade db1015 to trusty and mariadb 10 [puppet] - 10https://gerrit.wikimedia.org/r/179046 (owner: 10Springle) [01:56:38] !log yurik Synchronized php-1.25wmf11/cache/l10n: (no message) (duration: 00m 01s) [01:56:46] Logged the message, Master [01:56:51] !log LocalisationUpdate completed (1.25wmf11) at 2014-12-11 01:56:51+00:00 [01:56:54] Logged the message, Master [01:57:11] !log upgrade db1015 trusty [01:57:16] Logged the message, Master [02:04:29] nuria, requested a SWAT for tomorrow 1600 UTC. [02:06:33] superm401, all right, thank you! [02:08:30] !log l10nupdate Synchronized php-1.25wmf11/cache/l10n: (no message) (duration: 00m 01s) [02:08:34] !log LocalisationUpdate completed (1.25wmf11) at 2014-12-11 02:08:34+00:00 [02:08:40] Logged the message, Master [02:08:45] Logged the message, Master [02:09:15] (03PS1) 10Springle: repool db1015 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179050 [02:09:33] !log yurik Synchronized php-1.25wmf12/cache/l10n: (no message) (duration: 00m 01s) [02:09:36] Logged the message, Master [02:09:40] (03CR) 10Springle: [C: 032] repool db1015 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179050 (owner: 10Springle) [02:09:43] !log LocalisationUpdate completed (1.25wmf12) at 2014-12-11 02:09:43+00:00 [02:09:49] Logged the message, Master [02:10:38] !log springle Synchronized wmf-config/db-eqiad.php: repool db1015, warm up (duration: 00m 08s) [02:10:46] Logged the message, Master [02:12:00] !log l10nupdate Synchronized php-1.25wmf12/cache/l10n: (no message) (duration: 00m 01s) [02:12:04] !log LocalisationUpdate completed (1.25wmf12) at 2014-12-11 02:12:04+00:00 [02:12:05] Logged the message, Master [02:12:13] Logged the message, Master [02:12:38] bd808|BUFFER, looks like that l10nupdate script generates tons of warnings and error messages [02:13:26] e.g. This is nc from the netcat-openbsd package. An alternative nc is available [02:13:26] in the netcat-traditional package. [02:13:26] usage: nc [-46DdhklnrStUuvzC] [-i interval] [-P proxy_username] [-p source_port] [02:13:26] ... [02:13:38] lol [02:14:11] on the other hand, it has sucessfully done... something ) [02:18:01] hmm, no, still doing ... something (refreshing resource loader caches) [02:28:25] the_nobodies, do you know how long this thing usually runs? [02:28:48] does it even work? :P [02:34:50] (03CR) 10Krinkle: "Agree with Antoine. Unless ops will do it themselves or provide a convenient means to be notified of package updates with an easy way to a" [puppet] - 10https://gerrit.wikimedia.org/r/178806 (owner: 10Hashar) [02:36:11] the_nobodies, no idea - first it seemed like it pulled every git repo on the planet to some undisclosed location, followed by scary building of l10n files, successful syncing to various servers, than showing tons of yellow permission denied. [02:36:14] 02:09:32 ['/srv/deployment/scap/scap/bin/sync-common', '--no-update-l10n', '--include', 'php-1.25wmf12', '--include', 'php-1.25wmf12/cache', '--include', 'php-1.25wmf12/cache/l10n', '--include', 'php-1.25wmf12/cache/l10n/***', 'mw1010.eqiad.wmnet', 'mw1070.eqiad.wmnet', 'mw1161.eqiad.wmnet', 'mw1201.eqiad.wmnet'] on mw1183 returned [255]: Permission denied (publickey). [02:36:53] followed by tons more of green mw1258: 02:09:42 Updated 0 CDB files(s) in /srv/mediawiki/php-1.25wmf12/cache/l1 [02:36:53] l10n merge: 100% (ok: 378; fail: 0; left: 0) [02:37:12] now it has been sitting for the past 20 min in Refreshing ResourceLoader caches [02:37:22] doesnt sound very workey :| [02:37:39] i'm just hoping it will reset resource loader cache [02:37:45] but yeah, RL caches can take foreva [02:38:27] the_nobodies, try it - fake X-CS=TEST header, and navigate to http://en.m.wikipedia.org/wiki/Main_Page [02:38:42] click the external link at the bottom (CC-3.0-...) [02:38:56] I in such cases just touch resource files and resync, but I already did that for my scap, and it didn't help [02:38:58] for some reason, the l10n msg is not loading [02:39:10] even though it works fine on beta [02:39:11] (03PS1) 10Springle: assign db1004 to s7 [puppet] - 10https://gerrit.wikimedia.org/r/179055 [02:39:53] no idea what's going on, honestly [02:39:58] sigh [02:40:55] (03CR) 10Springle: [C: 032] assign db1004 to s7 [puppet] - 10https://gerrit.wikimedia.org/r/179055 (owner: 10Springle) [02:45:49] !log xtrabackup clone db1007 to db1004 [02:45:57] Logged the message, Master [02:57:22] !log git-deploy: Deploying integration/mediawiki-tools-codesniffer I602cb6cfe910fc0a [02:57:31] Logged the message, Master [03:00:12] Hm.. git-deploy always confuses me how it says only 0/2 or 1/2 completed fetch. Is the callback too early? If I keep pressing 'd' for details, eventually it reaches 2/2 [03:00:21] Looks broken to me [03:03:39] https://gist.github.com/Krinkle/cea42b20b46bf4ddbc7c [03:03:41] ori: [03:04:00] it's broken [03:04:04] the callback is too early [03:04:46] this is the worst thing about salt, i've complained about it. by default it is neither strictly synchronous or asynchronous, but an unhelpful melange [03:05:07] roughly: "be synchronous unless things are taking too long in which case just switch to asynchronous" [03:06:18] i use this shell wrapper: https://github.com/wikimedia/operations-puppet/blob/production/modules/admin/files/home/ori/.hosts/tin#L44-52 [03:06:23] i typically run it twice, for good measure. [03:07:31] it calls 'git deploy finish' to spare you from having to do that yourself if someone forgot to finish a deploy, then checkouts origin/master (unless --no-update is provided), then it syncs. [03:16:53] ori: since you've fought with it more than I have on sync-vs-async, do you know of a way around waiting on every other node in the system to time out when using grains as selectors? [03:17:52] e.g. I do salt -G 'cluster:foo' -t 30 cmd.run xxx, that picks 15/1000 hosts. If I set --verbose, it waits the full 30 seconds even if all 15 responded instantly, and then complains about all the other hosts. If I don't set --verbose, it doesn't even tell me if some of the selected 15 failed to respond. [03:18:15] greg-g: Done as https://www.mediawiki.org/w/index.php?title=MediaWiki_1.25/Roadmap&diff=1310914&oldid=1310030 – please revert if wrong. [03:34:38] !log ori Synchronized php-1.25wmf11/extensions/Math: Ic438b307a3b46: Fix for fatal caused by static call to MathRenderer::getError (duration: 00m 06s) [03:34:45] Logged the message, Master [03:37:13] (03PS2) 10Ori.livneh: mediawiki: tidy `cleanup_cache` script [puppet] - 10https://gerrit.wikimedia.org/r/179027 [03:37:37] bblack: wow, that's fucked up and lame. [03:37:46] i haven't encountered that. [03:38:10] we really need etcd [03:43:33] (03PS3) 10Ori.livneh: mediawiki: tidy `cleanup_cache` script [puppet] - 10https://gerrit.wikimedia.org/r/179027 [03:44:43] PROBLEM - Disk space on fluorine is CRITICAL: DISK CRITICAL - free space: /a 75451 MB (3% inode=99%): [03:55:07] (03PS1) 10Yuvipanda: wdq-mm: Request / not /ok [puppet] - 10https://gerrit.wikimedia.org/r/179061 [03:55:59] (03CR) 10Yuvipanda: [C: 032] wdq-mm: Request / not /ok [puppet] - 10https://gerrit.wikimedia.org/r/179061 (owner: 10Yuvipanda) [04:02:44] (03PS1) 10Yuvipanda: shinken: Add monitoring for wdq-mm project [puppet] - 10https://gerrit.wikimedia.org/r/179063 [04:03:45] (03CR) 10Yuvipanda: [C: 032] shinken: Add monitoring for wdq-mm project [puppet] - 10https://gerrit.wikimedia.org/r/179063 (owner: 10Yuvipanda) [04:08:33] !log LocalisationUpdate ResourceLoader cache refresh completed at Thu Dec 11 04:08:33 UTC 2014 (duration 30m 22s) [04:08:38] Logged the message, Master [04:08:50] !log LocalisationUpdate ResourceLoader cache refresh completed at Thu Dec 11 04:08:50 UTC 2014 (duration 8m 49s) [04:08:53] Logged the message, Master [04:37:20] (03PS1) 10Ori.livneh: Add test for IdleConnectionMonitoringProtocol.run [debs/pybal] - 10https://gerrit.wikimedia.org/r/179065 [04:38:41] (03PS2) 10Ori.livneh: Provision HHVM source tree in /usr/src instead of /usr/local/src [puppet] - 10https://gerrit.wikimedia.org/r/176624 [04:41:04] (03PS3) 10Ori.livneh: Provision HHVM source tree in /usr/src instead of /usr/local/src [puppet] - 10https://gerrit.wikimedia.org/r/176624 [04:41:14] (03CR) 10Ori.livneh: [C: 032 V: 032] Provision HHVM source tree in /usr/src instead of /usr/local/src [puppet] - 10https://gerrit.wikimedia.org/r/176624 (owner: 10Ori.livneh) [04:58:33] There was an error collecting ganglia data (127.0.0.1:8654): fsockopen error: Connection timed out [04:59:26] OK (but slow) on reload [05:37:37] kart_: you know RT is going to go away and be replaced by phab really, really soon, right? [05:37:41] like in a week or so [05:39:34] PROBLEM - puppet last run on mw1004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:40:02] PROBLEM - puppet last run on mw1050 is CRITICAL: CRITICAL: Puppet has 1 failures [05:40:03] PROBLEM - puppet last run on ms-be1008 is CRITICAL: CRITICAL: Puppet has 1 failures [05:40:27] PROBLEM - puppet last run on cp1048 is CRITICAL: CRITICAL: Puppet has 1 failures [05:41:37] PROBLEM - puppet last run on lvs1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:41:47] PROBLEM - puppet last run on tungsten is CRITICAL: CRITICAL: Puppet has 1 failures [05:43:20] PROBLEM - puppet last run on wtp1015 is CRITICAL: CRITICAL: Puppet has 1 failures [05:43:28] PROBLEM - puppet last run on mw1201 is CRITICAL: CRITICAL: Puppet has 1 failures [05:43:48] PROBLEM - puppet last run on mw1010 is CRITICAL: CRITICAL: Puppet has 1 failures [05:43:49] PROBLEM - puppet last run on mw1022 is CRITICAL: CRITICAL: Puppet has 1 failures [05:44:43] PROBLEM - puppet last run on rbf1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:44:58] PROBLEM - puppet last run on mw1219 is CRITICAL: CRITICAL: Puppet has 1 failures [05:44:58] PROBLEM - puppet last run on analytics1003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:44:59] PROBLEM - puppet last run on db2030 is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:12] PROBLEM - puppet last run on cp3004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:12] PROBLEM - puppet last run on ms-be1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:21] PROBLEM - puppet last run on elastic1020 is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:29] PROBLEM - puppet last run on mw1113 is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:29] PROBLEM - puppet last run on mw1021 is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:29] PROBLEM - puppet last run on eeden is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:39] PROBLEM - puppet last run on mw1064 is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:39] PROBLEM - puppet last run on lvs4004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:40] PROBLEM - puppet last run on ms-fe1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:40] PROBLEM - puppet last run on cp3018 is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:42] do we care? ^ [05:45:50] PROBLEM - puppet last run on cp1037 is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:50] PROBLEM - puppet last run on stat1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:45:59] PROBLEM - puppet last run on cp1067 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:06] PROBLEM - puppet last run on cp1052 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:07] PROBLEM - puppet last run on mw1236 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:07] PROBLEM - puppet last run on ms-be1011 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:07] PROBLEM - puppet last run on mw1139 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:19] PROBLEM - puppet last run on copper is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:28] PROBLEM - puppet last run on mw1154 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:28] PROBLEM - puppet last run on cp1044 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:38] PROBLEM - puppet last run on mw1107 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:41] PROBLEM - puppet last run on mw1131 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:42] PROBLEM - puppet last run on mw1155 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:42] PROBLEM - puppet last run on db1056 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:42] PROBLEM - puppet last run on achernar is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:42] PROBLEM - puppet last run on cp3005 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:53] PROBLEM - puppet last run on mw1016 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:53] PROBLEM - puppet last run on mw1027 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:53] PROBLEM - puppet last run on mw1104 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:59] PROBLEM - puppet last run on mw1255 is CRITICAL: CRITICAL: Puppet has 1 failures [05:46:59] PROBLEM - puppet last run on db2028 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:07] (03PS1) 10Yuvipanda: wdq-mm: Add loadbalancer [puppet] - 10https://gerrit.wikimedia.org/r/179068 [05:47:08] PROBLEM - puppet last run on mw1193 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:08] PROBLEM - puppet last run on es1004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:19] PROBLEM - puppet last run on bast1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:19] PROBLEM - puppet last run on sca1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:19] PROBLEM - puppet last run on elastic1010 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:31] PROBLEM - puppet last run on lvs1003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:32] PROBLEM - puppet last run on virt1005 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:32] PROBLEM - puppet last run on mw1207 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:33] PROBLEM - puppet last run on cp3017 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:38] PROBLEM - puppet last run on cp4012 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:40] PROBLEM - puppet last run on lvs3003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:41] greg-g: sounds like a transient storm [05:47:48] PROBLEM - puppet last run on mw1018 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:48] PROBLEM - puppet last run on mw1135 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:48] PROBLEM - puppet last run on palladium is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:48] PROBLEM - puppet last run on ms-be2009 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:49] PROBLEM - puppet last run on terbium is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:49] PROBLEM - puppet last run on mw1215 is CRITICAL: CRITICAL: Puppet has 1 failures [05:47:59] PROBLEM - puppet last run on search1008 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:00] PROBLEM - puppet last run on mw1019 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:00] PROBLEM - puppet last run on mw1066 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:00] PROBLEM - puppet last run on mw1143 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:06] greg-g: I’ll investigate if there isn’t a recovery storm soon [05:48:13] PROBLEM - puppet last run on elastic1013 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:13] PROBLEM - puppet last run on lvs1004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:20] PROBLEM - puppet last run on cp4009 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:24] greg-g: apt-get failure. probably transient [05:48:29] PROBLEM - puppet last run on amssq39 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:33] * springle pokes around [05:48:38] PROBLEM - puppet last run on db1068 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:39] PROBLEM - puppet last run on mw1182 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:40] PROBLEM - puppet last run on ms-be1010 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:40] PROBLEM - puppet last run on elastic1025 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:42] springle: for labs we increased the apt-get timeout and our transient apt-get problems have gone away [05:48:48] PROBLEM - puppet last run on mw1191 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:49] PROBLEM - puppet last run on lvs3002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:52] PROBLEM - puppet last run on radon is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:52] PROBLEM - puppet last run on lanthanum is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:52] PROBLEM - puppet last run on db1049 is CRITICAL: CRITICAL: Puppet has 1 failures [05:48:52] PROBLEM - puppet last run on lvs2002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:49:00] well, gone away as in haven’t happened for about 4-5 days now, where before they usedt o happen every day [05:49:02] PROBLEM - puppet last run on analytics1021 is CRITICAL: CRITICAL: Puppet has 1 failures [05:49:07] (03CR) 10Yuvipanda: [C: 032] wdq-mm: Add loadbalancer [puppet] - 10https://gerrit.wikimedia.org/r/179068 (owner: 10Yuvipanda) [05:49:12] PROBLEM - puppet last run on db2017 is CRITICAL: CRITICAL: Puppet has 1 failures [05:49:13] PROBLEM - puppet last run on cp3019 is CRITICAL: CRITICAL: Puppet has 1 failures [05:49:23] PROBLEM - puppet last run on amssq45 is CRITICAL: CRITICAL: Puppet has 1 failures [05:49:23] PROBLEM - puppet last run on lvs2005 is CRITICAL: CRITICAL: Puppet has 1 failures [05:49:32] RECOVERY - puppet last run on ms-be1008 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [05:49:37] PROBLEM - puppet last run on amssq50 is CRITICAL: CRITICAL: Puppet has 1 failures [05:49:37] PROBLEM - puppet last run on amssq52 is CRITICAL: CRITICAL: Puppet has 1 failures [05:49:46] PROBLEM - puppet last run on mw1075 is CRITICAL: CRITICAL: Puppet has 1 failures [05:50:00] PROBLEM - puppet last run on cp1069 is CRITICAL: CRITICAL: Puppet has 1 failures [05:50:00] PROBLEM - puppet last run on mw1096 is CRITICAL: CRITICAL: Puppet has 1 failures [05:50:01] PROBLEM - puppet last run on mw1169 is CRITICAL: CRITICAL: Puppet has 1 failures [05:50:02] PROBLEM - puppet last run on praseodymium is CRITICAL: CRITICAL: Puppet has 1 failures [05:50:14] PROBLEM - puppet last run on db1058 is CRITICAL: CRITICAL: Puppet has 1 failures [05:50:30] apt > carbon sockets sitting in CLOSE_WAIT. presumably those will have to timeout then next run will recover [05:50:33] PROBLEM - puppet last run on analytics1034 is CRITICAL: CRITICAL: Puppet has 1 failures [05:50:49] PROBLEM - puppet last run on erbium is CRITICAL: CRITICAL: Puppet has 1 failures [05:50:49] PROBLEM - puppet last run on mw1184 is CRITICAL: CRITICAL: Puppet has 1 failures [05:50:49] PROBLEM - puppet last run on db1041 is CRITICAL: CRITICAL: Puppet has 1 failures [05:50:49] PROBLEM - puppet last run on amslvs4 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:00] PROBLEM - puppet last run on es1009 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:02] PROBLEM - puppet last run on search1014 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:09] springle: bah, yeah, just as I said we haven’t had any apt-get timeouts we had one in labs as well. Assuming they all connect to carbon that makes sense [05:51:13] PROBLEM - puppet last run on uranium is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:20] PROBLEM - puppet last run on mw1083 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:24] PROBLEM - puppet last run on search1020 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:31] PROBLEM - puppet last run on db1024 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:31] PROBLEM - puppet last run on search1021 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:40] PROBLEM - puppet last run on mw1127 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:40] PROBLEM - puppet last run on mw1035 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:41] PROBLEM - puppet last run on cp4017 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:41] PROBLEM - puppet last run on ms-be2013 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:41] PROBLEM - puppet last run on analytics1015 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:50] PROBLEM - puppet last run on amssq58 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:52] PROBLEM - puppet last run on cp3022 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:53] PROBLEM - puppet last run on mc1004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:53] PROBLEM - puppet last run on mw1252 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:54] PROBLEM - puppet last run on mw1094 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:54] PROBLEM - puppet last run on mw1221 is CRITICAL: CRITICAL: Puppet has 1 failures [05:51:54] PROBLEM - puppet last run on ms-be1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:00] PROBLEM - puppet last run on mw1161 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:09] PROBLEM - puppet last run on db1007 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:09] PROBLEM - puppet last run on mw1124 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:10] PROBLEM - puppet last run on mw1038 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:10] PROBLEM - puppet last run on cp1059 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:11] PROBLEM - puppet last run on es1006 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:11] PROBLEM - puppet last run on virt1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:19] PROBLEM - puppet last run on ytterbium is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:20] PROBLEM - puppet last run on amssq59 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:20] RECOVERY - puppet last run on mw1004 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [05:52:29] PROBLEM - puppet last run on mw1256 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:29] PROBLEM - puppet last run on elastic1031 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:30] PROBLEM - puppet last run on analytics1033 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:30] PROBLEM - puppet last run on mw1218 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:30] PROBLEM - puppet last run on logstash1003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:30] PROBLEM - puppet last run on es1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:30] PROBLEM - puppet last run on mw1216 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:54] RECOVERY - puppet last run on mw1050 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [05:52:55] PROBLEM - puppet last run on wtp1021 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:55] PROBLEM - puppet last run on ms-be1013 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:55] PROBLEM - puppet last run on mw1059 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:55] PROBLEM - puppet last run on pc1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:55] PROBLEM - puppet last run on ocg1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:55] PROBLEM - puppet last run on dbproxy1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:55] PROBLEM - puppet last run on mw1005 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:56] PROBLEM - puppet last run on mw1132 is CRITICAL: CRITICAL: Puppet has 1 failures [05:52:56] PROBLEM - puppet last run on mw1031 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:05] PROBLEM - puppet last run on mc1009 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:05] PROBLEM - puppet last run on mw1115 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:05] PROBLEM - puppet last run on mw1147 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:05] PROBLEM - puppet last run on wtp1009 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:05] PROBLEM - puppet last run on mw1012 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:05] PROBLEM - puppet last run on mw1080 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:06] PROBLEM - puppet last run on mw1028 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:06] PROBLEM - puppet last run on db1029 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:07] PROBLEM - puppet last run on db1053 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:07] PROBLEM - puppet last run on db1031 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:08] PROBLEM - puppet last run on analytics1017 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:08] PROBLEM - puppet last run on mw1048 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:09] PROBLEM - puppet last run on mw1141 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:09] PROBLEM - puppet last run on amslvs2 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:10] PROBLEM - puppet last run on cp1047 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:18] PROBLEM - puppet last run on mw1089 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:18] PROBLEM - puppet last run on mw1109 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:18] PROBLEM - puppet last run on cp4007 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:28] PROBLEM - puppet last run on mw1140 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:32] PROBLEM - puppet last run on db2005 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:33] PROBLEM - puppet last run on cp3013 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:35] PROBLEM - puppet last run on amssq43 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:35] PROBLEM - puppet last run on mw1134 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:35] PROBLEM - puppet last run on mw1233 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:35] PROBLEM - puppet last run on cerium is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:35] PROBLEM - puppet last run on ms-be2006 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:45] PROBLEM - puppet last run on ms-fe1003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:47] PROBLEM - puppet last run on cp1070 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:47] PROBLEM - puppet last run on platinum is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:48] PROBLEM - puppet last run on netmon1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:48] PROBLEM - puppet last run on mw1197 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:48] PROBLEM - puppet last run on mw1240 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:48] PROBLEM - puppet last run on mw1067 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:48] PROBLEM - puppet last run on mw1072 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:49] PROBLEM - puppet last run on neptunium is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:49] PROBLEM - puppet last run on wtp1017 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:55] PROBLEM - puppet last run on analytics1020 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:58] PROBLEM - puppet last run on mc1011 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:58] PROBLEM - puppet last run on searchidx1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:58] PROBLEM - puppet last run on gold is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:58] PROBLEM - puppet last run on ms-fe2004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:58] PROBLEM - puppet last run on mw1250 is CRITICAL: CRITICAL: Puppet has 1 failures [05:53:59] PROBLEM - puppet last run on analytics1041 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:05] PROBLEM - puppet last run on amssq32 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:05] PROBLEM - puppet last run on amssq54 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:06] PROBLEM - puppet last run on cp4006 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:06] PROBLEM - puppet last run on amssq49 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:06] PROBLEM - puppet last run on cp3006 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:06] PROBLEM - puppet last run on lvs1005 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:06] PROBLEM - puppet last run on es2008 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:07] PROBLEM - puppet last run on mw1007 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:16] PROBLEM - puppet last run on vanadium is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:21] PROBLEM - puppet last run on mw1106 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:21] PROBLEM - puppet last run on wtp1019 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:24] PROBLEM - puppet last run on mw1045 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:26] PROBLEM - puppet last run on elastic1012 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:27] PROBLEM - puppet last run on db2019 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:27] PROBLEM - puppet last run on db2009 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:27] PROBLEM - puppet last run on magnesium is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:27] PROBLEM - puppet last run on mw1226 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:27] PROBLEM - puppet last run on mw1082 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:27] PROBLEM - puppet last run on ms-be1006 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:28] PROBLEM - puppet last run on dbstore1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:28] PROBLEM - puppet last run on mw1145 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:28] PROBLEM - puppet last run on db1073 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:29] PROBLEM - puppet last run on potassium is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:29] PROBLEM - puppet last run on search1010 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:30] PROBLEM - puppet last run on search1016 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:36] PROBLEM - puppet last run on mw1254 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:39] PROBLEM - puppet last run on mw1088 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:39] PROBLEM - puppet last run on db1066 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:39] PROBLEM - puppet last run on mc1016 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:40] PROBLEM - puppet last run on elastic1021 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:40] PROBLEM - puppet last run on mw1187 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:44] PROBLEM - puppet last run on cp1055 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:44] RECOVERY - puppet last run on mw1219 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:54:44] PROBLEM - puppet last run on db2039 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:44] PROBLEM - puppet last run on cp3015 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:44] PROBLEM - puppet last run on amssq44 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:48] RECOVERY - puppet last run on db2030 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:54:48] PROBLEM - puppet last run on lvs3001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:59] PROBLEM - puppet last run on xenon is CRITICAL: CRITICAL: Puppet has 1 failures [05:54:59] PROBLEM - puppet last run on rdb1004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:00] PROBLEM - puppet last run on es1008 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:00] RECOVERY - puppet last run on ms-be1002 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [05:55:00] PROBLEM - puppet last run on elastic1007 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:00] PROBLEM - puppet last run on ms-be1004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:00] PROBLEM - puppet last run on mw1009 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:00] PROBLEM - puppet last run on mw1003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:01] PROBLEM - puppet last run on db1050 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:01] PROBLEM - puppet last run on db1065 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:02] PROBLEM - puppet last run on cp1049 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:02] PROBLEM - puppet last run on elastic1004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:11] PROBLEM - puppet last run on mw1174 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:11] PROBLEM - puppet last run on mw1170 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:11] PROBLEM - puppet last run on mw1178 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:11] PROBLEM - puppet last run on lead is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:11] PROBLEM - puppet last run on mw1099 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:11] PROBLEM - puppet last run on snapshot1003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:11] PROBLEM - puppet last run on db2002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:12] PROBLEM - puppet last run on mw1228 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:12] PROBLEM - puppet last run on mw1200 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:13] PROBLEM - puppet last run on ms-be2002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:13] PROBLEM - puppet last run on ms1004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:14] PROBLEM - puppet last run on mw1150 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:14] PROBLEM - puppet last run on rbf1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:23] PROBLEM - puppet last run on caesium is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:23] PROBLEM - puppet last run on mw1173 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:23] PROBLEM - puppet last run on mw1224 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:23] PROBLEM - puppet last run on mw1026 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:27] PROBLEM - puppet last run on logstash1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:27] PROBLEM - puppet last run on ms-be2004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:28] PROBLEM - puppet last run on labstore2001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:28] PROBLEM - puppet last run on db1046 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:28] PROBLEM - puppet last run on ms-be2003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:28] PROBLEM - puppet last run on cp3008 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:28] PROBLEM - puppet last run on elastic1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:28] PROBLEM - puppet last run on cp3020 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:36] PROBLEM - puppet last run on mc1003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:36] PROBLEM - puppet last run on carbon is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:37] RECOVERY - puppet last run on cp1037 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [05:55:37] PROBLEM - puppet last run on mw1068 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:37] PROBLEM - puppet last run on graphite1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:37] RECOVERY - puppet last run on cp1067 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:55:48] PROBLEM - puppet last run on ms-fe1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:55:54] PROBLEM - puppet last run on mw1041 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:00] PROBLEM - puppet last run on mw1063 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:00] PROBLEM - puppet last run on mw1006 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:00] PROBLEM - puppet last run on search1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:00] PROBLEM - puppet last run on wtp1006 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:00] PROBLEM - puppet last run on es2001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:00] PROBLEM - puppet last run on db2018 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:00] PROBLEM - puppet last run on ms-be1003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:01] PROBLEM - puppet last run on mc1006 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:01] PROBLEM - puppet last run on lvs4002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:13] PROBLEM - puppet last run on cp3016 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:13] PROBLEM - puppet last run on wtp1020 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:13] RECOVERY - puppet last run on wtp1015 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [05:56:14] RECOVERY - puppet last run on cp1044 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [05:56:14] RECOVERY - puppet last run on mw1154 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [05:56:15] PROBLEM - puppet last run on labstore1003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:15] PROBLEM - puppet last run on analytics1025 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:15] PROBLEM - puppet last run on wtp1016 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:15] RECOVERY - puppet last run on mw1131 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [05:56:16] PROBLEM - puppet last run on mw1160 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:20] PROBLEM - puppet last run on mw1153 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:21] RECOVERY - puppet last run on mw1155 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [05:56:21] PROBLEM - puppet last run on mw1008 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:21] PROBLEM - puppet last run on ruthenium is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:22] PROBLEM - puppet last run on mw1242 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:22] PROBLEM - puppet last run on mw1176 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:23] PROBLEM - puppet last run on mw1144 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:23] PROBLEM - puppet last run on mw1025 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:23] PROBLEM - puppet last run on analytics1040 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:30] PROBLEM - puppet last run on analytics1035 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:41] PROBLEM - puppet last run on mw1120 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:41] PROBLEM - puppet last run on db1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:42] PROBLEM - puppet last run on ms-fe2001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:47] PROBLEM - puppet last run on mw1069 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:47] PROBLEM - puppet last run on mc1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:47] PROBLEM - puppet last run on helium is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:50] PROBLEM - puppet last run on mw1046 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:51] PROBLEM - puppet last run on mw1039 is CRITICAL: CRITICAL: Puppet has 1 failures [05:56:51] PROBLEM - puppet last run on cp4008 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:00] PROBLEM - puppet last run on mw1100 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:02] PROBLEM - puppet last run on mw1217 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:03] PROBLEM - puppet last run on lvs1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:03] RECOVERY - puppet last run on mw1255 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:57:03] PROBLEM - puppet last run on mw1166 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:03] PROBLEM - puppet last run on mw1235 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:03] PROBLEM - puppet last run on db2034 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:03] PROBLEM - puppet last run on mw1222 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:04] PROBLEM - puppet last run on cp1061 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:04] PROBLEM - puppet last run on mw1065 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:04] PROBLEM - puppet last run on amslvs1 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:12] RECOVERY - puppet last run on sca1001 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [05:57:12] PROBLEM - puppet last run on db1059 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:26] PROBLEM - puppet last run on db1021 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:26] PROBLEM - puppet last run on mw1175 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:26] PROBLEM - puppet last run on sca1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:26] PROBLEM - puppet last run on elastic1008 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:26] PROBLEM - puppet last run on mw1117 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:26] PROBLEM - puppet last run on db1018 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:27] PROBLEM - puppet last run on elastic1022 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:27] PROBLEM - puppet last run on amssq61 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:28] PROBLEM - puppet last run on amssq47 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:28] PROBLEM - puppet last run on db1051 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:29] PROBLEM - puppet last run on cp3014 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:29] PROBLEM - puppet last run on db2036 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:30] PROBLEM - puppet last run on cp1039 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:30] RECOVERY - puppet last run on mw1207 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:57:31] PROBLEM - puppet last run on db1022 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:31] PROBLEM - puppet last run on elastic1030 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:34] PROBLEM - puppet last run on db1023 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:35] PROBLEM - puppet last run on amssq53 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:35] RECOVERY - puppet last run on cp3017 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [05:57:35] PROBLEM - puppet last run on amssq60 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:35] PROBLEM - puppet last run on elastic1018 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:36] RECOVERY - puppet last run on lvs3003 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [05:57:41] RECOVERY - puppet last run on mw1135 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [05:57:41] PROBLEM - puppet last run on mw1189 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:41] PROBLEM - puppet last run on mw1092 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:51] PROBLEM - puppet last run on virt1006 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:51] PROBLEM - puppet last run on wtp1012 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:51] RECOVERY - puppet last run on ms-be2009 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [05:57:51] PROBLEM - puppet last run on mw1164 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:51] PROBLEM - puppet last run on mw1172 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:51] PROBLEM - puppet last run on mw1249 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:51] PROBLEM - puppet last run on search1018 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:52] RECOVERY - puppet last run on search1008 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [05:57:52] PROBLEM - puppet last run on mw1129 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:53] PROBLEM - puppet last run on iron is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:53] PROBLEM - puppet last run on labnet1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:57:54] PROBLEM - puppet last run on cp1056 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:01] RECOVERY - puppet last run on mw1066 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [05:58:09] PROBLEM - puppet last run on virt1004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:09] PROBLEM - puppet last run on ms-fe1004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:09] PROBLEM - puppet last run on db1040 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:09] PROBLEM - puppet last run on mw1177 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:10] PROBLEM - puppet last run on cp1058 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:10] PROBLEM - puppet last run on labsdb1003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:10] RECOVERY - puppet last run on elastic1013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:58:11] PROBLEM - puppet last run on mw1205 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:11] PROBLEM - puppet last run on mw1060 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:12] PROBLEM - puppet last run on mw1114 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:12] PROBLEM - puppet last run on lvs2001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:13] PROBLEM - puppet last run on labcontrol2001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:13] RECOVERY - puppet last run on cp4009 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [05:58:14] PROBLEM - puppet last run on mw1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:26] PROBLEM - puppet last run on analytics1030 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:27] PROBLEM - puppet last run on db2029 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:27] RECOVERY - puppet last run on amssq39 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [05:58:27] PROBLEM - puppet last run on db1067 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:27] PROBLEM - puppet last run on cp3003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:27] PROBLEM - puppet last run on cp4001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:28] PROBLEM - puppet last run on amssq35 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:30] PROBLEM - puppet last run on logstash1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:31] PROBLEM - puppet last run on db1028 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:33] PROBLEM - puppet last run on mw1123 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:33] PROBLEM - puppet last run on mw1061 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:33] PROBLEM - puppet last run on lvs2004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:40] PROBLEM - puppet last run on mc1012 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:46] PROBLEM - puppet last run on mw1237 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:46] PROBLEM - puppet last run on analytics1038 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:54] PROBLEM - puppet last run on elastic1027 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:54] PROBLEM - puppet last run on dataset1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:54] PROBLEM - puppet last run on labmon1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:54] PROBLEM - puppet last run on mw1052 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:54] PROBLEM - puppet last run on mw1042 is CRITICAL: CRITICAL: Puppet has 1 failures [05:58:54] RECOVERY - puppet last run on radon is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [05:58:54] PROBLEM - puppet last run on mw1119 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:05] RECOVERY - puppet last run on lanthanum is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:59:10] RECOVERY - puppet last run on lvs2002 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [05:59:12] PROBLEM - puppet last run on gallium is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:13] PROBLEM - puppet last run on es1007 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:13] PROBLEM - puppet last run on mw1054 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:13] RECOVERY - puppet last run on db2017 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [05:59:13] PROBLEM - puppet last run on mc1005 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:14] RECOVERY - puppet last run on ms-fe1002 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [05:59:14] PROBLEM - puppet last run on cp4019 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:14] PROBLEM - puppet last run on cp4014 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:24] PROBLEM - puppet last run on dbproxy1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:24] PROBLEM - puppet last run on sodium is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:24] PROBLEM - puppet last run on cp1050 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:24] PROBLEM - puppet last run on cp1046 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:24] PROBLEM - puppet last run on mw1133 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:24] PROBLEM - puppet last run on snapshot1001 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:24] PROBLEM - puppet last run on mw1168 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:25] PROBLEM - puppet last run on db1043 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:38] (03PS1) 10Yuvipanda: wdq-mm: Fix lb nginx config context [puppet] - 10https://gerrit.wikimedia.org/r/179069 [05:59:42] RECOVERY - puppet last run on lvs2005 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [05:59:42] PROBLEM - puppet last run on db1034 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:42] PROBLEM - puppet last run on mw1213 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:42] PROBLEM - puppet last run on lvs3004 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:42] RECOVERY - puppet last run on amssq45 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [05:59:42] PROBLEM - puppet last run on db1042 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:42] PROBLEM - puppet last run on mw1195 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:43] PROBLEM - puppet last run on search1007 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:43] PROBLEM - puppet last run on db2038 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:44] PROBLEM - puppet last run on mw1126 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:44] PROBLEM - puppet last run on db2007 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:45] PROBLEM - puppet last run on mw1118 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:52] PROBLEM - puppet last run on cp4003 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:54] PROBLEM - puppet last run on mw1227 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:54] PROBLEM - puppet last run on polonium is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:54] PROBLEM - puppet last run on pc1002 is CRITICAL: CRITICAL: Puppet has 1 failures [05:59:54] PROBLEM - puppet last run on mw1211 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:04] RECOVERY - puppet last run on amssq50 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:00:04] PROBLEM - puppet last run on amssq48 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:05] RECOVERY - puppet last run on mw1107 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:00:05] PROBLEM - puppet last run on wtp1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:05] PROBLEM - puppet last run on antimony is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:05] PROBLEM - puppet last run on mw1190 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:05] PROBLEM - puppet last run on mw1238 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:05] PROBLEM - puppet last run on argon is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:05] PROBLEM - puppet last run on mw1162 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:06] PROBLEM - puppet last run on analytics1010 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:06] PROBLEM - puppet last run on mw1206 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:07] PROBLEM - puppet last run on mw1055 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:07] PROBLEM - puppet last run on analytics1016 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:08] RECOVERY - puppet last run on cp1069 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [06:00:08] PROBLEM - puppet last run on mw1251 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:09] PROBLEM - puppet last run on mw1208 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:19] PROBLEM - puppet last run on cp1062 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:19] PROBLEM - puppet last run on search1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:19] PROBLEM - puppet last run on mw1247 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:19] RECOVERY - puppet last run on mw1096 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [06:00:19] PROBLEM - puppet last run on labstore1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:19] PROBLEM - puppet last run on mw1149 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:19] PROBLEM - puppet last run on mw1084 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:20] PROBLEM - puppet last run on cp4004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:22] PROBLEM - puppet last run on ms-fe2003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:22] PROBLEM - puppet last run on amssq55 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:22] PROBLEM - puppet last run on gadolinium is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:22] PROBLEM - puppet last run on lithium is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:22] PROBLEM - puppet last run on mw1044 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:30] PROBLEM - puppet last run on mw1014 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:30] RECOVERY - puppet last run on mw1109 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [06:00:32] PROBLEM - puppet last run on mw1076 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:32] PROBLEM - puppet last run on ms-be3001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:32] PROBLEM - puppet last run on nescio is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:36] PROBLEM - puppet last run on rdb1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:36] PROBLEM - puppet last run on install2001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:46] PROBLEM - puppet last run on plutonium is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:46] PROBLEM - puppet last run on mw1125 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:46] PROBLEM - puppet last run on db2016 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:46] PROBLEM - puppet last run on ms-be2012 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:46] PROBLEM - puppet last run on amssq46 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:46] PROBLEM - puppet last run on mw1111 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:46] PROBLEM - puppet last run on db1071 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:47] PROBLEM - puppet last run on mw1151 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:47] PROBLEM - puppet last run on virt1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:48] PROBLEM - puppet last run on mw1056 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:55] RECOVERY - puppet last run on mw1193 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:00:55] PROBLEM - puppet last run on ms-be2011 is CRITICAL: CRITICAL: Puppet has 1 failures [06:00:55] PROBLEM - puppet last run on thallium is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:05] PROBLEM - puppet last run on rhenium is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:05] PROBLEM - puppet last run on snapshot1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:14] PROBLEM - puppet last run on amssq34 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:14] RECOVERY - puppet last run on amslvs4 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:01:15] PROBLEM - puppet last run on amssq56 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:15] RECOVERY - puppet last run on analytics1034 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [06:01:15] PROBLEM - puppet last run on db1052 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:15] PROBLEM - puppet last run on labsdb1006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:16] PROBLEM - puppet last run on db2004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:16] PROBLEM - puppet last run on virt1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:16] RECOVERY - puppet last run on erbium is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:01:17] PROBLEM - puppet last run on db1016 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:17] RECOVERY - puppet last run on lvs1003 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:01:18] PROBLEM - puppet last run on db1020 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:18] RECOVERY - puppet last run on virt1005 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:01:19] RECOVERY - puppet last run on db1041 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:01:19] PROBLEM - puppet last run on ms-be2008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:20] PROBLEM - puppet last run on cp4005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:20] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:21] PROBLEM - puppet last run on db1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:21] PROBLEM - puppet last run on elastic1024 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:22] PROBLEM - puppet last run on mw1011 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:22] PROBLEM - puppet last run on analytics1013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:23] PROBLEM - puppet last run on cp4018 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:23] PROBLEM - puppet last run on amssq36 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:24] PROBLEM - puppet last run on wtp1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:24] PROBLEM - puppet last run on cp3010 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:25] PROBLEM - puppet last run on ms-be3002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:25] PROBLEM - puppet last run on amssq42 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:26] PROBLEM - puppet last run on osmium is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:37] PROBLEM - puppet last run on elastic1006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:37] PROBLEM - puppet last run on mw1258 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:37] PROBLEM - puppet last run on search1023 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:37] PROBLEM - puppet last run on mw1079 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:37] PROBLEM - puppet last run on analytics1026 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:37] PROBLEM - puppet last run on db1036 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:37] PROBLEM - puppet last run on lvs4003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:38] PROBLEM - puppet last run on hafnium is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:38] PROBLEM - puppet last run on cp1063 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:39] RECOVERY - puppet last run on dbstore1001 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [06:01:39] PROBLEM - puppet last run on mw1049 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:40] PROBLEM - puppet last run on analytics1022 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:40] PROBLEM - puppet last run on snapshot1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:41] PROBLEM - puppet last run on db1060 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:41] PROBLEM - puppet last run on mc1014 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:52] PROBLEM - puppet last run on elastic1019 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:53] i'll restart puppetmaster [06:01:56] RECOVERY - puppet last run on search1021 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:01:56] PROBLEM - puppet last run on mw1116 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:57] PROBLEM - puppet last run on db1026 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:57] PROBLEM - puppet last run on rubidium is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:57] PROBLEM - puppet last run on wtp1018 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:57] RECOVERY - puppet last run on db1024 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:01:57] PROBLEM - puppet last run on elastic1015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:57] PROBLEM - puppet last run on search1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:57] PROBLEM - puppet last run on amssq51 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:58] RECOVERY - puppet last run on mw1035 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:01:58] PROBLEM - puppet last run on bast4001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:01:59] RECOVERY - puppet last run on cp4017 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:01:59] PROBLEM - puppet last run on amssq40 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:00] PROBLEM - puppet last run on ms-fe3002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:00] PROBLEM - puppet last run on db1039 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:01] RECOVERY - puppet last run on analytics1015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:02:02] PROBLEM - puppet last run on ms-be2001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:20] RECOVERY - puppet last run on cp3022 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [06:02:20] RECOVERY - puppet last run on amssq58 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:02:22] RECOVERY - puppet last run on rdb1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:02:23] RECOVERY - puppet last run on mc1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:02:24] RECOVERY - puppet last run on mw1252 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:02:24] PROBLEM - puppet last run on mw1030 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:29] RECOVERY - puppet last run on mw1221 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:02:29] RECOVERY - puppet last run on ms-be1001 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:02:29] RECOVERY - puppet last run on mw1182 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:02:29] PROBLEM - puppet last run on mw1163 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:29] PROBLEM - puppet last run on stat1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:29] PROBLEM - puppet last run on db1048 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:29] PROBLEM - puppet last run on ms-be1012 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:30] RECOVERY - puppet last run on ms-be1010 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:02:30] PROBLEM - puppet last run on mw1202 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:31] PROBLEM - puppet last run on db1069 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:31] PROBLEM - puppet last run on mw1156 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:32] RECOVERY - puppet last run on mw1178 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [06:02:32] RECOVERY - puppet last run on db1007 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:02:33] PROBLEM - puppet last run on elastic1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:33] PROBLEM - puppet last run on mw1081 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:34] PROBLEM - puppet last run on mw1098 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:34] RECOVERY - puppet last run on mw1038 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [06:02:35] RECOVERY - puppet last run on ms-be2002 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [06:02:35] RECOVERY - puppet last run on cp1059 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [06:02:36] RECOVERY - puppet last run on mw1191 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:02:36] RECOVERY - puppet last run on es1006 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:02:37] PROBLEM - puppet last run on lvs2006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:37] RECOVERY - puppet last run on lvs3002 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:02:38] PROBLEM - puppet last run on mw1165 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:38] RECOVERY - puppet last run on virt1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:02:39] RECOVERY - puppet last run on mw1161 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:02:43] RECOVERY - puppet last run on logstash1001 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [06:02:50] RECOVERY - puppet last run on db1049 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:02:50] RECOVERY - puppet last run on ytterbium is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [06:02:50] PROBLEM - puppet last run on ms-be2005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:51] PROBLEM - puppet last run on amssq62 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:51] PROBLEM - puppet last run on mc1007 is CRITICAL: CRITICAL: Puppet has 1 failures [06:02:51] RECOVERY - puppet last run on carbon is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [06:02:54] !log restarted apache on palladium and strontium [06:03:00] RECOVERY - puppet last run on mw1256 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:03:03] RECOVERY - puppet last run on elastic1031 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:03:03] PROBLEM - puppet last run on mw1087 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:03] PROBLEM - puppet last run on mw1181 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:03] PROBLEM - puppet last run on wtp1022 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:03] PROBLEM - puppet last run on analytics1023 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:04] RECOVERY - puppet last run on es1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:03:04] Logged the message, Master [06:03:04] RECOVERY - puppet last run on logstash1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:03:04] RECOVERY - puppet last run on mw1216 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:03:05] PROBLEM - puppet last run on mw1051 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:05] PROBLEM - puppet last run on mw1180 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:06] PROBLEM - puppet last run on db2023 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:06] RECOVERY - puppet last run on ms-be1013 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:03:11] RECOVERY - puppet last run on ocg1001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:03:11] RECOVERY - puppet last run on pc1001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:03:11] PROBLEM - puppet last run on ms-be2007 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:11] RECOVERY - puppet last run on mw1005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:03:22] PROBLEM - puppet last run on mw1183 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:22] RECOVERY - puppet last run on mw1132 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:03:23] PROBLEM - puppet last run on amssq41 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:23] PROBLEM - puppet last run on cp3012 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:32] RECOVERY - puppet last run on mc1009 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:03:32] PROBLEM - puppet last run on mw1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:32] RECOVERY - puppet last run on mw1080 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:03:32] PROBLEM - puppet last run on analytics1011 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:40] PROBLEM - puppet last run on mw1029 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:40] PROBLEM - puppet last run on db1057 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:40] PROBLEM - puppet last run on db2037 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:40] PROBLEM - puppet last run on mw1148 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:41] PROBLEM - puppet last run on db2001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:41] PROBLEM - puppet last run on db2011 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:41] RECOVERY - puppet last run on db1053 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:03:41] RECOVERY - puppet last run on mw1169 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:03:42] PROBLEM - puppet last run on ms-be1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:42] PROBLEM - puppet last run on amssq38 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:43] PROBLEM - puppet last run on amslvs3 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:43] PROBLEM - puppet last run on db1055 is CRITICAL: CRITICAL: Puppet has 1 failures [06:03:50] PROBLEM - puppet last run on mw1159 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:04] PROBLEM - puppet last run on mw1053 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:06] PROBLEM - puppet last run on hooft is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:06] PROBLEM - puppet last run on mw1122 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:06] RECOVERY - puppet last run on mw1233 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:04:06] PROBLEM - puppet last run on labsdb1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:07] PROBLEM - puppet last run on ocg1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:07] PROBLEM - puppet last run on mc1013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:07] PROBLEM - puppet last run on db1033 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:08] PROBLEM - puppet last run on mw1225 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:08] PROBLEM - puppet last run on mw1229 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:16] PROBLEM - puppet last run on search1022 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:18] PROBLEM - puppet last run on analytics1037 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:18] PROBLEM - puppet last run on rdb1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:18] RECOVERY - puppet last run on neptunium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:04:20] PROBLEM - puppet last run on analytics1032 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:20] RECOVERY - puppet last run on mc1011 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:04:26] PROBLEM - puppet last run on rcs1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:27] PROBLEM - puppet last run on mw1121 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:28] PROBLEM - puppet last run on mw1212 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:28] RECOVERY - puppet last run on mw1184 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:04:28] PROBLEM - puppet last run on acamar is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:37] PROBLEM - puppet last run on iodine is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:37] PROBLEM - puppet last run on elastic1017 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:46] PROBLEM - puppet last run on titanium is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:46] RECOVERY - puppet last run on vanadium is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:04:46] RECOVERY - puppet last run on wtp1019 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:04:46] PROBLEM - puppet last run on mw1108 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:58] RECOVERY - puppet last run on magnesium is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:04:58] PROBLEM - puppet last run on search1013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:58] PROBLEM - puppet last run on analytics1028 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:58] PROBLEM - puppet last run on stat1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:58] PROBLEM - puppet last run on cp1060 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:58] PROBLEM - puppet last run on lvs2003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:58] PROBLEM - puppet last run on db1061 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:59] PROBLEM - puppet last run on dbstore1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:04:59] PROBLEM - puppet last run on search1012 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:00] PROBLEM - puppet last run on cp3009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:00] PROBLEM - puppet last run on mw1152 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:01] RECOVERY - puppet last run on search1020 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:05:01] PROBLEM - puppet last run on wtp1007 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:02] PROBLEM - puppet last run on wtp1011 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:09] PROBLEM - puppet last run on cp1038 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:09] PROBLEM - puppet last run on mw1043 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:09] PROBLEM - puppet last run on baham is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:09] PROBLEM - puppet last run on amssq31 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:09] PROBLEM - puppet last run on db1070 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:11] (03CR) 10Yuvipanda: [C: 032] wdq-mm: Fix lb nginx config context [puppet] - 10https://gerrit.wikimedia.org/r/179069 (owner: 10Yuvipanda) [06:05:23] RECOVERY - puppet last run on xenon is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [06:05:23] PROBLEM - puppet last run on db1006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:36] RECOVERY - puppet last run on es1008 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [06:05:37] PROBLEM - puppet last run on rdb1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:37] PROBLEM - puppet last run on mw1105 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:37] PROBLEM - puppet last run on db1072 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:37] PROBLEM - puppet last run on es1010 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:37] PROBLEM - puppet last run on elastic1023 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:37] PROBLEM - puppet last run on analytics1014 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:37] PROBLEM - puppet last run on virt1008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:38] PROBLEM - puppet last run on mw1024 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:38] PROBLEM - puppet last run on search1015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:39] PROBLEM - puppet last run on rcs1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:39] PROBLEM - puppet last run on analytics1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:40] RECOVERY - puppet last run on rbf1002 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [06:05:47] PROBLEM - puppet last run on mw1231 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:49] PROBLEM - puppet last run on db2003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:05:56] PROBLEM - puppet last run on ms-fe3001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:06] PROBLEM - puppet last run on cp1040 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:06] PROBLEM - puppet last run on mw1093 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:06] PROBLEM - puppet last run on pc1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:07] PROBLEM - puppet last run on ms-fe2002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:07] PROBLEM - puppet last run on db1038 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:07] PROBLEM - puppet last run on mw1253 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:07] PROBLEM - puppet last run on nitrogen is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:13] PROBLEM - puppet last run on search1011 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:13] PROBLEM - puppet last run on mw1223 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:13] PROBLEM - puppet last run on mw1033 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:26] PROBLEM - puppet last run on mw1071 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:26] PROBLEM - puppet last run on search1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:27] PROBLEM - puppet last run on analytics1018 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:27] RECOVERY - puppet last run on analytics1025 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:06:27] PROBLEM - puppet last run on mw1077 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:27] PROBLEM - puppet last run on wtp1008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:36] PROBLEM - puppet last run on db1037 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:36] PROBLEM - puppet last run on virt1000 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:37] PROBLEM - puppet last run on db2012 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:37] PROBLEM - puppet last run on mw1204 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:37] PROBLEM - puppet last run on mw1090 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:37] PROBLEM - puppet last run on db1035 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:37] RECOVERY - puppet last run on db1031 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [06:06:38] PROBLEM - puppet last run on calcium is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:38] RECOVERY - puppet last run on mc1002 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [06:06:39] PROBLEM - puppet last run on db1047 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:39] PROBLEM - puppet last run on amssq33 is CRITICAL: CRITICAL: Puppet has 1 failures [06:06:40] RECOVERY - puppet last run on amslvs2 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [06:06:40] RECOVERY - puppet last run on helium is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [06:06:59] PROBLEM - puppet last run on fluorine is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:11] RECOVERY - puppet last run on db2034 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [06:07:11] PROBLEM - puppet last run on db2010 is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:12] RECOVERY - puppet last run on cerium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:07:23] RECOVERY - puppet last run on netmon1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:07:34] PROBLEM - puppet last run on mw1203 is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:52] PROBLEM - puppet last run on osm-cp1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:52] PROBLEM - puppet last run on cp1053 is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:52] RECOVERY - puppet last run on db1059 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [06:07:52] RECOVERY - puppet last run on wtp1017 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:07:52] PROBLEM - puppet last run on mw1199 is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:52] RECOVERY - puppet last run on sca1002 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [06:07:53] PROBLEM - puppet last run on cp1045 is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:53] RECOVERY - puppet last run on analytics1020 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:07:54] PROBLEM - puppet last run on mw1112 is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:54] PROBLEM - puppet last run on analytics1012 is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:55] PROBLEM - puppet last run on cp1068 is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:55] RECOVERY - puppet last run on db1022 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [06:07:56] PROBLEM - puppet last run on mc1015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:56] PROBLEM - puppet last run on cp4020 is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:57] RECOVERY - puppet last run on es2008 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [06:07:57] PROBLEM - puppet last run on mw1142 is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:58] PROBLEM - puppet last run on radium is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:58] PROBLEM - puppet last run on zinc is CRITICAL: CRITICAL: Puppet has 1 failures [06:07:59] RECOVERY - puppet last run on es1009 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:07:59] RECOVERY - puppet last run on virt1006 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [06:08:04] RECOVERY - puppet last run on db2019 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:08:04] RECOVERY - puppet last run on db2009 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:08:04] RECOVERY - puppet last run on ms-be1006 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [06:08:04] RECOVERY - puppet last run on uranium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:08:04] RECOVERY - puppet last run on iron is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [06:08:04] RECOVERY - puppet last run on potassium is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:08:05] RECOVERY - puppet last run on mw1045 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [06:08:05] PROBLEM - puppet last run on db1045 is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:06] RECOVERY - puppet last run on elastic1021 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:08:06] RECOVERY - puppet last run on db1066 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:08:07] PROBLEM - puppet last run on chromium is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:07] RECOVERY - puppet last run on ms-fe1004 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [06:08:08] PROBLEM - puppet last run on lvs1006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:08] PROBLEM - puppet last run on labsdb1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:09] PROBLEM - puppet last run on pollux is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:28] PROBLEM - puppet last run on elastic1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:29] RECOVERY - puppet last run on db2039 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:08:30] PROBLEM - puppet last run on bast2001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:34] PROBLEM - puppet last run on es2006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:34] RECOVERY - puppet last run on lvs3001 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [06:08:34] PROBLEM - puppet last run on cp4002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:34] PROBLEM - puppet last run on cp4010 is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:34] PROBLEM - puppet last run on cp3011 is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:34] PROBLEM - puppet last run on search1019 is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:34] RECOVERY - puppet last run on elastic1007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:08:43] RECOVERY - puppet last run on elastic1004 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [06:08:43] RECOVERY - puppet last run on lvs2004 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [06:08:43] RECOVERY - puppet last run on db2002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:08:43] PROBLEM - puppet last run on db1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:44] PROBLEM - puppet last run on mw1110 is CRITICAL: CRITICAL: Puppet has 1 failures [06:08:53] PROBLEM - puppet last run on elastic1016 is CRITICAL: CRITICAL: Puppet has 1 failures [06:09:04] RECOVERY - puppet last run on db1046 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:09:14] RECOVERY - puppet last run on ms-be2004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:09:27] RECOVERY - puppet last run on elastic1001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:09:31] RECOVERY - puppet last run on cp3008 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [06:09:37] PROBLEM - puppet last run on es1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:09:37] PROBLEM - puppet last run on mw1230 is CRITICAL: CRITICAL: Puppet has 1 failures [06:09:37] PROBLEM - puppet last run on analytics1031 is CRITICAL: CRITICAL: Puppet has 1 failures [06:09:38] RECOVERY - puppet last run on graphite1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:09:38] PROBLEM - puppet last run on mw1015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:09:38] RECOVERY - puppet last run on db1034 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [06:09:38] RECOVERY - puppet last run on db1042 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [06:09:39] PROBLEM - puppet last run on mw1070 is CRITICAL: CRITICAL: Puppet has 1 failures [06:09:39] RECOVERY - puppet last run on ms-fe1001 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [06:09:39] PROBLEM - puppet last run on wtp1014 is CRITICAL: CRITICAL: Puppet has 1 failures [06:09:39] RECOVERY - puppet last run on mw1059 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [06:09:40] RECOVERY - puppet last run on wtp1006 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:09:40] RECOVERY - puppet last run on es2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:09:41] RECOVERY - puppet last run on ms-be1003 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:09:41] PROBLEM - puppet last run on search1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:09:42] RECOVERY - puppet last run on mc1006 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:09:42] PROBLEM - puppet last run on mw1073 is CRITICAL: CRITICAL: Puppet has 1 failures [06:09:43] RECOVERY - puppet last run on db2018 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:09:43] RECOVERY - puppet last run on db2007 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [06:09:46] RECOVERY - puppet last run on polonium is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [06:09:46] RECOVERY - puppet last run on wtp1020 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:09:46] PROBLEM - puppet last run on mw1078 is CRITICAL: CRITICAL: Puppet has 1 failures [06:09:46] PROBLEM - puppet last run on mw1128 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:01] RECOVERY - puppet last run on wtp1005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:03] RECOVERY - puppet last run on wtp1016 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:10:05] RECOVERY - puppet last run on mw1012 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:05] RECOVERY - puppet last run on wtp1009 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:10:06] RECOVERY - puppet last run on mw1160 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [06:10:07] RECOVERY - puppet last run on mw1028 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [06:10:08] RECOVERY - puppet last run on mw1153 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [06:10:09] RECOVERY - puppet last run on analytics1010 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [06:10:09] PROBLEM - puppet last run on analytics1024 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:09] PROBLEM - puppet last run on ms-be2010 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:09] PROBLEM - puppet last run on ms-be1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:09] RECOVERY - puppet last run on ruthenium is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:10:09] RECOVERY - puppet last run on mw1008 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [06:10:10] PROBLEM - puppet last run on search1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:10] RECOVERY - puppet last run on mw1242 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [06:10:11] PROBLEM - puppet last run on cp1064 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:11] RECOVERY - puppet last run on mw1176 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [06:10:12] PROBLEM - puppet last run on mw1136 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:12] PROBLEM - puppet last run on db1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:13] RECOVERY - puppet last run on analytics1035 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:10:13] RECOVERY - puppet last run on db1002 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [06:10:14] PROBLEM - puppet last run on cp4011 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:14] RECOVERY - puppet last run on ms-fe2003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:15] RECOVERY - puppet last run on mw1141 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:15] PROBLEM - puppet last run on cp4015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:16] RECOVERY - puppet last run on cp1047 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:10:16] RECOVERY - puppet last run on nescio is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [06:10:17] RECOVERY - puppet last run on mw1217 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [06:10:17] RECOVERY - puppet last run on mw1140 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [06:10:18] RECOVERY - puppet last run on lvs1002 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:10:18] RECOVERY - puppet last run on plutonium is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [06:10:19] RECOVERY - puppet last run on install2001 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [06:10:19] RECOVERY - puppet last run on mw1166 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [06:10:20] RECOVERY - puppet last run on db2016 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [06:10:20] RECOVERY - puppet last run on db2005 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:10:21] PROBLEM - puppet last run on mw1095 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:21] RECOVERY - puppet last run on mw1222 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [06:10:22] RECOVERY - puppet last run on cp1061 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [06:10:22] RECOVERY - puppet last run on virt1001 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [06:10:23] PROBLEM - puppet last run on ms-be2015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:23] RECOVERY - puppet last run on ms-be2006 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:10:24] RECOVERY - puppet last run on amslvs1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:32] RECOVERY - puppet last run on mw1175 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [06:10:37] RECOVERY - puppet last run on platinum is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:10:37] RECOVERY - puppet last run on thallium is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [06:10:37] RECOVERY - puppet last run on ms-be2011 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:37] RECOVERY - puppet last run on mw1240 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:37] RECOVERY - puppet last run on mw1067 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [06:10:37] RECOVERY - puppet last run on mw1072 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:37] RECOVERY - puppet last run on rhenium is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [06:10:38] RECOVERY - puppet last run on db1021 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:38] RECOVERY - puppet last run on db1052 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:39] RECOVERY - puppet last run on labsdb1006 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:10:39] RECOVERY - puppet last run on virt1003 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [06:10:40] RECOVERY - puppet last run on mw1117 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [06:10:40] RECOVERY - puppet last run on db2004 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [06:10:41] RECOVERY - puppet last run on elastic1022 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:10:41] RECOVERY - puppet last run on db1016 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:42] PROBLEM - puppet last run on mw1058 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:42] RECOVERY - puppet last run on db1051 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [06:10:50] RECOVERY - puppet last run on db1020 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [06:10:51] RECOVERY - puppet last run on mw1250 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:51] RECOVERY - puppet last run on cp1039 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:10:51] RECOVERY - puppet last run on db2036 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:10:51] RECOVERY - puppet last run on amssq34 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [06:10:51] RECOVERY - puppet last run on amssq47 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [06:10:52] RECOVERY - puppet last run on amssq61 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:52] RECOVERY - puppet last run on cp3014 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [06:10:52] RECOVERY - puppet last run on db1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:52] PROBLEM - puppet last run on ms-be1014 is CRITICAL: CRITICAL: Puppet has 1 failures [06:10:53] RECOVERY - puppet last run on elastic1024 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [06:10:53] RECOVERY - puppet last run on db1023 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:10:54] RECOVERY - puppet last run on mw1011 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [06:10:54] RECOVERY - puppet last run on analytics1013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:55] RECOVERY - puppet last run on lvs1005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:55] RECOVERY - puppet last run on cp4006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:10:56] RECOVERY - puppet last run on amssq49 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:10:56] PROBLEM - puppet last run on ms-be3004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:11:00] RECOVERY - puppet last run on amssq60 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [06:11:00] RECOVERY - puppet last run on cp3010 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [06:11:00] RECOVERY - puppet last run on ms-be3002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:11:00] RECOVERY - puppet last run on osmium is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [06:11:01] RECOVERY - puppet last run on mw1189 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [06:11:01] RECOVERY - puppet last run on mw1092 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [06:11:01] RECOVERY - puppet last run on elastic1006 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [06:11:01] RECOVERY - puppet last run on mw1258 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [06:11:01] PROBLEM - puppet last run on mw1102 is CRITICAL: CRITICAL: Puppet has 1 failures [06:11:02] RECOVERY - puppet last run on elastic1012 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:11:13] RECOVERY - puppet last run on mw1079 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [06:11:23] RECOVERY - puppet last run on mw1164 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:11:23] RECOVERY - puppet last run on tungsten is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [06:11:23] RECOVERY - puppet last run on analytics1026 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:11:23] RECOVERY - puppet last run on mw1226 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:11:23] RECOVERY - puppet last run on db1036 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:11:23] RECOVERY - puppet last run on hafnium is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [06:11:23] RECOVERY - puppet last run on cp1063 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [06:12:36] RECOVERY - puppet last run on mw1026 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:36] RECOVERY - puppet last run on es1007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:12:36] RECOVERY - puppet last run on mw1054 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:12:36] RECOVERY - puppet last run on gallium is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:37] RECOVERY - puppet last run on mc1005 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:37] RECOVERY - puppet last run on db2003 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [06:12:37] RECOVERY - puppet last run on ms-be2005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:12:38] RECOVERY - puppet last run on cp4014 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:38] RECOVERY - puppet last run on cp4019 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:12:39] RECOVERY - puppet last run on dbproxy1001 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:12:39] RECOVERY - puppet last run on sodium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:12:40] RECOVERY - puppet last run on mc1007 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [06:12:40] RECOVERY - puppet last run on cp1050 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:42] RECOVERY - puppet last run on cp1046 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [06:12:42] RECOVERY - puppet last run on mw1133 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:12:42] RECOVERY - puppet last run on mw1181 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [06:12:45] RECOVERY - puppet last run on mw1068 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:45] RECOVERY - puppet last run on analytics1023 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:46] RECOVERY - puppet last run on pc1003 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:12:46] RECOVERY - puppet last run on mw1213 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:51] RECOVERY - puppet last run on mw1051 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:53] RECOVERY - puppet last run on lvs3004 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:54] RECOVERY - puppet last run on mw1180 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [06:12:58] RECOVERY - puppet last run on mw1195 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:12:58] RECOVERY - puppet last run on nitrogen is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [06:12:58] RECOVERY - puppet last run on search1007 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:12:58] RECOVERY - puppet last run on db2023 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:58] RECOVERY - puppet last run on search1001 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:12:58] RECOVERY - puppet last run on mw1126 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:12:58] RECOVERY - puppet last run on db2038 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:59] RECOVERY - puppet last run on mw1118 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:12:59] RECOVERY - puppet last run on mw1227 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:13:00] RECOVERY - puppet last run on pc1002 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:13:00] RECOVERY - puppet last run on ms-be2007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:13:01] RECOVERY - puppet last run on cp4003 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:13:01] RECOVERY - puppet last run on mw1183 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:13:02] RECOVERY - puppet last run on mw1211 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:13:02] RECOVERY - puppet last run on amssq48 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:13:03] RECOVERY - puppet last run on amssq41 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [06:13:03] RECOVERY - puppet last run on cp3016 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:13:13] RECOVERY - puppet last run on mw1001 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [06:13:13] RECOVERY - puppet last run on antimony is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:13:13] RECOVERY - puppet last run on amssq55 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:13:13] RECOVERY - puppet last run on mw1190 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:13:13] RECOVERY - puppet last run on analytics1011 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:13:13] RECOVERY - puppet last run on amslvs3 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [06:13:13] RECOVERY - puppet last run on argon is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:13:40] RECOVERY - puppet last run on mw1111 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:13:40] RECOVERY - puppet last run on db1071 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:13:40] RECOVERY - puppet last run on mw1122 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [06:13:40] RECOVERY - puppet last run on mw1151 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:13:40] RECOVERY - puppet last run on labsdb1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:13:41] RECOVERY - puppet last run on amssq46 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:13:41] RECOVERY - puppet last run on mw1056 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:13:54] RECOVERY - puppet last run on mc1013 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [06:13:54] RECOVERY - puppet last run on mw1225 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [06:13:55] RECOVERY - puppet last run on mw1229 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [06:13:55] RECOVERY - puppet last run on search1022 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [06:13:55] RECOVERY - puppet last run on analytics1037 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:13:55] RECOVERY - puppet last run on rdb1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:14:01] RECOVERY - puppet last run on snapshot1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:14:01] RECOVERY - puppet last run on analytics1032 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:14:02] RECOVERY - puppet last run on rcs1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:14:02] RECOVERY - puppet last run on mw1121 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [06:14:02] RECOVERY - puppet last run on mw1212 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:14:10] PROBLEM - puppet last run on elastic1026 is CRITICAL: CRITICAL: Puppet has 1 failures [06:14:21] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:14:23] RECOVERY - puppet last run on amssq56 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:14:25] RECOVERY - puppet last run on acamar is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [06:14:33] RECOVERY - puppet last run on cp4005 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:14:34] RECOVERY - puppet last run on iodine is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:14:35] RECOVERY - puppet last run on wtp1004 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:14:35] RECOVERY - puppet last run on cp4018 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:14:35] RECOVERY - puppet last run on amssq36 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:14:35] PROBLEM - puppet last run on amssq57 is CRITICAL: CRITICAL: Puppet has 1 failures [06:14:35] RECOVERY - puppet last run on amssq42 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [06:14:36] RECOVERY - puppet last run on titanium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:14:36] RECOVERY - puppet last run on lvs1001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:14:36] PROBLEM - puppet last run on mw1234 is CRITICAL: CRITICAL: Puppet has 1 failures [06:14:37] RECOVERY - puppet last run on mw1108 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:14:37] PROBLEM - puppet last run on mw1246 is CRITICAL: CRITICAL: Puppet has 6 failures [06:15:54] RECOVERY - puppet last run on stat1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:15:56] RECOVERY - puppet last run on mw1093 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:15:56] RECOVERY - puppet last run on cp1052 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:15:57] RECOVERY - puppet last run on mw1236 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:15:57] RECOVERY - puppet last run on ms-fe2002 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:15:57] RECOVERY - puppet last run on db1038 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:15:57] RECOVERY - puppet last run on ms-be1011 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:15:57] RECOVERY - puppet last run on mw1139 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:15:58] RECOVERY - puppet last run on search1011 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:15:59] RECOVERY - puppet last run on mw1223 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:15:59] RECOVERY - puppet last run on mw1033 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:15:59] RECOVERY - puppet last run on copper is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:16:09] RECOVERY - puppet last run on cp3012 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:16:16] RECOVERY - puppet last run on mw1071 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:16:21] RECOVERY - puppet last run on search1004 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:16:21] RECOVERY - puppet last run on analytics1018 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:16:21] RECOVERY - puppet last run on wtp1008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:16:21] RECOVERY - puppet last run on mw1077 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:16:33] RECOVERY - puppet last run on db1037 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:16:33] RECOVERY - puppet last run on db1056 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:16:33] RECOVERY - puppet last run on mw1204 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:16:33] RECOVERY - puppet last run on achernar is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:16:33] RECOVERY - puppet last run on mw1090 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:16:33] RECOVERY - puppet last run on db1035 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:16:34] PROBLEM - puppet last run on mw1192 is CRITICAL: CRITICAL: Puppet has 1 failures [06:16:34] RECOVERY - puppet last run on amssq38 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:16:35] RECOVERY - puppet last run on amssq33 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [06:16:35] RECOVERY - puppet last run on mw1104 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [06:16:36] RECOVERY - puppet last run on mw1010 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:16:36] RECOVERY - puppet last run on fluorine is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:16:37] RECOVERY - puppet last run on mw1016 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:16:37] RECOVERY - puppet last run on mw1027 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:16:41] PROBLEM - puppet last run on mw1062 is CRITICAL: CRITICAL: Puppet has 1 failures [06:16:41] PROBLEM - puppet last run on cp4013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:16:41] RECOVERY - puppet last run on db2028 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:16:41] RECOVERY - puppet last run on db2010 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [06:16:52] RECOVERY - puppet last run on hooft is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:16:58] PROBLEM - puppet last run on cp4016 is CRITICAL: CRITICAL: Puppet has 1 failures [06:17:04] RECOVERY - puppet last run on ocg1003 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:17:04] RECOVERY - puppet last run on db1033 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:17:04] RECOVERY - puppet last run on es1004 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [06:17:16] RECOVERY - puppet last run on bast1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:17:17] RECOVERY - puppet last run on mw1203 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:17:17] RECOVERY - puppet last run on osm-cp1001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:17:17] RECOVERY - puppet last run on cp1053 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:17:18] RECOVERY - puppet last run on mw1199 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [06:17:26] RECOVERY - puppet last run on cp1045 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [06:17:26] RECOVERY - puppet last run on elastic1010 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [06:17:26] RECOVERY - puppet last run on mw1112 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:17:26] RECOVERY - puppet last run on analytics1012 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [06:17:26] PROBLEM - puppet last run on mw1130 is CRITICAL: CRITICAL: Puppet has 1 failures [06:17:35] RECOVERY - puppet last run on cp1068 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:17:35] RECOVERY - puppet last run on cp4020 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:17:44] RECOVERY - puppet last run on ms-be1014 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [06:17:44] RECOVERY - puppet last run on mc1015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:17:44] RECOVERY - puppet last run on elastic1017 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:17:44] RECOVERY - puppet last run on cp4012 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [06:17:54] RECOVERY - puppet last run on mw1142 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:17:54] RECOVERY - puppet last run on radium is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:17:54] RECOVERY - puppet last run on zinc is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [06:17:54] RECOVERY - puppet last run on palladium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:17:54] PROBLEM - puppet last run on mw1040 is CRITICAL: CRITICAL: Puppet has 1 failures [06:17:54] RECOVERY - puppet last run on mw1018 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [06:18:04] RECOVERY - puppet last run on terbium is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:18:05] RECOVERY - puppet last run on search1012 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:18:05] RECOVERY - puppet last run on db1045 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:18:05] RECOVERY - puppet last run on mw1143 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:18:16] RECOVERY - puppet last run on chromium is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:18:17] RECOVERY - puppet last run on lvs1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:18:17] RECOVERY - puppet last run on labsdb1005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:18:17] RECOVERY - puppet last run on elastic1009 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [06:18:17] RECOVERY - puppet last run on pollux is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [06:18:17] RECOVERY - puppet last run on baham is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:18:18] RECOVERY - puppet last run on bast2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:18:18] RECOVERY - puppet last run on amssq31 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:18:19] RECOVERY - puppet last run on es2006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:18:19] RECOVERY - puppet last run on cp4010 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [06:18:26] RECOVERY - puppet last run on cp4002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:18:28] RECOVERY - puppet last run on cp3011 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:18:29] RECOVERY - puppet last run on db1068 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [06:18:30] RECOVERY - puppet last run on search1019 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:18:47] RECOVERY - puppet last run on db1009 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:18:47] RECOVERY - puppet last run on eeden is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:18:48] RECOVERY - puppet last run on virt1008 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:18:48] RECOVERY - puppet last run on elastic1025 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:18:48] RECOVERY - puppet last run on mw1110 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:18:48] RECOVERY - puppet last run on mw1113 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:18:48] RECOVERY - puppet last run on mw1021 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:18:55] RECOVERY - puppet last run on elastic1016 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [06:18:55] RECOVERY - puppet last run on analytics1021 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:19:30] RECOVERY - puppet last run on mw1075 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:19:33] RECOVERY - puppet last run on cp3019 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:19:36] RECOVERY - puppet last run on es1005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:19:43] RECOVERY - puppet last run on mw1230 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:19:45] RECOVERY - puppet last run on analytics1031 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:19:50] RECOVERY - puppet last run on mw1015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:19:50] RECOVERY - puppet last run on mw1253 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:19:50] RECOVERY - puppet last run on mw1070 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:19:50] RECOVERY - puppet last run on wtp1014 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [06:19:50] RECOVERY - puppet last run on search1009 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [06:19:50] RECOVERY - puppet last run on mw1073 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:19:51] RECOVERY - puppet last run on amssq52 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [06:19:51] RECOVERY - puppet last run on mw1078 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:19:51] RECOVERY - puppet last run on mw1128 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:20:03] RECOVERY - puppet last run on mw1095 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:20:03] RECOVERY - puppet last run on cp3005 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:20:03] RECOVERY - puppet last run on ms-be2010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:20:04] RECOVERY - puppet last run on praseodymium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:20:04] RECOVERY - puppet last run on db1005 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:20:04] RECOVERY - puppet last run on db1058 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:20:15] RECOVERY - puppet last run on cp1064 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:20:15] RECOVERY - puppet last run on search1003 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:20:15] RECOVERY - puppet last run on cp4013 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [06:20:15] RECOVERY - puppet last run on ms-be2015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:20:15] RECOVERY - puppet last run on cp4016 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:20:15] RECOVERY - puppet last run on cp4011 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:20:15] RECOVERY - puppet last run on mw1136 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:20:22] RECOVERY - puppet last run on ms-be1005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:20:22] RECOVERY - puppet last run on cp4015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:20:31] RECOVERY - puppet last run on analytics1024 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:20:50] RECOVERY - puppet last run on mw1058 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:20:50] RECOVERY - puppet last run on elastic1026 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [06:21:02] RECOVERY - puppet last run on ms-be3004 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:21:02] RECOVERY - puppet last run on amssq57 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [06:21:11] RECOVERY - puppet last run on mw1234 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:21:12] RECOVERY - puppet last run on mw1040 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [06:21:12] RECOVERY - puppet last run on mw1246 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:21:12] RECOVERY - puppet last run on mw1102 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:21:12] RECOVERY - puppet last run on cp1043 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [06:21:24] RECOVERY - puppet last run on search1014 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:21:24] RECOVERY - puppet last run on mw1245 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [06:21:24] RECOVERY - puppet last run on mw1036 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:21:24] RECOVERY - puppet last run on cp1065 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:21:24] RECOVERY - puppet last run on analytics1036 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [06:21:24] RECOVERY - puppet last run on mw1083 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:21:24] RECOVERY - puppet last run on mw1019 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:21:25] RECOVERY - puppet last run on lvs4001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:21:25] RECOVERY - puppet last run on mc1010 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [06:21:26] RECOVERY - puppet last run on cp1051 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:21:26] RECOVERY - puppet last run on lvs1006 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:21:27] RECOVERY - puppet last run on mw1232 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:21:27] RECOVERY - puppet last run on mw1127 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:21:33] RECOVERY - puppet last run on mw1214 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:21:45] RECOVERY - puppet last run on mw1094 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:21:51] RECOVERY - puppet last run on mw1124 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:22:03] RECOVERY - puppet last run on caesium is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [06:22:32] RECOVERY - puppet last run on analytics1033 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [06:22:43] RECOVERY - puppet last run on wtp1021 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:22:43] RECOVERY - puppet last run on dbproxy1002 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [06:22:51] RECOVERY - puppet last run on mw1031 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:22:56] RECOVERY - puppet last run on mw1115 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:22:56] RECOVERY - puppet last run on mw1147 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:23:12] RECOVERY - puppet last run on amssq43 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [06:23:16] RECOVERY - puppet last run on cp4007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:23:17] RECOVERY - puppet last run on mw1089 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:23:17] RECOVERY - puppet last run on db2035 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:23:17] RECOVERY - puppet last run on mw1134 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:23:17] RECOVERY - puppet last run on cp3013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:23:17] RECOVERY - puppet last run on analytics1040 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [06:23:17] RECOVERY - puppet last run on mw1048 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:23:18] RECOVERY - puppet last run on analytics1017 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [06:23:22] RECOVERY - puppet last run on mw1192 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:23:31] RECOVERY - puppet last run on ms-fe1003 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:23:42] RECOVERY - puppet last run on db1029 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:23:51] RECOVERY - puppet last run on mw1062 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:23:51] RECOVERY - puppet last run on cp1070 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [06:23:52] RECOVERY - puppet last run on mw1197 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:24:01] RECOVERY - puppet last run on gold is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:24:01] RECOVERY - puppet last run on mw1130 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:24:02] RECOVERY - puppet last run on analytics1041 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [06:24:12] RECOVERY - puppet last run on ms-fe2004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:24:12] RECOVERY - puppet last run on amssq54 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [06:24:17] RECOVERY - puppet last run on amssq53 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [06:24:19] RECOVERY - puppet last run on cp3006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:24:19] RECOVERY - puppet last run on mw1007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:24:32] RECOVERY - puppet last run on mw1106 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:24:34] RECOVERY - puppet last run on mw1145 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:24:41] RECOVERY - puppet last run on cp1055 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:24:41] RECOVERY - puppet last run on mc1016 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:24:56] RECOVERY - puppet last run on cp3015 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:25:01] RECOVERY - puppet last run on amssq44 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:25:02] RECOVERY - puppet last run on ms-be2013 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:25:15] RECOVERY - puppet last run on ms-be1004 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:25:15] RECOVERY - puppet last run on db1065 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:25:15] RECOVERY - puppet last run on cp1049 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:25:16] RECOVERY - puppet last run on mw1174 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:25:16] RECOVERY - puppet last run on mw1200 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:25:24] RECOVERY - puppet last run on mw1173 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [06:25:31] RECOVERY - puppet last run on mw1224 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:25:44] RECOVERY - puppet last run on labstore2001 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:25:44] RECOVERY - puppet last run on ms-be2003 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:25:44] RECOVERY - puppet last run on cp3020 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:25:44] RECOVERY - puppet last run on amssq59 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:25:44] RECOVERY - puppet last run on mc1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:25:55] RECOVERY - puppet last run on mw1041 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:25:55] RECOVERY - puppet last run on mw1063 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:25:57] RECOVERY - puppet last run on mw1006 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:26:03] RECOVERY - puppet last run on lvs4002 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:26:17] RECOVERY - puppet last run on labstore1003 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:26:23] RECOVERY - puppet last run on mw1120 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:26:42] RECOVERY - puppet last run on ms-fe2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:27:13] RECOVERY - puppet last run on elastic1008 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:27:16] RECOVERY - puppet last run on db1018 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:27:22] RECOVERY - puppet last run on searchidx1001 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [06:27:34] RECOVERY - puppet last run on amssq32 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:27:44] RECOVERY - puppet last run on elastic1018 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:28:16] RECOVERY - puppet last run on db1040 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:28:17] RECOVERY - puppet last run on mw1060 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:28:55] RECOVERY - puppet last run on mw1119 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:29:49] YuviPanda: what was the problem? it hit labs/beta cluster as well [06:29:57] RECOVERY - puppet last run on mw1065 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:30:13] RECOVERY - puppet last run on cp4004 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:30:14] greg-g: our prod apt-get mirror got ‘stuck' [06:30:18] ah [06:30:33] and beta cluster hits that, I presume [06:30:41] greg-g: all of labs too, really. [06:30:53] RECOVERY - puppet last run on ms-be2008 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:31:12] greg-g: is transient, so should recover [06:31:25] RECOVERY - puppet last run on lvs4003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:31:32] yeah, looks like it did [06:31:51] greg-g: we need to spend some time on aggregating these alerts, so we get one saying ‘hey, 400 hosts are down!’ vs 400 individual ones [06:31:53] alas, time, etc [06:32:10] RECOVERY - puppet last run on mc1012 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:32:28] RECOVERY - puppet last run on db1043 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:33:28] PROBLEM - puppet last run on cp1061 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:08] PROBLEM - puppet last run on db2036 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:40] PROBLEM - puppet last run on cp1056 is CRITICAL: CRITICAL: Puppet has 3 failures [06:34:43] (03PS4) 10Ori.livneh: mediawiki: tidy `cleanup_cache` script [puppet] - 10https://gerrit.wikimedia.org/r/179027 [06:35:01] PROBLEM - puppet last run on cp3003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:35:34] PROBLEM - puppet last run on mw1042 is CRITICAL: CRITICAL: Puppet has 1 failures [06:36:13] PROBLEM - puppet last run on mw1118 is CRITICAL: CRITICAL: Puppet has 1 failures [06:36:48] PROBLEM - puppet last run on ms-fe2003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:39:06] (03PS5) 10Ori.livneh: mediawiki: tidy `cleanup_cache` script [puppet] - 10https://gerrit.wikimedia.org/r/179027 [06:39:27] PROBLEM - puppet last run on sodium is CRITICAL: CRITICAL: Puppet has 1 failures [06:40:23] PROBLEM - puppet last run on virt1007 is CRITICAL: CRITICAL: Puppet has 1 failures [06:40:36] PROBLEM - puppet last run on mw1084 is CRITICAL: CRITICAL: Puppet has 1 failures [06:41:42] PROBLEM - puppet last run on mw1198 is CRITICAL: CRITICAL: Puppet has 1 failures [06:42:21] PROBLEM - puppet last run on analytics1014 is CRITICAL: CRITICAL: Puppet has 1 failures [06:42:23] PROBLEM - puppet last run on wtp1013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:42:33] PROBLEM - puppet last run on rcs1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:42:34] PROBLEM - puppet last run on lvs2006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:43:37] PROBLEM - puppet last run on amssq38 is CRITICAL: CRITICAL: Puppet has 1 failures [06:43:43] PROBLEM - puppet last run on db1054 is CRITICAL: CRITICAL: Puppet has 1 failures [06:43:43] PROBLEM - puppet last run on labsdb1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:44:45] PROBLEM - puppet last run on analytics1032 is CRITICAL: CRITICAL: Puppet has 1 failures [06:44:59] PROBLEM - puppet last run on mw1023 is CRITICAL: CRITICAL: Puppet has 1 failures [06:45:09] PROBLEM - puppet last run on cp1060 is CRITICAL: CRITICAL: Puppet has 1 failures [06:45:28] PROBLEM - puppet last run on mw1219 is CRITICAL: CRITICAL: Puppet has 1 failures [06:46:41] PROBLEM - puppet last run on db1027 is CRITICAL: CRITICAL: Puppet has 1 failures [06:47:09] PROBLEM - puppet last run on fluorine is CRITICAL: CRITICAL: Puppet has 1 failures [06:47:10] RECOVERY - puppet last run on cp1061 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:10] PROBLEM - puppet last run on search1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:47:18] PROBLEM - puppet last run on db1056 is CRITICAL: CRITICAL: Puppet has 1 failures [06:47:31] PROBLEM - puppet last run on db2010 is CRITICAL: CRITICAL: Puppet has 1 failures [06:47:32] PROBLEM - puppet last run on db2012 is CRITICAL: CRITICAL: Puppet has 1 failures [06:47:32] PROBLEM - puppet last run on cp4011 is CRITICAL: CRITICAL: Puppet has 1 failures [06:47:39] PROBLEM - puppet last run on es1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:47:49] PROBLEM - puppet last run on ms-be1015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:47:49] PROBLEM - puppet last run on bast1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:47:49] PROBLEM - puppet last run on mw1194 is CRITICAL: CRITICAL: Puppet has 1 failures [06:47:58] PROBLEM - puppet last run on mw1199 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:00] PROBLEM - puppet last run on analytics1012 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:10] PROBLEM - puppet last run on virt1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:10] PROBLEM - puppet last run on cp1068 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:10] PROBLEM - puppet last run on ms-be1014 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:11] PROBLEM - puppet last run on mc1015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:11] PROBLEM - puppet last run on ms-be3004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:11] PROBLEM - puppet last run on cp3017 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:17] PROBLEM - puppet last run on palladium is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:21] PROBLEM - puppet last run on mw1102 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:21] PROBLEM - puppet last run on mw1135 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:21] PROBLEM - puppet last run on mw1018 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:30] PROBLEM - puppet last run on search1008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:30] PROBLEM - puppet last run on mw1179 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:38] PROBLEM - puppet last run on mw1019 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:38] PROBLEM - puppet last run on cp1066 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:38] PROBLEM - puppet last run on ms-be2009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:38] PROBLEM - puppet last run on chromium is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:38] PROBLEM - puppet last run on elastic1013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:49] PROBLEM - puppet last run on labsdb1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:50] PROBLEM - puppet last run on mw1127 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:51] PROBLEM - puppet last run on elastic1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:52] PROBLEM - puppet last run on pollux is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:55] PROBLEM - puppet last run on bast2001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:48:56] PROBLEM - puppet last run on es2006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:03] PROBLEM - puppet last run on cp4010 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:05] PROBLEM - puppet last run on db1068 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:15] PROBLEM - puppet last run on search1019 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:16] PROBLEM - puppet last run on mw1182 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:17] PROBLEM - puppet last run on ms-be1010 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:25] PROBLEM - puppet last run on db1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:25] PROBLEM - puppet last run on elastic1025 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:25] PROBLEM - puppet last run on mw1021 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:25] RECOVERY - puppet last run on mw1042 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:49:25] PROBLEM - puppet last run on mw1017 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:25] PROBLEM - puppet last run on mw1101 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:26] PROBLEM - puppet last run on mw1191 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:26] PROBLEM - puppet last run on lvs3002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:35] PROBLEM - puppet last run on lanthanum is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:36] PROBLEM - puppet last run on eeden is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:36] PROBLEM - puppet last run on elastic1016 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:36] PROBLEM - puppet last run on mw1137 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:36] PROBLEM - puppet last run on analytics1021 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:36] PROBLEM - puppet last run on db1049 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:48] PROBLEM - puppet last run on elastic1029 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:49] PROBLEM - puppet last run on mw1157 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:49] PROBLEM - puppet last run on lvs2002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:49] PROBLEM - puppet last run on db2017 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:49] PROBLEM - puppet last run on cp3019 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:50] PROBLEM - puppet last run on cp3021 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:50] PROBLEM - puppet last run on es1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:56] PROBLEM - puppet last run on mw1230 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:58] PROBLEM - puppet last run on analytics1031 is CRITICAL: CRITICAL: Puppet has 1 failures [06:49:58] PROBLEM - puppet last run on mw1047 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:04] PROBLEM - puppet last run on mw1015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:05] PROBLEM - puppet last run on mw1070 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:05] PROBLEM - puppet last run on wtp1014 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:05] PROBLEM - puppet last run on lvs2005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:14] PROBLEM - puppet last run on amssq45 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:17] PROBLEM - puppet last run on search1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:23] RECOVERY - puppet last run on mw1118 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:50:24] PROBLEM - puppet last run on mw1073 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:24] PROBLEM - puppet last run on ocg1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:25] PROBLEM - puppet last run on amssq52 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:25] PROBLEM - puppet last run on amssq50 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:42] PROBLEM - puppet last run on labsdb1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:42] PROBLEM - puppet last run on mw1078 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:42] PROBLEM - puppet last run on mw1075 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:43] PROBLEM - puppet last run on mw1128 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:43] PROBLEM - puppet last run on hydrogen is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:43] PROBLEM - puppet last run on mc1008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:43] PROBLEM - puppet last run on mw1138 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:44] PROBLEM - puppet last run on mw1095 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:44] PROBLEM - puppet last run on tmh1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:45] PROBLEM - puppet last run on db1058 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:45] PROBLEM - puppet last run on praseodymium is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:46] PROBLEM - puppet last run on ms-be2010 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:46] PROBLEM - puppet last run on db1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:55] PROBLEM - puppet last run on mw1103 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:55] PROBLEM - puppet last run on cp1069 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:55] PROBLEM - puppet last run on cp1064 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:56] PROBLEM - puppet last run on cp3007 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:57] PROBLEM - puppet last run on ms-be2015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:57] PROBLEM - puppet last run on cp4016 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:57] PROBLEM - puppet last run on cp4013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:57] PROBLEM - puppet last run on virt1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:57] PROBLEM - puppet last run on mw1020 is CRITICAL: CRITICAL: Puppet has 1 failures [06:50:57] RECOVERY - puppet last run on mw1084 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [06:50:58] PROBLEM - puppet last run on mw1136 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:05] PROBLEM - puppet last run on mw1096 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:12] PROBLEM - puppet last run on mw1169 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:13] PROBLEM - puppet last run on ms-be1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:20] PROBLEM - puppet last run on mw1257 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:22] PROBLEM - puppet last run on mw1085 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:23] PROBLEM - puppet last run on cp4015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:23] PROBLEM - puppet last run on analytics1024 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:23] PROBLEM - puppet last run on neon is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:24] RECOVERY - puppet last run on mw1025 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:51:34] PROBLEM - puppet last run on cp1057 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:34] PROBLEM - puppet last run on analytics1039 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:34] PROBLEM - puppet last run on analytics1034 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:34] RECOVERY - puppet last run on analytics1032 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [06:51:46] PROBLEM - puppet last run on erbium is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:50] PROBLEM - puppet last run on mw1058 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:50] PROBLEM - puppet last run on mw1130 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:50] PROBLEM - puppet last run on elastic1026 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:50] PROBLEM - puppet last run on mw1184 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:50] PROBLEM - puppet last run on db1019 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:50] PROBLEM - puppet last run on db1041 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:51] PROBLEM - puppet last run on amslvs4 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:57] PROBLEM - puppet last run on elastic1028 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:57] PROBLEM - puppet last run on amssq57 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:06] PROBLEM - puppet last run on wtp1019 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:07] PROBLEM - puppet last run on labsdb1007 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:07] PROBLEM - puppet last run on cp1043 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:07] PROBLEM - puppet last run on mw1234 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:07] PROBLEM - puppet last run on mw1036 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:07] PROBLEM - puppet last run on mw1246 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:07] PROBLEM - puppet last run on mw1040 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:08] PROBLEM - puppet last run on es1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:08] PROBLEM - puppet last run on es1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:09] PROBLEM - puppet last run on search1014 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:09] PROBLEM - puppet last run on tmh1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:10] PROBLEM - puppet last run on mw1245 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:10] PROBLEM - puppet last run on dbstore1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:11] PROBLEM - puppet last run on cp1065 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:11] PROBLEM - puppet last run on analytics1036 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:15] PROBLEM - puppet last run on mw1083 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:15] PROBLEM - puppet last run on mc1010 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:15] PROBLEM - puppet last run on mw1013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:15] PROBLEM - puppet last run on lvs4001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:15] PROBLEM - puppet last run on search1021 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:16] PROBLEM - puppet last run on db1024 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:24] PROBLEM - puppet last run on cp1051 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:24] PROBLEM - puppet last run on analytics1029 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:24] PROBLEM - puppet last run on lvs1006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:24] PROBLEM - puppet last run on mw1244 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:24] PROBLEM - puppet last run on mw1232 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:24] PROBLEM - puppet last run on mw1035 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:25] PROBLEM - puppet last run on cp4017 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:25] PROBLEM - puppet last run on cp3015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:26] PROBLEM - puppet last run on mw1214 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:36] PROBLEM - puppet last run on analytics1015 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:38] PROBLEM - puppet last run on ms-be2013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:39] PROBLEM - puppet last run on rdb1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:39] PROBLEM - puppet last run on amssq58 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:40] PROBLEM - puppet last run on cp3022 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:40] PROBLEM - puppet last run on mc1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:45] PROBLEM - puppet last run on mw1221 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:45] PROBLEM - puppet last run on mw1252 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:46] PROBLEM - puppet last run on ms-be1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:46] PROBLEM - puppet last run on ms-be1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:46] PROBLEM - puppet last run on mw1094 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:57] PROBLEM - puppet last run on mw1124 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:57] PROBLEM - puppet last run on mw1178 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:58] PROBLEM - puppet last run on mw1196 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:58] PROBLEM - puppet last run on mw1038 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:58] PROBLEM - puppet last run on virt1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:58] PROBLEM - puppet last run on labsdb1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:58] PROBLEM - puppet last run on es1006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:58] PROBLEM - puppet last run on cp1059 is CRITICAL: CRITICAL: Puppet has 1 failures [06:52:59] PROBLEM - puppet last run on mw1161 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:10] PROBLEM - puppet last run on ms-be2002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:11] PROBLEM - puppet last run on ytterbium is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:16] PROBLEM - puppet last run on ms-be2003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:16] PROBLEM - puppet last run on labstore2001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:22] PROBLEM - puppet last run on cp3008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:22] PROBLEM - puppet last run on amssq59 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:35] PROBLEM - puppet last run on carbon is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:38] PROBLEM - puppet last run on mw1256 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:38] PROBLEM - puppet last run on analytics1033 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:38] PROBLEM - puppet last run on elastic1031 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:47] PROBLEM - puppet last run on es1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:47] PROBLEM - puppet last run on mw1218 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:47] PROBLEM - puppet last run on mw1216 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:47] PROBLEM - puppet last run on logstash1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:47] PROBLEM - puppet last run on ms-be1013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:47] PROBLEM - puppet last run on mw1063 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:48] PROBLEM - puppet last run on analytics1019 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:48] PROBLEM - puppet last run on wtp1021 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:49] PROBLEM - puppet last run on pc1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:49] PROBLEM - puppet last run on mw1041 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:49] for fuck's sake [06:53:50] PROBLEM - puppet last run on mw1006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:50] PROBLEM - puppet last run on dbproxy1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:55] PROBLEM - puppet last run on ms-be1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:55] PROBLEM - puppet last run on mc1006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:56] PROBLEM - puppet last run on ocg1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:56] PROBLEM - puppet last run on db1010 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:56] PROBLEM - puppet last run on wtp1024 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:56] PROBLEM - puppet last run on mw1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:56] PROBLEM - puppet last run on mw1132 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:57] PROBLEM - puppet last run on lvs4002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:53:57] PROBLEM - puppet last run on mw1031 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:10] PROBLEM - puppet last run on mw1028 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:12] PROBLEM - puppet last run on mc1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:14] PROBLEM - puppet last run on mw1080 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:14] PROBLEM - puppet last run on mw1115 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:14] PROBLEM - puppet last run on zirconium is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:14] PROBLEM - puppet last run on mw1147 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:14] PROBLEM - puppet last run on cp1047 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:16] RECOVERY - puppet last run on fluorine is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [06:54:19] PROBLEM - puppet last run on mw1109 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:19] PROBLEM - puppet last run on wtp1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:19] PROBLEM - puppet last run on mw1089 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:19] PROBLEM - puppet last run on amslvs2 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:19] RECOVERY - puppet last run on amssq38 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:54:22] RECOVERY - puppet last run on labsdb1004 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [06:54:22] RECOVERY - puppet last run on db1054 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:54:23] PROBLEM - puppet last run on mw1134 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:23] PROBLEM - puppet last run on mw1233 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:23] PROBLEM - puppet last run on cp4007 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:23] PROBLEM - puppet last run on cerium is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:33] PROBLEM - puppet last run on db2035 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:34] PROBLEM - puppet last run on cp3013 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:34] PROBLEM - puppet last run on helium is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:34] PROBLEM - puppet last run on mw1048 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:34] PROBLEM - puppet last run on mw1012 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:35] PROBLEM - puppet last run on analytics1017 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:35] PROBLEM - puppet last run on ms-be2006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:36] PROBLEM - puppet last run on mw1192 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:43] PROBLEM - puppet last run on amssq43 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:43] PROBLEM - puppet last run on db1053 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:44] PROBLEM - puppet last run on mw1141 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:44] PROBLEM - puppet last run on db1031 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:44] PROBLEM - puppet last run on ms-fe1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:53] PROBLEM - puppet last run on db1029 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:54] PROBLEM - puppet last run on mw1062 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:54] PROBLEM - puppet last run on cp1070 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:54] PROBLEM - puppet last run on netmon1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:54] PROBLEM - puppet last run on db2034 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:54] PROBLEM - puppet last run on mw1240 is CRITICAL: CRITICAL: Puppet has 1 failures [06:54:54] PROBLEM - puppet last run on mw1197 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:07] PROBLEM - puppet last run on wtp1017 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:07] PROBLEM - puppet last run on mw1072 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:07] PROBLEM - puppet last run on neptunium is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:07] PROBLEM - puppet last run on mw1067 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:07] PROBLEM - puppet last run on sca1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:07] PROBLEM - puppet last run on mc1011 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:14] PROBLEM - puppet last run on cp1039 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:14] PROBLEM - puppet last run on mw1250 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:14] PROBLEM - puppet last run on db1022 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:14] RECOVERY - puppet last run on cp1068 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [06:55:24] PROBLEM - puppet last run on ms-fe2004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:24] PROBLEM - puppet last run on amssq54 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:24] PROBLEM - puppet last run on amssq32 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:25] PROBLEM - puppet last run on lvs1005 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:25] PROBLEM - puppet last run on es2008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:25] PROBLEM - puppet last run on cp4006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:26] PROBLEM - puppet last run on amssq49 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:26] PROBLEM - puppet last run on cp3006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:26] PROBLEM - puppet last run on vanadium is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:33] _joe_: woken up yet? [06:55:33] PROBLEM - puppet last run on mw1106 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:33] PROBLEM - puppet last run on elastic1012 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:33] PROBLEM - puppet last run on magnesium is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:33] PROBLEM - puppet last run on uranium is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:33] PROBLEM - puppet last run on ms-be1006 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:34] PROBLEM - puppet last run on mw1226 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:43] PROBLEM - puppet last run on mc1016 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:46] PROBLEM - puppet last run on db2019 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:47] PROBLEM - puppet last run on mw1045 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:47] PROBLEM - puppet last run on elastic1021 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:47] PROBLEM - puppet last run on search1020 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:48] PROBLEM - puppet last run on mw1088 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:48] RECOVERY - puppet last run on chromium is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [06:55:48] PROBLEM - puppet last run on db2009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:48] PROBLEM - puppet last run on cp1055 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:51] (03PS1) 10Faidon Liambotis: Remove more excess Service provider => upstarts [puppet] - 10https://gerrit.wikimedia.org/r/179073 [06:55:53] (03PS1) 10Faidon Liambotis: webserver: add || debian >= jessie to os_version [puppet] - 10https://gerrit.wikimedia.org/r/179074 [06:55:53] PROBLEM - puppet last run on mw1060 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:53] RECOVERY - puppet last run on mw1219 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:55:54] PROBLEM - puppet last run on db2039 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:54] PROBLEM - puppet last run on amssq44 is CRITICAL: CRITICAL: Puppet has 1 failures [06:55:55] (03PS1) 10Faidon Liambotis: apache: add || debian >= jessie to os_version [puppet] - 10https://gerrit.wikimedia.org/r/179075 [06:55:57] (03PS1) 10Faidon Liambotis: ganglia: remove pre-trusty/tmpfs support hacks [puppet] - 10https://gerrit.wikimedia.org/r/179076 [06:55:59] (03PS1) 10Faidon Liambotis: ganglia::web: remove ServerAlias [puppet] - 10https://gerrit.wikimedia.org/r/179077 [06:56:01] (03PS1) 10Faidon Liambotis: url_downloader: remove pre-precise/squid 2 compat [puppet] - 10https://gerrit.wikimedia.org/r/179078 [06:56:03] (03PS1) 10Faidon Liambotis: install-server: remove pre-precise/squid 2 compat [puppet] - 10https://gerrit.wikimedia.org/r/179079 [06:56:05] (03PS1) 10Faidon Liambotis: Remove apt::pin for squid packages [puppet] - 10https://gerrit.wikimedia.org/r/179080 [06:56:05] PROBLEM - puppet last run on xenon is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:06] PROBLEM - puppet last run on lvs3001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:06] PROBLEM - puppet last run on amssq35 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:06] PROBLEM - puppet last run on mw1009 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:06] PROBLEM - puppet last run on db1050 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:06] PROBLEM - puppet last run on db1065 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:06] PROBLEM - puppet last run on elastic1004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:07] (03PS1) 10Faidon Liambotis: Add a new squid3 module and replace in-grown use [puppet] - 10https://gerrit.wikimedia.org/r/179081 [06:56:09] (03PS1) 10Faidon Liambotis: Run "apt-get update" outside of/before puppet [puppet] - 10https://gerrit.wikimedia.org/r/179082 [06:56:11] (03PS1) 10Faidon Liambotis: toollabs: DTRT with both trusty and >= trusty [puppet] - 10https://gerrit.wikimedia.org/r/179083 [06:56:13] (03PS1) 10Faidon Liambotis: Fix Labs ldap/ssh for Debian [puppet] - 10https://gerrit.wikimedia.org/r/179084 [06:56:13] PROBLEM - puppet last run on cp1049 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:13] PROBLEM - puppet last run on mw1123 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:13] PROBLEM - puppet last run on mw1174 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:16] PROBLEM - puppet last run on snapshot1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:18] RECOVERY - puppet last run on analytics1014 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:56:18] PROBLEM - puppet last run on mw1200 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:18] PROBLEM - puppet last run on lvs2004 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:18] PROBLEM - puppet last run on lead is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:18] PROBLEM - puppet last run on mw1170 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:18] PROBLEM - puppet last run on db2002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:18] RECOVERY - puppet last run on mw1021 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [06:56:19] PROBLEM - puppet last run on mw1052 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:19] PROBLEM - puppet last run on elastic1027 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:20] RECOVERY - puppet last run on rcs1002 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:56:20] PROBLEM - puppet last run on caesium is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:24] PROBLEM - puppet last run on mw1173 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:27] PROBLEM - puppet last run on mw1224 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:27] PROBLEM - puppet last run on logstash1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:58:01] so [06:58:03] https://gerrit.wikimedia.org/r/179082 [06:58:09] Run "apt-get update" outside of/before puppet [06:58:13] would fix this [06:58:20] reviews welcome [06:58:56] ori: this also fixes the issue you tried to fix with apt that I had to revert because of the dependency loop [07:05:42] (03CR) 10Faidon Liambotis: [C: 032] "Trivial" [puppet] - 10https://gerrit.wikimedia.org/r/179073 (owner: 10Faidon Liambotis) [07:05:52] (03CR) 10Faidon Liambotis: [C: 032] "Trivial" [puppet] - 10https://gerrit.wikimedia.org/r/179074 (owner: 10Faidon Liambotis) [07:05:54] (03CR) 10Andrew Bogott: [C: 032] Fix Labs ldap/ssh for Debian [puppet] - 10https://gerrit.wikimedia.org/r/179084 (owner: 10Faidon Liambotis) [07:06:07] oh hey andrewbogott [07:06:14] I'm not used to seeing you around this time of the day :) [07:06:25] (03CR) 10Faidon Liambotis: [C: 032] apache: add || debian >= jessie to os_version [puppet] - 10https://gerrit.wikimedia.org/r/179075 (owner: 10Faidon Liambotis) [07:06:30] me neither :) [07:06:48] paravoid: Is nslcd on your list? The template there has lots of comparisons that don't work at all on debian. [07:07:09] not yet [07:07:41] uniquemember [07:07:47] oops [07:08:12] 'uniquemember' does not work on Jessie, so anytime puppet runs it basically prevents all future logins :) [07:09:50] (03PS2) 10Faidon Liambotis: Fix Labs ldap/ssh for Debian [puppet] - 10https://gerrit.wikimedia.org/r/179084 [07:10:05] andrewbogott: try that +2 again :) [07:11:02] andrewbogott: do we have any lucids in Labs? [07:11:06] I guess not? [07:11:17] I'm pretty sure I killed them all during the migration to eqiad [07:11:23] if so, the fix is really easy, as all the nslcd checks are for 12.04 [07:11:52] True -- i'll just pull out the tests entirely. [07:13:42] andrewbogott: want to re+2 https://gerrit.wikimedia.org/r/#/c/179084/ ? [07:13:53] andrewbogott: and I'll fix those 12.04 checks for you ;) [07:14:35] isn't it merged already? [07:14:41] no, it needed a rebase [07:15:07] yeah, but then I merged it [07:15:11] Or so gerrit says [07:15:24] oh, sorry [07:15:30] grrrit-wm didn't say anything for some weird reason [07:16:42] <_joe_> paravoid: yes I am, just taking a walk earlier [07:16:53] <_joe_> paravoid: need something? [07:17:05] https://gerrit.wikimedia.org/r/179082 [07:17:17] it has the potential of breaking puppet across the fleet [07:17:18] <_joe_> (and yes, andrewbogott here in my morning is nice, hi Andrew!) [07:17:21] so I need careful review :) [07:17:30] <_joe_> and you ask me? [07:17:30] hi! [07:17:32] <_joe_> :P [07:17:41] <_joe_> paravoid: oh yes [07:17:52] <_joe_> (just seen the topic, I'm already sold [07:18:21] paravoid: if you have a moment this morning… I'm still stuck re: the new hp virt servers. pxe works but then the trusty installer declares that dhcp is broken. [07:18:26] (03PS1) 10Faidon Liambotis: Remove Ubuntu >= 12.04 conditionals for Labs [puppet] - 10https://gerrit.wikimedia.org/r/179086 [07:19:06] andrewbogott: ^ [07:20:01] (03CR) 10Andrew Bogott: [C: 04-1] "Thanks! One mistake where the check is < rather than >." (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/179086 (owner: 10Faidon Liambotis) [07:20:26] oh you're right of course [07:20:27] silly me [07:20:43] the comment also says sudo-ldap, which we don't anymore, right? [07:21:37] Hm, I'm not sure. It's certainly still installed everywhere. [07:21:37] (03PS2) 10Faidon Liambotis: Remove Ubuntu >= 12.04 conditionals for Labs [puppet] - 10https://gerrit.wikimedia.org/r/179086 [07:22:36] (03CR) 10Andrew Bogott: [C: 032] Remove Ubuntu >= 12.04 conditionals for Labs [puppet] - 10https://gerrit.wikimedia.org/r/179086 (owner: 10Faidon Liambotis) [07:22:55] awesome [07:24:04] RECOVERY - puppet last run on mw1028 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:24:09] still? [07:24:16] RECOVERY - puppet last run on zirconium is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [07:24:18] RECOVERY - puppet last run on analytics1025 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:24:18] RECOVERY - puppet last run on cp1047 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:24:22] oh fuck off [07:24:33] PROBLEM - puppet last run on calcium is CRITICAL: CRITICAL: Puppet has 1 failures [07:24:35] RECOVERY - puppet last run on analytics1040 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:24:40] RECOVERY - puppet last run on mw1012 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:24:41] RECOVERY - puppet last run on ms-be2006 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [07:24:43] RECOVERY - puppet last run on mw1160 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:24:48] I don't think bots respond to insults [07:24:53] (03CR) 10Faidon Liambotis: [C: 032] Add mobile subdomains to wikidata.org [dns] - 10https://gerrit.wikimedia.org/r/179037 (owner: 10MaxSem) [07:24:55] RECOVERY - puppet last run on db1031 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:24:55] RECOVERY - puppet last run on mw1141 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [07:24:55] RECOVERY - puppet last run on mw1069 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [07:24:55] RECOVERY - puppet last run on db1029 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [07:24:55] RECOVERY - puppet last run on mw1176 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [07:24:56] RECOVERY - puppet last run on wtp1016 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [07:24:56] RECOVERY - puppet last run on mw1008 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [07:24:56] RECOVERY - puppet last run on db1002 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [07:24:56] RECOVERY - puppet last run on mc1002 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [07:24:57] RECOVERY - puppet last run on netmon1001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [07:25:03] RECOVERY - puppet last run on db2034 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [07:29:11] (03PS1) 10Faidon Liambotis: Remove provider => upstart from Service [puppet/kafkatee] - 10https://gerrit.wikimedia.org/r/179087 [07:29:28] (03CR) 10Faidon Liambotis: [C: 032] Remove provider => upstart from Service [puppet/kafkatee] - 10https://gerrit.wikimedia.org/r/179087 (owner: 10Faidon Liambotis) [07:32:41] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "1 comment, apart from that, I do agree with the philosophy and the implementation." (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/179082 (owner: 10Faidon Liambotis) [07:33:05] _joe_: and you are correct of course [07:34:17] (03Abandoned) 10Giuseppe Lavagetto: ssl_ciphersuite: remove RC4 [puppet] - 10https://gerrit.wikimedia.org/r/178488 (owner: 10Giuseppe Lavagetto) [07:34:48] <_joe_> paravoid: I was wondering if we shouldn't move the apt resources to the first stage [07:35:06] <_joe_> I mean the apt class and the most important ones [07:35:11] Preparing to replace libc6 2.11.1-0ubuntu7.10 (using .../libc6_2.11.1-0ubuntu7.19_amd64.deb) ... [07:35:14] [: 399: Illegal number: 3.16-3-amd64 [07:35:15] upgrading my lucid chroot [07:35:15] heh [07:35:18] /var/lib/dpkg/tmp.ci/preinst: 399: arithmetic expression: expecting EOF: "3.16-3-amd64" [07:35:21] dpkg: error processing /var/cache/apt/archives/libc6_2.11.1-0ubuntu7.19_amd64.deb (--unpack): [07:35:32] "setarch x86_64 --uname-2.6 chroot /srv/chroots/64lucid/" to the rescue [07:35:41] Linux gearloose 2.6.56-3-amd64 #1 SMP Debian 3.16.5-1 (2014-10-10) x86_64 GNU/Linux [07:48:58] (03CR) 10Giuseppe Lavagetto: [C: 04-2] "Things I like:" [puppet] - 10https://gerrit.wikimedia.org/r/179027 (owner: 10Ori.livneh) [08:01:11] (03CR) 10Giuseppe Lavagetto: [C: 04-2] "The correct way to do this is to run the prune script in the pre-start stanza of our upstart job for hhvm. Let's do that." [puppet] - 10https://gerrit.wikimedia.org/r/178794 (owner: 10Ori.livneh) [08:02:23] <_joe_> I think we have a lot of things mixed up between a global "hhvm" puppet module and what we use for mediawiki [08:02:32] <_joe_> that is also the reason why it' [08:02:42] <_joe_> s so hard to use it anywhere but there [08:02:52] * _joe_ spring cleaning [08:15:25] (03PS1) 10Faidon Liambotis: Multiple cleanups [debs/quickstack] - 10https://gerrit.wikimedia.org/r/179089 [08:15:27] _joe_: ^ [08:17:06] <_joe_> paravoid: oh thanks, I did that in quite a hurry tbh [08:19:18] ok to merge? [08:19:20] <_joe_> paravoid: "Drop elfutils build-dependency as it's superfluous" [08:19:23] <_joe_> are you sure? [08:19:26] yes [08:19:31] libelf-dev is what you need [08:19:37] <_joe_> meh, right [08:19:55] (03CR) 10Giuseppe Lavagetto: [C: 031] Multiple cleanups [debs/quickstack] - 10https://gerrit.wikimedia.org/r/179089 (owner: 10Faidon Liambotis) [08:20:34] <_joe_> (and closes #1234 is there from the helper I used, I don't remember which one - I don't care about lintian bogus warnings) [08:22:12] (03CR) 10Faidon Liambotis: [C: 032 V: 032] Multiple cleanups [debs/quickstack] - 10https://gerrit.wikimedia.org/r/179089 (owner: 10Faidon Liambotis) [08:22:28] _joe_: if you want this to land in Debian, I'd happy to upload it [08:26:34] <_joe_> paravoid: yes! [08:26:44] <_joe_> I think quickstack is truly useful [08:26:46] (03PS1) 10Faidon Liambotis: Add quickstack to all-distros standard-packages [puppet] - 10https://gerrit.wikimedia.org/r/179092 [08:27:03] _joe_: file an ITP, add it to d/changelog [08:27:33] <_joe_> paravoid: will do :) [08:30:27] (03CR) 10Alexandros Kosiaris: [C: 04-1] "No, kill the entire ganglia::web class. It has been replaced by ganglia_new::web and more importantly role::ganglia::web in 6836e69 and it" [puppet] - 10https://gerrit.wikimedia.org/r/179076 (owner: 10Faidon Liambotis) [08:31:25] (03CR) 10Alexandros Kosiaris: [C: 04-2] "Niah, kill the entire ganglia::web class as proposed in https://gerrit.wikimedia.org/r/#/c/179076/" [puppet] - 10https://gerrit.wikimedia.org/r/179077 (owner: 10Faidon Liambotis) [08:32:03] (03PS2) 10Faidon Liambotis: Add quickstack to all-distros standard-packages [puppet] - 10https://gerrit.wikimedia.org/r/179092 [08:32:16] <_joe_> paravoid: we should do something about the hhvm package in debian... [08:32:23] it's in NEW waiting... [08:32:35] (03CR) 10Faidon Liambotis: [C: 032] Add quickstack to all-distros standard-packages [puppet] - 10https://gerrit.wikimedia.org/r/179092 (owner: 10Faidon Liambotis) [08:32:36] <_joe_> sigh [08:32:57] (03CR) 10Faidon Liambotis: [V: 032] Add quickstack to all-distros standard-packages [puppet] - 10https://gerrit.wikimedia.org/r/179092 (owner: 10Faidon Liambotis) [08:32:57] <_joe_> paravoid: did you rebuild the package for anything besides trusty? [08:33:03] yes [08:33:05] <_joe_> because I didn't [08:33:05] lucid, precise, jessie [08:33:10] <_joe_> :))) [08:33:15] <_joe_> <3 [08:33:19] that's why I found all these issues too [08:33:30] <_joe_> ok [08:33:35] but I first built lucid & precise, then fixed all of them [08:33:40] so now lucid/precise/trusty have 1.0-1 [08:33:57] and jessie has 20121211-1 [08:34:05] <_joe_> ok [08:34:20] (well 1.0-1~lucid1, 1.0-1~precise1, 1.0-1, 20121211-1~bpo8+1) [08:35:25] akosiaris: ganglia was functional in labs at some point [08:35:35] I think it's not right now, but I'm not sure if it will be revived? [08:35:48] andrewbogott, YuviPanda: any future plans to have ganglia in labs? [08:36:05] (03CR) 10Alexandros Kosiaris: [C: 04-1] "LGTM, one minor pedantic lint issue. Glad to see our manifests getting simpler" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/179078 (owner: 10Faidon Liambotis) [08:36:21] paravoid: not really. [08:37:03] just diamond+graphite for the foreeseable future [08:37:17] (03CR) 10Faidon Liambotis: url_downloader: remove pre-precise/squid 2 compat (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/179078 (owner: 10Faidon Liambotis) [08:39:42] RECOVERY - Unmerged changes on repository puppet on strontium is OK: No changes to merge. [08:40:58] (03CR) 10Alexandros Kosiaris: [C: 04-1] "LGTM, minor pedantic issues." (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/179079 (owner: 10Faidon Liambotis) [08:41:33] akosiaris: look at the whole patchset first, these are very shortlived [08:41:54] I'm just doing one commit per functional for the sake of clarity [08:42:05] hmm, ok [08:42:21] no need for pinning anymore, I reprepro removed squid from precise-wikimedia earlier [08:43:12] (03PS1) 10Steinsplitter: Adding *.wmflabs.org to wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179093 [08:43:17] (03CR) 10jenkins-bot: [V: 04-1] Adding *.wmflabs.org to wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179093 (owner: 10Steinsplitter) [08:43:19] (03CR) 10Alexandros Kosiaris: [C: 032] Remove apt::pin for squid packages [puppet] - 10https://gerrit.wikimedia.org/r/179080 (owner: 10Faidon Liambotis) [08:44:11] (03Abandoned) 10Steinsplitter: Adding *.wmflabs.org to wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179093 (owner: 10Steinsplitter) [08:47:18] (03PS1) 10Steinsplitter: Adding *.wmflabs.org to wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179094 [08:48:56] (03PS2) 10Steinsplitter: Adding *.wmflabs.org to wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179094 [08:49:58] (03PS1) 10Faidon Liambotis: Kill ganglia::web, replaced by ganglia_new::web [puppet] - 10https://gerrit.wikimedia.org/r/179095 [08:50:00] (03PS1) 10Faidon Liambotis: ganglia: remove Labs support [puppet] - 10https://gerrit.wikimedia.org/r/179096 [08:50:02] (03PS1) 10Faidon Liambotis: ganglia_new: remove Labs support [puppet] - 10https://gerrit.wikimedia.org/r/179097 [08:50:05] akosiaris: ^^^ [08:50:57] (03CR) 10jenkins-bot: [V: 04-1] ganglia: remove Labs support [puppet] - 10https://gerrit.wikimedia.org/r/179096 (owner: 10Faidon Liambotis) [08:51:04] (03CR) 10jenkins-bot: [V: 04-1] ganglia_new: remove Labs support [puppet] - 10https://gerrit.wikimedia.org/r/179097 (owner: 10Faidon Liambotis) [08:51:32] paravoid: jenkins doesn't love you :P [08:52:19] (03PS2) 10Faidon Liambotis: ganglia_new: remove Labs support [puppet] - 10https://gerrit.wikimedia.org/r/179097 [08:52:21] (03PS2) 10Faidon Liambotis: ganglia: remove Labs support [puppet] - 10https://gerrit.wikimedia.org/r/179096 [08:53:08] (03CR) 10jenkins-bot: [V: 04-1] ganglia_new: remove Labs support [puppet] - 10https://gerrit.wikimedia.org/r/179097 (owner: 10Faidon Liambotis) [08:53:13] oh ffs [08:54:42] (03PS3) 10Faidon Liambotis: ganglia_new: remove Labs support [puppet] - 10https://gerrit.wikimedia.org/r/179097 [08:54:44] (03PS1) 10Faidon Liambotis: ganglia_new::web: switch to apache-2.4 ssl ciphers [puppet] - 10https://gerrit.wikimedia.org/r/179098 [08:55:10] (03CR) 10Faidon Liambotis: [V: 032] Remove provider => upstart from Service [puppet/kafkatee] - 10https://gerrit.wikimedia.org/r/179087 (owner: 10Faidon Liambotis) [08:55:36] (03CR) 10Alexandros Kosiaris: [C: 032] Add a new squid3 module and replace in-grown use [puppet] - 10https://gerrit.wikimedia.org/r/179081 (owner: 10Faidon Liambotis) [09:01:24] (03CR) 10Alexandros Kosiaris: [C: 032] Kill ganglia::web, replaced by ganglia_new::web [puppet] - 10https://gerrit.wikimedia.org/r/179095 (owner: 10Faidon Liambotis) [09:01:47] PROBLEM - puppet last run on d-i-test is CRITICAL: CRITICAL: Puppet has 1 failures [09:03:22] (03PS3) 10Steinsplitter: Adding *.wmflabs.org to wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179094 [09:03:39] mark: Can I get a hand with these new virt servers? The trusty installer says that dhcp isn't set up properly… I've no idea if there's an actual network issue or what. [09:04:06] (I'm not totally sure that Trusty will work with our OpenStack version, but I'd like to give it a try) [09:04:57] <_joe_> andrewbogott: I reinstalled 230 servers in the last week, so I guess the network and dhcp should be fine - maybe the dhcp config is wrong? [09:05:13] _joe_: this is on the labs subnet, so possibly a different setup. [09:05:27] But, yeah, probably the dhcp config /is/ wrong, but I don't know what's wrong about it. [09:05:35] I wouldn't expect pxe to work at all if it were too wrong... [09:12:41] _joe_: there's nothing to a dhcp config other than a mac address, is there? [09:14:08] andrewbogott: actually there is, but it should be automatic. Wanna give me an example of what fails ? [09:14:16] hm, I guess it could pxe boot on something other than eth0, and then the trusty installer finds its primary nic broken [09:14:27] akosiaris: sure [09:15:43] akosiaris: virt1010, 1011 and 1012 all fail. [09:15:49] Well, they work well enough for pxe to start. [09:15:51] But fail subsequently. [09:16:15] ok, I will reboot virt1010 to debug it, is that ok ? [09:16:21] yep! [09:16:29] thank you [09:16:38] don't mention it [09:17:08] akosiaris: some notes about hp mgmt here: https://wikitech.wikimedia.org/wiki/HP_DL3N0 [09:17:12] oh... hp... [09:17:15] * akosiaris sigh [09:17:58] andrewbogott: thanks, that will prove handy [09:18:23] good morning [09:18:58] akosiaris: right after reset there's a dramatic pause -- may feel like a hang but it gets going after a few. [09:26:36] greetings [09:27:24] http://bellard.org/bpg/ [09:28:12] I am always impressed... [09:29:54] (03CR) 10Rillke: [C: 031] Adding *.wmflabs.org to wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179094 (owner: 10Steinsplitter) [09:35:08] (03PS1) 10Giuseppe Lavagetto: mediawiki: enhancements to hhvm_cleanup_cache [puppet] - 10https://gerrit.wikimedia.org/r/179102 [09:35:21] PROBLEM - check_mysql on db1008 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 623 [09:35:49] (03PS1) 10Yuvipanda: Add gitreview [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179103 [09:36:13] <_joe_> ciao godog [09:36:51] (03PS1) 10Yuvipanda: Add .gitignore [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179104 [09:37:35] hey _joe_ [09:37:53] <_joe_> I don't own guns [09:38:04] andrewbogott: so you are right. It seems PXE DHCPs from a different card than d-i DHCPs from [09:38:35] systemd ftw :P [09:38:42] akosiaris: ok, so that means it needs to be cabled up to a different interface, and we need a different MAC in the dhcp config, correct? [09:39:14] not sure that would work either [09:39:24] so, these boxes, got 2 internal ifces [09:39:31] and 2 via a PCI card or something ? [09:39:45] (03CR) 10Yuvipanda: [C: 032 V: 032] Add gitreview [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179103 (owner: 10Yuvipanda) [09:39:48] I think 2 10g interfaces built in and 4 1g on a card [09:39:57] (03CR) 10Yuvipanda: [C: 032 V: 032] Add .gitignore [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179104 (owner: 10Yuvipanda) [09:39:59] But I'm pretty sure they're connected to the outside network via a 1g [09:40:20] RECOVERY - check_mysql on db1008 is OK: Uptime: 4911205 Threads: 82 Questions: 125285290 Slow queries: 32913 Opens: 94136 Flush tables: 2 Open tables: 64 Queries per second avg: 25.510 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [09:40:47] akosiaris: so maybe we need to disable the 10g nics in the bios [09:41:35] RECOVERY - Disk space on fluorine is OK: DISK OK [09:42:27] andrewbogott: entirely ? or just the pxe part ? [09:42:41] entirely [09:43:00] As I think they're not connected to anything, and most likely trusty wants the 10g interfaces to be eth0 and eth1 [09:43:08] do you know the mac that it was using? [09:43:26] f0:92:1c:05:6b:30 [09:43:30] well, that is eth0 [09:43:35] it is the first 10g [09:44:52] yep, and if you watch the startup screen, pxe is happening via 40 A8 F0 38 06 40 [09:45:25] so, we need to either cable up a 10g or disable them. My understanding is that the labs subnet can't currently talk to 10g ports [09:48:04] RECOVERY - https://phabricator.wikimedia.org on iridium is OK: HTTP OK: HTTP/1.1 200 OK - 17450 bytes in 0.319 second response time [09:48:47] ok, so I 've encountered this before [09:48:57] gimme 5 mins to refresh my memory... [09:49:09] ok [09:56:34] can you change the order of boot devices in the bios so that the 10g nic comes before the 1g? or disable the 1g for the install? [09:57:51] or change the pci initialization order :P [09:58:11] jgage: init order might help [10:01:34] the problem is not the boot order [10:01:59] d-i is reinitializing all the network cards and reexecuting the DHCP [10:02:12] oh *that* problem [10:02:12] from a different card, that is the problem... [10:02:13] bleh [10:02:41] I was looking again at biosdevname [10:03:21] last time I met the issue the order was vice verse... 4x1G cards on board and 2x10G cards on extra card [10:03:54] so it had helped then... but now, if we don't use the on board cards... I don't think it can [10:04:36] akosiaris: I'm a bit suprised that a 10g nic can't talk to a 1g network. Is that really right? [10:04:46] is it connected ? [10:04:53] PROBLEM - puppet last run on virt1005 is CRITICAL: CRITICAL: puppet fail [10:05:12] no, but it's not connected /because/ of thinking it wouldn't work [10:06:11] I'm trying to understand why we didn't just plug the cable into the 10g (which all components agree /should/ be eth0) and just have done. [10:06:44] copper vs fiber perhaps ? [10:07:26] oh, maybe [10:07:47] RECOVERY - puppet last run on virt1005 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [10:08:04] I am running a manual dhclient on eth2 to download some tools and see what is the config there [10:08:37] ok [10:09:15] paravoid, you were at least briefly involved in the config of these boxes… do you have anything to add? [10:10:51] I can't understand why it detects a link on eth0 though... [10:11:05] we have that in d-i [10:11:08] # Select interface that has a link [10:11:08] d-i netcfg/choose_interface select auto [10:11:14] but ... noooo... [10:21:33] not sure if it can help with installs, but have you guys seen /lib/udev/write_net_rules which generates mac-to-ethX mappings in /etc/udev/rules.d/70-presistent-net.rules ? what a place for a script :P [10:23:33] yeah, that is after the install... in d-i it is /etc/udev/rules.d/010_net-hotplug.rules and it has "SUBSYSTEM=="net", RUN+="/etc/hotplug.d/net/hw-detect.hotplug" [10:24:13] which does not do much... [10:24:21] bleh [10:24:30] case $ACTION in [10:24:30] add|register) [10:24:30] log "Detected hotpluggable network interface $INTERFACE" [10:24:30] mkdir -p /etc/network [10:24:31] echo "$INTERFACE" >>/etc/network/devhotplug [10:24:32] ;; [10:24:33] esac [10:24:34] that is about it .... :-( [10:24:51] akosiaris: I'm not 100% positive that Trusty will work for my use case, so if this starts to look trusty-specific I'll switch to precise [10:28:33] ok, so supposedly interface=eth2 should work... let's see if it does [10:28:49] <_joe_> win 18 [10:28:53] <_joe_> even 20 [10:31:19] akosiaris: so when that box is fully installed and booted it will still show eth0 and eth1 as disconnected [10:31:21] ? [10:32:01] I think so... [10:32:33] let's see [10:33:27] Hm, I bet the existing puppet code won't be able to handle that. I can customize… but would be nicer to just have eth0. [10:33:35] (I realize you're experimenting atm, just thinking ahead.) [10:34:22] springle: around? [10:40:45] hmm, can’t connect to the labsdb upstream databases over the network. [10:40:57] (03CR) 10Dzahn: [C: 032] stats.wm.org: use ssl_ciphersuite [puppet] - 10https://gerrit.wikimedia.org/r/178833 (owner: 10Dzahn) [10:45:31] (03CR) 10Dzahn: "since we also enabled STS we need to load mod_headers or Invalid command 'Header'" [puppet] - 10https://gerrit.wikimedia.org/r/178833 (owner: 10Dzahn) [10:45:44] PROBLEM - puppet last run on ms-be2015 is CRITICAL: CRITICAL: Puppet has 1 failures [10:47:02] YuviPanda: ahoy hoy [10:47:47] springle: hey! just realized that the upstream dbs for labsdb (db1069, etc) don’t have mysql listening on network. [10:47:52] that, or I am not looking properly [10:50:45] (03PS1) 10Dzahn: stats.wm: load mod_headers in Apache for STS [puppet] - 10https://gerrit.wikimedia.org/r/179107 [10:51:22] (03PS1) 10Giuseppe Lavagetto: hhvm: make the puppet module more configurable [puppet] - 10https://gerrit.wikimedia.org/r/179108 [10:52:19] (03CR) 10Dzahn: [C: 032] stats.wm: load mod_headers in Apache for STS [puppet] - 10https://gerrit.wikimedia.org/r/179107 (owner: 10Dzahn) [10:53:56] (03CR) 10Dzahn: "Apache::Mod::Headers/Apache::Mod_conf[headers]/Exec[ensure_present_mod_headers]/returns: executed successfully" [puppet] - 10https://gerrit.wikimedia.org/r/179107 (owner: 10Dzahn) [10:54:30] (03CR) 10Dzahn: "needed https://gerrit.wikimedia.org/r/#/c/179107/" [puppet] - 10https://gerrit.wikimedia.org/r/178833 (owner: 10Dzahn) [10:55:14] (03CR) 10Hashar: "I noticed your addition of" [puppet] - 10https://gerrit.wikimedia.org/r/178810 (owner: 10Hashar) [10:56:00] (03CR) 10Dzahn: "that said, there is no http->https enforcing redirect here yet" [puppet] - 10https://gerrit.wikimedia.org/r/178833 (owner: 10Dzahn) [10:58:49] (03CR) 10Dzahn: move mediawiki maintenance scripts to module (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/178873 (owner: 10Dzahn) [10:59:07] (03PS1) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [10:59:26] (03PS4) 10Hashar: Basic rspec setup [puppet] - 10https://gerrit.wikimedia.org/r/178810 [10:59:53] (03CR) 10Dzahn: "Yuvipanda: indirect IRC ping via gerrit :)" [puppet] - 10https://gerrit.wikimedia.org/r/178835 (owner: 10Dzahn) [11:00:34] RECOVERY - puppet last run on ms-be2015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:00:52] (03CR) 10Yuvipanda: "This is unfortunately going to be more complicated, since misc::labsdebrepo was a role you could pick in wikitech, and lots of hosts do ha" [puppet] - 10https://gerrit.wikimedia.org/r/178835 (owner: 10Dzahn) [11:00:54] (03PS4) 10Giuseppe Lavagetto: admins: use concat() instead of an inline template [puppet] - 10https://gerrit.wikimedia.org/r/177757 [11:02:24] (03CR) 10Dzahn: [C: 04-2] "oh, ok, thanks..hmm yea, just part of a general attempt to remove the entire manifests/misc/ if possible" [puppet] - 10https://gerrit.wikimedia.org/r/178835 (owner: 10Dzahn) [11:03:38] andrewbogott: hahaha... so no eth0 is a 10g, eth1 the first 1g and eth5 the second 10g ... [11:03:42] now* [11:03:58] I was actually afraid of something similar but this surpassed my expectations :-( [11:04:03] wow [11:04:23] anyway, this should be fixable with biosdevname=1 [11:06:11] (03CR) 10Hashar: "PS4 uses module_path='modules', i.e. no more rely on having a clone of labs/private.git" [puppet] - 10https://gerrit.wikimedia.org/r/178810 (owner: 10Hashar) [11:06:23] (03CR) 10Yuvipanda: "So the way to do this is to..." [puppet] - 10https://gerrit.wikimedia.org/r/178835 (owner: 10Dzahn) [11:06:50] I wonder if there's something smart we could do with said udev rules, e.g. set some kind of priority/preference [11:07:40] godog: in d-i ? [11:08:51] akosiaris: yep, to set some kind of consistency [11:09:00] I am wondering too... can't say I see much space for doing stuff in di-i ... [11:09:17] I actively dislike d-i btw [11:10:19] yeah I'm not a fan of partman mainly, I think because it is hard to test (and write) recipes [11:11:10] <_joe_> godog: partman is the sendmail of the 21st century [11:11:21] <_joe_> (I think brandon said this, not sure) [11:11:44] (03CR) 10Dzahn: "AndrewBogott: did you still want to follow-up on this one? i'd also be fine abandoning it i think, because after the rebase what is left i" [puppet] - 10https://gerrit.wikimedia.org/r/173470 (owner: 10Dzahn) [11:11:45] hehehe no m4 involved yet [11:12:37] well FAI's setup-storage is friendlier, but also simpler... it is debuggable though [11:12:59] FAI though is not without its shortcomings... [11:14:22] (03CR) 10Dzahn: [C: 031] "ping to move up in queue. per comment from Alex above" [puppet] - 10https://gerrit.wikimedia.org/r/96424 (owner: 10Dzahn) [11:17:54] requests jenkins-partman-recipe-check from hashar, then hides quickly [11:19:32] akosiaris: hehe feels like a fairly short blanket any way you look at it [11:19:54] FAI might be a good idea , yea [11:20:40] (03PS2) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [11:21:25] (it's from University of Cologne IT dept, so i saw their presentations) [11:22:10] (03PS3) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [11:22:56] Begin: 1999 / Status: ongoing [11:23:32] http://www.informatik.uni-koeln.de/ls_juenger/research/fai/ [11:26:58] (03PS2) 10Dzahn: remove slauerhoff and slauerhoff-array [dns] - 10https://gerrit.wikimedia.org/r/176868 [11:28:47] mark: ^ should i? (remove slauerhoff from DNS) you deactivated its switch port in 2013 some time [11:29:05] is it in racktables? [11:29:41] yes, it is, esams OE14 [11:29:52] then probably not [11:29:55] i made a ticket asking if it should be decom [11:30:05] (03PS6) 10Nemo bis: phabricator: community metrics stats mail [puppet] - 10https://gerrit.wikimedia.org/r/177792 (owner: 10Dzahn) [11:30:14] it also appears in some "server cleanup list" from Ariel but as a question [11:30:17] ok [11:31:43] there's an esams ticket, 8956 [11:33:16] updated that [11:33:32] thanks [11:40:28] (03Abandoned) 10Dzahn: remove slauerhoff and slauerhoff-array [dns] - 10https://gerrit.wikimedia.org/r/176868 (owner: 10Dzahn) [11:44:42] (03PS3) 10Dzahn: apache-graceful-all: drop dsh, use salt [puppet] - 10https://gerrit.wikimedia.org/r/160953 (owner: 10Alexandros Kosiaris) [11:46:47] PROBLEM - puppet last run on db1027 is CRITICAL: CRITICAL: Puppet has 1 failures [11:47:29] (03CR) 10Giuseppe Lavagetto: [C: 032] admins: use concat() instead of an inline template [puppet] - 10https://gerrit.wikimedia.org/r/177757 (owner: 10Giuseppe Lavagetto) [11:48:09] PROBLEM - puppet last run on palladium is CRITICAL: CRITICAL: Puppet has 1 failures [11:48:38] PROBLEM - puppet last run on search1021 is CRITICAL: CRITICAL: Puppet has 1 failures [11:49:18] <_joe_> apt-get again [11:50:13] it’s been on and off constantly today on labs too [11:51:11] RECOVERY - puppet last run on palladium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:54:45] (03CR) 10Filippo Giunchedi: [C: 031] "scap/scap seems to approximate mediawiki_installation so I think it'd work (to my untrained eye at least)" [puppet] - 10https://gerrit.wikimedia.org/r/160953 (owner: 10Alexandros Kosiaris) [11:55:40] YuviPanda: blah. I know :) [11:56:06] YuviPanda: hold on, replace it with whatever needed. [11:57:41] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "This script is used on tin, which is not the salt master; it is also runnable by non-roots." [puppet] - 10https://gerrit.wikimedia.org/r/160953 (owner: 10Alexandros Kosiaris) [11:57:54] kart_: phab’s operations project will be public by default :) [11:58:27] RECOVERY - puppet last run on db1027 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [11:59:25] number of appservers you get when: a) matching mw* hostname: 258 b) matching salt grain deployment_target scap/scap: 269 c) using dsh group mediawiki-installation: 229 [11:59:29] godog: ^ :p [12:00:48] mutante: heh [12:01:22] i'm going to paste the full lists in a phabbin [12:01:53] <_joe_> mutante, godog don't waste time on apache-graceful-all [12:02:00] <_joe_> it's basically useless nowadays [12:02:23] that too [12:02:28] <_joe_> apache config changes are going to refersh apache automagically [12:02:29] would still be useful to find the salt grain and replace dsh [12:02:43] <_joe_> mutante: dsh for what? [12:02:52] scap [12:02:54] <_joe_> and with what? [12:02:57] salt [12:03:30] when people deploy they still rely on the dsh groups [12:03:30] <_joe_> mh, let's speak with the scap maintainers about this? it doesn't seem like it's a sensible idea [12:03:32] RECOVERY - puppet last run on search1021 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [12:03:45] <_joe_> so, we need to replace that file [12:03:47] eh? i think bd808 is rewriting scap for that reason [12:03:56] <_joe_> to use salt? [12:04:06] <_joe_> why not using git-deploy then? [12:04:08] yea, trebuchet [12:04:10] * _joe_ is puzzled [12:04:35] i don't know, the deployment question is always puzzling [12:04:40] <_joe_> oh ok, so when we do that, we'd have a deployment-target "mediawiki" wherever it's sensible [12:04:49] <_joe_> so, why do we care now? [12:04:52] _joe_: that's exactly what i amended in that change [12:05:01] to use the deployment_target scap salt grain [12:05:06] <_joe_> mutante: no you seached wherever scap is [12:05:29] <_joe_> which may be a different place than where mediawiki is effectively maintained and installed [12:05:39] because i was told not to add my own grain called mediawiki-install ..sigh [12:05:54] <_joe_> mutante: by me [12:05:59] <_joe_> so, rewind a second [12:06:09] <_joe_> what problem do you want to solve originally? [12:06:52] <_joe_> having a list of servers for mediawiki installation in a dsh group, that we need to maintain? [12:06:58] originally originally, i wanted to have one grain per puppet role [12:07:03] https://gerrit.wikimedia.org/r/#/c/107831/ [12:07:17] yes, that too, what you said [12:07:28] i want to remove the need for the manual dsh group file [12:07:31] <_joe_> mutante: we already have one per cluster [12:07:50] because it has been outdated a million times [12:08:01] <_joe_> so, for the role => grain thing, I do have a good solution I think [12:08:02] and then people deploy and miss a server or there is an old one [12:08:42] afaik the problem was solved with a newer salt version [12:09:46] the problem was https://gerrit.wikimedia.org/r/#/c/123834/ [12:10:08] <_joe_> and also for the dsh group mediawiki-installation being automagically populated, "the right way" now is what I'll show you in a few [12:11:06] (03CR) 10Dzahn: "list of appservers you get when:" [puppet] - 10https://gerrit.wikimedia.org/r/160953 (owner: 10Alexandros Kosiaris) [12:11:16] _joe_: alright [12:11:31] <_joe_> for the role => grain thing, hold on until I have merged https://gerrit.wikimedia.org/r/#/c/176334/ [12:12:10] ok [12:15:06] i'm not sure if this helps at all then, but here are the detailed lists of servers you get via the different methods [12:15:11] https://phabricator.wikimedia.org/P144 [12:16:04] (03CR) 10Dzahn: "https://phabricator.wikimedia.org/P144" [puppet] - 10https://gerrit.wikimedia.org/r/160953 (owner: 10Alexandros Kosiaris) [12:17:38] (03PS1) 10KartikMistry: Added initial Debian packaging [debs/contenttranslation/apertium-en-ca] - 10https://gerrit.wikimedia.org/r/179117 [12:18:44] (03PS2) 10KartikMistry: Added initial Debian packaging [debs/contenttranslation/apertium-en-ca] - 10https://gerrit.wikimedia.org/r/179117 [12:19:06] (03CR) 10Dzahn: "dsh, salt grain and hostname match compared:" [puppet] - 10https://gerrit.wikimedia.org/r/177801 (owner: 10Dzahn) [12:19:27] (03Abandoned) 10Dzahn: add salt grain 'mediawiki-installation' in mw role [puppet] - 10https://gerrit.wikimedia.org/r/177801 (owner: 10Dzahn) [12:20:40] (03CR) 10Dzahn: "since Giuseppe commented on Change-Id: I6d889dbbc07ccf2 that apache-graceful-all is now useless i expect similar comments on this change a" [puppet] - 10https://gerrit.wikimedia.org/r/177080 (owner: 10Dzahn) [12:22:04] (03PS4) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [12:22:49] (03CR) 10Dzahn: "you might as well go back to this then because Change-Id: I4c7db8319c1158e is probably not going to be merged anyways. i don't know what t" [puppet] - 10https://gerrit.wikimedia.org/r/164508 (owner: 10Ori.livneh) [12:25:36] (03PS5) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [12:26:04] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "I don't think we're getting rid of apache-fast-test soon, it's still useful. I don't think the rest of all this is useful today. So just s" [puppet] - 10https://gerrit.wikimedia.org/r/164508 (owner: 10Ori.livneh) [12:29:29] (03CR) 10Dzahn: "this would have saved apache-fast-test by moving it into apache module, it could also be changed to just touch that one" [puppet] - 10https://gerrit.wikimedia.org/r/177080 (owner: 10Dzahn) [12:30:06] (03CR) 10Dzahn: "ok, i don't know how to 1) and 4) and i don't think it's worth that effort" [puppet] - 10https://gerrit.wikimedia.org/r/178835 (owner: 10Dzahn) [12:31:03] (03Abandoned) 10Dzahn: labsdebrepo: move out of misc [puppet] - 10https://gerrit.wikimedia.org/r/178835 (owner: 10Dzahn) [12:39:39] YuviPanda: good to know. Thanks! [12:40:00] no more RTs :) [12:41:48] (03Abandoned) 10Dzahn: LDAP: rm pmtpa, +codfw, gluster/NFS server undef [puppet] - 10https://gerrit.wikimedia.org/r/173470 (owner: 10Dzahn) [12:42:30] akosiaris: Are you still poking at virt1010 or is that in my court now? (I'm about done for the day, so no rush if you're still working on it) [12:47:49] akosiaris: for some reason, I can't send you PM. So email :) [12:51:26] (03PS1) 10Giuseppe Lavagetto: dsh: create files based on exported resources [puppet] - 10https://gerrit.wikimedia.org/r/179121 [12:51:34] <_joe_> mutante: ^^ [12:53:03] (03PS2) 10Dzahn: move apache helper scripts and kill apachesync [puppet] - 10https://gerrit.wikimedia.org/r/177080 [12:54:14] _joe_: generating the dsh groups sounds like a good thing, it's just that until today whenever dsh has been mentioned it was "why you still using dsh, replace it with salt" [12:54:42] andrewbogott: I went to lunch, sorry. No I am still working on it [12:54:44] also, see above, moving apache-fast-test, removing apache-graceful-all [12:54:52] kart_: I just PMed you [12:55:21] PROBLEM - puppet last run on amssq46 is CRITICAL: CRITICAL: Puppet has 1 failures [12:56:10] <_joe_> mutante: dsh groups are basically used by scap in this case, so this should be a good thing [13:06:56] (03CR) 10Hashar: "The rspec-puppet matcher in v1.0.1 uses the wrong signature for failure_message_for_should and failure_message_for_should_not, it excepts " [puppet] - 10https://gerrit.wikimedia.org/r/178810 (owner: 10Hashar) [13:07:05] (03PS5) 10Hashar: Basic rspec setup [puppet] - 10https://gerrit.wikimedia.org/r/178810 [13:07:54] RECOVERY - puppet last run on amssq46 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [13:08:16] akosiaris: andrewbogott: hello! I got some basic rspec test for operations/puppet.git ( https://gerrit.wikimedia.org/r/#/c/178810/ ). Paired a bit yesterday with someone at my coworking place and Dan Duvall seems to love the idea :D [13:08:30] (03CR) 10Dzahn: "there is a "mediawiki-installation" vs. "mediawiki-install" naming conflict" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/179121 (owner: 10Giuseppe Lavagetto) [13:08:38] _joe_: so I am back from lunch but I guess you are heading out to lunch yourself aren't you? [13:09:11] hashar: yeah, I 've seen it too. it is in my backlog to study it more carefully [13:09:42] akosiaris: the last patchset should works properly now :] [13:10:05] (03CR) 10Dzahn: "see this please: https://gerrit.wikimedia.org/r/#/c/177080/" [puppet] - 10https://gerrit.wikimedia.org/r/164508 (owner: 10Ori.livneh) [13:11:48] <_joe_> hashar: yes [13:11:55] PROBLEM - puppet last run on amssq44 is CRITICAL: CRITICAL: Puppet has 1 failures [13:17:59] (03CR) 10Dzahn: "correction:" [puppet] - 10https://gerrit.wikimedia.org/r/160953 (owner: 10Alexandros Kosiaris) [13:19:03] (03CR) 10Giuseppe Lavagetto: mediawiki: allow use of mpm_worker instead of mpm_prefork (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/178470 (owner: 10Giuseppe Lavagetto) [13:20:34] (03PS3) 10Giuseppe Lavagetto: mediawiki: allow use of mpm_worker instead of mpm_prefork [puppet] - 10https://gerrit.wikimedia.org/r/178470 [13:21:00] (03CR) 10Dzahn: "diff between dsh group and scap target: osmium, searchidx1001, virt1000" [puppet] - 10https://gerrit.wikimedia.org/r/160953 (owner: 10Alexandros Kosiaris) [13:23:59] RECOVERY - puppet last run on amssq44 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [13:27:19] (03PS1) 10Springle: pool db1004 in s7 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179128 [13:28:50] (03PS6) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [13:28:57] (03CR) 10Springle: [C: 032] pool db1004 in s7 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179128 (owner: 10Springle) [13:29:07] (03Merged) 10jenkins-bot: pool db1004 in s7 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179128 (owner: 10Springle) [13:30:57] !log springle Synchronized wmf-config/db-eqiad.php: pool db1004 in s7, warm up (duration: 00m 06s) [13:31:05] Logged the message, Master [13:35:11] (03PS7) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [13:42:39] getting an OAuth error when trying to move files from en.wp to Commons using Commonshelper [13:43:33] got it https://phabricator.wikimedia.org/T78209 [13:46:37] JollyOldStNick: thanks for reporting, it's this one: [13:46:48] https://phabricator.wikimedia.org/T78223#841468 [13:47:06] people are looking at it now [13:47:15] confirmed https://www.mediawiki.org/w/index.php?title=Special:OAuth/ [13:47:42] ah, T87209 is a duplicate [13:47:59] as are others, I notice [13:48:27] yes, it's caused by a schema change and [13:48:39] https://gerrit.wikimedia.org/r/#/c/153983/ [13:50:38] JollyOldStNick: can you please try again? [13:50:46] we made a change [13:51:22] "This tool facilitates file uploads to Wikimedia Commons, under jour user name. You will have to authorise it first." [13:51:53] then there's some key stuff, which I guess I shouldn't post in an open channel [13:53:45] i tried this one http://tools.wmflabs.org/oauth-hello-world/ [13:53:54] and it appears to work [13:57:27] tools doesn't want to load for me now [13:58:01] hmmm. would be great if you can put the remaining issue on that phab ticket [13:58:11] because for others it looks fixed now [13:58:18] (the phab login users f.e.) [13:58:23] PROBLEM - Host d-i-test is DOWN: PING CRITICAL - Packet loss = 100% [13:58:28] I wonder if it's specific to Commonshelper [13:59:05] the oauth-hello-world works perfectly for me now [14:14:48] akosiaris: I'm going to sleep -- can you email me or leave me a message here if you make progress with booting those servers? Most likely partman will fail if we ever get that far. [14:19:14] (03CR) 10Aude: [C: 031] "i can't think of a reason why we need to put this SetupAfterCache. callback should be okay." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179008 (owner: 10Hoo man) [14:19:47] (03CR) 10Alexandros Kosiaris: [C: 032] "There is a general consensus to try and avoid exported_resources due to all the problems that come with them (mostly performance issues a" [puppet] - 10https://gerrit.wikimedia.org/r/179121 (owner: 10Giuseppe Lavagetto) [14:20:25] (03CR) 10Hoo man: "Doing the fix up in a follow up is fine with me" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/178873 (owner: 10Dzahn) [14:23:07] (03CR) 10Alexandros Kosiaris: [C: 032] "LGTM and I like mpm_worker, let's see if it is going to be better :-). Btw... mpm_event later on ;)" [puppet] - 10https://gerrit.wikimedia.org/r/178470 (owner: 10Giuseppe Lavagetto) [14:32:20] (03CR) 10Filippo Giunchedi: [C: 031] mediawiki: allow use of mpm_worker instead of mpm_prefork [puppet] - 10https://gerrit.wikimedia.org/r/178470 (owner: 10Giuseppe Lavagetto) [14:32:34] (03CR) 10Alexandros Kosiaris: [C: 032] ganglia_new::web: switch to apache-2.4 ssl ciphers [puppet] - 10https://gerrit.wikimedia.org/r/179098 (owner: 10Faidon Liambotis) [14:36:54] akosiaris: it has dependencies [14:37:41] (03CR) 10Filippo Giunchedi: [C: 031] "agreed with Alexandros on the rationale" [puppet] - 10https://gerrit.wikimedia.org/r/179121 (owner: 10Giuseppe Lavagetto) [14:38:30] (03CR) 10Giuseppe Lavagetto: "If we want to take less of a performance hit we may try to use the "naggen2" way later on." [puppet] - 10https://gerrit.wikimedia.org/r/179121 (owner: 10Giuseppe Lavagetto) [14:38:48] akosiaris: so you have to merge all the rest :D [14:40:05] (03CR) 10Giuseppe Lavagetto: "@Filippo: the reason why we can't use salt here is that this specific dsh group is used by scap, which is run from a deployment host. I am" [puppet] - 10https://gerrit.wikimedia.org/r/179121 (owner: 10Giuseppe Lavagetto) [14:40:46] paravoid: I know [14:41:32] <_joe_> akosiaris: btw, I'm waiting for bryan to be around, so we can decide with him what's the best strategy here [14:42:11] _joe_: ok [14:47:57] (03CR) 10Dzahn: "there is role::deployment::salt_masters::production but it's only on palladium, not on tin. salt_masters::labs appears but is commented ou" [puppet] - 10https://gerrit.wikimedia.org/r/179121 (owner: 10Giuseppe Lavagetto) [14:49:26] If I want logrotate for custom log files... I have to configure that per hand? [14:49:39] Or how is logrotate configured in the cluster [14:50:41] hoo: in puppet. but it's not a generic logrotate module unfortunately [14:50:49] mh [14:50:56] it's just several modules all dumping their own logrotate config file [14:51:03] doesn't it just work for stuff in /a/mw-log? [14:51:07] Maybe I should just overwrite my logs files [14:51:15] I don't care for more than the last run [14:51:33] modules/icinga/manifests/init.pp: file { '/etc/logrotate.d/icinga': [14:51:36] modules/icinga/manifests/init.pp: source => 'puppet:///modules/icinga/logrotate.conf', [14:51:40] ^ example [14:52:01] make a file resource and dump a new file into logrotate.d [14:52:12] I'm sure when Timo and I added the "error" logs, they got rotated out automagically [14:52:26] Reedy: I'm logging to /var/log locally [14:52:32] mutante: Yeah, ok [14:52:38] will probably do that [14:53:40] No idea why I should keep old log files... so just overwriting files might also be ok [14:53:49] (03PS1) 10Filippo Giunchedi: add LICENSE [software/swift-utils] - 10https://gerrit.wikimedia.org/r/179137 [14:55:07] that would be "rotate 0" [14:55:15] (03CR) 10Filippo Giunchedi: [C: 032 V: 032] add LICENSE [software/swift-utils] - 10https://gerrit.wikimedia.org/r/179137 (owner: 10Filippo Giunchedi) [14:56:04] Reedy: this might be the reason it "just worked" ? [14:56:15] manifests/role/logging.pp: cron { "mw-log-cleanup": [14:56:15] manifests/role/logging.pp: command => "/usr/local/bin/mw-log-cleanup", [14:57:06] yea, that's a custom script using find [14:57:39] # Remove logs which are beyond the logrotate expiry time [14:58:05] # logrotate will miss archived logs which were deleted in the parent directory. [14:58:24] ah [14:58:25] fair enough [14:59:22] puppet files/misc/scripts/mw-log-cleanup [15:01:10] (03PS2) 10Giuseppe Lavagetto: dsh: create files based on exported resources [puppet] - 10https://gerrit.wikimedia.org/r/179121 [15:01:12] !log saved Jenkins configuration via the web interface to reset the interface language from Chinese to English [15:01:17] Logged the message, Master [15:01:31] <_joe_> hashar: lol [15:01:41] <_joe_> you know these messages get tweeted right? [15:01:55] I know they are public indeed [15:02:09] <_joe_> yeah but the "sal" has its purpose [15:02:49] <_joe_> so it's easy to understand the context [15:02:56] <_joe_> on the contrary, https://twitter.com/wikimediatech/status/540626921210798080 [15:03:04] <_joe_> is plainly funny :P [15:03:12] :) [15:03:15] aahha [15:03:18] <_joe_> or https://twitter.com/wikimediatech/status/543057942334177280 [15:03:27] <_joe_> it's the out-of-context effect [15:03:48] (03CR) 10Alexandros Kosiaris: [C: 032] ganglia: remove Labs support [puppet] - 10https://gerrit.wikimedia.org/r/179096 (owner: 10Faidon Liambotis) [15:03:57] so potentially I could !log @someone to ping him in the name of the Wikimedia Tech Staff ? [15:04:15] (03CR) 10Alexandros Kosiaris: [C: 032] ganglia_new: remove Labs support [puppet] - 10https://gerrit.wikimedia.org/r/179097 (owner: 10Faidon Liambotis) [15:04:46] !log @damons we love you! [15:04:50] Logged the message, Master [15:05:02] sounds good https://twitter.com/wikimediatech/status/543058834143838208 [15:05:20] lol, i saw that on actual twitter before i switched back to IRC [15:06:49] so you can refer to each gerrit change that has been mentioned in SAL as a t.co URL: https://t.co/XopgIwDDDC [15:10:05] (03PS8) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [15:12:20] Coren: can I get a review of https://gerrit.wikimedia.org/r/#/c/179083/ ? [15:13:10] (03Abandoned) 10Faidon Liambotis: ganglia: remove pre-trusty/tmpfs support hacks [puppet] - 10https://gerrit.wikimedia.org/r/179076 (owner: 10Faidon Liambotis) [15:13:17] (03Abandoned) 10Faidon Liambotis: ganglia::web: remove ServerAlias [puppet] - 10https://gerrit.wikimedia.org/r/179077 (owner: 10Faidon Liambotis) [15:13:46] https://twitter.com/intent/favorite?tweet_id=509559476492304384 [15:14:19] _joe_: heh, the nodjes sucks thing makes sense even with no context, I think :) [15:14:53] <_joe_> YuviPanda: indeed [15:15:57] YuviPanda: there is .. panda.js [15:18:10] paravoid: Looking. [15:20:12] (03CR) 10coren: [C: 031] "Clearly sane." [puppet] - 10https://gerrit.wikimedia.org/r/179083 (owner: 10Faidon Liambotis) [15:20:22] Coren: then +2/merge it please :) [15:20:30] (03PS1) 10RobH: adding Matthias Mullie to statstics-admins [puppet] - 10https://gerrit.wikimedia.org/r/179140 [15:21:44] (03CR) 10RobH: [C: 04-2] "Do not submit until full 3 day wait has passed on associated RT access request ticket. RobH will merge this as Ops Clinic duty person on " [puppet] - 10https://gerrit.wikimedia.org/r/179140 (owner: 10RobH) [15:22:01] (03PS3) 10Dzahn: gerrit: Output space in commentlink "commit" before the link [puppet] - 10https://gerrit.wikimedia.org/r/177106 (owner: 10Krinkle) [15:23:51] (03CR) 10Dzahn: [C: 032] gerrit: Output space in commentlink "commit" before the link [puppet] - 10https://gerrit.wikimedia.org/r/177106 (owner: 10Krinkle) [15:24:28] (03PS2) 10Alexandros Kosiaris: url_downloader: remove pre-precise/squid 2 compat [puppet] - 10https://gerrit.wikimedia.org/r/179078 (owner: 10Faidon Liambotis) [15:24:30] (03PS2) 10Alexandros Kosiaris: install-server: remove pre-precise/squid 2 compat [puppet] - 10https://gerrit.wikimedia.org/r/179079 (owner: 10Faidon Liambotis) [15:25:31] gerrit problems or is it just me ? [15:25:35] is gerrit broken? [15:25:40] not just you [15:25:45] i just merged a change to the gerrit config [15:25:46] ok, restarting it then [15:25:46] hold on [15:25:50] mutante: ok [15:25:51] ok [15:26:06] works for me [15:26:06] back [15:26:12] it was just the restart from puppet [15:26:16] ok [15:31:48] (03PS8) 10Giuseppe Lavagetto: hiera: role-based backend, role keyword [puppet] - 10https://gerrit.wikimedia.org/r/176334 [15:32:39] (03PS1) 10Hoo man: Rotate Wikidata json dump logs [puppet] - 10https://gerrit.wikimedia.org/r/179141 [15:32:44] mutante: ^ [15:32:49] untested, of course [15:33:15] (03CR) 10Alexandros Kosiaris: [C: 032] "Given the short-lived commits as Faidon pointed out, I remove my -1 and give a +2" [puppet] - 10https://gerrit.wikimedia.org/r/179078 (owner: 10Faidon Liambotis) [15:33:20] (03CR) 10Tim Landscheidt: "(See T77987 for reference.)" [puppet] - 10https://gerrit.wikimedia.org/r/178493 (owner: 10Dzahn) [15:35:17] (03CR) 10Alexandros Kosiaris: "recheck" [puppet] - 10https://gerrit.wikimedia.org/r/179078 (owner: 10Faidon Liambotis) [15:35:43] (03CR) 10Ottomata: [C: 04-1] "Ah ok, no. You are looking at the puppetization of public-datasets directory on stat1001 (the webserver) not on the cruncher machines (st" [puppet] - 10https://gerrit.wikimedia.org/r/179140 (owner: 10RobH) [15:36:42] (03CR) 10Alexandros Kosiaris: "Given the short life of the commit as pointed out by Faidon, I remove the -1 and +2 this" [puppet] - 10https://gerrit.wikimedia.org/r/179079 (owner: 10Faidon Liambotis) [15:36:54] (03CR) 10Alexandros Kosiaris: [C: 032] "Given the short life of the commit as pointed out by Faidon, I remove the -1 and +2 this" [puppet] - 10https://gerrit.wikimedia.org/r/179079 (owner: 10Faidon Liambotis) [15:37:16] (03PS9) 10Giuseppe Lavagetto: hiera: role-based backend, role keyword [puppet] - 10https://gerrit.wikimedia.org/r/176334 [15:38:24] (03PS2) 10Alexandros Kosiaris: Remove apt::pin for squid packages [puppet] - 10https://gerrit.wikimedia.org/r/179080 (owner: 10Faidon Liambotis) [15:38:47] am I late to the party or ganglia seems in trouble? [15:42:49] (03PS9) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [15:45:07] (03CR) 10Faidon Liambotis: [C: 04-1] "/run is volatile. Contents there may not survive a reboot. You have to create subdirs there from the upstart script (systemd has a special" [puppet] - 10https://gerrit.wikimedia.org/r/179108 (owner: 10Giuseppe Lavagetto) [15:46:18] paravoid: to continue, do you think I should not do this logster-varnishkafka-statsd thing I am going to do soon? [15:46:53] ottomata: I think it'd be better for varnishkafka to gain the ability to push to statsd directly [15:46:54] (03PS10) 10Giuseppe Lavagetto: hiera: role-based backend, role keyword [puppet] - 10https://gerrit.wikimedia.org/r/176334 [15:47:10] ottomata: or at least have a hook to run something everytime it writes its statistics [15:47:18] ottomata: but I guess logster could be acceptable, okay... [15:48:59] (03CR) 10Giuseppe Lavagetto: "Ah, true. I'll restore that part of the upstart script (just for what is in /run by default)" [puppet] - 10https://gerrit.wikimedia.org/r/179108 (owner: 10Giuseppe Lavagetto) [15:49:11] paravoid, i started to look into modifying statsd, and sorta talked to snaps about it, and he recommened I parse the json file! :) and after looking into it, i realized that would be easier...especially since I had pretty much already done the work for it (a year ago) [15:50:08] (03CR) 10Giuseppe Lavagetto: [C: 032] hiera: role-based backend, role keyword [puppet] - 10https://gerrit.wikimedia.org/r/176334 (owner: 10Giuseppe Lavagetto) [15:50:32] <_joe_> paravoid: thanks, that was clearly a brainfart [15:50:45] (03CR) 10Faidon Liambotis: "Furthermore, what's the use case for making log_dir/tmp_dir parameters? This makes the manifest harder to read, so I'd prefer to not do th" [puppet] - 10https://gerrit.wikimedia.org/r/179108 (owner: 10Giuseppe Lavagetto) [15:50:53] marktraceur, manybubbles, ^demon|zzz: Who wants to SWAT this morning? [15:51:00] paravoid: There's a dependency on https://gerrit.wikimedia.org/r/#/c/179082/1; want me to cherry pick it away or are you going to merge that one soon? [15:51:12] Coren: yeah I'm waiting for akosiaris to finish with his merge [15:51:17] and I'll rebase [15:51:38] anomie: I'm not in a place with consistent wifi so I'll decline this morning if possible..... Monday I'll totally pull my weight [15:54:14] bd808, hoo|lecture: Ping for SWAT in about 6 minutes. [15:54:19] PROBLEM - Kafka Broker Messages In on analytics1021 is CRITICAL: kafka.server.BrokerTopicMetrics.AllTopicsMessagesInPerSec.FifteenMinuteRate CRITICAL: 712.405987608 [15:54:21] around [15:54:45] ottomata: dude, thank you for correcting me! [15:54:54] (not sarcastic, that would have been a bad access merge!) [15:55:00] so i appreciate it =] [15:55:14] * robh would have caught hell for it [15:55:31] heh, np! [15:55:46] yeah, in general, if people ask for access to statistics servers, its like this: [15:56:01] anomie: I can if you don't want to [15:56:06] stat1003 -> statistics-users [15:56:06] stat1002 -> statistics-privatedata-users [15:56:15] marktraceur: Go for it [15:56:17] Neat. [15:56:30] if they want hadoop -> analytics-privatedata-users (if they have signed NDA and want access to webrequest logs), otherwise just analytics-users [15:56:40] if they want research db password from stat1003 -> researchers [15:56:44] (03PS2) 10RobH: adding Matthias Mullie to statstics-users [puppet] - 10https://gerrit.wikimedia.org/r/179140 [15:57:06] good to know [15:57:35] (03CR) 10RobH: "Do not submit until full 3 day wait has passed on associated RT access request ticket. RobH will merge this as Ops Clinic duty person on F" [puppet] - 10https://gerrit.wikimedia.org/r/179140 (owner: 10RobH) [15:58:39] (03CR) 10Giuseppe Lavagetto: "The use case to make those configurable is that we made those configurable in the default file; else we should've chosen to hardwire them." [puppet] - 10https://gerrit.wikimedia.org/r/179108 (owner: 10Giuseppe Lavagetto) [15:58:59] Two config patches <3 [15:59:08] ja, -admins groups generally have some kind of special rights, usually sudo [15:59:37] !log starting trusty upgrade of analytics1033 [15:59:41] Logged the message, Master [15:59:46] bd808: Need you to ping back before we can go [15:59:58] Whoever's doing SWAT, let me know if you have questions about the WikimediaEvents fix [15:59:59] Thanks for being on time, hoo|lecture [16:00:04] manybubbles, anomie, ^d, marktraceur: Dear anthropoid, the time has come. Please deploy Morning SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20141211T1600). [16:00:13] superm401: What WikimediaEvents fix? [16:00:18] (03CR) 10BryanDavis: "scap does not use dsh directly anymore, but it does use the mediawiki-installation and scap-proxies dsh group files to know which hosts to" [puppet] - 10https://gerrit.wikimedia.org/r/179121 (owner: 10Giuseppe Lavagetto) [16:01:07] (03PS2) 10MarkTraceur: Compute $wgWBClientSettings['excludeNamespaces'] on demand [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179008 (owner: 10Hoo man) [16:01:12] (03CR) 10MarkTraceur: [C: 032] Compute $wgWBClientSettings['excludeNamespaces'] on demand [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179008 (owner: 10Hoo man) [16:01:23] marktraceur: pong [16:01:38] mine should probably be last as it could melt logging [16:01:42] Sweet [16:01:46] Sounds like a fun morning [16:02:05] (03Merged) 10jenkins-bot: Compute $wgWBClientSettings['excludeNamespaces'] on demand [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179008 (owner: 10Hoo man) [16:02:06] bd808: I *was* going to ask you "how badly will this break the servers?" but thanks for being forthcoming [16:02:11] I need an update to the private settings too I realized. I'll get the info for that ready [16:02:40] marktraceur: it should not break the servers themselves but may make us blind to errors in the worst case [16:03:10] I accidentally tested it with very broken config in beta a couple weeks ago and there were no user facing errors [16:03:15] marktraceur, the one that was on the calendar until Max deleted it by accident: https://wikitech.wikimedia.org/w/index.php?title=Deployments&diff=next&oldid=137531 [16:03:32] Who's Max? Oh well, superm401, please add it back and I'll do it next [16:03:58] marktraceur, MaxSem; see the link I just put. [16:04:03] Will fix page [16:04:06] !log marktraceur Synchronized wmf-config/Wikibase.php: [SWAT] [config] Compute ['excludeNamespaces'] on demand (duration: 00m 05s) [16:04:23] (I only saw that it was deleted just now) [16:04:53] superm401: It looks like the wikidata one is done [16:05:15] And...the other one enables a pretty big feature, it looks like? [16:05:30] (03PS1) 10Ottomata: Include CDH mahout package in apt [puppet] - 10https://gerrit.wikimedia.org/r/179144 [16:05:30] Oh, re-enable. Fine. [16:05:36] (03PS2) 10MarkTraceur: Reenable WikiGrok UI on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179010 (owner: 10MaxSem) [16:05:38] hoo|lecture: Test plox [16:05:47] (03CR) 10MarkTraceur: [C: 032] Reenable WikiGrok UI on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179010 (owner: 10MaxSem) [16:05:50] marktraceur: Not really something we can test [16:05:57] (03Merged) 10jenkins-bot: Reenable WikiGrok UI on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179010 (owner: 10MaxSem) [16:06:02] Can look whether it broke, though [16:06:10] (03PS2) 10Giuseppe Lavagetto: hhvm: make the puppet module more configurable [puppet] - 10https://gerrit.wikimedia.org/r/179108 [16:06:10] hoo|lecture: OK...well...look at it thoughtfully for a second and then tell me everything will be OK [16:06:22] <^demon|zzz> /nick ^d [16:06:32] In^deed. [16:06:38] (03CR) 10Ottomata: [C: 032] Include CDH mahout package in apt [puppet] - 10https://gerrit.wikimedia.org/r/179144 (owner: 10Ottomata) [16:06:40] (03PS4) 10Giuseppe Lavagetto: mediawiki: allow use of mpm_worker instead of mpm_prefork [puppet] - 10https://gerrit.wikimedia.org/r/178470 [16:06:44] <^d> marktraceur: Waking up is hard. [16:07:01] The patch made sense to me for two consecutive days, so it should be fine [16:07:01] superm401: OK, you're next. Ready to go? [16:07:21] marktraceur, yeah, page fixed too. [16:07:45] !log marktraceur Synchronized wmf-config/InitialiseSettings.php: [SWAT] [config] Reenable WikiGrok on enwiki (duration: 00m 07s) [16:07:46] superm401: Test! [16:07:54] marktraceur: Seems to work :) [16:07:54] Logged the message, Master [16:07:58] (03CR) 10Giuseppe Lavagetto: [C: 032] mediawiki: allow use of mpm_worker instead of mpm_prefork [puppet] - 10https://gerrit.wikimedia.org/r/178470 (owner: 10Giuseppe Lavagetto) [16:08:03] You broke the deployment page [16:08:17] I did? [16:08:28] No, superm401 did [16:09:16] Fixed I think [16:09:29] Oh. [16:09:35] superm401: I pushed the WikiGrok thing. [16:09:45] marktraceur, thanks, sorry. [16:10:05] superm401: Was the WikiGrok thing supposed to be excluded from today's SWAT? [16:10:33] There's a big red "Not done" next to it in yesterday's evening SWAT [16:10:41] But no explanation [16:11:30] marktraceur, well, presumably it wasn't meant to be done this morning since no one requested it this morning. [16:11:34] Sigh [16:11:39] OK, well, crap, I'll revert it. [16:11:51] (03PS1) 10MarkTraceur: Revert "Reenable WikiGrok UI on enwiki" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179145 [16:11:52] Did they leave the core submodule there? [16:11:53] (03CR) 10Faidon Liambotis: [C: 04-1] "It's very unusual for a package to have its log or directories configurable using /etc/default. I don't like it much, I'd prefer it if we " [puppet] - 10https://gerrit.wikimedia.org/r/179108 (owner: 10Giuseppe Lavagetto) [16:12:00] Oh, it's just a config change. [16:12:00] What? [16:12:03] Yeah. [16:12:15] (03CR) 10MarkTraceur: [C: 032] Revert "Reenable WikiGrok UI on enwiki" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179145 (owner: 10MarkTraceur) [16:12:23] (03Merged) 10jenkins-bot: Revert "Reenable WikiGrok UI on enwiki" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179145 (owner: 10MarkTraceur) [16:12:49] !log marktraceur Synchronized wmf-config/InitialiseSettings.php: [SWAT] [config] Redisable WikiGrok on enwiki (duration: 00m 05s) [16:12:55] Logged the message, Master [16:13:01] OK, now superm401's patches for real this time [16:15:05] Waiting on Jenkins. [16:16:20] (03CR) 10Dzahn: "hmm. do "nocreate" and "rotate 5" contradict each other? anyways, i'd be fine merging if you are fine with not keeping old logs because th" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/179141 (owner: 10Hoo man) [16:17:23] mutante: snapshot1003 has that role [16:17:42] (03PS1) 10Ottomata: Add mahout.pp class to install mahout package [puppet/cdh] - 10https://gerrit.wikimedia.org/r/179147 [16:18:22] (03CR) 10Ottomata: [C: 032] Add mahout.pp class to install mahout package [puppet/cdh] - 10https://gerrit.wikimedia.org/r/179147 (owner: 10Ottomata) [16:19:46] (03PS2) 10Hoo man: Rotate Wikidata json dump logs [puppet] - 10https://gerrit.wikimedia.org/r/179141 [16:19:59] My god, what is Jenkins even doing [16:20:13] (03PS1) 10Ottomata: Include mahout package on analytics client nodes [puppet] - 10https://gerrit.wikimedia.org/r/179150 [16:20:19] (03CR) 10Hoo man: "Removed nocreate... that doesn't make to much sense here anyway." [puppet] - 10https://gerrit.wikimedia.org/r/179141 (owner: 10Hoo man) [16:21:37] (03CR) 10Ottomata: [C: 032] Include mahout package on analytics client nodes [puppet] - 10https://gerrit.wikimedia.org/r/179150 (owner: 10Ottomata) [16:22:03] (03CR) 10Ottomata: "T78016" [puppet] - 10https://gerrit.wikimedia.org/r/179150 (owner: 10Ottomata) [16:22:37] (03PS1) 10Ottomata: Need to add mahout.pp role! [puppet] - 10https://gerrit.wikimedia.org/r/179151 [16:23:00] (03CR) 10Ottomata: [C: 032 V: 032] "T78016" [puppet] - 10https://gerrit.wikimedia.org/r/179151 (owner: 10Ottomata) [16:23:36] superm401: The long wait is over, syncing file [16:23:51] !log marktraceur Synchronized php-1.25wmf11/extensions/WikimediaEvents/WikimediaEvents.php: [SWAT] [wmf11] Bump sendBeacon schema revision so new URL will be generated (duration: 00m 14s) [16:23:54] Logged the message, Master [16:24:07] superm401: Test on a Wikipedia or something? [16:24:57] Sure, testing now. [16:25:05] (03PS1) 10Papaul: Change labs-support1-b-codfw subnet from /22 to /24 [puppet] - 10https://gerrit.wikimedia.org/r/179152 [16:25:34] !log marktraceur Synchronized php-1.25wmf12/extensions/WikimediaEvents/WikimediaEvents.php: [SWAT] [wmf12] Bump sendBeacon schema revision so new URL will be generated (duration: 00m 16s) [16:25:35] superm401: And on mw.org when you're done. [16:25:37] Logged the message, Master [16:25:43] And now on too bd808's silly change. [16:25:59] * bd808 sulks [16:26:03] it's not silly [16:26:05] :) [16:26:17] It's... the FUTURE! [16:27:02] (03CR) 10Hashar: "To make hhvm uses the env variable HHVM_REPO_LOCAL_PATH and HHVM_REPO_CENTRAL_PATH, we would need to define empty values in php.ini:" (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/178806 (owner: 10Hashar) [16:28:31] !log marktraceur Synchronized private/PrivateSettings.php: [SWAT] [config] Add password for logstash (duration: 00m 10s) [16:28:33] (03CR) 10Dzahn: [C: 032] Rotate Wikidata json dump logs [puppet] - 10https://gerrit.wikimedia.org/r/179141 (owner: 10Hoo man) [16:28:34] * marktraceur makes sure things still run [16:28:35] Logged the message, Master [16:28:46] mutante: Thanks :) [16:28:47] OK, on to the real patch. [16:29:12] hoo|lecture: there are currently 4 files, so rotating 5 seems just right [16:29:30] (03CR) 10MarkTraceur: [C: 032] Configure logging to use MWLoggerMonologSpi [mediawiki-config] - 10https://gerrit.wikimedia.org/r/178978 (owner: 10BryanDavis) [16:29:43] Oops, didn't rebase. Sigh. [16:29:47] (03Merged) 10jenkins-bot: Configure logging to use MWLoggerMonologSpi [mediawiki-config] - 10https://gerrit.wikimedia.org/r/178978 (owner: 10BryanDavis) [16:29:50] mutante: Mh... I think it will treat each one individually [16:30:03] thus we'll get 5 * 4 logs [16:30:38] hoo|lecture: you are right, but still not an issue with space, as long as we rotate at all.. each was like 4M [16:30:50] yep [16:30:55] superm401: Is it working? :) [16:31:10] marktraceur, we're good on Wikipedia (had to wait for RL to give you a final answer). [16:31:13] KK [16:31:14] About to check MW.org. [16:31:26] I'll wait for confirmation before breaking logging. [16:33:30] that seems wise. [16:33:39] marktraceur, can't tell if it's working on MW.org yet. Might just be lower traffic (it's sampled). [16:33:40] One sec. [16:33:50] KK, fun times. [16:34:12] (03CR) 10Dzahn: "checked on snapshot1003: config file has been created and in the log dir we have files -0 thru -3 as before (for now)" [puppet] - 10https://gerrit.wikimedia.org/r/179141 (owner: 10Hoo man) [16:34:23] MaxSem: Sorry I accidentally re-deployed WikiGrok to enwiki. [16:34:28] For about five minutes. [16:34:47] :P [16:37:55] MaxSem: All part of my plan to get people slowly used to it. [16:39:17] Something's not right. [16:39:20] Trying to figure out what. [16:39:22] Uh oh. [16:39:34] superm401: In a crashy way, or just a...that didn't help anything way? [16:39:41] The latter [16:39:52] OK, well, that's a good thing overall [16:40:07] superm401: Think you'll have a solution in ten minutes? I want to get bd808's patch on its way soonish [16:40:13] (not that we're in a huge time crunch) [16:40:25] (03CR) 10Papaul: [C: 031] Change labs-support1-b-codfw subnet from /22 to /24 [puppet] - 10https://gerrit.wikimedia.org/r/179152 (owner: 10Papaul) [16:40:26] marktraceur, yeah, I think I know what it was. [16:41:50] (03CR) 10coren: [C: 032] "Yep. Fix to bad copypaste from when the net was said to be a /22 (but later updated to /24)" [puppet] - 10https://gerrit.wikimedia.org/r/179152 (owner: 10Papaul) [16:42:08] (03PS1) 10KartikMistry: Added initial Debian packaging [debs/contenttranslation/hfst] - 10https://gerrit.wikimedia.org/r/179153 [16:42:30] I need a root to look at the contents of /var/log/hhvm/error.log on mw1121.eqiad.wmnet to see if they can get the full error message for https://phabricator.wikimedia.org/T78309 [16:43:05] Never mind, I think we're good. [16:43:23] I just forgot that it wouldn't be in an anon page without purging (this is a known consideration for the deployment though, not a problem) [16:43:29] superm401: Perfect! Thanks. [16:43:33] I think it's just that MW.org has radically lower traffic. [16:43:41] So it will take a while to start sampling data (it's 1/10,000) [16:43:44] bd808: Yours is going out [16:43:46] !log marktraceur Synchronized wmf-config/: [SWAT] [config] Configure logging to use MWLoggerMonologSpi (duration: 00m 07s) [16:43:51] Logged the message, Master [16:43:51] bd808: Testy test :) [16:43:56] revert [16:43:57] marktraceur: explode! [16:43:58] The other ways of checking look fine, though. [16:44:00] revert [16:44:01] revert [16:44:07] marktraceur: revert [16:44:10] Shoot [16:44:40] !log starting trusty upgrade of analytics1011 [16:44:43] Logged the message, Master [16:44:45] Whew\ [16:44:46] !log marktraceur Synchronized wmf-config/: [SWAT] [config] Revert Configure logging to use MWLoggerMonologSpi (duration: 00m 05s) [16:44:50] Logged the message, Master [16:44:52] thx :( [16:45:00] Gotta actually revert it, sec [16:45:20] Back up [16:45:25] cu [16:45:28] * bd808 will look at the obviously bad code and try to figure out what went wrong [16:45:35] "PHP fatal error: [16:45:35] Undefined index: " [16:45:38] ... [16:45:43] Bsadowski1: We know, it's bd808's fault [16:45:51] Reverted now, should be back up [16:46:26] marktraceur: sorry you pushed the buttons for that :/ [16:46:36] 's all right [16:46:36] YuviPanda: how often will the shinken email change? :P Now I see a moderated email from "shinken@shinken-01" [16:46:39] Better than coffee [16:46:47] (03PS1) 10MarkTraceur: Revert "Configure logging to use MWLoggerMonologSpi" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179154 [16:46:51] greg-g: huh? it’s always been from shinken-01, no? [16:47:11] greg-g: did I brainfart and have you add shinken-server-01? [16:47:13] no.... [16:48:01] * greg-g looks [16:48:07] * greg-g waits for mailman [16:48:56] bd808: do you still need a root? [16:49:03] (03CR) 10MarkTraceur: [C: 032] Revert "Configure logging to use MWLoggerMonologSpi" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179154 (owner: 10MarkTraceur) [16:49:06] {{selfmerge}} yay [16:49:12] (03Merged) 10jenkins-bot: Revert "Configure logging to use MWLoggerMonologSpi" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179154 (owner: 10MarkTraceur) [16:49:25] YuviPanda: it was "shinken@shinken-01.eqiad.wmflabs" now it's "shinken@shinken-01" [16:49:36] greg-g: huh, that’s… weird. [16:49:42] or at least, one message was [16:49:43] YuviPanda: Sure, if you have time to dig for the error message [16:50:01] !log marktraceur Synchronized wmf-config/: [SWAT] [config] Actually Revert 'Configure logging to use MWLoggerMonologSpi' (duration: 00m 10s) [16:50:01] greg-g: hmm, my current emails are also coming from shinken-01.eqiad.wmflabs [16:50:03] Logged the message, Master [16:50:18] One apache didn't like that sync, but we should be set now [16:50:20] greg-g: hmm, probably something terrible and transient. do let me know if there are more [16:50:25] bd808: sure, looking [16:50:42] YuviPanda: god I hate those terrible and transient issues :P [16:50:44] PROBLEM - Apache HTTP on mw1237 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:50:56] greg-g: sooo many of them, like the ones filling up that mailing list all day [16:51:42] * greg-g nods [16:53:16] PROBLEM - HHVM rendering on mw1237 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:54:33] bd808: hmm, interesting. there’s a pubkey for that in /etc/ssh/ssh_known_hosts but that’s invalid [16:54:35] for mw1121 [16:54:38] probably reimaging [16:55:04] YuviPanda: A pubkey for what? [16:55:15] bd808: mw1121 [16:55:49] YuviPanda: Oh I can log into the host, the problem is that the perms on the log file are syslog:adm 0640 [16:56:05] Which I just filed https://phabricator.wikimedia.org/T78310 for [16:56:29] bd808: is this /var/log/hhvm/error.log? [16:56:44] YuviPanda: yes. And the logs that make it to fluorine are truncated at 64k per line [16:56:57] because they are sent as udp packets [16:57:06] bd808: heh, similar errors in that log file are also truncated before the data ends [16:57:07] actually probably a lot less than 64k [16:57:11] boo [16:57:22] PROBLEM - HHVM queue size on mw1237 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [80.0] [16:57:33] so in this case the hhvm log message is not very useful :( [16:57:58] bd808: yeah, formatted wrong [16:58:00] PROBLEM - HHVM busy threads on mw1237 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [115.2] [16:58:02] New ResourceLoader startup module compression is neat: https://gerrit.wikimedia.org/r/#/c/168732/ [16:58:42] bd808: so not sure what to do there. [16:59:37] YuviPanda: maybe just comment on the phab task that the local log file is not helpful [16:59:42] !log restarted gmond on ms-fe1001, all swift machines under this aggregator were showing offline [16:59:44] yeah did [16:59:47] Logged the message, Master [17:00:08] (03CR) 10Hashar: contint: provision hhvm on CI slaves (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/178806 (owner: 10Hashar) [17:00:51] <_joe_> opsens: when you see that message (HHVM queue size on mw1237 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold) that means that server needs an HHVM restart [17:01:17] (03PS8) 10Hashar: contint: provision hhvm on CI slaves [puppet] - 10https://gerrit.wikimedia.org/r/178806 [17:01:28] _joe_, aaand, that means apache restart? or something else? (sorry, haven't messed with hhvm yet) [17:01:52] _joe_: I’ve been meaning to make the messages from check_graphite configurable so they provide useful context by themselves, should get to it next week [17:02:27] <_joe_> ottomata: service hhvm restart :) [17:02:43] <_joe_> ottomata: not now, I'm on it, but just FYI [17:03:08] k cool [17:04:11] <_joe_> did we have a few issues with the latest deploy? [17:04:14] what happened to ganglia at ~15:16? I see gmond restarted because of (blank) configuration change, I2d7f45b5f perhaps paravoid ? [17:04:28] <_joe_> http://gdash.wikimedia.org/dashboards/reqerror/ looks wrong [17:04:35] (03PS1) 10Christopher Johnson (WMDE): Phabricator Sprint (0.6.1.4) [puppet] - 10https://gerrit.wikimedia.org/r/179155 [17:06:46] <_joe_> !log restarting HHVM on mw1237, stuck in HPHP::StatCache::refresh [17:06:49] Logged the message, Master [17:07:13] <_joe_> mmmh [17:07:56] <_joe_> ori: ^^ looks like a statcache deadlock [17:08:04] (03PS2) 10Qgil: Phabricator Sprint (0.6.1.4) [puppet] - 10https://gerrit.wikimedia.org/r/179155 (owner: 10Christopher Johnson (WMDE)) [17:08:41] RECOVERY - HHVM rendering on mw1237 is OK: HTTP OK: HTTP/1.1 200 OK - 64572 bytes in 0.100 second response time [17:09:31] RECOVERY - Apache HTTP on mw1237 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 440 bytes in 0.041 second response time [17:13:38] (03PS2) 10Faidon Liambotis: Run "apt-get update" outside of/before puppet [puppet] - 10https://gerrit.wikimedia.org/r/179082 [17:13:40] (03PS2) 10Faidon Liambotis: toollabs: DTRT with both trusty and >= trusty [puppet] - 10https://gerrit.wikimedia.org/r/179083 [17:13:42] (03PS2) 10Faidon Liambotis: Add a new squid3 module and replace in-grown use [puppet] - 10https://gerrit.wikimedia.org/r/179081 [17:13:57] (03CR) 10Faidon Liambotis: [C: 032] toollabs: DTRT with both trusty and >= trusty [puppet] - 10https://gerrit.wikimedia.org/r/179083 (owner: 10Faidon Liambotis) [17:15:09] (03CR) 10Dzahn: [C: 04-1] "i should use a separate SQL user for this , requested GRANT in T78311" [puppet] - 10https://gerrit.wikimedia.org/r/177792 (owner: 10Dzahn) [17:15:44] RECOVERY - HHVM queue size on mw1237 is OK: OK: Less than 30.00% above the threshold [10.0] [17:16:15] (03CR) 10Hashar: "Gave it a try and noticed hhvm uses:" [puppet] - 10https://gerrit.wikimedia.org/r/178806 (owner: 10Hashar) [17:16:23] RECOVERY - HHVM busy threads on mw1237 is OK: OK: Less than 30.00% above the threshold [76.8] [17:18:06] Could not retrieve file metadata for puppet://puppet/plugins: Error 500 on SERVER: [17:18:41] !log restarting apache on strontium [17:18:45] Logged the message, Master [17:20:51] PROBLEM - puppet last run on mw1241 is CRITICAL: CRITICAL: Puppet has 30 failures [17:20:52] PROBLEM - puppet last run on cp3005 is CRITICAL: CRITICAL: puppet fail [17:21:04] PROBLEM - puppet last run on terbium is CRITICAL: CRITICAL: Puppet has 42 failures [17:21:14] PROBLEM - puppet last run on achernar is CRITICAL: CRITICAL: Puppet has 15 failures [17:21:39] PROBLEM - puppet last run on wtp1001 is CRITICAL: CRITICAL: puppet fail [17:21:46] PROBLEM - puppet last run on mw1135 is CRITICAL: CRITICAL: puppet fail [17:21:49] PROBLEM - puppet last run on radium is CRITICAL: CRITICAL: puppet fail [17:22:01] PROBLEM - puppet last run on lvs1004 is CRITICAL: CRITICAL: puppet fail [17:22:02] PROBLEM - puppet last run on strontium is CRITICAL: CRITICAL: Puppet has 33 failures [17:22:03] PROBLEM - puppet last run on ms-be1002 is CRITICAL: CRITICAL: Puppet has 29 failures [17:22:03] PROBLEM - puppet last run on mw1086 is CRITICAL: CRITICAL: Puppet has 36 failures [17:22:03] PROBLEM - puppet last run on mw1066 is CRITICAL: CRITICAL: Puppet has 68 failures [17:22:03] PROBLEM - puppet last run on cp4012 is CRITICAL: CRITICAL: Puppet has 26 failures [17:22:03] PROBLEM - puppet last run on cp4020 is CRITICAL: CRITICAL: Puppet has 42 failures [17:22:03] PROBLEM - puppet last run on lvs3003 is CRITICAL: CRITICAL: Puppet has 19 failures [17:22:06] uh oh [17:22:13] PROBLEM - puppet last run on virt1008 is CRITICAL: CRITICAL: Puppet has 22 failures [17:22:20] :) [17:22:26] PROBLEM - puppet last run on elastic1020 is CRITICAL: CRITICAL: Puppet has 8 failures [17:22:27] PROBLEM - puppet last run on mw1158 is CRITICAL: CRITICAL: Puppet has 67 failures [17:22:27] PROBLEM - puppet last run on mw1143 is CRITICAL: CRITICAL: Puppet has 74 failures [17:22:28] PROBLEM - puppet last run on ms-fe1002 is CRITICAL: CRITICAL: Puppet has 25 failures [17:22:28] PROBLEM - puppet last run on bast2001 is CRITICAL: CRITICAL: puppet fail [17:22:28] PROBLEM - puppet last run on baham is CRITICAL: CRITICAL: Puppet has 26 failures [17:22:28] puppet should give you a grace period before shaming you in here [17:22:32] PROBLEM - puppet last run on radon is CRITICAL: CRITICAL: puppet fail [17:22:32] PROBLEM - puppet last run on cp4002 is CRITICAL: CRITICAL: Puppet has 27 failures [17:22:32] PROBLEM - puppet last run on mw1231 is CRITICAL: CRITICAL: Puppet has 13 failures [17:22:32] PROBLEM - puppet last run on cp4009 is CRITICAL: CRITICAL: Puppet has 26 failures [17:22:32] PROBLEM - puppet last run on amssq39 is CRITICAL: CRITICAL: Puppet has 18 failures [17:22:43] PROBLEM - puppet last run on cp1040 is CRITICAL: CRITICAL: Puppet has 26 failures [17:22:43] PROBLEM - puppet last run on amssq31 is CRITICAL: CRITICAL: Puppet has 27 failures [17:22:50] that's the 500s [17:22:52] PROBLEM - puppet last run on stat1001 is CRITICAL: CRITICAL: Puppet has 18 failures [17:22:54] PROBLEM - puppet last run on elastic1003 is CRITICAL: CRITICAL: Puppet has 20 failures [17:22:56] PROBLEM - puppet last run on elastic1029 is CRITICAL: CRITICAL: Puppet has 23 failures [17:22:56] PROBLEM - puppet last run on mw1017 is CRITICAL: CRITICAL: puppet fail [17:23:02] PROBLEM - puppet last run on mw1236 is CRITICAL: CRITICAL: Puppet has 65 failures [17:23:03] PROBLEM - puppet last run on mw1154 is CRITICAL: CRITICAL: puppet fail [17:23:03] PROBLEM - puppet last run on mw1021 is CRITICAL: CRITICAL: puppet fail [17:23:03] PROBLEM - puppet last run on cp1067 is CRITICAL: CRITICAL: Puppet has 45 failures [17:23:03] PROBLEM - puppet last run on mw1064 is CRITICAL: CRITICAL: Puppet has 76 failures [17:23:03] PROBLEM - puppet last run on mw1113 is CRITICAL: CRITICAL: puppet fail [17:23:04] PROBLEM - puppet last run on copper is CRITICAL: CRITICAL: Puppet has 27 failures [17:23:04] PROBLEM - puppet last run on mw1101 is CRITICAL: CRITICAL: puppet fail [17:23:05] PROBLEM - puppet last run on analytics1031 is CRITICAL: CRITICAL: puppet fail [17:23:05] PROBLEM - puppet last run on mw1137 is CRITICAL: CRITICAL: puppet fail [17:23:13] PROBLEM - puppet last run on cp1044 is CRITICAL: CRITICAL: Puppet has 7 failures [17:23:13] PROBLEM - puppet last run on mw1110 is CRITICAL: CRITICAL: Puppet has 71 failures [17:23:13] PROBLEM - puppet last run on analytics1021 is CRITICAL: CRITICAL: puppet fail [17:23:14] PROBLEM - puppet last run on mw1253 is CRITICAL: CRITICAL: Puppet has 74 failures [17:23:15] PROBLEM - puppet last run on db1027 is CRITICAL: CRITICAL: Puppet has 23 failures [17:23:15] PROBLEM - puppet last run on mw1047 is CRITICAL: CRITICAL: puppet fail [17:23:15] PROBLEM - puppet last run on labsdb1002 is CRITICAL: CRITICAL: puppet fail [17:23:15] PROBLEM - puppet last run on mw1071 is CRITICAL: CRITICAL: Puppet has 31 failures [17:23:15] PROBLEM - puppet last run on wtp1008 is CRITICAL: CRITICAL: puppet fail [17:23:15] PROBLEM - puppet last run on search1003 is CRITICAL: CRITICAL: Puppet has 47 failures [17:23:23] PROBLEM - puppet last run on mw1078 is CRITICAL: CRITICAL: puppet fail [17:23:24] PROBLEM - puppet last run on db2017 is CRITICAL: CRITICAL: puppet fail [17:23:26] PROBLEM - puppet last run on mw1073 is CRITICAL: CRITICAL: puppet fail [17:23:26] PROBLEM - puppet last run on tmh1001 is CRITICAL: CRITICAL: Puppet has 58 failures [17:23:27] PROBLEM - puppet last run on mw1131 is CRITICAL: CRITICAL: puppet fail [17:23:27] PROBLEM - puppet last run on lvs2002 is CRITICAL: CRITICAL: puppet fail [17:23:27] PROBLEM - puppet last run on lvs4004 is CRITICAL: CRITICAL: Puppet has 22 failures [17:23:27] PROBLEM - puppet last run on db1056 is CRITICAL: CRITICAL: Puppet has 22 failures [17:23:27] PROBLEM - puppet last run on cp3011 is CRITICAL: CRITICAL: Puppet has 26 failures [17:23:28] PROBLEM - puppet last run on eeden is CRITICAL: CRITICAL: Puppet has 25 failures [17:23:28] PROBLEM - puppet last run on amssq37 is CRITICAL: CRITICAL: Puppet has 29 failures [17:23:29] PROBLEM - puppet last run on cp3021 is CRITICAL: CRITICAL: puppet fail [17:23:29] PROBLEM - puppet last run on lvs2005 is CRITICAL: CRITICAL: puppet fail [17:23:30] PROBLEM - puppet last run on db1005 is CRITICAL: CRITICAL: Puppet has 24 failures [17:23:30] PROBLEM - puppet last run on mw1107 is CRITICAL: CRITICAL: Puppet has 64 failures [17:23:31] PROBLEM - puppet last run on mw1075 is CRITICAL: CRITICAL: puppet fail [17:23:36] PROBLEM - puppet last run on virt1009 is CRITICAL: CRITICAL: puppet fail [17:23:36] PROBLEM - puppet last run on db1035 is CRITICAL: CRITICAL: Puppet has 23 failures [17:23:36] PROBLEM - puppet last run on cp3019 is CRITICAL: CRITICAL: puppet fail [17:23:36] PROBLEM - puppet last run on mw1204 is CRITICAL: CRITICAL: Puppet has 62 failures [17:23:36] PROBLEM - puppet last run on mw1103 is CRITICAL: CRITICAL: Puppet has 70 failures [17:23:36] PROBLEM - puppet last run on cp3018 is CRITICAL: CRITICAL: Puppet has 19 failures [17:23:36] PROBLEM - puppet last run on es1004 is CRITICAL: CRITICAL: puppet fail [17:23:37] PROBLEM - puppet last run on cp1064 is CRITICAL: CRITICAL: Puppet has 24 failures [17:23:42] PROBLEM - puppet last run on bast1001 is CRITICAL: CRITICAL: Puppet has 82 failures [17:23:42] PROBLEM - puppet last run on mw1128 is CRITICAL: CRITICAL: puppet fail [17:23:42] PROBLEM - puppet last run on ms-be1015 is CRITICAL: CRITICAL: Puppet has 26 failures [17:23:45] PROBLEM - puppet last run on osm-cp1001 is CRITICAL: CRITICAL: Puppet has 15 failures [17:23:46] PROBLEM - puppet last run on mw1155 is CRITICAL: CRITICAL: Puppet has 66 failures [17:23:47] PROBLEM - puppet last run on mw1027 is CRITICAL: CRITICAL: Puppet has 56 failures [17:23:47] PROBLEM - puppet last run on db1037 is CRITICAL: CRITICAL: Puppet has 19 failures [17:23:47] PROBLEM - puppet last run on amssq45 is CRITICAL: CRITICAL: puppet fail [17:23:47] PROBLEM - puppet last run on mw1037 is CRITICAL: CRITICAL: puppet fail [17:23:47] PROBLEM - puppet last run on mw1255 is CRITICAL: CRITICAL: Puppet has 67 failures [17:23:47] PROBLEM - puppet last run on sca1001 is CRITICAL: CRITICAL: Puppet has 22 failures [17:23:48] PROBLEM - puppet last run on mw1220 is CRITICAL: CRITICAL: Puppet has 78 failures [17:23:49] PROBLEM - puppet last run on wtp1010 is CRITICAL: CRITICAL: Puppet has 24 failures [17:23:56] PROBLEM - puppet last run on analytics1018 is CRITICAL: CRITICAL: Puppet has 13 failures [17:23:56] PROBLEM - puppet last run on mw1085 is CRITICAL: CRITICAL: puppet fail [17:24:15] ... [17:25:44] (03PS1) 10Faidon Liambotis: install-server: restore squid3-apt-proxy.conf [puppet] - 10https://gerrit.wikimedia.org/r/179156 [17:26:29] (03CR) 10Faidon Liambotis: [C: 032 V: 032] install-server: restore squid3-apt-proxy.conf [puppet] - 10https://gerrit.wikimedia.org/r/179156 (owner: 10Faidon Liambotis) [17:26:33] (03PS3) 10Dzahn: Phabricator Sprint (0.6.1.4) [puppet] - 10https://gerrit.wikimedia.org/r/179155 (owner: 10Christopher Johnson (WMDE)) [17:30:15] (03PS4) 10Faidon Liambotis: grub: use augeas to modify the config [puppet] - 10https://gerrit.wikimedia.org/r/178897 [17:35:20] (03CR) 10Ori.livneh: grub: use augeas to modify the config (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/178897 (owner: 10Faidon Liambotis) [17:36:28] (03PS5) 10Faidon Liambotis: grub: use augeas to modify the config [puppet] - 10https://gerrit.wikimedia.org/r/178897 [17:39:43] (03CR) 10Ori.livneh: "IMO there should still be an 'apt-get update resource', but it should only be executed by Puppet when adding apt sources." (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/179082 (owner: 10Faidon Liambotis) [17:40:08] ori: ? [17:40:25] ori: that's exactly what the patch does [17:40:30] unless I misunderstood you [17:40:42] oh, let me look again, maybe i misread [17:40:49] yes, yes i did [17:40:51] carry on [17:40:53] i'll get coffee [17:40:55] also look at the last paragraph of the commit [17:41:08] (commit message) [17:41:10] (03CR) 10Ori.livneh: "Disregard comment, I can't read" [puppet] - 10https://gerrit.wikimedia.org/r/179082 (owner: 10Faidon Liambotis) [17:41:34] glad you came up with the same idea on your own though [17:41:49] must be sane :) [17:41:58] !log starting trusty upgrade of analytics1019 [17:42:02] Logged the message, Master [17:42:41] (and re: whitespace you're right, but I just moved the lines so only partially my fault!) [17:42:46] what do you think of the idea in general? [17:44:26] (03CR) 10Ori.livneh: "It's a bit rude to +2 your own changes and then -2 mine. If you'd rather make corrections yourself, then wait for review." [puppet] - 10https://gerrit.wikimedia.org/r/179027 (owner: 10Ori.livneh) [17:45:22] <_joe_> ori: we kinda needed that in production [17:45:36] paravoid: i think it's a good idea. the other thing i was thinking of was having a local apt proxy that updates asynchronously from puppet [17:45:55] why would you need an apt proxy [17:46:00] <_joe_> and I find kind of irritating you reorganized the python script completely, and marked it as "pep8 fixes", as stated this morning. [17:46:11] that was one bullet point in my commit message [17:46:15] <_joe_> but then again, I said enough on this. [17:46:41] and at least i added you as a reviewer when modifying your code, whereas with you didn't think to include me [17:46:50] <_joe_> also, I think the file entry removal is harmful and useless (it's at most some kb) [17:47:26] <_joe_> so my -2 had a precise reason [17:47:43] <_joe_> but then, I'll live with you thinking I'm rude [17:47:53] <_joe_> whatever [17:50:13] * _joe_ meeting [17:52:00] paravoid: i'm not sure if could be used for that purpose, but i thought if apt is always connecting to localhost then the failures could be disguised from puppet [17:52:16] i think the approach of your patch is good, though [17:53:33] if I wanted to disguise the errors I could just add tell puppet to ignore non-zero exit codes [17:55:06] well, true. but it's a bit strange for every single server to ping canonical. it'd be nice if there was a way for a single host to fetch package listing updates and then sync those to the rest of the cluster [17:55:43] we're not pinging canonical [17:55:46] we have a local mirror [17:55:54] so why is apt-get failing? [17:56:36] we ping canonical for security.ubuntu.com which is unmirrorable, but these go via squid [17:57:15] are security.ubuntu.com updates the reason why apt-get update fails? [17:57:38] I don't actually know [17:58:07] (03CR) 10Giuseppe Lavagetto: "@bryan: as said on IRC, the solution would be to have scap check on error - if the host is down in icinga, then do suppress the error. Not" [puppet] - 10https://gerrit.wikimedia.org/r/179121 (owner: 10Giuseppe Lavagetto) [17:59:08] paravoid: i think your patch is a good idea regardless, for the other reasons you mentioned (the fact that it slows down puppet runs substantially), but it'd be interesting to get to the bottom of why it's failing [17:59:25] it's not failing very often [17:59:31] and that can happen for a number of reasons [17:59:43] for instance, the apt repository format isn't very http caching-friendly [17:59:52] if Packages and Release get out of sync for instance [17:59:59] cryptographic signatures will fail [18:01:31] another option is to customize the apt package provider [18:01:40] so that it runs 'apt-get update' on the first package install [18:01:59] so apt-get updates would only get run when there are packages that need to be installed [18:02:16] for the ensure => latest cases, you'd rely on a cron-jobbed apt-get update [18:05:24] ^ paravoid [18:05:40] why, though? [18:05:49] and apt-get operation isn't very expensive [18:05:56] especially if it hits local mirrors [18:06:48] so your patch is probably the best approach [18:09:16] (03CR) 10Ori.livneh: [C: 031] "We discussed alternatives on IRC and on reflection I think this is the best approach." [puppet] - 10https://gerrit.wikimedia.org/r/179082 (owner: 10Faidon Liambotis) [18:10:26] (03CR) 10Ori.livneh: [C: 032] "small, test-only change" [debs/pybal] - 10https://gerrit.wikimedia.org/r/179065 (owner: 10Ori.livneh) [18:10:47] (03Merged) 10jenkins-bot: Add test for IdleConnectionMonitoringProtocol.run [debs/pybal] - 10https://gerrit.wikimedia.org/r/179065 (owner: 10Ori.livneh) [18:11:14] puppet runs on my test Debian system are 30% faster btw [18:11:30] it has a newer puppet, and more importantly a newer ruby [18:12:02] i'm not opposed to using debian at all [18:12:18] i thought it'd be a lot of work [18:12:29] but since it looks like you've already done a lot of it.. [18:12:44] puppet runs cleanly now [18:13:06] and very few changes were really Debian-specific [18:13:20] most of them were either manifest bugs that just manifested now [18:13:22] or upstart vs. systemd [18:13:28] which we'll need to tackle eventually anyway [18:14:02] nice pun [18:14:16] ;) [18:14:19] :) [18:14:33] https://gerrit.wikimedia.org/r/#/q/project:operations/puppet+topic:jessie,n,z [18:14:41] (03PS1) 10Filippo Giunchedi: mwprof: use graphite-in.eqiad.wmnet [puppet] - 10https://gerrit.wikimedia.org/r/179164 [18:15:03] ori: ^ [18:15:19] (03CR) 10Ori.livneh: [C: 031] mwprof: use graphite-in.eqiad.wmnet [puppet] - 10https://gerrit.wikimedia.org/r/179164 (owner: 10Filippo Giunchedi) [18:15:35] only a handful are Debian-specific [18:15:50] now of course that's for the basic install [18:16:11] ori: sweet, thanks! [18:16:12] depending on the service, we'll discover bits that are broken I guess [18:16:21] ori: I'll change manually the hhvm profiler [18:16:22] but we did so with trusty as well, it's part of the game [18:16:35] (03PS2) 10Filippo Giunchedi: mwprof: use graphite-in.eqiad.wmnet [puppet] - 10https://gerrit.wikimedia.org/r/179164 [18:16:41] (03CR) 10Filippo Giunchedi: [C: 032 V: 032] mwprof: use graphite-in.eqiad.wmnet [puppet] - 10https://gerrit.wikimedia.org/r/179164 (owner: 10Filippo Giunchedi) [18:16:48] we have os_version() and $lsbdistrelease checks all over the place [18:19:32] I can't wait for our glorious jessie-backports future [18:20:21] !log ori Synchronized php-1.25wmf12/resources/src/mediawiki.action/mediawiki.action.edit.stash.js: Ib2de3f15: Stash edit when user idles (duration: 00m 05s) [18:20:24] Logged the message, Master [18:31:25] (03PS1) 10Dzahn: udp2log - stop using old iptables classes [puppet] - 10https://gerrit.wikimedia.org/r/179166 [18:37:16] <_joe_> ori: do you think we can repurpose osmium, or it's still needed? [18:37:31] <_joe_> I have other people in need of a sandbox :) [18:39:49] (03CR) 10Dzahn: move mediawiki maintenance scripts to module (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/178873 (owner: 10Dzahn) [18:41:05] !log restart profiler-to-carbon on tungsten to pick up changes, including hhvm-profiler-to-carbon [18:41:11] Logged the message, Master [18:45:37] (03PS1) 10Anomie: Enable $wgExtractsExtendOpenSearchXml [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179168 [18:50:29] (03PS2) 10Dzahn: move mediawiki maintenance scripts to module [puppet] - 10https://gerrit.wikimedia.org/r/178873 [18:56:32] (03CR) 10Dzahn: "http://puppet-compiler.wmflabs.org/546/change/178873/html/terbium.eqiad.wmnet.html" [puppet] - 10https://gerrit.wikimedia.org/r/178873 (owner: 10Dzahn) [19:00:46] (03PS1) 10RobH: setting hostname mgmt address for host einsteinium [dns] - 10https://gerrit.wikimedia.org/r/179171 [19:12:34] (03PS1) 10RobH: adding unified project cert from globalsign [puppet] - 10https://gerrit.wikimedia.org/r/179173 [19:21:06] !log rescuing revisions on frwiki (https://phabricator.wikimedia.org/T76979) [19:21:11] Logged the message, Master [19:22:46] (03PS1) 10RobH: adding einsteinium production dns entry [dns] - 10https://gerrit.wikimedia.org/r/179175 [19:23:03] (03CR) 10RobH: [C: 032] setting hostname mgmt address for host einsteinium [dns] - 10https://gerrit.wikimedia.org/r/179171 (owner: 10RobH) [19:23:29] (03CR) 10RobH: [C: 032] adding einsteinium production dns entry [dns] - 10https://gerrit.wikimedia.org/r/179175 (owner: 10RobH) [19:26:48] (03CR) 10BryanDavis: "Filed T78319 to track need for some sort of error triage or suppression for known down hosts." [puppet] - 10https://gerrit.wikimedia.org/r/179121 (owner: 10Giuseppe Lavagetto) [19:30:28] !log powering down tmh1002 to replace failed disk [19:30:35] Logged the message, Master [19:39:46] (03PS2) 10BBlack: adding unified project cert from globalsign [puppet] - 10https://gerrit.wikimedia.org/r/179173 (owner: 10RobH) [19:41:38] (03CR) 10BBlack: [C: 032] adding unified project cert from globalsign [puppet] - 10https://gerrit.wikimedia.org/r/179173 (owner: 10RobH) [19:47:06] (03PS10) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [19:53:09] (03PS11) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [19:54:24] (03PS12) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [19:56:09] (03PS13) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [19:58:04] (03PS14) 10Yuvipanda: [WIP] Add dblist / shard support for bootstrapper [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/179110 [19:59:34] greg-g: we're doing a surprise CentralNotice deployment, in 30 sec :) [19:59:43] bog willing! [19:59:50] surprise! [20:05:00] I have to complain about this every time I deploy, but... Doesn't it seem wasteful to allow the "test" jobs to run when identical gate-and-submit jobs are also running? [20:11:48] (03PS1) 10BBlack: test uni.wm.o cert on cp1008 [puppet] - 10https://gerrit.wikimedia.org/r/179183 [20:13:05] (03PS1) 10Dzahn: smokeping: fix minor compiler warnings [puppet] - 10https://gerrit.wikimedia.org/r/179184 [20:13:23] !log awight Synchronized php-1.25wmf11/extensions/CentralNotice: IE fix for CentralNotice hide cookies (duration: 00m 07s) [20:13:25] Logged the message, Master [20:13:34] !log awight Synchronized php-1.25wmf12/extensions/CentralNotice: IE fix for CentralNotice hide cookies (duration: 00m 06s) [20:13:36] Logged the message, Master [20:14:42] (03PS2) 10Dzahn: smokeping: fix minor compiler warnings [puppet] - 10https://gerrit.wikimedia.org/r/179184 [20:18:12] (03PS2) 10BBlack: test uni.wm.o cert on cp1008 [puppet] - 10https://gerrit.wikimedia.org/r/179183 [20:21:22] (03CR) 10BBlack: [C: 032] test uni.wm.o cert on cp1008 [puppet] - 10https://gerrit.wikimedia.org/r/179183 (owner: 10BBlack) [20:27:01] (03PS1) 10Dzahn: monitoring: move decom_host define into autoload [puppet] - 10https://gerrit.wikimedia.org/r/179188 [20:30:06] (03PS2) 10Dzahn: monitoring: move decom_host define into autoload [puppet] - 10https://gerrit.wikimedia.org/r/179188 [20:37:44] (03PS1) 10RobH: setting analytics1001-1002 productoin dns entries [dns] - 10https://gerrit.wikimedia.org/r/179189 [20:37:54] (03CR) 10jenkins-bot: [V: 04-1] setting analytics1001-1002 productoin dns entries [dns] - 10https://gerrit.wikimedia.org/r/179189 (owner: 10RobH) [20:38:33] (03PS2) 10RobH: setting analytics1001-1002 productoin dns entries [dns] - 10https://gerrit.wikimedia.org/r/179189 [20:42:00] (03PS1) 10Dzahn: zuul: move parameters around for lint [puppet] - 10https://gerrit.wikimedia.org/r/179190 [20:42:52] (03CR) 10jenkins-bot: [V: 04-1] zuul: move parameters around for lint [puppet] - 10https://gerrit.wikimedia.org/r/179190 (owner: 10Dzahn) [20:43:14] Is there still a sync-l10nupdate? Seems like no. [20:43:38] (03PS2) 10Dzahn: zuul: move parameters around for lint [puppet] - 10https://gerrit.wikimedia.org/r/179190 [20:44:24] (03CR) 10jenkins-bot: [V: 04-1] zuul: move parameters around for lint [puppet] - 10https://gerrit.wikimedia.org/r/179190 (owner: 10Dzahn) [20:45:24] (03PS3) 10Dzahn: zuul: move parameters around for lint [puppet] - 10https://gerrit.wikimedia.org/r/179190 [20:48:12] Reedy: fyi, after all my complaining, I'm actually running l10nupdate. Anything I should be worried about? [20:48:22] awight: it doesn't work? [20:49:03] Um, it probably works, I'm just scared :) I don't see docs for this tool. [20:49:21] Maybe docs should go here? https://wikitech.wikimedia.org/wiki/How_to_deploy_code#Alternative_to_scap [20:51:09] awight: It's broken [20:51:21] It won't sync the files to the apaches [20:51:27] /mw app servers [20:51:44] oooh ok [20:51:48] no worries. [20:52:03] I need to do a full scap in order to sync l10n, then, eh? [20:52:16] yeah [20:52:26] but you need to backport message changes [20:52:51] or, in your case, update the branch pointer [20:53:05] Reedy: gah. I'd kind of forgotten that l10nupdate was still broken [20:55:27] Reedy: ok I think I follow. Yeah, I've deployed the msgs in extension code, so now I'm in position to scap... [20:55:30] (03CR) 10Dzahn: "try rebasing this now, i think the class that Yuvi commented on and based his -1 on has been removed meanwhile" [puppet] - 10https://gerrit.wikimedia.org/r/170477 (owner: 10John F. Lewis) [20:58:17] !log awight Synchronized php-1.25wmf11/cache/l10n: (no message) (duration: 00m 03s) [20:58:22] Logged the message, Master [20:58:25] !log LocalisationUpdate completed (1.25wmf11) at 2014-12-11 20:58:24+00:00 [20:58:29] Logged the message, Master [21:01:55] (03PS2) 10Dzahn: base: lint fixes [puppet] - 10https://gerrit.wikimedia.org/r/170477 (owner: 10John F. Lewis) [21:08:30] (03Abandoned) 10Manybubbles: Lower full text search queue [mediawiki-config] - 10https://gerrit.wikimedia.org/r/176932 (owner: 10Manybubbles) [21:09:21] !log awight Synchronized php-1.25wmf12/cache/l10n: (no message) (duration: 00m 01s) [21:09:25] Logged the message, Master [21:09:26] !log LocalisationUpdate completed (1.25wmf12) at 2014-12-11 21:09:26+00:00 [21:09:28] Logged the message, Master [21:10:27] (03CR) 10Dzahn: [C: 04-1] "please remove the change in bacula::director::schedule then let's see again if we can get the rest in" [puppet] - 10https://gerrit.wikimedia.org/r/170476 (owner: 10John F. Lewis) [21:13:06] (03PS1) 10Ottomata: Putting stats.wikimedia.org and datasets.wikimedia.org behind misc-web-lb [puppet] - 10https://gerrit.wikimedia.org/r/179197 [21:13:12] (03CR) 10Dduvall: "Nice job factoring out the private modules; that seems like a good place to start stubbing/mocking." [puppet] - 10https://gerrit.wikimedia.org/r/178810 (owner: 10Hashar) [21:15:35] mutante: https://gerrit.wikimedia.org/r/#/c/179197/, look ok? i'll remove ssl settings from apache configs in a separate change if this works [21:15:48] none of the sites force ssl now, except in one specific instance of stats.wm.org (geowiki) [21:16:08] (03CR) 10Dzahn: "this looks right for the varnish side, just don't forget that on the Apache side you should remove the entire SSL config and especially ch" [puppet] - 10https://gerrit.wikimedia.org/r/179197 (owner: 10Ottomata) [21:16:14] :) [21:16:44] how's this work, btw? with ssl? does nginx do ssl for this? [21:16:56] yes, it's nginx [21:17:02] is it just nginx on the misc varnish boxes that decrypts and proxies to local varnish? [21:17:05] you also need the DNS change of course [21:17:06] yes [21:17:13] going to test with explicit Host header first [21:17:20] yes, good [21:17:37] goign to test with host and with the ssl removal before I change DNS [21:17:39] yes to the question above as well, afaict [21:17:59] (03CR) 10Ottomata: [C: 032] "Cool, that will come in a separate commit after I test. Same for DNS." [puppet] - 10https://gerrit.wikimedia.org/r/179197 (owner: 10Ottomata) [21:18:00] cool [21:18:03] it can be tricky if your backend has a public IP still [21:18:13] because then people can also still talk to it directly [21:18:25] so the order of things.. removing SSL config and switching DNS... [21:19:10] yeah [21:19:13] it does still ahve public IP [21:19:18] IP change will happen next week [21:19:26] you want puppet on cp1043/1044 after that change [21:19:38] doing that now :) thanks! found that in pybal :) [21:20:26] in case you want to enforce http->https ( i think you do) after it's behind misc-web [21:20:30] RewriteCond %{HTTP:X-Forwarded-Proto} !https [21:20:33] use that [21:20:46] RewriteRule (.*) https://... [21:22:56] mutante, i think i don't want to enforce https, not sure though, at least not yet. [21:22:58] but [21:23:12] in the meantime...is there anything else I have to do to get https to work from cp1043, for example? [21:23:17] there's one more thing about your varnish change [21:23:18] it proxies the http request fine [21:23:22] but https gives me this: [21:23:24] in case you wanted "no caching" for stats [21:23:28] curl: (51) SSL peer certificate or SSH remote key was not OK [21:23:29] am doing [21:23:32] but caching for datasets [21:23:39] then you'd have to define them separately [21:23:42] curl -H 'Host: datasets.wikimedia.org' https://localhost [21:23:56] ah [21:24:01] caching is fine i think, for both of these [21:27:41] hm --insecure works, so at least the proxy is working [21:28:20] yea, it's subjectAltName does not match datasets.wikimedia.org [21:28:28] it's kind of surprsing though [21:28:41] try adding lots of -vvv [21:29:59] cp1043:~# curl -H 'Host: datasets.wikimedia.org' https://datasets.wikimedia.org -S -vvv -I 10.64.0.171 [21:30:09] sorry, 10.30 pm here [21:30:18] but something like that and i get a 200 OK [21:30:44] i'd try from local computer though, by hacking /etc/hosts to point it to misc-web [21:30:54] and then just browser [21:32:31] https://datasets.wikimedia.org/ won't work cause that will hit stat1001 [21:32:34] dns is changed [21:32:43] misc-web is an addy? [21:32:44] oh! [21:33:00] ah! perfect that is easier [21:33:10] Hi all! I'd like to make sure Special:HideBanners isn't being cached for too long (more than a day). I'm afraid it may be sticking around for a while, as it's got max-age=0 in the response headers now. Should we make a code change to set that header in the special page, or limit its cache duration some other way? [21:33:21] ottomata: host misc-web-lb.eqiad.wikimedia.org [21:33:37] like it like it... [21:33:42] like this: [21:33:59] 208.80.154.241 stats.wikimedia.org [21:34:14] mutante: curl -vvv -H 'Host: datasets.wikimedia.org' https://misc-web-lb.eqiad.wikimedia.org [21:34:56] ottomata: yep, works :) [21:34:59] works! [21:35:06] ? [21:35:08] i get bad key [21:35:15] without --insecure [21:35:59] running it where? [21:36:16] i did on cp1043 [21:36:33] * SSL connection using ECDHE-RSA-AES128-SHA [21:36:43] on my local [21:36:57] then you may need to specify the path to CA cert [21:36:58] Hi operations... can anyone tell me how long Special:HideBanners is cached, and how to fiddle with that cache time if necessary? I see the exclusion condition in puppet/templates/varnish/text-backend.inc.vcl.erb... [21:37:02] ejegg: ^ [21:37:06] oh hm [21:37:09] it works on cp1043 now [21:37:10] intresteing! [21:37:11] ok... [21:37:24] meh, whatever is happening, i think it is fine [21:37:27] i will make the DNS change [21:37:32] ottomata: this :* successfully set certificate verify locations: [21:37:38] CApath: /etc/ssl/certs [21:38:13] might be different if that exists on your OS and has the CA cert [21:39:33] AndyRussG: it looks like we /should/ be setting s-maxage to 86400 (1 day) [21:39:42] when the user is not logged in [21:39:57] however, I'm not seeing that [21:41:11] (03PS9) 10Hashar: contint: provision hhvm on CI slaves [puppet] - 10https://gerrit.wikimedia.org/r/178806 [21:41:37] ejegg: hmmm... [21:41:49] (03PS1) 10Ottomata: Point stats and datasets .wikmedia.org at misc-web-lb [dns] - 10https://gerrit.wikimedia.org/r/179215 [21:42:14] AndyRussG: I /am/ seeing it locally, but perhaps varnish resets that header? [21:43:15] (03PS1) 10BryanDavis: Revert "Revert "Configure logging to use MWLoggerMonologSpi"" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179217 [21:43:27] (03CR) 10BryanDavis: [C: 04-2] "Needs fixes" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179217 (owner: 10BryanDavis) [21:45:12] AndyRussG: yup, being reset in varnish, with exceptions for BannerController and BannerListLoader [21:45:32] ejegg: ah.... where didja find that? [21:45:44] text-frontend.inc.vcl.erb [21:45:44] (03CR) 10Hashar: "PS9 removes the path, unfortunately that causes the ::hhvm class to use the built in default:" [puppet] - 10https://gerrit.wikimedia.org/r/178806 (owner: 10Hashar) [21:45:57] (03PS2) 10BryanDavis: Configure logging to use MWLoggerMonologSpi [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179217 [21:46:20] ejegg: it gets thru to the hhvm_is_cacheable call u think? [21:46:35] bblack: yt? [21:46:52] yes, but engaged in something else deep at the moment, but I'll come back and look above in a few [21:47:07] !log Jenkins re adding [https://integration.wikimedia.org/ci/computer/integration-slave1009/ integration-slave1009] to the pool of slaves [21:47:10] Logged the message, Master [21:47:35] AndyRussG: not sure about that, but the reset is in vcl_deliver [21:50:01] AndyRussG: So as far as varnish is concerned, it will see the s-maxage param and only cache for a day [21:50:22] (03CR) 10Matanya: "duplicate of https://gerrit.wikimedia.org/r/#/c/169691/ ?" [puppet] - 10https://gerrit.wikimedia.org/r/179166 (owner: 10Dzahn) [21:50:30] Then caches closer to the user will see max-age=0; must-revalidate and will always go back to varnish [21:51:25] By tomorrow the new header should be on that page across all domains [21:53:13] ok, what's the s-maxage deal? I don't have the whole context here. [21:53:48] bblack: I think ejegg figured it out... :) we just wanted to check how long Special:HideBanners is being cached [21:54:25] That's a special page created by CentralNotice to set "hide" cookies when users donate or when they click on a banner's hide button [21:54:57] ok :) [21:54:57] Mmrrgg with this varnish DSL I feel like I'm reading minified javascript [21:55:25] I like to think of it as assembly language for HTTP :) [21:56:11] (03PS3) 10BryanDavis: Configure logging to use MWLoggerMonologSpi [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179217 [21:56:21] (03PS1) 10Ottomata: Put stat1001.wikimedia.org behind misc-web-lb in preparation for move to private IP next week. [puppet] - 10https://gerrit.wikimedia.org/r/179220 [21:56:27] the first rule of varnish club is: any time you think you understand exactly what's going on, you don't [21:56:38] the second rule of varnish club is: any time you think you understand exactly what's going on, YOU DON'T [21:57:07] (03CR) 10Ottomata: [C: 032] Put stat1001.wikimedia.org behind misc-web-lb in preparation for move to private IP next week. [puppet] - 10https://gerrit.wikimedia.org/r/179220 (owner: 10Ottomata) [21:57:54] (03CR) 10Ottomata: [V: 032] Put stat1001.wikimedia.org behind misc-web-lb in preparation for move to private IP next week. [puppet] - 10https://gerrit.wikimedia.org/r/179220 (owner: 10Ottomata) [21:58:08] (03PS6) 10Hashar: Basic rspec setup [puppet] - 10https://gerrit.wikimedia.org/r/178810 [22:00:19] uh oh puppet didn't like that last change, uhh [22:01:47] doh [22:01:49] (03PS1) 10Ottomata: Fix for missing = in == [puppet] - 10https://gerrit.wikimedia.org/r/179222 [22:02:04] (03CR) 10Ottomata: [C: 032 V: 032] Fix for missing = in == [puppet] - 10https://gerrit.wikimedia.org/r/179222 (owner: 10Ottomata) [22:03:21] bblack: that's complicated [22:06:36] bblack: thanks BTW! do you want to add anything to ejegg's analysis (above)? [22:07:31] greg-g, robh: according to the deploy calendar there doesn't seem to be any deploying right now. I'd like to deploy Parsoid, if that's okay. [22:09:47] kk [22:10:37] yes, but i also will be going afk @ some point in the next 30 minutes (currently staying in a single car household, thus i have to drive to pick someone up) [22:11:11] (not that i do much other than page folks ;) [22:11:45] (03CR) 10Hashar: "I have edited the commit summary to point to T78342" [puppet] - 10https://gerrit.wikimedia.org/r/178810 (owner: 10Hashar) [22:14:57] (03PS4) 10BryanDavis: Configure logging to use MWLoggerMonologSpi [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179217 [22:19:56] AndyRussG: no, I have nothing to add, but I haven't really even looked. [22:20:28] bblack: OK thanks!! ejegg ^ We'll keep an eye on it and ping u if there are issuz, then... :) [22:20:44] marktraceur: RoanKattouw: MaxSem: ^demon|lunch: quick question: are u planning to run scap during the SWAT deploy later today? Just asking because awight did a CentralNotice deploy and there were some very, very obscure new i18n messages that didn't go out... But if ur gonna run scap in your deploy later, tht's good enough I think... (also cc ejegg) [22:21:19] I probably won't be SWATting, I usually don't in the evening [22:21:21] AndyRussG: SWATs are generally scap-less [22:21:41] And currently no one's registered for the SWAT [22:22:03] But if you want one of us to run scap in that window, we can [22:22:17] mobile will definitely push stuff for scap [22:22:38] RoanKattouw: marktraceur: ah OK thanks... Uh lemme see then [22:23:00] s/for scap/for swat/ [22:24:45] OuKB: ah wil that involve running scap (which is the standard thing for adding new i18n messages I think?) [22:25:20] RoanKattouw: if no other scap gets run and someone will be on the cluster and can easily run scap without much bother, that'd be fanstastic [22:25:40] If it's inconvenient we can also take care of it tho [22:25:56] (I think it's long and one has to babysit it? never done one...) [22:25:58] greg-g, can we just scap i18n for AndyRussG to avoid spending time on it during swat window? [22:26:03] ^^^^ [22:26:51] MaxSem: yeah [22:27:05] okay, I'll scap now [22:28:23] MaxSem: greg-g: thanks much!! [22:28:29] RoanKattouw: thanks much also! [22:29:10] oh wait AndyRussG - https://wikitech.wikimedia.org/wiki/Server_Admin_Log shows that there was an l10n update 40 minutes after Adam's pushes [22:29:19] can you check if scap is still needed? [22:29:29] MaxSem: OK one sec [22:31:10] MaxSem: yeah apparently l10nupdate is still broken, and does not push to cache successfully. [22:31:14] AndyRussG: ^ [22:31:21] errrg [22:31:39] same shit yurik was complianing about yesterday [22:32:23] okay, I'll try [22:32:35] MaxSem: awight: yeah indeed our very obscure page is still unmesageful: http://en.wikipedia.org/wiki/Special:HideBanners/P3P?action=purge [22:33:26] lol you can't purge a specal page because it's already not cached [22:33:27] (03CR) 10BryanDavis: "Diff PS4 against PS1 to see the changes made since the initial deploy to correct the errors seen in production." (033 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179217 (owner: 10BryanDavis) [22:33:37] MaxSem: it's tracked here, https://phabricator.wikimedia.org/T76061 [22:33:49] MaxSem: not quite true! [22:34:26] Though it's usually a good assumption :) [22:34:57] in any case, action=purge shouldn't work:P [22:35:14] (03CR) 10BryanDavis: "retest" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179217 (owner: 10BryanDavis) [22:36:02] MaxSem: yeah I think you're right about that one [22:36:40] !log maxsem Started scap: i18n update for CentralNotice [22:36:44] Logged the message, Master [22:36:50] (03CR) 10BryanDavis: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179217 (owner: 10BryanDavis) [22:37:35] Re: caching that page see backscroll, ejegg and bblack's messages... in /puppet/templates/varnish/text-backend.inc.vcl.erb: there's a line that says "req.url !~ "^/wiki/Special:HideBanners"" [22:38:00] MaxSem: if u have any more thoughts on the caching of that exact page, they're much appreciated :) and thanks again [22:42:16] !log Zuul stuck [22:42:21] Logged the message, Master [22:45:25] !log Disconnected/reconnected the Jenkins Gearman client which unstuck Zuul magically. [22:45:28] Logged the message, Master [22:50:23] !log updated OCG to version bfc3812ef346c9f767135b339cedd123a1bcac98 [22:50:26] Logged the message, Master [22:50:52] (03Abandoned) 10BryanDavis: beta: enable php processing for bits/w [puppet] - 10https://gerrit.wikimedia.org/r/177700 (owner: 10BryanDavis) [22:55:36] (03CR) 10Hashar: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/178810 (owner: 10Hashar) [22:56:11] !log updated Parsoid to version d16dd2db [22:56:13] Logged the message, Master [22:58:36] (03CR) 10Hashar: "And now we have a Jenkins job to play with (just comment 'check experimental' on this change), though it invokes 'bundle exec rspec' and " [puppet] - 10https://gerrit.wikimedia.org/r/178810 (owner: 10Hashar) [23:04:19] (03PS5) 10BryanDavis: Configure logging to use MWLoggerMonologSpi [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179217 [23:05:50] !log maxsem Finished scap: i18n update for CentralNotice (duration: 29m 09s) [23:05:54] Logged the message, Master [23:07:36] (03PS1) 10BryanDavis: [Just In Case] Disable Monolog logger on testwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179338 [23:08:18] AndyRussG, awight: seems to work now [23:08:39] (03CR) 10BryanDavis: [C: 04-1] "Do not merge unless I99a032faf8c422f3a443bd91b9afbd77f80729db is blowing up testwiki (and not the rest of the cluster)." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179338 (owner: 10BryanDavis) [23:10:41] MaxSem: fantastic, thx! [23:13:11] greg-g: ok, done with OCG and Parsoid deploy [23:13:19] anyone else who wants to break things, feel free! [23:14:13] * MaxSem breaks parsoid and ocg [23:14:26] no no no! [23:14:33] *other* things [23:15:26] I could try to break all the things again if tin is clear for mwf-config changes. I'm 98% sure I fixed the monolog configuration problems form this morning [23:15:40] *from this morning [23:18:05] MaxSem, AndyRussG: Are you guys done on tin? [23:18:12] yup [23:18:28] ^ not me tho [23:18:54] AndyRussG: you're still syncing things? [23:19:12] bd808: no... [23:20:19] * bd808 is confused [23:23:51] bd808: the CentralNotice deploy went out a couple hours ago, and a scap was run later by MaxSem to sync i18n, but it's all done now it seems :) [23:24:41] *nod* [23:27:16] I'm going to try the monolog config again. This time it should only effect testwiki [23:28:36] (03PS6) 10BryanDavis: Configure logging to use MWLoggerMonologSpi [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179217 [23:28:43] (03CR) 10BryanDavis: [C: 032] Configure logging to use MWLoggerMonologSpi [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179217 (owner: 10BryanDavis) [23:28:52] (03Merged) 10jenkins-bot: Configure logging to use MWLoggerMonologSpi [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179217 (owner: 10BryanDavis) [23:30:56] !log bd808 Synchronized wmf-config: Configure logging to use MWLoggerMonologSpi (I99a032f) (duration: 00m 09s) [23:31:01] Logged the message, Master [23:32:32] (03PS1) 10MaxSem: Second attempt at mobile wikidata, now with a subdomain [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179341 [23:33:08] bha [23:33:11] !log bd808 Synchronized wmf-config: quick revert -- Configure logging to use MWLoggerMonologSpi (I99a032f) (duration: 00m 07s) [23:33:15] Logged the message, Master [23:33:30] bd808|deploy, have you tried on labs? [23:33:41] MaxSem: yes [23:33:46] It works in beta [23:33:53] :D [23:34:04] (03PS1) 10BryanDavis: Revert "Configure logging to use MWLoggerMonologSpi" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179342 [23:34:15] (03CR) 10BryanDavis: [C: 032] Revert "Configure logging to use MWLoggerMonologSpi" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179342 (owner: 10BryanDavis) [23:34:23] (03Merged) 10jenkins-bot: Revert "Configure logging to use MWLoggerMonologSpi" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179342 (owner: 10BryanDavis) [23:35:56] !log bd808 Synchronized wmf-config: Revert Configure logging to use MWLoggerMonologSpi (Ib8ddd86) (duration: 00m 05s) [23:36:01] Logged the message, Master [23:36:13] ssh: connect to host mw1041 port 22: No route to host [23:36:39] doesn't ping from tin but in dsh group [23:37:16] it's down in icinga [23:37:23] don't trying for today. I have some new error messages to look at [23:42:51] !T 60196 [23:44:09] !log restarted logstash on logstash1001; fatalmonitor report was empty since ~20:30z [23:44:15] Logged the message, Master [23:53:22] (03Abandoned) 10BryanDavis: [Just In Case] Disable Monolog logger on testwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/179338 (owner: 10BryanDavis) [23:54:19] ori: do any of our servers still run puppet w/ ruby 1.8? [23:54:54] marxarelli: yes, sadly [23:54:57] ori: i'm helping hashar with experimental puppet rspec and was curious whether i should suggest we just target >= 1.9 [23:55:12] nope :( [23:55:41] most are on 1.9, but there are some that aren't [23:56:25] ori: got it. thanks!