[00:05:14] (03Abandoned) 10Hoo man: Install 'tidy' for wikitech [puppet] - 10https://gerrit.wikimedia.org/r/162619 (owner: 10Hoo man) [00:14:18] jackmcbarn: Just use github :P [00:14:31] then why do we have git.wikimedia.org? :P [00:14:41] We had it before github ;) [00:14:47] hoo: HTMLTidy is pretty janky. Wikitech might be a decent place to explore why it's still needed. [00:14:51] Antimony needs some love it seems [00:15:11] Carmela: Because people don't close tags and WTF when it fucks up their userpage [00:15:12] Carmela: Nope, beta could be such a place, but wikitech is not supposed to be a playground [00:15:20] and what Reedy says [00:15:28] It is just unclosed tags? [00:15:32] Probably not just [00:15:37] But that's one user facing ones [00:15:37] HTMLTidy causes lots of problem.s [00:15:39] problems [00:15:42] Per hoo, wikitech no, beta yes [00:15:58] What's the issue on wikitech? Unclosed tags? [00:16:04] Is that common? [00:16:11] Someone came in complaining about it [00:16:32] Right, but people fuck up wiki pages all the time. That doesn't mean we need an external package to chase after them. :-) [00:16:33] could we turn it off on test. or test2.wikipedia.org, so the beta cluster doesn't diverge more in config from the prod cluster? [00:16:53] jackmcbarn: After an RfC maybe [00:16:55] https://wikitech.wikimedia.org/w/index.php?title=User:Negative24&diff=prev&oldid=128364 [00:17:04] but that would need someone to assess the benefits etc. [00:17:21] hoo: You've seen the Tidy sucks bug? [00:17:24] and we probably want an alternate solution for the lost functionally [00:17:37] hoo: VisualEditor! [00:17:39] * Reedy hides [00:17:44] https://bugzilla.wikimedia.org/show_bug.cgi?id=2542 [00:17:46] isn't the only "functionality" that Tidy offers is to clean up after users who screw up? [00:17:47] Carmela: Nope, but did you see the "enwiki is screwed" one? :P [00:17:49] well, parsoid I guess [00:17:49] * hoo hides [00:17:59] https://bugzilla.wikimedia.org/showdependencytree.cgi?id=2542&hide_resolved=1 [00:18:03] jackmcbarn: And possibly the odd developer [00:18:08] how so? [00:18:14] I'm not sure HTMLTidy is still developed. [00:18:25] that hasn't stopped us before [00:18:48] > HTMLTidy is pretty janky. Wikitech might be a decent place to explore why it's still needed. [00:18:51] I still agree with me. [00:19:06] So... wikitech is the new beta? [00:19:18] * hoo hides [00:19:21] :-) [00:19:45] I think it's reasonable in any environment to assess whether the proposed tool is the correct tool before installing/enabling. ;-) [00:20:11] If tidy is needed on wikitech.wikimedia.org, it'd be good to figure out why that is. If it's only unclosed tags, then it may make sense to fix that problem instead. [00:20:11] the decision was to make wikitech a standard WMF deployment wiki [00:20:14] so that's answered [00:20:46] Wheel reinvention? [00:21:07] Reedy: Sanitizer.php [00:21:19] And MediaWiki sometimes has unique problems, yes. [00:21:33] * Reedy awaits Carmelas patch [00:21:33] I'm not sure many sites allow span and div. [00:22:09] I'm awaiting the requirements first. ;-) [00:22:13] PROBLEM - puppet last run on ssl3003 is CRITICAL: CRITICAL: puppet fail [00:22:13] Is it just unclosed tags? [00:22:21] I've nfi [00:22:28] Probably not [00:22:44] Right. Hard to kill something that has a laundry list of hidden features that we may or may not be relying on. [00:22:46] It'd be amazing if it is [00:22:49] I'd actually bet there are more more or less funny side effects you don't want to face [00:22:59] Don't we have wonky unit tests around html tidy? [00:23:09] Probably. [00:23:17] Wonky is the word [00:23:35] I thought wikitech was intentionally kept off the cluster. [00:23:45] Guess those days are gone. [00:23:45] It is off the cluster, ish [00:23:55] But it's more common config and such [00:23:59] It's special, but not too special! [00:24:05] yus [00:24:05] I guess it depends on the definition of cluster [00:24:13] it's on the wikimedia server cluster [00:24:21] but it's not on the cluster that serves our primary sites [00:24:35] All right. [00:24:46] I usually am. [00:25:52] hoo: FWIW, I sometimes think of wikitech as part of the group0 wikis. [00:25:56] Even if it technically isn't. [00:26:14] because it's broken often? [00:26:14] it gets changed with the group1 [00:26:20] lol [00:26:20] Because it's weird. [00:26:30] And does weird development-y things. [00:40:42] RECOVERY - puppet last run on ssl3003 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [00:51:02] PROBLEM - MySQL Slave Delay on db1016 is CRITICAL: CRIT replication delay 305 seconds [00:52:06] PROBLEM - MySQL Replication Heartbeat on db1016 is CRITICAL: CRIT replication delay 361 seconds [00:53:06] RECOVERY - MySQL Replication Heartbeat on db1016 is OK: OK replication delay -1 seconds [00:53:07] RECOVERY - MySQL Slave Delay on db1016 is OK: OK replication delay 0 seconds [01:50:16] PROBLEM - git.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:57:21] (03PS1) 10Springle: s4 api traffic prefer db1059 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166364 [01:57:56] (03CR) 10Springle: [C: 032] s4 api traffic prefer db1059 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166364 (owner: 10Springle) [01:58:03] (03Merged) 10jenkins-bot: s4 api traffic prefer db1059 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166364 (owner: 10Springle) [01:58:56] !log springle Synchronized wmf-config/db-eqiad.php: s4 api traffic prefer db1059 (duration: 00m 08s) [01:59:09] Logged the message, Master [02:00:37] RECOVERY - git.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 54237 bytes in 9.992 second response time [02:06:37] PROBLEM - git.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:10:44] !log springle Synchronized wmf-config/db-eqiad.php: repool db1062, warm up (duration: 00m 07s) [02:10:52] Logged the message, Master [02:12:37] RECOVERY - git.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 54237 bytes in 0.641 second response time [02:15:46] PROBLEM - git.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:16:03] !log LocalisationUpdate completed (1.25wmf2) at 2014-10-13 02:16:03+00:00 [02:16:09] Logged the message, Master [02:16:37] RECOVERY - git.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 54237 bytes in 1.009 second response time [02:20:47] PROBLEM - git.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:28:12] !log LocalisationUpdate completed (1.25wmf3) at 2014-10-13 02:28:12+00:00 [02:28:22] Logged the message, Master [02:36:16] RECOVERY - git.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 54237 bytes in 8.368 second response time [02:36:21] (03PS1) 10Springle: s1 api traffic prefer db1065 and db1066 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166365 [02:36:42] (03CR) 10Springle: [C: 032] s1 api traffic prefer db1065 and db1066 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166365 (owner: 10Springle) [02:36:49] (03Merged) 10jenkins-bot: s1 api traffic prefer db1065 and db1066 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166365 (owner: 10Springle) [02:37:54] !log springle Synchronized wmf-config/db-eqiad.php: s1 api traffic prefer db1065 and db1066 (duration: 00m 07s) [02:38:00] Logged the message, Master [02:47:27] PROBLEM - git.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:53:26] RECOVERY - git.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 54237 bytes in 1.976 second response time [03:29:19] PROBLEM - puppet last run on dbproxy1001 is CRITICAL: CRITICAL: Puppet has 2 failures [03:33:52] !log LocalisationUpdate ResourceLoader cache refresh completed at Mon Oct 13 03:33:51 UTC 2014 (duration 33m 50s) [03:33:57] Logged the message, Master [03:46:47] RECOVERY - puppet last run on dbproxy1001 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [04:25:17] (03PS6) 10KartikMistry: Apertium service configuration for Beta [puppet] - 10https://gerrit.wikimedia.org/r/165485 [06:21:57] <_joe_> morning [06:26:41] Morning. [06:28:09] PROBLEM - puppet last run on mw1039 is CRITICAL: CRITICAL: puppet fail [06:28:09] PROBLEM - puppet last run on mw1126 is CRITICAL: CRITICAL: puppet fail [06:28:28] PROBLEM - puppet last run on mw1114 is CRITICAL: CRITICAL: puppet fail [06:28:57] PROBLEM - puppet last run on search1007 is CRITICAL: CRITICAL: puppet fail [06:29:18] PROBLEM - puppet last run on cp1056 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:39] PROBLEM - puppet last run on cp1061 is CRITICAL: CRITICAL: Puppet has 2 failures [06:29:57] PROBLEM - puppet last run on db1059 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:58] PROBLEM - puppet last run on labsdb1003 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:08] PROBLEM - puppet last run on mw1166 is CRITICAL: CRITICAL: Puppet has 2 failures [06:30:28] PROBLEM - puppet last run on lvs2004 is CRITICAL: CRITICAL: Puppet has 2 failures [06:30:47] PROBLEM - puppet last run on mw1052 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:48] PROBLEM - puppet last run on db1023 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:57] PROBLEM - puppet last run on cp3014 is CRITICAL: CRITICAL: Puppet has 2 failures [06:30:58] PROBLEM - puppet last run on mw1025 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:07] PROBLEM - puppet last run on cp4014 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:07] PROBLEM - puppet last run on db1046 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:08] PROBLEM - puppet last run on mw1042 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:08] PROBLEM - puppet last run on dbproxy1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:08] PROBLEM - puppet last run on search1001 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:08] PROBLEM - puppet last run on cp4008 is CRITICAL: CRITICAL: Puppet has 3 failures [06:31:08] PROBLEM - puppet last run on mw1118 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:28] PROBLEM - puppet last run on mw1123 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:47] PROBLEM - puppet last run on mw1144 is CRITICAL: CRITICAL: Puppet has 1 failures [06:45:28] RECOVERY - puppet last run on cp1061 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [06:45:31] RECOVERY - puppet last run on mw1025 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [06:45:31] RECOVERY - puppet last run on db1059 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [06:45:37] RECOVERY - puppet last run on labsdb1003 is OK: OK: Puppet is currently enabled, last run 60 seconds ago with 0 failures [06:45:47] RECOVERY - puppet last run on db1046 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [06:45:48] RECOVERY - puppet last run on search1001 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [06:45:57] RECOVERY - puppet last run on cp4008 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [06:46:07] RECOVERY - puppet last run on mw1166 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [06:46:11] RECOVERY - puppet last run on cp1056 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [06:46:18] RECOVERY - puppet last run on mw1123 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [06:46:19] RECOVERY - puppet last run on lvs2004 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [06:46:27] RECOVERY - puppet last run on mw1144 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [06:46:27] RECOVERY - puppet last run on mw1052 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [06:46:28] RECOVERY - puppet last run on db1023 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [06:46:38] RECOVERY - puppet last run on cp3014 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [06:46:47] RECOVERY - puppet last run on search1007 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [06:46:52] RECOVERY - puppet last run on mw1042 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [06:46:52] RECOVERY - puppet last run on dbproxy1001 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [06:46:58] RECOVERY - puppet last run on mw1118 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:46:58] RECOVERY - puppet last run on mw1039 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [06:47:18] RECOVERY - puppet last run on mw1114 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [06:47:50] RECOVERY - puppet last run on cp4014 is OK: OK: Puppet is currently enabled, last run 60 seconds ago with 0 failures [06:48:13] RECOVERY - puppet last run on mw1126 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [07:52:43] (03CR) 10Alexandros Kosiaris: [C: 032] role::servermon: Convert to package<|provider==trebuchet|> [puppet] - 10https://gerrit.wikimedia.org/r/163369 (owner: 10BryanDavis) [08:01:49] (03PS2) 10Alexandros Kosiaris: Default to UNKNOWN when NRPE checks timeout [puppet] - 10https://gerrit.wikimedia.org/r/165732 [08:03:20] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "I would give +1 if we were giving any attention to the unknowns; since it seems no one cares about those, and nrpe timeouts _can_ be signi" [puppet] - 10https://gerrit.wikimedia.org/r/165732 (owner: 10Alexandros Kosiaris) [08:04:56] _joe_: when this happens, it is usually a flurry of criticals that don't really help cause the hide the actual problem [08:05:00] (03PS1) 10Giuseppe Lavagetto: hhvm: monitor busy threads and queued requests. [puppet] - 10https://gerrit.wikimedia.org/r/166376 [08:05:42] like the host being down, lockups and so on [08:05:54] <_joe_> akosiaris: yes, of course [08:06:07] <_joe_> if we do have something still sending us a critical in that case it's ok [08:06:20] <_joe_> my fear is some server dying of load [08:06:36] <_joe_> an no one notices because there just is a timeout in the check [08:07:03] we have the actual service check for that [08:07:17] <_joe_> nrpe-based? [08:07:28] no, I mean check_http [08:07:32] and so on [08:07:53] and if we don't... we should [08:07:59] <_joe_> ^^ [08:08:01] <_joe_> :) [08:08:18] <_joe_> we should check for each service if the fundamental service check is nrpe [08:08:30] <_joe_> and override the -u in that specific case [08:09:39] <_joe_> so that was basically my point :) [08:09:56] it is wrong to have the fundamental service check via nrpe ... [08:10:05] I sincerely hope we don't do that [08:10:07] <_joe_> akosiaris: we agree again [08:10:17] <_joe_> but I'd like to check that [08:10:35] <_joe_> akosiaris: mysql/databases can be major offenders there [08:10:54] probably [08:11:22] I am pretty sure mysql/mariadb modules define their own checks as well [08:12:23] <_joe_> akosiaris: so, my -1 was not to the concept, but to the real-world monitoring we do; we should probably give a look [08:12:35] <_joe_> but I'm overcautious with monitoring [08:15:02] yeah I get it. I think you are right. It can probably be a starting point for taking another look at some of the ways we monitor stuff [08:15:19] <_joe_> yes [08:15:31] <_joe_> but I guess neither of us has a lot of time for that :) [08:15:46] <_joe_> not now at least [08:17:05] can't argue with that [08:19:48] PROBLEM - Disk space on ms-be1013 is CRITICAL: DISK CRITICAL - free space: / 1999 MB (3% inode=83%): [08:20:44] greetings [08:21:44] <_joe_> godog: you wake up and swift salutes [08:21:56] <_joe_> godog: ms-be1013 is having issues since friday btw [08:21:57] we're connected! [08:22:23] the disk contention thing, indeed [08:23:30] <_joe_> godog: ms-be101[3-5] have very high iowait loads [08:23:59] !log stopping swift on ms-be1013, out of disk space on / [08:24:09] Logged the message, Master [08:26:39] PROBLEM - swift-account-auditor on ms-be1013 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-auditor [08:26:47] PROBLEM - swift-account-replicator on ms-be1013 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-replicator [08:26:47] PROBLEM - swift-container-auditor on ms-be1013 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [08:26:48] PROBLEM - swift-object-replicator on ms-be1013 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-replicator [08:26:58] PROBLEM - swift-container-replicator on ms-be1013 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-replicator [08:26:59] PROBLEM - swift-account-reaper on ms-be1013 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-reaper [08:27:18] PROBLEM - swift-object-server on ms-be1013 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-server [08:27:18] PROBLEM - swift-container-server on ms-be1013 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [08:27:18] PROBLEM - swift-account-server on ms-be1013 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [08:27:19] PROBLEM - swift-object-auditor on ms-be1013 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [08:27:19] PROBLEM - swift-container-updater on ms-be1013 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-container-updater [08:27:19] PROBLEM - swift-object-updater on ms-be1013 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/swift-object-updater [08:27:37] downtimed ^ [08:28:13] godog: good to know, thanks [08:28:51] akosiaris: ye I was pondering about being able to downtime hosts/services from irc, it is too painful from icinga [08:29:09] _joe_: indeed because they have bigger disks with more partitions on them [08:29:34] <_joe_> godog: I was thinking of the same right now [08:30:21] indeed that'd go a long way I think [08:33:04] RECOVERY - Disk space on ms-be1013 is OK: DISK OK [08:40:29] (03PS2) 10Giuseppe Lavagetto: hhvm: monitor busy threads and queued requests. [puppet] - 10https://gerrit.wikimedia.org/r/166376 [08:40:36] (03CR) 10Giuseppe Lavagetto: [C: 032] hhvm: monitor busy threads and queued requests. [puppet] - 10https://gerrit.wikimedia.org/r/166376 (owner: 10Giuseppe Lavagetto) [08:46:49] (03PS3) 10Christopher Johnson (WMDE): added git fetch --tags service for tagged phabricator extensions installation [puppet] - 10https://gerrit.wikimedia.org/r/166181 [08:46:53] (03PS1) 10Alexandros Kosiaris: netmon1001's ganglia aggregator pmtpa removal [puppet] - 10https://gerrit.wikimedia.org/r/166380 [08:47:32] (03CR) 10jenkins-bot: [V: 04-1] added git fetch --tags service for tagged phabricator extensions installation [puppet] - 10https://gerrit.wikimedia.org/r/166181 (owner: 10Christopher Johnson (WMDE)) [08:52:16] (03CR) 10Giuseppe Lavagetto: [C: 04-2] "fetching all tags on every puppet run is a very bad idea, as I stated in my comment to the preceding patchset." [puppet] - 10https://gerrit.wikimedia.org/r/166181 (owner: 10Christopher Johnson (WMDE)) [08:53:18] RECOVERY - swift-container-auditor on ms-be1013 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor [08:53:18] RECOVERY - swift-object-replicator on ms-be1013 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-replicator [08:53:28] RECOVERY - swift-container-replicator on ms-be1013 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-replicator [08:53:29] RECOVERY - swift-account-reaper on ms-be1013 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-reaper [08:53:48] RECOVERY - swift-account-server on ms-be1013 is OK: PROCS OK: 13 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server [08:53:50] RECOVERY - swift-object-server on ms-be1013 is OK: PROCS OK: 101 processes with regex args ^/usr/bin/python /usr/bin/swift-object-server [08:53:50] RECOVERY - swift-container-server on ms-be1013 is OK: PROCS OK: 13 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server [08:53:51] RECOVERY - swift-object-auditor on ms-be1013 is OK: PROCS OK: 3 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor [08:53:51] RECOVERY - swift-container-updater on ms-be1013 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-updater [08:53:51] RECOVERY - swift-object-updater on ms-be1013 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-updater [08:54:08] RECOVERY - swift-account-auditor on ms-be1013 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-auditor [08:54:17] RECOVERY - swift-account-replicator on ms-be1013 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-replicator [08:55:53] (03CR) 10Giuseppe Lavagetto: [C: 031] First of (hopefully many) es-tool commands [puppet] - 10https://gerrit.wikimedia.org/r/163945 (owner: 10Chad) [08:57:28] PROBLEM - git.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:57:41] (03CR) 10Alexandros Kosiaris: [C: 032] netmon1001's ganglia aggregator pmtpa removal [puppet] - 10https://gerrit.wikimedia.org/r/166380 (owner: 10Alexandros Kosiaris) [08:59:29] RECOVERY - git.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 54220 bytes in 4.944 second response time [08:59:58] RECOVERY - puppet last run on netmon1001 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [09:03:34] PROBLEM - puppet last run on lvs1003 is CRITICAL: CRITICAL: Puppet last ran 329559 seconds ago, expected 14400 [09:04:33] RECOVERY - puppet last run on lvs1003 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [09:11:23] ACKNOWLEDGEMENT - puppet last run on ms-be1013 is CRITICAL: CRITICAL: Puppet has 1 failures Filippo Giunchedi /dev/sdl broken [09:49:12] !log test enable container sync on wikibooks-it-local-thumb [09:49:18] Logged the message, Master [09:53:25] Has anybody investigated the up and down of git.wikimedia.org in the meantime? https://bugzilla.wikimedia.org/show_bug.cgi?id=71974 [09:53:30] * andre__ wasn't much online this weekend [09:55:13] <_joe_> andre__: yes I did a few months ago [09:55:27] <_joe_> andre__: "gitlbit is horrible and should be burnt with fire" [09:55:30] heh. I referred to this weekend :) [09:55:40] <_joe_> so, ops basically just restart java in case [09:55:50] <_joe_> gitblit is unstable in itself [09:56:50] <_joe_> we've noticed sometimes it gets in a loop of malfunction after on thursday some tree pruning gets done in the git hierarchies [09:57:00] _joe_, heh, ok. if Java was restarted, fine. I just want to provide an update on https://bugzilla.wikimedia.org/show_bug.cgi?id=71974 and whether it's being handled or has been handled. [09:57:39] <_joe_> "handled" as in paravoid restarted java, yes [09:59:01] <_joe_> andre__: sorry, just setting expectations here. We can't expect we use something broken (which has been reported broken since basically its first day online AFAIK) and don't have stability issues and/or have ops support it [09:59:18] sure, totally understood [09:59:28] thanks, I'll update the ticket! [10:06:25] <_joe_> thanks to you :) [10:16:28] (03PS1) 10Alexandros Kosiaris: Revert "WIP: Added initial Debian package" [debs/contenttranslation/apertium-apy] - 10https://gerrit.wikimedia.org/r/166386 [10:16:42] (03CR) 10Alexandros Kosiaris: [C: 032] Revert "WIP: Added initial Debian package" [debs/contenttranslation/apertium-apy] - 10https://gerrit.wikimedia.org/r/166386 (owner: 10Alexandros Kosiaris) [10:16:50] (03CR) 10Alexandros Kosiaris: [V: 032] Revert "WIP: Added initial Debian package" [debs/contenttranslation/apertium-apy] - 10https://gerrit.wikimedia.org/r/166386 (owner: 10Alexandros Kosiaris) [10:26:04] (03CR) 10Alexandros Kosiaris: Apertium service configuration for Beta (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/165485 (owner: 10KartikMistry) [10:26:50] (03CR) 10Alexandros Kosiaris: [C: 04-1] "Aside from the missing module class call inside the role class this is +1" [puppet] - 10https://gerrit.wikimedia.org/r/165485 (owner: 10KartikMistry) [10:30:11] !log enable container sync as a test on wiktionary*-local-public [10:30:19] Logged the message, Master [10:37:28] (03PS1) 10Dan-nl: Add edh-www.adw.uni-heidelberg.de to the wgCopyUploadsDomains whitelist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166390 (https://bugzilla.wikimedia.org/71990) [10:40:11] (03CR) 10Zfilipin: [C: 031] contint: ruby2.0 on Trusty slaves [puppet] - 10https://gerrit.wikimedia.org/r/166046 (owner: 10Hashar) [10:40:16] (03PS2) 10Dan-nl: Add edh-www.adw.uni-heidelberg.de to the wgCopyUploadsDomains whitelist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166390 (https://bugzilla.wikimedia.org/71990) [10:45:49] PROBLEM - HTTP error ratio anomaly detection on tungsten is CRITICAL: CRITICAL: Anomaly detected: 10 data above and 0 below the confidence bounds [10:48:22] (03PS7) 10KartikMistry: Apertium service configuration for Beta [puppet] - 10https://gerrit.wikimedia.org/r/165485 [10:55:41] (03PS1) 10KartikMistry: Added initial Debian packaging [debs/contenttranslation/apertium-apy] - 10https://gerrit.wikimedia.org/r/166393 [10:56:35] anyone online able to approve a domain whitelist reuest? https://gerrit.wikimedia.org/r/#/c/166390/ [10:58:07] I guess so [10:59:11] I'm not sure about the standard process on this... if you can show me an old bug about whitelisting URLs, I'll probably do it [10:59:12] thanks hoo, just need a +2 if you're okay with what i added [10:59:20] sure, one moment [10:59:22] "just" a +2 isn't much use [10:59:25] it won't be deployed then :P [10:59:26] Yep [10:59:27] :D [11:00:38] Ok, sounds good to me [11:00:52] (03CR) 10Hoo man: [C: 032] "Trivial change is trivial" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166390 (https://bugzilla.wikimedia.org/71990) (owner: 10Dan-nl) [11:00:59] (03Merged) 10jenkins-bot: Add edh-www.adw.uni-heidelberg.de to the wgCopyUploadsDomains whitelist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166390 (https://bugzilla.wikimedia.org/71990) (owner: 10Dan-nl) [11:01:26] hoo, here is a previous example https://bugzilla.wikimedia.org/show_bug.cgi?id=70771 [11:01:41] !log hoo Synchronized wmf-config/InitialiseSettings.php: Add edh-www.adw.uni-heidelberg.de to the wgCopyUploadsDomains whitelist (duration: 00m 08s) [11:01:43] dan-nl: I found the tracking bug... but thanks :) [11:01:46] Logged the message, Master [11:01:46] Done now [11:02:12] cool, thanks hoo [11:04:04] You're welcome [11:04:06] hoo, is there anything else you'd need from me? Reedy, so the +2 isn't enough to deploy it? did the log synchronized wmf-config take care of that? [11:04:17] dan-nl: right [11:04:20] dan-nl: Yep :) [11:04:23] just +2-ing it won't deploy it [11:04:26] unless it's for beta [11:04:30] Needs to be +2ed and then (manually) rolled out [11:04:30] excellent thanks both [11:04:37] that's what I just did, hence the log message [11:18:19] (03PS1) 10KartikMistry: Add .gitreview file [debs/contenttranslation/apertium-apy] - 10https://gerrit.wikimedia.org/r/166396 [11:24:16] hashar: any reason /srv/deployment/cxserver is empty on newly created cxserver instance? [11:24:35] kart_: cxserver02 is broken [11:24:38] should come via cxserver-deploy repo there. [11:24:42] 02 too? :( [11:25:07] I thought bug is resolved. [11:25:10] yeah we had a bug in labs that prevented the creation of a mount on /srv [11:25:17] and until there is a mount, nothing is populated [11:25:21] https://bugzilla.wikimedia.org/show_bug.cgi?id=71873 [11:25:28] 02 should be deleted and another instance created :D [11:25:42] maybe 03 [11:26:01] hashar: can you show me how to do? possible to copy config from 02 anyway? [11:26:15] na just delete 02 [11:26:28] create a fresh 03 using Trusty and whatever size was used (I think it was m1.medium) [11:26:49] once instance is created and puppet run fine: change it to point to the local puppet&salt masters following the doc at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/How_code_is_updated#Converting_a_host_to_use_local_puppetmaster_and_salt_master [11:26:57] run puppet a bunch till it pass [11:27:06] then apply the role class ( role::beta::cxserver or something) [11:27:10] and hopefully that will do [11:27:14] okay! [11:27:16] Thanks. [11:27:26] and saw your mail, sorry! [11:27:47] then, maybe it can be added to Jenkins :] [11:27:57] anyway, I caught a flu this weekend, so I am going to crash to bed again [11:30:36] kart_: oh and once puppet ran and you get cxserver cloned, you can repool the Jenkins slave node :d [11:30:58] kart_: https://integration.wikimedia.org/ci/computer/deployment-cxserver02/ need a WMF ldap account, then press configure, change the name of the node and the IP address. Done (hopefully ) [11:31:10] I am off to bed. Flu is getting rid of me [11:31:30] hashar: take care! [11:37:50] PROBLEM - git.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:51:07] (03PS2) 10Glaisher: Add several domains to wgCopyUploadsDomains for commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166176 (https://bugzilla.wikimedia.org/71195) [11:52:50] (03CR) 10Glaisher: "I think it would work just fine if you tried to upload without the www in url. Anyway, just to be on the safe side, I've added * to the do" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166176 (https://bugzilla.wikimedia.org/71195) (owner: 10Glaisher) [11:53:01] RECOVERY - git.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 54220 bytes in 0.418 second response time [11:59:15] hmm. I tried to follow as close as hashar said :D [12:03:51] It seems Ganglia is extreme slow at the moment [12:04:05] http://ganglia.wikimedia.org/latest/?r=day&c=Miscellaneous+pmtpa&h=professor.pmtpa.wmnet is taking several minutes to run [12:04:09] tried different hosts [12:04:10] all timing out [12:04:33] (I know prof is dead, wanted to check since which month) [12:05:21] https://ganglia.wikimedia.org/latest/?c=Miscellaneous%20eqiad&h=labsdb1004.eqiad.wmnet&m=cpu_report&r=hour&s=descending&hc=4&mc=2 [12:05:27] loads pretty fast Krinkle [12:05:41] maybe something related to pmtpa ? [12:05:59] https://ganglia.wikimedia.org/latest/?c=Miscellaneous%20eqiad&h=professor.pmtpa.wmnet&m=cpu_report&r=hour&s=descending&hc=4&mc=2 [12:06:07] gives me the first few bytes immediately but then it hangs [12:06:15] it worked a few minutes ago, and then it stopped [12:06:18] (03Abandoned) 10Christopher Johnson (WMDE): added git fetch --tags service for tagged phabricator extensions installation [puppet] - 10https://gerrit.wikimedia.org/r/166181 (owner: 10Christopher Johnson (WMDE)) [12:08:02] miscellaneous eqiad but professor.pmtpa.wmnet ? [12:08:45] for me this page hangs the entire chrome tab [12:11:33] RECOVERY - HTTP error ratio anomaly detection on tungsten is OK: OK: No anomaly detected [12:12:52] Seems better now [12:12:57] akosiaris: yeah, me too. [12:13:01] 100% CPU [12:13:03] amazing [12:14:17] Tidy should be enabled on wikitech [12:14:22] https://wikitech.wikimedia.org/wiki/User:Peachey88 [12:14:24] https://wikitech.wikimedia.org/wiki/User:Peachey88?action=edit [12:19:43] <_joe_> bbl [12:20:01] (03PS1) 10ArielGlenn: fix test of trebuchet provider [puppet] - 10https://gerrit.wikimedia.org/r/166402 [12:21:35] (03CR) 10ArielGlenn: [C: 032] fix test of trebuchet provider [puppet] - 10https://gerrit.wikimedia.org/r/166402 (owner: 10ArielGlenn) [12:31:02] PROBLEM - git.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:42:11] RECOVERY - git.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 54241 bytes in 1.108 second response time [12:54:10] (03PS9) 10Filippo Giunchedi: First of (hopefully many) es-tool commands [puppet] - 10https://gerrit.wikimedia.org/r/163945 (owner: 10Chad) [12:54:15] (03CR) 10Filippo Giunchedi: [C: 032 V: 032] First of (hopefully many) es-tool commands [puppet] - 10https://gerrit.wikimedia.org/r/163945 (owner: 10Chad) [12:57:57] (03CR) 10Filippo Giunchedi: "with pmtpa going away this can be abandoned I think?" [dns] - 10https://gerrit.wikimedia.org/r/138568 (owner: 10Filippo Giunchedi) [13:00:31] PROBLEM - puppet last run on elastic1016 is CRITICAL: CRITICAL: Puppet has 1 failures [13:04:24] (03CR) 10Filippo Giunchedi: base: add checks for 127.0.1.1 in /etc/hosts (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/157795 (owner: 10Filippo Giunchedi) [13:05:49] (03CR) 10Filippo Giunchedi: [C: 031] role::elasticsearch: Convert to package<|provider==trebuchet|> [puppet] - 10https://gerrit.wikimedia.org/r/163364 (owner: 10BryanDavis) [13:06:11] (03CR) 10Filippo Giunchedi: [C: 031] role::analytics: Convert to package<|provider==trebuchet|> [puppet] - 10https://gerrit.wikimedia.org/r/163366 (owner: 10BryanDavis) [13:07:29] actually we could do sth smarter, akosiaris you reckon the trebuchet provider deprecation patches can be merged ? [13:08:01] PROBLEM - git.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:08:26] godog: aside from https://gerrit.wikimedia.org/r/163376 I think the rest should be merged, yes [13:08:40] servermon had zero issues [13:08:45] citoid also [13:08:56] I will do mathoid today, probably also parsoid [13:09:07] (03CR) 10John Erling Blad: [C: 031] "Looks fine!" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166176 (https://bugzilla.wikimedia.org/71195) (owner: 10Glaisher) [13:09:48] akosiaris: ack, and if it doesn't work we'd see puppet failures [13:10:16] !log enable container sync on wikibooks originals as test [13:10:24] Logged the message, Master [13:11:31] godog: no we wont. We need to trigger a deployment run for every package [13:11:47] so yeah we should be careful doing it [13:13:41] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] Add initial Debian packaging [debs/contenttranslation/apertium-lex-tools] - 10https://gerrit.wikimedia.org/r/164050 (owner: 10KartikMistry) [13:17:46] akosiaris: true, heh less trivial than I imagined, I'll give mwprof a shot [13:19:45] (03PS4) 10Filippo Giunchedi: mwprof: Convert to package<|provider==trebuchet|> [puppet] - 10https://gerrit.wikimedia.org/r/163372 (owner: 10BryanDavis) [13:19:53] (03CR) 10Filippo Giunchedi: [C: 032 V: 032] mwprof: Convert to package<|provider==trebuchet|> [puppet] - 10https://gerrit.wikimedia.org/r/163372 (owner: 10BryanDavis) [13:24:21] RECOVERY - git.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 54220 bytes in 9.840 second response time [13:27:26] (03PS1) 10Christopher Johnson (WMDE): Change phab_update_tag script to remove library lock file Change library_lock path to /var/run/$library_lock Update library tag [puppet] - 10https://gerrit.wikimedia.org/r/166406 [13:29:21] there's a huge log-mess going on on fluorine... anyone around who is firm with that or might have an idea? [13:29:37] paravoid: can I tag you in for RT duty? [13:29:41] I suspect that the cause might be messed up HHVM jit, but then again I have no idea [13:29:44] (welcome back, btw! Back in Athens?) [13:31:46] (03PS2) 10Christopher Johnson (WMDE): Change phab_update_tag.erb to remove library lock file [puppet] - 10https://gerrit.wikimedia.org/r/166406 [13:32:14] ori: ^ [13:34:09] (03PS1) 10Alexandros Kosiaris: servermon: Set a cron for the make_updates command [puppet] - 10https://gerrit.wikimedia.org/r/166408 [13:35:29] (03CR) 10Alexandros Kosiaris: [C: 032] servermon: Set a cron for the make_updates command [puppet] - 10https://gerrit.wikimedia.org/r/166408 (owner: 10Alexandros Kosiaris) [13:36:21] (03PS1) 10Physikerwelt: Allow the default rendering modes on beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166410 [13:41:33] (03CR) 10Christopher Johnson (WMDE): "Sprint needs a way to update. Continuing to use the lock file makes sense because the library should be kept in sync with updates to the " [puppet] - 10https://gerrit.wikimedia.org/r/166406 (owner: 10Christopher Johnson (WMDE)) [13:42:15] (03PS2) 10Physikerwelt: Allow the default rendering modes on beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166410 [13:43:01] (03Abandoned) 10Christopher Johnson (WMDE): Icinga: IRC notification event handler and Wikidata configuration file [puppet] - 10https://gerrit.wikimedia.org/r/139193 (owner: 10Christopher Johnson (WMDE)) [13:46:21] (03PS3) 10Physikerwelt: Allow the default rendering modes on beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166410 [13:53:56] (03CR) 10He7d3r: "Is this better than adding the missing option to the list?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166410 (owner: 10Physikerwelt) [13:54:33] (03CR) 10Krinkle: [C: 031] Remove $wgCategoryTreeDynamicTag [mediawiki-config] - 10https://gerrit.wikimedia.org/r/164015 (owner: 10PleaseStand) [13:55:54] (03CR) 10Krinkle: [C: 031] Remove various settings removed in mediawiki/core [mediawiki-config] - 10https://gerrit.wikimedia.org/r/164014 (owner: 10PleaseStand) [13:59:19] (03CR) 10Physikerwelt: "I think it's best to use the default option. Production uses the default option as well. Otherwise we would have to copy all default setti" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166410 (owner: 10Physikerwelt) [14:05:31] bd808|BUFFER: I've merged your mwprof trebuchet patch, however trying git deploy start / git deploy sync doesn't report success on tungsten, running manually salt-call deploy.fetch 'mwprof/mwprof' on tungsten shows git searching for a tag that doesn't exist [14:14:27] <_joe_> hoo: what do you mean 'log-mess'? [14:14:34] (03CR) 10Alexandros Kosiaris: [C: 032] role::mathoid: Convert to package<|provider==trebuchet|> [puppet] - 10https://gerrit.wikimedia.org/r/163374 (owner: 10BryanDavis) [14:23:19] (03CR) 10Alexandros Kosiaris: [C: 032] role::librenms: Convert to package<|provider==trebuchet|> [puppet] - 10https://gerrit.wikimedia.org/r/163367 (owner: 10BryanDavis) [14:25:21] PROBLEM - OCG health on ocg1001 is CRITICAL: CRITICAL: ocg_job_status 217529 msg: ocg_render_job_queue 30656 msg (=500 critical) [14:26:41] PROBLEM - OCG health on ocg1002 is CRITICAL: CRITICAL: ocg_job_status 217606 msg: ocg_render_job_queue 30696 msg (=500 critical) [14:29:24] (03PS1) 10Brion VIBBER: Add HiDPI PNG variants for 'A Wikimedia Project' logo [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166417 (https://bugzilla.wikimedia.org/63872) [14:30:24] <_joe_> !log rolling restart of the ocg cluster [14:30:33] Logged the message, Master [14:30:41] PROBLEM - OCG health on ocg1003 is CRITICAL: CRITICAL: ocg_job_status 217896 msg: ocg_render_job_queue 30727 msg (=500 critical) [14:32:21] hello, ops people. ocg seems to be having problems. anyone working on this already, before i dive in? [14:32:40] _joe_: looks like you're working on OCG? [14:32:52] <_joe_> cscott: yea right now [14:33:03] what's up? [14:33:17] <_joe_> queue is getting larger and larger [14:33:25] which queue? [14:33:28] and how large is large? [14:33:31] <_joe_> the jobs queue [14:33:40] <_joe_> 30K messages [14:33:40] i'm logging in from bermuda right now, and i can get to irc but not (yet) to http [14:33:54] yeah, the jobs queue growing means that the service isn't running, i think. [14:33:58] <_joe_> I restarted ocg1001 and guess what? [14:34:02] have you tried service ocg restart? [14:34:07] <_joe_> now it's dequeueing [14:34:17] <_joe_> cscott: that's what I'm doing now [14:34:24] yeah, good. so something caused all three of the services to stop processing jobs, huh? [14:34:29] <_joe_> I would've preferred to understand what's going on [14:34:31] PROBLEM - Kafka Broker Messages In on analytics1021 is CRITICAL: kafka.server.BrokerTopicMetrics.AllTopicsMessagesInPerSec.FifteenMinuteRate CRITICAL: 955.308702153 [14:34:34] <_joe_> cscott: yes [14:34:44] can you figure out a rough timestamp when that happened? then i can dig into logstash and see what was going on at that time. [14:35:23] if you haven't restarted ocg1002 and ocg1003, maybe leave one of those for me to look at [14:35:28] <_joe_> cscott: the usual problem [14:35:35] and see if i can figure out anything from its crashed state. [14:35:38] <_joe_> cscott: 100% cpu, no work done [14:35:43] <_joe_> it happened yesterday [14:35:48] <_joe_> cscott: ocg1003 is all yours [14:35:49] orly? [14:36:03] but not before yesterday, right? i don't remember seeing this at all last week, etc. [14:37:06] the last messages in ocg1003's dmesg are all about replacing apparmor profiles -- was puppet run on these hosts recently? [14:37:07] <_joe_> cscott: the problem began on friday I'd say, but the services stopped doing anything useful around sunday [14:37:22] <_joe_> cscott: dmesg -T [14:37:33] <_joe_> cscott: puppet runs continuously, yes [14:38:00] hm, i was working online all day friday, i guess no one noticed things were going south until later? [14:38:26] <_joe_> cscott: the user-facing problem (the queue filling up) started on sunday [14:39:02] there's some raid messages on ocg1003 starting [Sun Oct 5 02:42:05 2014] [14:39:06] <_joe_> cscott: exactly what happened last weekend http://ganglia.wikimedia.org/latest/graph.php?r=month&z=large&c=PDF+servers+eqiad&m=cpu_report&s=by+name&mc=2&g=load_report [14:39:24] and two segfaults: [Fri Oct 3 14:41:37 2014] nodejs-ocg[1214]: segfault at 18 ip 00007f92b8898451 sp 00007f92b37fde50 error 4 in node_sqlite3.node[7f92b886c000+d3000] [14:39:24] [Sat Oct 4 01:24:08 2014] xdvipdfmx[40354]: segfault at 2c10000 ip 0000000000443160 sp 00007fff06497170 error 4 in xdvipdfmx[400000+94000] [14:39:45] hm, wait, i fixed last weekend's bug. let me try to remember what it was. [14:39:57] <_joe_> cscott: apparently not :) [14:40:30] oh, right, gerrit I1da07ac9433ece717635694e290cf94fb8b91735 [14:40:55] <_joe_> cscott: anyways, it seems you didn't [14:40:56] <_joe_> :) [14:41:04] well, i fixed *one* 100% cpu bug ;) [14:42:55] can you tell me what the job status queue length is? (i still don't have http working) [14:43:17] <_joe_> now? 1 sec [14:43:42] <_joe_> it' [14:43:48] <_joe_> s going down [14:44:12] hm. that seems that maybe it is the gc process that got stuck. [14:44:42] so -- does ops have an actual modern redis service running anywhere? [14:45:20] <_joe_> cscott: what do you mean with "actual modern redis service"? [14:45:32] part of the issue with last weekend's bug was that the redis hkeys command takes a huge amount of memory. redis 2.8.x has 'hscan' to replace it. but we're running 2.6 afaict [14:46:17] 'modern' may be a bit strong, i have no idea what their release cycle is like. but one quick fix here might be to point ocg at a redis 2.8 server. [14:46:27] <_joe_> oh ok so you mean "a newer version of redis compared with what ubuntu ships and we already use" [14:46:33] <_joe_> cscott: I was unaware [14:46:47] <_joe_> of this problem of yours [14:46:53] <_joe_> lemme take a look [14:47:03] otherwise i probably have to recode the gc to manually chunk keys or something. assuming that this is the root cause, i'm trying to confirm that. [14:47:16] <_joe_> cscott: redis on trusty is 2.8 [14:47:19] gerrit I1da07ac9433ece717635694e290cf94fb8b91735 has a lengthy essay. (again, apologies for no http link yet) [14:47:34] <_joe_> cscott: so you just need one trusty redis instance [14:47:51] <_joe_> cscott: np, thanks for the help :) [14:48:22] the redis config is more stuff that mwalker set up, so i'm not 100% familiar with the details. i think ocg's redis host is configured via a puppet config var? [14:48:58] puppet:/modules/ocg/templates/mw-collection-ocg.js.erb says config.redis.host = "<%= @redis_host %>"; [14:49:37] i don't know if that's one host or a small cluster or a raspberry pi sitting under mwalker's old desk [14:50:38] hm, i might be premature to blame redis here [14:50:52] <_joe_> cscott: eheh yea I don't think it's redis, honestly [14:50:54] ocg1003 has a bunch of stuck 100% cpu processes in /usr/bin/nodejs-ocg /srv/deployment/ocg/ocg/mw-ocg-latexer/bin/mw-ocg-latexer -T /mnt/tmpfs/6e7d095bd260d0305153725ddaf415593df2921a.rdf2latex -o /srv/deployment/ocg/output/6e/7d095bd260d0305153725ddaf415593df2921a.pdf /mnt/tmpfs/6e7d095bd260d0305153725ddaf415593df2921a.zip [14:51:08] let me look at that file, that seems like a bug in the latexer [14:51:18] does app armor have any cpu-limiting provisions? [14:51:25] <_joe_> but if you will find it handly to have 2.8, we can work on that [14:51:37] <_joe_> cscott: if it had, they're not working on those machines [14:51:42] <_joe_> 100% cpu [14:51:58] 2.8 might be useful, but let's wait until i get back from bermuda for that, since it doesn't seem to be on the critical path here [14:52:10] last week's bug does appear to be fixed, it looks like this is a new fancy special bug [14:53:21] there's a wall-clock time limit available in the ocg settings. mwalker left it disabled, and i've never really tested it. [14:53:36] but one quick fix here might be to enable that, which should kill the long-lived 100% cpu processes. [14:53:51] <_joe_> cscott: wait 1 min please [14:54:29] <_joe_> cscott: I don't think this is a latexer problem [14:54:39] o? [14:54:44] <_joe_> I mean it runs at 100% of cpu but always on different files [14:54:53] <_joe_> it just doesn't produce any output [14:55:32] <_joe_> I'm trying to see something more out of it [14:56:14] i guess i'm deferring the question of "why does the latexer spin forever at 100% cpu" and trying to figure out if just setting `config.backend.writer.rdf2latex.max_execution_time=3600` will fix this enough that i can do the full portmortem next week. [14:56:53] it seems like setting an execution time limit would be a good idea in any case. [14:57:28] <_joe_> cscott: I guess so, not sure if it's that simple but we can try [14:57:53] <_joe_> cscott: or - for this week I schedule myself to do a restart around thursday [14:58:07] yeah, that also seems like it will help us limp along. [14:58:10] <_joe_> and we can live until you're back from vacation [14:58:14] let's maybe do both ;) [14:58:41] <_joe_> cscott: it's your call in the end, but if you never tested that config, I'd wait [14:59:19] discretion is probably the better part of valor here. [14:59:53] were you successful in getting any more info from ocg1003? [15:00:10] i'd like to at least write down all the hung job ids before i restart, so i'll recognize them if they happen again. [15:00:31] <_joe_> yeah I'm actually trying to get a backtrace from one of those processes [15:00:56] so let's say i do about 10 minutes more postmortem there, then whatever you want to do if it's more, then we restart ocg1003. [15:01:13] since that seems to fix the problem, i can do a little bit more digging tonight after i get back from my bermuda-ing. [15:01:22] <_joe_> cscott: ok, we don't have user-facing issues right now [15:01:28] in particular i'll try to test the max_timeout stuff locally and make sure it seems to work as intended. [15:01:34] <_joe_> ok [15:01:54] if that works, then maybe i'll try to deploy that tonight, when i can be online for a little while to keep an eye on it [15:02:04] (and after i figure out this http port issue) [15:03:17] subbu: ocg issue with some threads stuck at 100% cpu, eventually causing the service to stop accepting jobs. restarting the ocg service 'fixes' the problem, but _joe_ and I are trying to dig into it more, find the root cause. [15:03:59] subbu: if arlo isn't taking today as a holiday (is he?) maybe he could try testing the max_execution_time setting in the ocg configuration locally and verify that it seems to work correctly. [15:04:24] that would prevent the 100% cpu processes from eventually crashing the process (still not fixing the root cause, but a better short-term fix) [15:04:30] <_joe_> cscott: ping me when you're done [15:04:45] _joe_: did we figure otu the 'log mess'? [15:05:07] (03CR) 10Andrew Bogott: [C: 032] Make sure openstack::openstack-manager creates /a [puppet] - 10https://gerrit.wikimedia.org/r/166360 (owner: 10Hoo man) [15:05:27] <_joe_> aude: no, I've seen hoo saying that but I found nothing particularly interesting on fluorine [15:05:32] on fluorine, there is a bunch of garbage, a lot of it with wikibase lua stuff [15:05:34] <_joe_> aude: I got distracted by ocg btw [15:05:43] <_joe_> aude: which log? [15:05:47] 374 Oct 13 11:09 632.log [15:05:54] for example, in /a/mw-log [15:06:02] and a lot of similar files [15:06:13] since thursday [15:06:27] <_joe_> ok, what exactly happened on thursday? [15:06:31] no clue [15:06:33] * _joe_ tries to remember [15:06:35] subbu: i'm hesitant to just enable the max_execution_time in production blindly because it's mwalker's code which i've never tested. i'd hate to find out the hard way that the value is not in seconds but in milliseconds or minutes or something. but if arlo could audit and bang on that code a bit, i'd have greater confidence. if arlo's not around, i'll do this myself later tonight. [15:06:42] <_joe_> aude: do you have a timing for that? [15:06:58] cscott, just woke up .. checking email .. let me read your messages here. [15:06:59] no i don't [15:07:29] <_joe_> aude: it looks like those files are created all at the same time [15:08:16] <_joe_> so, no idea and we will look into that, but it's not nearly as important as ocg atm [15:08:23] _joe_: thanks [15:08:32] definitely not urgent but something strange going on [15:08:56] cscott, ok, i'm not sure about arlo, but, let me email him and check what his plans are. [15:11:51] with limited availability until tomorrow evening [15:12:00] packing a house, flying etc :) [15:13:13] <_joe_> paravoid: I thought they taught you how to be ready to move within 5 minutes :P [15:13:47] riiight [15:15:40] <_joe_> cscott: I [15:15:59] <_joe_> cscott: I'm restarting ocg on 1003 as well [15:16:30] ok [15:16:35] i'm done with what i can do there [15:17:48] <_joe_> yeah I tried to enable the debug mod for node, but that doesn't work with children processes apparently [15:19:06] cscott, emailed arlo .. cced you. [15:20:54] ok, thanks. [15:21:08] and thanks _joe_! [15:24:28] _joe_: i think i've got a SOCKS tunnel set up for http now. anything you want me to look at before i go find a nice bermudan beach? [15:25:33] <_joe_> cscott: no, but if you say "bermudan beach" once more, I'll surely figure something out to waste your morning :) [15:25:42] ;) [15:50:31] PROBLEM - git.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:52:26] aaah ^ [15:53:28] RECOVERY - git.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 54251 bytes in 4.041 second response time [15:56:22] RECOVERY - OCG health on ocg1002 is OK: OK: ocg_job_status 224001 msg: ocg_render_job_queue 11 msg [15:56:22] RECOVERY - OCG health on ocg1003 is OK: OK: ocg_job_status 224005 msg: ocg_render_job_queue 0 msg [15:57:03] RECOVERY - OCG health on ocg1001 is OK: OK: ocg_job_status 224043 msg: ocg_render_job_queue 0 msg [15:57:32] PROBLEM - git.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:00:51] what's up with antimony? [16:02:43] <_joe_> greg-g: gitblit is broken, and it's a known issue [16:02:52] it affects jenkins [16:02:59] at least for wikibase [16:02:59] <_joe_> does it? [16:03:21] https://gerrit.wikimedia.org/r/#/c/166425/ [16:03:22] <_joe_> then jenkins should use something different, or we ought migrate away from gitblit [16:03:27] jzerebecki: ^ [16:03:50] suppose it can use github but i don't know where /why it's using gitblit [16:03:52] <_joe_> I said repeatedly in the past it's simply not working; it's not managed by anyone and we can basically restart java [16:03:58] _joe_: it seems it's using git.w.o urls [16:04:11] <_joe_> jzerebecki: who is using those? [16:04:29] <_joe_> jzerebecki: jenkins? [16:04:43] _joe_: looking [16:04:50] https://integration.wikimedia.org/ci/job/mwext-Wikibase-client-tests/6655/console [16:04:53] <_joe_> I'm pretty sure that's not so common [16:04:53] for example [16:05:00] might be just our setup [16:05:11] <_joe_> aude: ok, then that should be changed, IMHO [16:05:15] i think we had issues with github in the past, although seems the better choice now [16:05:20] <_joe_> git.w.o is _not_ production ready [16:05:32] yeah [16:05:38] <_joe_> that's my opinion, completely my opinion mind it [16:06:09] _joe_: agree [16:06:11] yes it should gerrit.w.o urls instead [16:06:32] i'm on it [16:06:47] <_joe_> jzerebecki: yes exactly [16:06:51] <_joe_> thanks ! [16:08:53] (03CR) 10JanZerebecki: [C: 04-1] "Aproach looks good, see inline comments." (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/166406 (owner: 10Christopher Johnson (WMDE)) [16:09:53] jzerebecki: https://github.com/wikimedia/integration-jenkins/blob/82f40360cb3e71687a58e73144ebfa8492d9fe7e/bin/mw-core-get.sh [16:10:11] if that's where/why [16:17:17] there is another one in tools/fetch-mw-ext [16:17:52] it fetches the archive of the source to test with, when the job does not run on an instance with a local gerrit mirror [16:18:05] lets see if git archive also works remotely [16:18:24] <_joe_> jzerebecki: let's wait for someone who knows why we're doing it like that [16:18:44] (gerrit afaik does not have a download source archive function) [16:19:28] <_joe_> !log restarting gitlbit, stuck in GC probably [16:19:35] Logged the message, Master [16:20:11] _joe_: i do not think there is a reason besides that it was the easiest way to implement it. i'll try to come up with a patch for hashar to review. [16:21:02] RECOVERY - git.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 54251 bytes in 0.116 second response time [16:22:42] <_joe_> it won't last [16:45:13] (03CR) 10Frédéric Wang: [C: 031] "This looks ok to me, but the release engineers probably know better what's best for the options." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166410 (owner: 10Physikerwelt) [16:48:06] aude, _joe_: those jenkins jobs work again after the restart. the easy solution to remove the dependency i had in mind does not work. filed a bug instead, with possible solutions: https://bugzilla.wikimedia.org/show_bug.cgi?id=72001 lets see what hashar has to say [16:48:39] jzerebecki: thanks [16:49:31] <_joe_> thanks a lot [16:53:37] real 0m9.855s [16:53:45] for a --depth 1 clone of MW core from gerrit [16:53:49] not fast enough? [17:02:39] (03PS1) 10Giuseppe Lavagetto: change wgPercentHHVM to 2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166437 [17:02:41] (03PS1) 10Giuseppe Lavagetto: change wgPercentHHVM to 5 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166438 [17:04:07] \o/ [17:05:11] <_joe_> paravoid: I could use some guidance in distributing the change, I'm reading wikitech right now [17:06:42] sync-file [17:06:46] fairly easy [17:07:02] <_joe_> paravoid: yeah the "how to deploy code" part seemed complicated [17:07:14] <_joe_> paravoid: shouldn't I also merge something locally on tin? [17:07:52] PROBLEM - git.wikimedia.org on antimony is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:08:12] (03CR) 10Giuseppe Lavagetto: [C: 032] change wgPercentHHVM to 2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166437 (owner: 10Giuseppe Lavagetto) [17:08:19] (03Merged) 10jenkins-bot: change wgPercentHHVM to 2 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166437 (owner: 10Giuseppe Lavagetto) [17:08:21] yeah [17:08:42] wait let me find the proper doc for you [17:09:01] of course the whole /srv move happened since last time I synced anything :) [17:09:12] <_joe_> paravoid: yes [17:09:16] https://wikitech.wikimedia.org/wiki/Heterogeneous_deployment [17:09:49] /srv/mediawiki-staging/wmf-config apparently [17:09:54] <_joe_> paravoid: I'll update it [17:09:55] fetch/diff/merge there [17:09:58] <_joe_> if it's needed [17:09:59] <_joe_> ok [17:10:03] then sync-file [17:10:34] https://wikitech.wikimedia.org/wiki/Heterogeneous_deployment#In_your_own_repo_via_gerrit [17:10:36] that section [17:11:25] <_joe_> paravoid: ok thanks a lot :) [17:12:38] it's very easy [17:12:45] ...once you know what to do :) [17:12:54] !log oblivian Synchronized wmf-config/CommonSettings.php: Serving 2% of anons with HHVM (duration: 00m 06s) [17:13:00] Logged the message, Master [17:15:04] <_joe_> in a couple of hours, I'll deploy to 5% of users [17:15:36] <_joe_> btw, playing with grafana I came to the conclusion we should collect stats differently in graphite [17:16:07] <_joe_> the servers.* hierarchy needs a first-level subtree that reports the cluster [17:16:18] <_joe_> so servers.appserver_hhvm.mw1017 [17:16:24] <_joe_> like we have in ganglia [17:17:19] mhm I'm not sure [17:17:21] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 51.733334 [17:17:48] renaming isn't easy, and we may want to move a host into a different cluster without losing history [17:18:03] ganglia does that; that graphite hierarchy wouldn't allow us to [17:18:42] I'm assuming appserver_hhvm will soon be called "appserver" :) [17:18:54] (or "mediawiki" or something) [17:18:58] <_joe_> paravoid: yeah [17:19:26] <_joe_> paravoid: or something like servers.$host.$cluster may work as well [17:20:06] <_joe_> mmh I'm not seeing any immediate rise in usage like we had when we switched hhvm on for anons [17:20:21] RECOVERY - Varnishkafka Delivery Errors on cp3020 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [17:20:30] <_joe_> I thought that the js ori wrote should've addressed this [17:21:44] <_joe_> (https://github.com/wikimedia/mediawiki-extensions-NavigationTiming/blob/wmf/1.25wmf2/modules/ext.navigationTiming.HHVM.js) [17:21:45] where is ori? [17:22:03] <_joe_> paravoid: at the mwcore offsite where wireless signal isn't working [17:22:06] lol [17:22:08] fail [17:22:12] <_joe_> they're trying to get online somehow [17:22:45] no mifis? [17:24:53] <_joe_> also, the js console in chrome shows me the correct value for mw.config.get('wgPercentHHVM') [17:26:07] <_joe_> ok, it's starting to increase [17:31:17] :) [17:32:18] <_joe_> Nemo_bis: well last time it took ~ 5 mins for the traffic to increase [17:32:26] <_joe_> it's taking much much longer this time [17:33:32] <_joe_> Nemo_bis: and if you look at the code, it should be [18:01:32] RECOVERY - git.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 54251 bytes in 1.685 second response time [18:15:26] (03PS1) 10ArielGlenn: deployment: turn off gitfat for testrepo [puppet] - 10https://gerrit.wikimedia.org/r/166447 [18:16:45] (03CR) 10ArielGlenn: [C: 032] deployment: turn off gitfat for testrepo [puppet] - 10https://gerrit.wikimedia.org/r/166447 (owner: 10ArielGlenn) [18:23:31] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 359.333344 [18:30:51] _joe_: you have to wait five minutes for the resourceloader cache. when that five-minute period is up, people aren't on 2% yet -- it simply means that 2% will be bucketed on the next page-view. but then they have to visit another page before they actually start utilizing HHVM. [18:31:33] i have a weak mifi signal atm, trying to purchase a bigger data allowance from my provider [18:35:31] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 89.333336 [18:38:31] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1192.033325 [18:41:41] RECOVERY - Varnishkafka Delivery Errors on cp3020 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [18:43:41] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 973.366638 [18:50:31] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [18:55:41] RECOVERY - Varnishkafka Delivery Errors on cp3022 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [18:59:41] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 486.399994 [19:08:31] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1076.133301 [19:09:41] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 469.766663 [19:10:51] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 226.399994 [19:11:42] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:11:42] RECOVERY - Varnishkafka Delivery Errors on cp3020 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:12:16] (03CR) 10Giuseppe Lavagetto: [C: 032] change wgPercentHHVM to 5 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166438 (owner: 10Giuseppe Lavagetto) [19:12:24] (03Merged) 10jenkins-bot: change wgPercentHHVM to 5 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/166438 (owner: 10Giuseppe Lavagetto) [19:12:32] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:13:46] !log oblivian Synchronized wmf-config/CommonSettings.php: Serving 5% of anons with HHVM (duration: 00m 12s) [19:13:57] Logged the message, Master [19:14:01] PROBLEM - puppet last run on ms-fe3001 is CRITICAL: CRITICAL: puppet fail [19:16:42] RECOVERY - Varnishkafka Delivery Errors on cp3022 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:20:25] <_joe_> both ganglia and https://grafana.wikimedia.org/#/dashboard/db/hhvm-health confirm HHVM usage is catching up, traffic is still moderate enough anywyas [19:20:42] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 235.03334 [19:26:41] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1011.166687 [19:32:01] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 558.633362 [19:32:33] RECOVERY - puppet last run on ms-fe3001 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [19:34:54] PROBLEM - Varnishkafka Delivery Errors on cp3009 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 54.666668 [19:38:52] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 78.833336 [19:40:52] RECOVERY - Varnishkafka Delivery Errors on cp3022 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:43:29] Oh, 5 :) [19:43:52] PROBLEM - Varnishkafka Delivery Errors on cp3009 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 50.733334 [19:43:53] PROBLEM - Varnishkafka Delivery Errors on cp3003 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1058.866699 [19:44:51] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 53.666668 [19:46:32] PROBLEM - Varnishkafka Delivery Errors on cp3007 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1339.466675 [19:48:39] Network activity didn't quite double, did it? It sounds like the cache layer does its job? :P [19:49:32] RECOVERY - Varnishkafka Delivery Errors on cp3007 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:50:02] RECOVERY - Varnishkafka Delivery Errors on cp3003 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:50:02] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 889.93335 [19:51:33] <_joe_> !log load test on HHVM starting [19:51:40] Logged the message, Master [19:56:51] RECOVERY - Varnishkafka Delivery Errors on cp3020 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:58:41] PROBLEM - Varnishkafka Delivery Errors on cp3007 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 67.133331 [19:59:01] RECOVERY - Varnishkafka Delivery Errors on cp3009 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:59:51] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 37.966667 [19:59:51] PROBLEM - Varnishkafka Delivery Errors on cp3016 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1754.333374 [20:00:12] PROBLEM - Varnishkafka Delivery Errors on cp3018 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 117.466667 [20:01:51] RECOVERY - Varnishkafka Delivery Errors on cp3022 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:02:51] RECOVERY - Varnishkafka Delivery Errors on cp3016 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:03:12] RECOVERY - Varnishkafka Delivery Errors on cp3018 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:04:12] PROBLEM - Varnishkafka Delivery Errors on cp3017 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 323.399994 [20:06:54] RECOVERY - Varnishkafka Delivery Errors on cp3017 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:07:15] PROBLEM - Varnishkafka Delivery Errors on cp3010 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 905.799988 [20:07:54] PROBLEM - Varnishkafka Delivery Errors on cp3009 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1392.599976 [20:08:54] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1176.233276 [20:09:54] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 936.133362 [20:10:05] RECOVERY - Varnishkafka Delivery Errors on cp3010 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:10:55] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1059.099976 [20:13:04] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:13:44] PROBLEM - Varnishkafka Delivery Errors on cp3007 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 33.066666 [20:14:54] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:16:54] PROBLEM - Varnishkafka Delivery Errors on cp3009 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 31.4 [20:20:04] PROBLEM - Varnishkafka Delivery Errors on cp3003 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1264.199951 [20:20:04] RECOVERY - Varnishkafka Delivery Errors on cp3022 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:20:45] <_joe_> !log load test done. HHVM is awesome [20:20:51] Logged the message, Master [20:23:56] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1090.5 [20:25:45] PROBLEM - Varnishkafka Delivery Errors on cp3008 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1369.966675 [20:25:54] PROBLEM - Varnishkafka Delivery Errors on cp3003 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 37.033333 [20:26:54] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1192.866699 [20:28:45] RECOVERY - Varnishkafka Delivery Errors on cp3008 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:28:45] RECOVERY - Varnishkafka Delivery Errors on cp3007 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:28:54] RECOVERY - Varnishkafka Delivery Errors on cp3009 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:29:04] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1161.733276 [20:30:55] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 299.600006 [20:33:54] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:35:54] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 216.399994 [20:38:01] (03PS1) 10Nemo bis: Graph User::pingLimiter() actions in gdash [puppet] - 10https://gerrit.wikimedia.org/r/166511 (https://bugzilla.wikimedia.org/65478) [20:39:04] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 159.633331 [20:41:14] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 36.599998 [20:42:58] (03PS2) 10Nemo bis: [gdash] Use logscale 10 for reqerror graph, again [puppet] - 10https://gerrit.wikimedia.org/r/117021 (https://bugzilla.wikimedia.org/41754) [20:46:48] PROBLEM - Varnishkafka Delivery Errors on cp3008 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 469.066681 [20:47:14] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1161.599976 [20:47:14] PROBLEM - Varnishkafka Delivery Errors on cp3003 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 55.866665 [20:49:04] (03PS3) 10Nemo bis: [gdash] Use logscale 10 for reqerror graph, again [puppet] - 10https://gerrit.wikimedia.org/r/117021 (https://bugzilla.wikimedia.org/41754) [20:49:58] (03CR) 10Nemo bis: "PS2 rebase, PS3 filtering out zeros per previous -1." [puppet] - 10https://gerrit.wikimedia.org/r/117021 (https://bugzilla.wikimedia.org/41754) (owner: 10Nemo bis) [20:52:45] RECOVERY - Varnishkafka Delivery Errors on cp3008 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:53:15] PROBLEM - Varnishkafka Delivery Errors on cp3003 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 418.799988 [20:55:45] PROBLEM - Varnishkafka Delivery Errors on cp3007 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 438.533325 [20:56:23] <_joe_> can someone who knows something about varnishkafka take a look? [20:56:30] <_joe_> it's pretty late here [20:57:04] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 42.166668 [20:58:25] PROBLEM - Varnishkafka Delivery Errors on cp3010 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1486.733276 [20:58:54] RECOVERY - Varnishkafka Delivery Errors on cp3007 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:59:57] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 45.0 [21:00:25] PROBLEM - Varnishkafka Delivery Errors on cp3018 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1591.133301 [21:01:24] RECOVERY - Varnishkafka Delivery Errors on cp3010 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [21:01:54] PROBLEM - Varnishkafka Delivery Errors on cp3008 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 1350.233276 [21:05:14] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 991.733337 [21:06:34] RECOVERY - Varnishkafka Delivery Errors on cp3018 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [21:07:54] RECOVERY - Varnishkafka Delivery Errors on cp3008 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [21:12:04] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 237.166672 [21:16:54] PROBLEM - Varnishkafka Delivery Errors on cp3007 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 913.866638 [21:17:25] RECOVERY - Varnishkafka Delivery Errors on cp3003 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [21:19:44] RECOVERY - Varnishkafka Delivery Errors on cp3007 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [21:26:25] RECOVERY - Varnishkafka Delivery Errors on cp3022 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [21:30:15] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [21:33:24] RECOVERY - Varnishkafka Delivery Errors on cp3020 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [21:37:56] PROBLEM - puppet last run on sca1001 is CRITICAL: CRITICAL: Puppet has 9 failures [21:38:01] (03CR) 10Hashar: "Aaron, Ori, that follow an incident in which our MediaWiki config rate limited our own media servers. https://wikitech.wikimedia.org/wiki" [puppet] - 10https://gerrit.wikimedia.org/r/166511 (https://bugzilla.wikimedia.org/65478) (owner: 10Nemo bis) [21:38:05] PROBLEM - puppet last run on db1027 is CRITICAL: CRITICAL: puppet fail [21:38:14] PROBLEM - puppet last run on palladium is CRITICAL: CRITICAL: Puppet has 32 failures [21:38:14] PROBLEM - puppetmaster backend https on palladium is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 8141: HTTP/1.1 500 Internal Server Error [21:38:15] PROBLEM - puppet last run on analytics1021 is CRITICAL: CRITICAL: puppet fail [21:38:15] PROBLEM - puppet last run on chromium is CRITICAL: CRITICAL: Puppet has 8 failures [21:38:24] PROBLEM - puppet last run on mw1058 is CRITICAL: CRITICAL: puppet fail [21:38:24] PROBLEM - puppet last run on labsdb1002 is CRITICAL: CRITICAL: puppet fail [21:38:24] PROBLEM - puppet last run on mw1207 is CRITICAL: CRITICAL: Puppet has 68 failures [21:38:24] PROBLEM - puppet last run on analytics1012 is CRITICAL: CRITICAL: puppet fail [21:38:34] PROBLEM - puppet last run on mw1103 is CRITICAL: CRITICAL: Puppet has 59 failures [21:38:34] PROBLEM - puppet last run on mw1128 is CRITICAL: CRITICAL: puppet fail [21:38:35] PROBLEM - puppet last run on mw1135 is CRITICAL: CRITICAL: Puppet has 33 failures [21:38:35] PROBLEM - puppet last run on labsdb1005 is CRITICAL: CRITICAL: Puppet has 24 failures [21:38:36] PROBLEM - puppet last run on db2017 is CRITICAL: CRITICAL: Puppet has 26 failures [21:38:44] PROBLEM - puppet last run on mw1021 is CRITICAL: CRITICAL: Puppet has 3 failures [21:38:45] PROBLEM - puppet last run on cp3005 is CRITICAL: CRITICAL: Puppet has 3 failures [21:38:45] PROBLEM - puppet last run on cp4010 is CRITICAL: CRITICAL: puppet fail [21:38:45] PROBLEM - puppet last run on lvs4004 is CRITICAL: CRITICAL: Puppet has 8 failures [21:38:45] PROBLEM - puppet last run on osm-cp1001 is CRITICAL: CRITICAL: Puppet has 11 failures [21:38:45] PROBLEM - puppet last run on cp1066 is CRITICAL: CRITICAL: puppet fail [21:38:45] PROBLEM - puppet last run on ssl1003 is CRITICAL: CRITICAL: Puppet has 16 failures [21:38:46] PROBLEM - puppet last run on mc1015 is CRITICAL: CRITICAL: Puppet has 11 failures [21:38:46] PROBLEM - puppet last run on mw1131 is CRITICAL: CRITICAL: Puppet has 35 failures [21:38:47] PROBLEM - puppet last run on tmh1001 is CRITICAL: CRITICAL: Puppet has 50 failures [21:38:47] PROBLEM - puppet last run on mw1019 is CRITICAL: CRITICAL: puppet fail [21:38:48] PROBLEM - puppet last run on mw1194 is CRITICAL: CRITICAL: puppet fail [21:38:48] PROBLEM - puppet last run on ms-be1014 is CRITICAL: CRITICAL: puppet fail [21:38:49] PROBLEM - puppet last run on lvs1004 is CRITICAL: CRITICAL: Puppet has 13 failures [21:38:49] PROBLEM - puppet last run on lvs1003 is CRITICAL: CRITICAL: Puppet has 11 failures [21:38:54] PROBLEM - puppet last run on mw1155 is CRITICAL: CRITICAL: Puppet has 37 failures [21:38:54] PROBLEM - puppet last run on mw1017 is CRITICAL: CRITICAL: puppet fail [21:38:54] PROBLEM - puppet last run on mw1199 is CRITICAL: CRITICAL: Puppet has 66 failures [21:38:54] PROBLEM - puppet last run on bast2001 is CRITICAL: CRITICAL: Puppet has 21 failures [21:38:55] PROBLEM - puppet last run on cp3019 is CRITICAL: CRITICAL: Puppet has 13 failures [21:38:55] PROBLEM - puppet last run on pollux is CRITICAL: CRITICAL: puppet fail [21:38:55] PROBLEM - puppet last run on mw1037 is CRITICAL: CRITICAL: Puppet has 37 failures [21:38:56] PROBLEM - puppet last run on lvs2002 is CRITICAL: CRITICAL: puppet fail [21:38:56] PROBLEM - puppet last run on eeden is CRITICAL: CRITICAL: Puppet has 10 failures [21:38:57] PROBLEM - puppet last run on ms-be1010 is CRITICAL: CRITICAL: Puppet has 21 failures [21:38:57] PROBLEM - puppet last run on lvs3002 is CRITICAL: CRITICAL: puppet fail [21:38:58] PROBLEM - puppet last run on mw1073 is CRITICAL: CRITICAL: Puppet has 53 failures [21:39:04] PROBLEM - puppet last run on mw1015 is CRITICAL: CRITICAL: puppet fail [21:39:04] PROBLEM - puppet last run on es1004 is CRITICAL: CRITICAL: Puppet has 18 failures [21:39:05] PROBLEM - puppet last run on amssq45 is CRITICAL: CRITICAL: puppet fail [21:39:05] PROBLEM - puppet last run on amssq50 is CRITICAL: CRITICAL: Puppet has 25 failures [21:39:06] PROBLEM - puppet last run on elastic1009 is CRITICAL: CRITICAL: puppet fail [21:39:06] PROBLEM - puppet last run on amssq52 is CRITICAL: CRITICAL: puppet fail [21:39:06] PROBLEM - puppet last run on amssq37 is CRITICAL: CRITICAL: Puppet has 16 failures [21:39:06] PROBLEM - puppet last run on cp3011 is CRITICAL: CRITICAL: Puppet has 8 failures [21:39:18] PROBLEM - puppet last run on db1009 is CRITICAL: CRITICAL: puppet fail [21:39:24] PROBLEM - puppet last run on cp1064 is CRITICAL: CRITICAL: Puppet has 20 failures [21:39:25] PROBLEM - puppet last run on mw1179 is CRITICAL: CRITICAL: puppet fail [21:39:25] PROBLEM - puppet last run on mw1157 is CRITICAL: CRITICAL: puppet fail [21:39:25] PROBLEM - puppet last run on mw1020 is CRITICAL: CRITICAL: puppet fail [21:39:25] PROBLEM - puppet last run on mw1102 is CRITICAL: CRITICAL: puppet fail [21:39:25] PROBLEM - puppet last run on db1005 is CRITICAL: CRITICAL: Puppet has 20 failures [21:39:25] PROBLEM - puppet last run on mw1018 is CRITICAL: CRITICAL: Puppet has 76 failures [21:39:26] PROBLEM - puppet last run on mw1047 is CRITICAL: CRITICAL: Puppet has 58 failures [21:39:27] PROBLEM - puppet last run on cp3021 is CRITICAL: CRITICAL: Puppet has 23 failures [21:39:27] PROBLEM - puppet last run on cp3017 is CRITICAL: CRITICAL: Puppet has 26 failures [21:39:27] PROBLEM - puppet last run on mw1184 is CRITICAL: CRITICAL: puppet fail [21:39:34] PROBLEM - puppet last run on lvs2005 is CRITICAL: CRITICAL: puppet fail [21:39:34] PROBLEM - puppet last run on hydrogen is CRITICAL: CRITICAL: puppet fail [21:39:34] PROBLEM - puppet last run on analytics1031 is CRITICAL: CRITICAL: Puppet has 21 failures [21:39:34] PROBLEM - puppet last run on zinc is CRITICAL: CRITICAL: Puppet has 20 failures [21:39:34] PROBLEM - puppet last run on mw1070 is CRITICAL: CRITICAL: Puppet has 74 failures [21:39:35] PROBLEM - puppet last run on mw1169 is CRITICAL: CRITICAL: puppet fail [21:39:35] PROBLEM - puppet last run on mw1137 is CRITICAL: CRITICAL: Puppet has 56 failures [21:39:44] PROBLEM - puppet last run on mw1075 is CRITICAL: CRITICAL: puppet fail [21:39:45] PROBLEM - puppet last run on cp4011 is CRITICAL: CRITICAL: Puppet has 21 failures [21:39:45] PROBLEM - puppet last run on lanthanum is CRITICAL: CRITICAL: Puppet has 37 failures [21:39:45] PROBLEM - puppet last run on db1049 is CRITICAL: CRITICAL: Puppet has 29 failures [21:39:45] PROBLEM - puppet last run on cp4015 is CRITICAL: CRITICAL: puppet fail [21:39:45] PROBLEM - puppet last run on cp4016 is CRITICAL: CRITICAL: Puppet has 34 failures [21:39:46] PROBLEM - puppet last run on ms-be2010 is CRITICAL: CRITICAL: puppet fail [21:39:46] PROBLEM - puppet last run on cp1008 is CRITICAL: CRITICAL: Puppet has 27 failures [21:39:46] PROBLEM - puppet last run on elastic1010 is CRITICAL: CRITICAL: puppet fail [21:39:47] PROBLEM - puppet last run on db1068 is CRITICAL: CRITICAL: Puppet has 21 failures [21:39:54] PROBLEM - puppet last run on erbium is CRITICAL: CRITICAL: puppet fail [21:39:54] PROBLEM - puppet last run on db2010 is CRITICAL: CRITICAL: Puppet has 13 failures [21:39:54] PROBLEM - puppet last run on mw1078 is CRITICAL: CRITICAL: Puppet has 66 failures [21:40:04] PROBLEM - puppet last run on search1003 is CRITICAL: CRITICAL: Puppet has 43 failures [21:40:12] PROBLEM - puppet last run on cp3007 is CRITICAL: CRITICAL: puppet fail [21:40:12] PROBLEM - puppet last run on praseodymium is CRITICAL: CRITICAL: puppet fail [21:40:12] PROBLEM - puppet last run on ocg1002 is CRITICAL: CRITICAL: puppet fail [21:40:12] PROBLEM - puppet last run on search1008 is CRITICAL: CRITICAL: Puppet has 47 failures [21:40:12] PROBLEM - puppet last run on elastic1013 is CRITICAL: CRITICAL: Puppet has 20 failures [21:40:14] PROBLEM - puppet last run on search1019 is CRITICAL: CRITICAL: Puppet has 44 failures [21:40:14] PROBLEM - puppet last run on es1005 is CRITICAL: CRITICAL: puppet fail [21:40:14] PROBLEM - puppet last run on mw1182 is CRITICAL: CRITICAL: puppet fail [21:40:24] PROBLEM - puppet last run on mw1095 is CRITICAL: CRITICAL: Puppet has 60 failures [21:40:24] PROBLEM - puppet last run on nickel is CRITICAL: CRITICAL: puppet fail [21:40:24] PROBLEM - puppet last run on mw1101 is CRITICAL: CRITICAL: Puppet has 69 failures [21:40:25] PROBLEM - puppet last run on ms-be1005 is CRITICAL: CRITICAL: puppet fail [21:40:25] PROBLEM - puppet last run on analytics1029 is CRITICAL: CRITICAL: puppet fail [21:40:25] PROBLEM - puppet last run on mw1085 is CRITICAL: CRITICAL: Puppet has 57 failures [21:40:25] PROBLEM - puppet last run on mw1094 is CRITICAL: CRITICAL: puppet fail [21:40:34] PROBLEM - puppet last run on tmh1002 is CRITICAL: CRITICAL: puppet fail [21:40:36] PROBLEM - puppet last run on analytics1024 is CRITICAL: CRITICAL: Puppet has 19 failures [21:40:37] PROBLEM - puppet last run on wtp1014 is CRITICAL: CRITICAL: Puppet has 25 failures [21:40:37] PROBLEM - puppet last run on ssl1004 is CRITICAL: CRITICAL: puppet fail [21:40:47] PROBLEM - puppet last run on db1058 is CRITICAL: CRITICAL: Puppet has 21 failures [21:40:54] PROBLEM - puppet last run on ms-be2009 is CRITICAL: CRITICAL: Puppet has 20 failures [21:40:54] PROBLEM - puppet last run on lvs4001 is CRITICAL: CRITICAL: puppet fail [21:40:54] PROBLEM - puppet last run on virt1005 is CRITICAL: CRITICAL: Puppet has 21 failures [21:40:55] PROBLEM - puppet last run on cp1069 is CRITICAL: CRITICAL: puppet fail [21:40:55] PROBLEM - puppet last run on virt1009 is CRITICAL: CRITICAL: Puppet has 21 failures [21:41:04] PROBLEM - puppet last run on search1021 is CRITICAL: CRITICAL: puppet fail [21:41:05] PROBLEM - puppet last run on db1019 is CRITICAL: CRITICAL: puppet fail [21:41:05] PROBLEM - puppet last run on mw1191 is CRITICAL: CRITICAL: puppet fail [21:41:05] PROBLEM - puppet last run on labsdb1007 is CRITICAL: CRITICAL: puppet fail [21:41:05] PROBLEM - puppet last run on mw1136 is CRITICAL: CRITICAL: puppet fail [21:41:05] PROBLEM - puppet last run on lvs1006 is CRITICAL: CRITICAL: Puppet has 24 failures [21:41:05] PROBLEM - puppet last run on ms-be3004 is CRITICAL: CRITICAL: Puppet has 20 failures [21:41:15] PROBLEM - puppet last run on db2033 is CRITICAL: CRITICAL: puppet fail [21:41:15] PROBLEM - puppet last run on search1009 is CRITICAL: CRITICAL: Puppet has 52 failures [21:41:15] PROBLEM - puppet last run on es1003 is CRITICAL: CRITICAL: Puppet has 21 failures [21:41:16] PROBLEM - puppet last run on mw1127 is CRITICAL: CRITICAL: Puppet has 66 failures [21:41:25] PROBLEM - puppet last run on mw1214 is CRITICAL: CRITICAL: Puppet has 76 failures [21:41:30] PROBLEM - puppet last run on cp1057 is CRITICAL: CRITICAL: Puppet has 20 failures [21:41:30] PROBLEM - puppet last run on amssq57 is CRITICAL: CRITICAL: puppet fail [21:41:30] PROBLEM - puppet last run on mw1083 is CRITICAL: CRITICAL: Puppet has 56 failures [21:41:34] PROBLEM - puppet last run on analytics1015 is CRITICAL: CRITICAL: puppet fail [21:41:34] PROBLEM - puppet last run on mw1124 is CRITICAL: CRITICAL: puppet fail [21:41:35] PROBLEM - puppet last run on mw1096 is CRITICAL: CRITICAL: puppet fail [21:41:35] PROBLEM - puppet last run on mw1036 is CRITICAL: CRITICAL: puppet fail [21:41:44] PROBLEM - puppet last run on db1041 is CRITICAL: CRITICAL: puppet fail [21:41:47] PROBLEM - puppet last run on mw1138 is CRITICAL: CRITICAL: puppet fail [21:41:47] PROBLEM - puppet last run on mw1013 is CRITICAL: CRITICAL: puppet fail [21:41:48] PROBLEM - puppet last run on labsdb1001 is CRITICAL: CRITICAL: puppet fail [21:41:48] PROBLEM - puppet last run on wtp1024 is CRITICAL: CRITICAL: puppet fail [21:41:48] PROBLEM - puppet last run on search1014 is CRITICAL: CRITICAL: Puppet has 57 failures [21:41:48] PROBLEM - puppet last run on db1024 is CRITICAL: CRITICAL: Puppet has 24 failures [21:41:48] PROBLEM - puppet last run on wtp1021 is CRITICAL: CRITICAL: puppet fail [21:41:55] PROBLEM - puppet last run on mc1008 is CRITICAL: CRITICAL: Puppet has 20 failures [21:41:56] PROBLEM - puppet last run on cp4017 is CRITICAL: CRITICAL: Puppet has 27 failures [21:42:05] PROBLEM - puppet last run on db2035 is CRITICAL: CRITICAL: puppet fail [21:42:15] PROBLEM - puppet last run on cp4013 is CRITICAL: CRITICAL: Puppet has 26 failures [21:42:15] PROBLEM - puppet last run on mw1196 is CRITICAL: CRITICAL: Puppet has 72 failures [21:42:24] PROBLEM - puppet last run on mw1062 is CRITICAL: CRITICAL: puppet fail [21:42:29] PROBLEM - puppet last run on analytics1036 is CRITICAL: CRITICAL: Puppet has 16 failures [21:42:29] PROBLEM - puppet last run on mc1010 is CRITICAL: CRITICAL: puppet fail [21:42:30] PROBLEM - puppet last run on analytics1034 is CRITICAL: CRITICAL: Puppet has 17 failures [21:42:30] PROBLEM - puppet last run on amslvs4 is CRITICAL: CRITICAL: Puppet has 18 failures [21:42:30] PROBLEM - puppet last run on ssl3003 is CRITICAL: CRITICAL: puppet fail [21:42:30] PROBLEM - puppet last run on analytics1019 is CRITICAL: CRITICAL: Puppet has 23 failures [21:42:31] PROBLEM - puppet last run on mw1109 is CRITICAL: CRITICAL: Puppet has 55 failures [21:42:31] PROBLEM - puppet last run on amssq58 is CRITICAL: CRITICAL: Puppet has 23 failures [21:42:31] PROBLEM - puppet last run on cp3022 is CRITICAL: CRITICAL: puppet fail [21:42:32] PROBLEM - puppet last run on mw1216 is CRITICAL: CRITICAL: Puppet has 80 failures [21:42:32] PROBLEM - puppet last run on cp1065 is CRITICAL: CRITICAL: Puppet has 26 failures [21:42:33] PROBLEM - puppet last run on db1010 is CRITICAL: CRITICAL: puppet fail [21:42:33] PROBLEM - puppet last run on mw1218 is CRITICAL: CRITICAL: puppet fail [21:42:44] PROBLEM - puppet last run on cp1051 is CRITICAL: CRITICAL: Puppet has 21 failures [21:42:44] PROBLEM - puppet last run on db1007 is CRITICAL: CRITICAL: Puppet has 22 failures [21:42:44] PROBLEM - puppet last run on zirconium is CRITICAL: CRITICAL: Puppet has 50 failures [21:42:45] PROBLEM - puppet last run on mw1035 is CRITICAL: CRITICAL: Puppet has 62 failures [21:42:45] PROBLEM - puppet last run on es1009 is CRITICAL: CRITICAL: puppet fail [21:42:45] PROBLEM - puppet last run on mw1040 is CRITICAL: CRITICAL: Puppet has 66 failures [21:42:58] PROBLEM - puppet last run on mw1178 is CRITICAL: CRITICAL: puppet fail [21:42:58] PROBLEM - puppet last run on mc1011 is CRITICAL: CRITICAL: Puppet has 23 failures [21:42:59] PROBLEM - puppet last run on cp1043 is CRITICAL: CRITICAL: puppet fail [21:43:04] PROBLEM - puppet last run on cp4007 is CRITICAL: CRITICAL: puppet fail [21:43:15] PROBLEM - puppet last run on analytics1039 is CRITICAL: CRITICAL: puppet fail [21:43:15] PROBLEM - puppet last run on mw1028 is CRITICAL: CRITICAL: puppet fail [21:43:15] PROBLEM - puppet last run on ms-fe1003 is CRITICAL: CRITICAL: Puppet has 26 failures [21:43:16] PROBLEM - puppet last run on ytterbium is CRITICAL: CRITICAL: Puppet has 31 failures [21:43:16] PROBLEM - puppet last run on es1006 is CRITICAL: CRITICAL: Puppet has 21 failures [21:43:16] PROBLEM - puppet last run on mw1161 is CRITICAL: CRITICAL: Puppet has 68 failures [21:43:16] PROBLEM - puppet last run on mw1080 is CRITICAL: CRITICAL: puppet fail [21:43:16] PROBLEM - puppet last run on cp1059 is CRITICAL: CRITICAL: puppet fail [21:43:17] PROBLEM - puppet last run on mc1009 is CRITICAL: CRITICAL: Puppet has 21 failures [21:43:17] PROBLEM - puppet last run on rdb1004 is CRITICAL: CRITICAL: puppet fail [21:43:25] PROBLEM - puppet last run on mw1192 is CRITICAL: CRITICAL: puppet fail [21:43:25] PROBLEM - puppet last run on amssq44 is CRITICAL: CRITICAL: puppet fail [21:43:25] PROBLEM - puppet last run on virt1002 is CRITICAL: CRITICAL: puppet fail [21:43:34] PROBLEM - puppet last run on mw1132 is CRITICAL: CRITICAL: Puppet has 61 failures [21:43:35] PROBLEM - puppet last run on mw1072 is CRITICAL: CRITICAL: puppet fail [21:43:37] PROBLEM - puppet last run on mw1147 is CRITICAL: CRITICAL: Puppet has 67 failures [21:43:38] PROBLEM - puppet last run on pc1001 is CRITICAL: CRITICAL: puppet fail [21:43:38] PROBLEM - puppet last run on mw1048 is CRITICAL: CRITICAL: puppet fail [21:43:38] PROBLEM - puppet last run on mw1115 is CRITICAL: CRITICAL: Puppet has 48 failures [21:43:38] PROBLEM - puppet last run on dbstore1001 is CRITICAL: CRITICAL: Puppet has 25 failures [21:43:38] PROBLEM - puppet last run on mw1130 is CRITICAL: CRITICAL: Puppet has 56 failures [21:43:38] PROBLEM - puppet last run on uranium is CRITICAL: CRITICAL: puppet fail [21:43:50] PROBLEM - puppet last run on logstash1003 is CRITICAL: CRITICAL: puppet fail [21:43:51] PROBLEM - puppet last run on db1029 is CRITICAL: CRITICAL: puppet fail [21:43:51] PROBLEM - puppet last run on wtp1017 is CRITICAL: CRITICAL: puppet fail [21:43:51] PROBLEM - puppet last run on cerium is CRITICAL: CRITICAL: puppet fail [21:43:54] PROBLEM - puppet last run on mw1089 is CRITICAL: CRITICAL: Puppet has 65 failures [21:43:54] PROBLEM - puppet last run on cp3006 is CRITICAL: CRITICAL: Puppet has 20 failures [21:43:54] PROBLEM - puppet last run on cp3015 is CRITICAL: CRITICAL: puppet fail [21:43:54] PROBLEM - puppet last run on vanadium is CRITICAL: CRITICAL: puppet fail [21:43:54] PROBLEM - puppet last run on mw1038 is CRITICAL: CRITICAL: puppet fail [21:43:55] PROBLEM - puppet last run on mw1145 is CRITICAL: CRITICAL: puppet fail [21:43:55] PROBLEM - puppet last run on mw1067 is CRITICAL: CRITICAL: puppet fail [21:43:56] PROBLEM - puppet last run on mw1031 is CRITICAL: CRITICAL: Puppet has 61 failures [21:43:56] PROBLEM - puppet last run on search1020 is CRITICAL: CRITICAL: puppet fail [21:43:57] PROBLEM - puppet last run on wtp1019 is CRITICAL: CRITICAL: Puppet has 24 failures [21:44:04] PROBLEM - puppet last run on ms-be1001 is CRITICAL: CRITICAL: Puppet has 19 failures [21:44:04] PROBLEM - puppet last run on mc1004 is CRITICAL: CRITICAL: puppet fail [21:44:05] PROBLEM - puppet last run on cp4006 is CRITICAL: CRITICAL: puppet fail [21:44:14] PROBLEM - puppet last run on ocg1001 is CRITICAL: CRITICAL: puppet fail [21:44:15] PROBLEM - puppet last run on caesium is CRITICAL: CRITICAL: Puppet has 37 failures [21:44:15] PROBLEM - puppet last run on cp3020 is CRITICAL: CRITICAL: puppet fail [21:44:15] PROBLEM - puppet last run on mw1012 is CRITICAL: CRITICAL: puppet fail [21:44:15] PROBLEM - puppet last run on mw1006 is CRITICAL: CRITICAL: Puppet has 52 failures [21:44:15] PROBLEM - puppet last run on magnesium is CRITICAL: CRITICAL: Puppet has 25 failures [21:44:24] PROBLEM - puppet last run on neon is CRITICAL: CRITICAL: Puppet has 79 failures [21:44:25] PROBLEM - puppet last run on db1017 is CRITICAL: CRITICAL: puppet fail [21:44:25] PROBLEM - puppet last run on mw1005 is CRITICAL: CRITICAL: Puppet has 53 failures [21:44:26] PROBLEM - puppet last run on amslvs2 is CRITICAL: CRITICAL: puppet fail [21:44:36] PROBLEM - puppet last run on cp1070 is CRITICAL: CRITICAL: puppet fail [21:44:37] PROBLEM - puppet last run on cp3008 is CRITICAL: CRITICAL: puppet fail [21:44:44] PROBLEM - puppet last run on db1053 is CRITICAL: CRITICAL: Puppet has 25 failures [21:44:44] PROBLEM - puppet last run on amssq43 is CRITICAL: CRITICAL: Puppet has 20 failures [21:44:44] PROBLEM - puppet last run on cp3013 is CRITICAL: CRITICAL: Puppet has 24 failures [21:44:44] PROBLEM - puppet last run on amssq54 is CRITICAL: CRITICAL: puppet fail [21:44:44] PROBLEM - puppet last run on mw1187 is CRITICAL: CRITICAL: puppet fail [21:44:45] PROBLEM - puppet last run on logstash1001 is CRITICAL: CRITICAL: puppet fail [21:44:45] PROBLEM - puppet last run on mw1197 is CRITICAL: CRITICAL: Puppet has 67 failures [21:44:46] PROBLEM - puppet last run on ms-be1004 is CRITICAL: CRITICAL: puppet fail [21:44:46] PROBLEM - puppet last run on mw1140 is CRITICAL: CRITICAL: Puppet has 66 failures [21:44:47] PROBLEM - puppet last run on db1065 is CRITICAL: CRITICAL: puppet fail [21:44:47] PROBLEM - puppet last run on carbon is CRITICAL: CRITICAL: puppet fail [21:44:48] PROBLEM - puppet last run on mw1106 is CRITICAL: CRITICAL: puppet fail [21:44:48] PROBLEM - puppet last run on dbproxy1002 is CRITICAL: CRITICAL: Puppet has 10 failures [21:44:49] PROBLEM - puppet last run on amssq59 is CRITICAL: CRITICAL: Puppet has 24 failures [21:44:49] PROBLEM - puppet last run on mw1063 is CRITICAL: CRITICAL: Puppet has 47 failures [21:44:54] PROBLEM - puppet last run on searchidx1001 is CRITICAL: CRITICAL: puppet fail [21:44:54] PROBLEM - puppet last run on es1001 is CRITICAL: CRITICAL: Puppet has 24 failures [21:44:54] PROBLEM - puppet last run on neptunium is CRITICAL: CRITICAL: puppet fail [21:44:54] PROBLEM - puppet last run on mw1174 is CRITICAL: CRITICAL: puppet fail [21:44:54] PROBLEM - puppet last run on analytics1033 is CRITICAL: CRITICAL: Puppet has 24 failures [21:44:55] PROBLEM - puppet last run on mw1026 is CRITICAL: CRITICAL: puppet fail [21:44:55] PROBLEM - puppet last run on analytics1017 is CRITICAL: CRITICAL: Puppet has 19 failures [21:45:04] PROBLEM - puppet last run on mc1016 is CRITICAL: CRITICAL: Puppet has 18 failures [21:45:05] PROBLEM - puppet last run on xenon is CRITICAL: CRITICAL: puppet fail [21:45:06] PROBLEM - puppet last run on ms-be2002 is CRITICAL: CRITICAL: Puppet has 29 failures [21:45:07] PROBLEM - puppet last run on db2009 is CRITICAL: CRITICAL: puppet fail [21:45:11] PROBLEM - puppet last run on mw1007 is CRITICAL: CRITICAL: puppet fail [21:45:14] PROBLEM - puppet last run on mw1134 is CRITICAL: CRITICAL: Puppet has 63 failures [21:45:24] PROBLEM - puppet last run on helium is CRITICAL: CRITICAL: puppet fail [21:45:31] PROBLEM - puppet last run on mw1141 is CRITICAL: CRITICAL: Puppet has 71 failures [21:45:31] PROBLEM - puppet last run on wtp1009 is CRITICAL: CRITICAL: Puppet has 24 failures [21:45:31] PROBLEM - puppet last run on mw1045 is CRITICAL: CRITICAL: Puppet has 67 failures [21:45:31] PROBLEM - puppet last run on db1073 is CRITICAL: CRITICAL: Puppet has 16 failures [21:45:31] PROBLEM - puppet last run on lvs4002 is CRITICAL: CRITICAL: Puppet has 22 failures [21:45:31] PROBLEM - puppet last run on elastic1007 is CRITICAL: CRITICAL: puppet fail [21:45:34] PROBLEM - puppet last run on mw1150 is CRITICAL: CRITICAL: puppet fail [21:45:34] PROBLEM - puppet last run on db2019 is CRITICAL: CRITICAL: puppet fail [21:45:34] PROBLEM - puppet last run on netmon1001 is CRITICAL: CRITICAL: Puppet has 29 failures [21:45:35] PROBLEM - puppet last run on es1008 is CRITICAL: CRITICAL: puppet fail [21:45:35] PROBLEM - puppet last run on mw1082 is CRITICAL: CRITICAL: puppet fail [21:45:35] PROBLEM - puppet last run on analytics1040 is CRITICAL: CRITICAL: puppet fail [21:45:35] PROBLEM - puppet last run on amssq32 is CRITICAL: CRITICAL: puppet fail [21:45:36] PROBLEM - puppet last run on rbf1002 is CRITICAL: CRITICAL: puppet fail [21:45:36] PROBLEM - puppet last run on cp1049 is CRITICAL: CRITICAL: Puppet has 23 failures [21:45:37] PROBLEM - puppet last run on amssq49 is CRITICAL: CRITICAL: Puppet has 21 failures [21:45:37] PROBLEM - puppet last run on mc1006 is CRITICAL: CRITICAL: puppet fail [21:45:44] PROBLEM - puppet last run on elastic1001 is CRITICAL: CRITICAL: puppet fail [21:45:44] PROBLEM - puppet last run on analytics1041 is CRITICAL: CRITICAL: puppet fail [21:45:44] PROBLEM - puppet last run on mw1068 is CRITICAL: CRITICAL: puppet fail [21:45:44] PROBLEM - puppet last run on cp1039 is CRITICAL: CRITICAL: puppet fail [21:45:45] PROBLEM - puppet last run on ms-be1003 is CRITICAL: CRITICAL: Puppet has 30 failures [21:45:45] PROBLEM - puppet last run on platinum is CRITICAL: CRITICAL: puppet fail [21:45:45] PROBLEM - puppet last run on amssq61 is CRITICAL: CRITICAL: puppet fail [21:45:46] PROBLEM - puppet last run on gold is CRITICAL: CRITICAL: Puppet has 18 failures [21:45:46] PROBLEM - puppet last run on mw1059 is CRITICAL: CRITICAL: Puppet has 55 failures [21:45:47] PROBLEM - puppet last run on db1031 is CRITICAL: CRITICAL: Puppet has 24 failures [21:45:47] PROBLEM - puppet last run on search1016 is CRITICAL: CRITICAL: Puppet has 52 failures [21:45:48] PROBLEM - puppet last run on mw1088 is CRITICAL: CRITICAL: puppet fail [21:45:53] hi icinga-wm [21:45:54] PROBLEM - puppet last run on wtp1006 is CRITICAL: CRITICAL: Puppet has 23 failures [21:45:55] PROBLEM - puppet last run on mw1205 is CRITICAL: CRITICAL: puppet fail [21:45:56] PROBLEM - puppet last run on ms-be1006 is CRITICAL: CRITICAL: Puppet has 35 failures [21:45:56] PROBLEM - puppet last run on db2034 is CRITICAL: CRITICAL: Puppet has 20 failures [21:45:56] PROBLEM - puppet last run on elastic1008 is CRITICAL: CRITICAL: puppet fail [21:45:56] PROBLEM - puppet last run on lead is CRITICAL: CRITICAL: Puppet has 24 failures [21:45:56] PROBLEM - puppet last run on wtp1020 is CRITICAL: CRITICAL: puppet fail [21:45:56] PROBLEM - puppet last run on mw1200 is CRITICAL: CRITICAL: Puppet has 69 failures [21:46:04] PROBLEM - puppet last run on db2002 is CRITICAL: CRITICAL: puppet fail [21:46:07] PROBLEM - puppet last run on cp1047 is CRITICAL: CRITICAL: Puppet has 23 failures [21:46:07] PROBLEM - puppet last run on lvs1005 is CRITICAL: CRITICAL: puppet fail [21:46:07] PROBLEM - puppet last run on mw1160 is CRITICAL: CRITICAL: Puppet has 68 failures [21:46:14] PROBLEM - puppet last run on analytics1025 is CRITICAL: CRITICAL: Puppet has 21 failures [21:46:15] PROBLEM - puppet last run on potassium is CRITICAL: CRITICAL: Puppet has 21 failures [21:46:15] PROBLEM - puppet last run on db2005 is CRITICAL: CRITICAL: puppet fail [21:46:16] PROBLEM - puppet last run on snapshot1003 is CRITICAL: CRITICAL: Puppet has 59 failures [21:46:16] PROBLEM - puppet last run on mw1041 is CRITICAL: CRITICAL: Puppet has 62 failures [21:46:16] PROBLEM - puppet last run on cp1055 is CRITICAL: CRITICAL: Puppet has 26 failures [21:46:17] PROBLEM - puppet last run on elastic1004 is CRITICAL: CRITICAL: Puppet has 19 failures [21:46:17] PROBLEM - puppet last run on db2039 is CRITICAL: CRITICAL: Puppet has 19 failures [21:46:24] PROBLEM - puppet last run on analytics1035 is CRITICAL: CRITICAL: Puppet has 25 failures [21:46:24] PROBLEM - puppet last run on mw1189 is CRITICAL: CRITICAL: puppet fail [21:46:24] PROBLEM - puppet last run on mw1120 is CRITICAL: CRITICAL: puppet fail [21:46:24] PROBLEM - puppet last run on ms-be2003 is CRITICAL: CRITICAL: Puppet has 22 failures [21:46:24] PROBLEM - puppet last run on lvs3001 is CRITICAL: CRITICAL: puppet fail [21:46:25] PROBLEM - puppet last run on mw1060 is CRITICAL: CRITICAL: puppet fail [21:46:25] PROBLEM - puppet last run on elastic1012 is CRITICAL: CRITICAL: Puppet has 24 failures [21:46:26] PROBLEM - puppet last run on amssq53 is CRITICAL: CRITICAL: Puppet has 27 failures [21:46:26] PROBLEM - puppet last run on cp3016 is CRITICAL: CRITICAL: puppet fail [21:46:27] PROBLEM - puppet last run on ssl1002 is CRITICAL: CRITICAL: Puppet has 17 failures [21:46:34] PROBLEM - puppet last run on mw1217 is CRITICAL: CRITICAL: Puppet has 71 failures [21:46:34] PROBLEM - puppet last run on sca1002 is CRITICAL: CRITICAL: puppet fail [21:46:34] PROBLEM - puppet last run on lvs1002 is CRITICAL: CRITICAL: Puppet has 21 failures [21:46:34] PROBLEM - puppet last run on ms-be2004 is CRITICAL: CRITICAL: puppet fail [21:46:35] PROBLEM - puppet last run on ms-fe1001 is CRITICAL: CRITICAL: puppet fail [21:46:45] PROBLEM - puppet last run on mc1002 is CRITICAL: CRITICAL: puppet fail [21:46:46] PROBLEM - puppet last run on mw1100 is CRITICAL: CRITICAL: Puppet has 72 failures [21:46:49] PROBLEM - puppet last run on db1050 is CRITICAL: CRITICAL: Puppet has 15 failures [21:46:49] PROBLEM - puppet last run on db1066 is CRITICAL: CRITICAL: Puppet has 17 failures [21:46:56] PROBLEM - puppet last run on mw1003 is CRITICAL: CRITICAL: puppet fail [21:46:56] PROBLEM - puppet last run on mw1117 is CRITICAL: CRITICAL: puppet fail [21:46:57] PROBLEM - puppet last run on labstore1003 is CRITICAL: CRITICAL: Puppet has 5 failures [21:46:57] PROBLEM - puppet last run on mw1176 is CRITICAL: CRITICAL: Puppet has 62 failures [21:47:05] PROBLEM - puppet last run on mw1046 is CRITICAL: CRITICAL: puppet fail [21:47:05] PROBLEM - puppet last run on analytics1020 is CRITICAL: CRITICAL: Puppet has 21 failures [21:47:05] PROBLEM - puppet last run on mw1042 is CRITICAL: CRITICAL: puppet fail [21:47:05] PROBLEM - puppet last run on db1015 is CRITICAL: CRITICAL: puppet fail [21:47:06] PROBLEM - puppet last run on mw1061 is CRITICAL: CRITICAL: puppet fail [21:47:14] PROBLEM - puppet last run on mc1003 is CRITICAL: CRITICAL: Puppet has 27 failures [21:47:14] PROBLEM - puppet last run on db1022 is CRITICAL: CRITICAL: Puppet has 23 failures [21:47:15] PROBLEM - puppet last run on cp1056 is CRITICAL: CRITICAL: puppet fail [21:47:15] PROBLEM - puppet last run on ms-fe2004 is CRITICAL: CRITICAL: Puppet has 23 failures [21:47:15] PROBLEM - puppet last run on cp4008 is CRITICAL: CRITICAL: puppet fail [21:47:15] PROBLEM - puppet last run on mw1025 is CRITICAL: CRITICAL: puppet fail [21:47:15] PROBLEM - puppet last run on search1001 is CRITICAL: CRITICAL: puppet fail [21:47:16] PROBLEM - puppet last run on lvs2004 is CRITICAL: CRITICAL: puppet fail [21:47:16] PROBLEM - puppet last run on mw1069 is CRITICAL: CRITICAL: Puppet has 70 failures [21:47:17] PROBLEM - puppet last run on ms-fe2001 is CRITICAL: CRITICAL: puppet fail [21:47:17] PROBLEM - puppet last run on mw1008 is CRITICAL: CRITICAL: puppet fail [21:47:18] PROBLEM - puppet last run on wtp1016 is CRITICAL: CRITICAL: Puppet has 21 failures [21:47:18] PROBLEM - puppet last run on ms-be2006 is CRITICAL: CRITICAL: Puppet has 22 failures [21:47:25] PROBLEM - puppet last run on db1059 is CRITICAL: CRITICAL: puppet fail [21:47:25] PROBLEM - puppet last run on mw1164 is CRITICAL: CRITICAL: Puppet has 78 failures [21:47:25] PROBLEM - puppet last run on cp4003 is CRITICAL: CRITICAL: puppet fail [21:47:25] PROBLEM - puppet last run on mw1173 is CRITICAL: CRITICAL: Puppet has 72 failures [21:47:25] PROBLEM - puppet last run on mw1002 is CRITICAL: CRITICAL: puppet fail [21:47:35] PROBLEM - puppet last run on mw1052 is CRITICAL: CRITICAL: puppet fail [21:47:35] PROBLEM - puppet last run on mw1099 is CRITICAL: CRITICAL: Puppet has 52 failures [21:47:35] PROBLEM - puppet last run on mw1119 is CRITICAL: CRITICAL: puppet fail [21:47:35] PROBLEM - puppet last run on mw1177 is CRITICAL: CRITICAL: puppet fail [21:47:35] PROBLEM - puppet last run on db1018 is CRITICAL: CRITICAL: Puppet has 22 failures [21:47:35] PROBLEM - puppet last run on amssq35 is CRITICAL: CRITICAL: Puppet has 30 failures [21:47:35] PROBLEM - puppet last run on db1040 is CRITICAL: CRITICAL: puppet fail [21:47:36] PROBLEM - puppet last run on mw1009 is CRITICAL: CRITICAL: Puppet has 48 failures [21:47:36] PROBLEM - puppet last run on mw1092 is CRITICAL: CRITICAL: puppet fail [21:47:37] PROBLEM - puppet last run on db1023 is CRITICAL: CRITICAL: puppet fail [21:47:39] !log restarting apache on palladium, strontium [21:47:44] PROBLEM - puppet last run on mw1170 is CRITICAL: CRITICAL: puppet fail [21:47:45] PROBLEM - puppet last run on search1010 is CRITICAL: CRITICAL: Puppet has 52 failures [21:47:45] Logged the message, Master [21:47:45] PROBLEM - puppet last run on amssq47 is CRITICAL: CRITICAL: puppet fail [21:47:45] PROBLEM - puppet last run on db1028 is CRITICAL: CRITICAL: puppet fail [21:47:54] PROBLEM - puppet last run on cp3003 is CRITICAL: CRITICAL: Puppet has 18 failures [21:47:54] PROBLEM - puppet last run on amssq55 is CRITICAL: CRITICAL: puppet fail [21:47:54] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: puppet fail [21:47:55] PROBLEM - puppet last run on labsdb1003 is CRITICAL: CRITICAL: Puppet has 16 failures [21:47:55] PROBLEM - puppet last run on mw1118 is CRITICAL: CRITICAL: Puppet has 70 failures [21:47:55] PROBLEM - puppet last run on mw1153 is CRITICAL: CRITICAL: Puppet has 69 failures [21:47:55] PROBLEM - puppet last run on mw1175 is CRITICAL: CRITICAL: puppet fail [21:47:56] PROBLEM - puppet last run on iron is CRITICAL: CRITICAL: puppet fail [21:47:56] PROBLEM - puppet last run on db1034 is CRITICAL: CRITICAL: puppet fail [21:47:57] PROBLEM - puppet last run on mw1144 is CRITICAL: CRITICAL: Puppet has 64 failures [21:47:57] PROBLEM - puppet last run on cp1061 is CRITICAL: CRITICAL: puppet fail [21:48:05] PROBLEM - puppet last run on mw1123 is CRITICAL: CRITICAL: Puppet has 59 failures [21:48:06] PROBLEM - puppet last run on search1018 is CRITICAL: CRITICAL: Puppet has 50 failures [21:48:06] PROBLEM - puppet last run on mw1172 is CRITICAL: CRITICAL: puppet fail [21:48:06] PROBLEM - puppet last run on db1021 is CRITICAL: CRITICAL: puppet fail [21:48:07] PROBLEM - puppet last run on db1002 is CRITICAL: CRITICAL: Puppet has 17 failures [21:48:07] PROBLEM - puppet last run on mw1065 is CRITICAL: CRITICAL: Puppet has 58 failures [21:48:07] RECOVERY - puppetmaster backend https on palladium is OK: HTTP OK: Status line output matched 400 - 335 bytes in 0.052 second response time [21:48:15] PROBLEM - puppet last run on analytics1030 is CRITICAL: CRITICAL: Puppet has 22 failures [21:48:15] PROBLEM - puppet last run on db1046 is CRITICAL: CRITICAL: Puppet has 26 failures [21:48:24] PROBLEM - puppet last run on db1042 is CRITICAL: CRITICAL: puppet fail [21:48:25] PROBLEM - puppet last run on dbproxy1001 is CRITICAL: CRITICAL: puppet fail [21:48:26] PROBLEM - puppet last run on logstash1002 is CRITICAL: CRITICAL: Puppet has 32 failures [21:48:26] PROBLEM - puppet last run on analytics1010 is CRITICAL: CRITICAL: puppet fail [21:48:26] PROBLEM - puppet last run on ms-fe1004 is CRITICAL: CRITICAL: Puppet has 32 failures [21:48:26] PROBLEM - puppet last run on mw1211 is CRITICAL: CRITICAL: puppet fail [21:48:26] PROBLEM - puppet last run on db2018 is CRITICAL: CRITICAL: Puppet has 16 failures [21:48:26] PROBLEM - puppet last run on mw1126 is CRITICAL: CRITICAL: puppet fail [21:48:27] PROBLEM - puppet last run on mw1114 is CRITICAL: CRITICAL: puppet fail [21:48:27] PROBLEM - puppet last run on search1007 is CRITICAL: CRITICAL: puppet fail [21:48:28] PROBLEM - puppet last run on install2001 is CRITICAL: CRITICAL: puppet fail [21:48:28] PROBLEM - puppet last run on mw1054 is CRITICAL: CRITICAL: Puppet has 55 failures [21:48:29] PROBLEM - puppet last run on mw1011 is CRITICAL: CRITICAL: puppet fail [21:48:29] PROBLEM - puppet last run on mw1166 is CRITICAL: CRITICAL: Puppet has 67 failures [21:48:30] PROBLEM - puppet last run on db1016 is CRITICAL: CRITICAL: puppet fail [21:48:34] PROBLEM - puppet last run on labcontrol2001 is CRITICAL: CRITICAL: Puppet has 35 failures [21:48:34] PROBLEM - puppet last run on ms-fe2003 is CRITICAL: CRITICAL: Puppet has 26 failures [21:48:34] PROBLEM - puppet last run on db2029 is CRITICAL: CRITICAL: puppet fail [21:48:35] PROBLEM - puppet last run on db1067 is CRITICAL: CRITICAL: Puppet has 24 failures [21:48:35] PROBLEM - puppet last run on amssq60 is CRITICAL: CRITICAL: Puppet has 20 failures [21:48:45] PROBLEM - puppet last run on analytics1016 is CRITICAL: CRITICAL: puppet fail [21:48:45] PROBLEM - puppet last run on mc1012 is CRITICAL: CRITICAL: puppet fail [21:48:45] PROBLEM - puppet last run on gallium is CRITICAL: CRITICAL: Puppet has 32 failures [21:48:45] PROBLEM - puppet last run on db1051 is CRITICAL: CRITICAL: Puppet has 25 failures [21:48:45] PROBLEM - puppet last run on cp3014 is CRITICAL: CRITICAL: Puppet has 24 failures [21:48:45] PROBLEM - puppet last run on cp3010 is CRITICAL: CRITICAL: puppet fail [21:48:45] PROBLEM - puppet last run on amssq46 is CRITICAL: CRITICAL: puppet fail [21:48:54] PROBLEM - puppet last run on virt1003 is CRITICAL: CRITICAL: puppet fail [21:49:00] PROBLEM - puppet last run on wtp1012 is CRITICAL: CRITICAL: puppet fail [21:49:07] PROBLEM - puppet last run on lvs2001 is CRITICAL: CRITICAL: puppet fail [21:49:07] PROBLEM - puppet last run on db2036 is CRITICAL: CRITICAL: Puppet has 16 failures [21:49:07] PROBLEM - puppet last run on mw1129 is CRITICAL: CRITICAL: puppet fail [21:49:07] PROBLEM - puppet last run on mw1213 is CRITICAL: CRITICAL: Puppet has 66 failures [21:49:14] PROBLEM - puppet last run on antimony is CRITICAL: CRITICAL: puppet fail [21:49:14] PROBLEM - puppet last run on mw1055 is CRITICAL: CRITICAL: puppet fail [21:49:15] PROBLEM - puppet last run on mw1076 is CRITICAL: CRITICAL: puppet fail [21:49:15] PROBLEM - puppet last run on wtp1005 is CRITICAL: CRITICAL: Puppet has 25 failures [21:49:18] is there a plan for fixing this? as in, really fixing it? [21:49:24] PROBLEM - puppet last run on polonium is CRITICAL: CRITICAL: Puppet has 29 failures [21:49:24] PROBLEM - puppet last run on db2007 is CRITICAL: CRITICAL: puppet fail [21:49:26] PROBLEM - puppet last run on mw1208 is CRITICAL: CRITICAL: puppet fail [21:49:26] PROBLEM - puppet last run on db1052 is CRITICAL: CRITICAL: Puppet has 45 failures [21:49:35] PROBLEM - puppet last run on db2038 is CRITICAL: CRITICAL: puppet fail [21:49:35] PROBLEM - puppet last run on ms-be2011 is CRITICAL: CRITICAL: puppet fail [21:49:35] palladium going haywire you mean ? [21:49:35] PROBLEM - puppet last run on cp4019 is CRITICAL: CRITICAL: puppet fail [21:49:35] PROBLEM - puppet last run on es1007 is CRITICAL: CRITICAL: Puppet has 27 failures [21:49:35] PROBLEM - puppet last run on cp4001 is CRITICAL: CRITICAL: puppet fail [21:49:35] PROBLEM - puppet last run on cp4004 is CRITICAL: CRITICAL: Puppet has 29 failures [21:49:35] PROBLEM - puppet last run on snapshot1001 is CRITICAL: CRITICAL: Puppet has 104 failures [21:49:35] or is it like the flooding of the nile? [21:49:36] PROBLEM - puppet last run on cp4014 is CRITICAL: CRITICAL: Puppet has 24 failures [21:49:36] PROBLEM - puppet last run on db1060 is CRITICAL: CRITICAL: puppet fail [21:49:37] PROBLEM - puppet last run on mw1162 is CRITICAL: CRITICAL: Puppet has 78 failures [21:49:37] PROBLEM - puppet last run on labstore1001 is CRITICAL: CRITICAL: puppet fail [21:49:38] PROBLEM - puppet last run on amssq48 is CRITICAL: CRITICAL: Puppet has 34 failures [21:49:39] yeah [21:49:44] PROBLEM - puppet last run on mw1206 is CRITICAL: CRITICAL: puppet fail [21:49:44] PROBLEM - puppet last run on mw1195 is CRITICAL: CRITICAL: puppet fail [21:49:44] PROBLEM - puppet last run on virt1001 is CRITICAL: CRITICAL: puppet fail [21:49:44] PROBLEM - puppet last run on elastic1006 is CRITICAL: CRITICAL: puppet fail [21:49:45] PROBLEM - puppet last run on mw1039 is CRITICAL: CRITICAL: Puppet has 72 failures [21:49:45] PROBLEM - puppet last run on db1036 is CRITICAL: CRITICAL: puppet fail [21:49:45] PROBLEM - puppet last run on pc1002 is CRITICAL: CRITICAL: puppet fail [21:49:46] PROBLEM - puppet last run on mw1202 is CRITICAL: CRITICAL: puppet fail [21:49:46] PROBLEM - puppet last run on amssq34 is CRITICAL: CRITICAL: puppet fail [21:49:54] PROBLEM - puppet last run on cp4005 is CRITICAL: CRITICAL: puppet fail [21:49:59] PROBLEM - puppet last run on lithium is CRITICAL: CRITICAL: Puppet has 40 failures [21:49:59] PROBLEM - puppet last run on db1026 is CRITICAL: CRITICAL: puppet fail [21:49:59] PROBLEM - puppet last run on plutonium is CRITICAL: CRITICAL: puppet fail [21:49:59] PROBLEM - puppet last run on mw1051 is CRITICAL: CRITICAL: puppet fail [21:50:00] PROBLEM - puppet last run on ms-be3002 is CRITICAL: CRITICAL: Puppet has 44 failures [21:50:00] PROBLEM - puppet last run on amslvs1 is CRITICAL: CRITICAL: Puppet has 15 failures [21:50:00] PROBLEM - puppet last run on ruthenium is CRITICAL: CRITICAL: Puppet has 43 failures [21:50:01] PROBLEM - puppet last run on amssq36 is CRITICAL: CRITICAL: puppet fail [21:50:01] PROBLEM - puppet last run on mw1044 is CRITICAL: CRITICAL: Puppet has 67 failures [21:50:03] if we can debug it, yes [21:50:04] PROBLEM - puppet last run on mw1149 is CRITICAL: CRITICAL: puppet fail [21:50:04] PROBLEM - puppet last run on mw1014 is CRITICAL: CRITICAL: puppet fail [21:50:04] PROBLEM - puppet last run on amssq51 is CRITICAL: CRITICAL: puppet fail [21:50:04] PROBLEM - puppet last run on analytics1038 is CRITICAL: CRITICAL: Puppet has 44 failures [21:50:15] PROBLEM - puppet last run on mw1190 is CRITICAL: CRITICAL: puppet fail [21:50:15] PROBLEM - puppet last run on lvs2006 is CRITICAL: CRITICAL: puppet fail [21:50:15] PROBLEM - puppet last run on labnet1001 is CRITICAL: CRITICAL: Puppet has 43 failures [21:50:15] PROBLEM - puppet last run on dataset1001 is CRITICAL: CRITICAL: Puppet has 58 failures [21:50:16] PROBLEM - puppet last run on labmon1001 is CRITICAL: CRITICAL: puppet fail [21:50:16] PROBLEM - puppet last run on db1003 is CRITICAL: CRITICAL: puppet fail [21:50:29] PROBLEM - puppet last run on mc1014 is CRITICAL: CRITICAL: puppet fail [21:50:29] PROBLEM - puppet last run on db1043 is CRITICAL: CRITICAL: Puppet has 42 failures [21:50:34] PROBLEM - puppet last run on search1002 is CRITICAL: CRITICAL: puppet fail [21:50:36] PROBLEM - puppet last run on mw1180 is CRITICAL: CRITICAL: puppet fail [21:50:36] RECOVERY - puppet last run on db2039 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [21:54:45] RECOVERY - puppet last run on osm-cp1001 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [21:55:35] RECOVERY - puppet last run on mw1021 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [21:55:35] RECOVERY - puppet last run on mw1135 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [21:55:46] RECOVERY - puppet last run on lvs4004 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [21:55:55] RECOVERY - puppet last run on mw1131 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [21:55:55] RECOVERY - puppet last run on lvs1004 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [21:55:55] RECOVERY - puppet last run on mw1155 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [21:56:05] RECOVERY - puppet last run on sca1001 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [21:56:08] RECOVERY - puppet last run on cp3011 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [21:56:15] RECOVERY - puppet last run on palladium is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [21:56:15] RECOVERY - puppet last run on chromium is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [21:56:15] RECOVERY - puppet last run on mw1207 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [21:56:24] RECOVERY - puppet last run on analytics1012 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [21:56:25] RECOVERY - puppet last run on analytics1031 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [21:56:34] RECOVERY - puppet last run on mw1018 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [21:56:35] RECOVERY - puppet last run on mw1103 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [21:56:35] RECOVERY - puppet last run on labsdb1005 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [21:56:44] RECOVERY - puppet last run on db2017 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [21:56:44] RECOVERY - puppet last run on tmh1001 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [21:56:44] RECOVERY - puppet last run on ssl1003 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [21:56:44] RECOVERY - puppet last run on cp1066 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [21:56:45] RECOVERY - puppet last run on mc1015 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [21:56:45] RECOVERY - puppet last run on db1068 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [21:56:45] RECOVERY - puppet last run on cp4010 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [21:56:46] RECOVERY - puppet last run on cp4016 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [21:56:46] RECOVERY - puppet last run on cp3005 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [21:56:54] RECOVERY - puppet last run on lvs1003 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [21:56:55] RECOVERY - puppet last run on mw1199 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [21:56:55] RECOVERY - puppet last run on mw1037 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [21:56:55] RECOVERY - puppet last run on search1003 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [21:56:55] RECOVERY - puppet last run on db2010 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [21:56:55] RECOVERY - puppet last run on lvs2002 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [21:56:55] RECOVERY - puppet last run on ms-be1010 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [21:56:56] RECOVERY - puppet last run on mw1073 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [21:57:04] RECOVERY - puppet last run on cp3019 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [21:57:04] RECOVERY - puppet last run on eeden is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [21:57:05] RECOVERY - puppet last run on es1004 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [21:57:08] RECOVERY - puppet last run on amssq37 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [21:57:24] RECOVERY - puppet last run on cp1064 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [21:57:24] RECOVERY - puppet last run on db1027 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [21:57:25] RECOVERY - puppet last run on db1005 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [21:57:25] RECOVERY - puppet last run on cp3021 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [21:57:25] RECOVERY - puppet last run on cp3017 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [21:57:25] RECOVERY - puppet last run on hydrogen is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [21:57:34] RECOVERY - puppet last run on zinc is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [21:57:37] RECOVERY - puppet last run on mw1070 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [21:57:37] RECOVERY - puppet last run on analytics1021 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [21:57:37] RECOVERY - puppet last run on mw1047 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [21:57:38] RECOVERY - puppet last run on mw1128 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [21:57:38] RECOVERY - puppet last run on mw1137 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [21:57:38] RECOVERY - puppet last run on mw1075 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [21:57:44] RECOVERY - puppet last run on db1049 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [21:57:44] RECOVERY - puppet last run on lanthanum is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [21:57:45] RECOVERY - puppet last run on elastic1010 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [21:57:45] RECOVERY - puppet last run on cp1008 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [21:57:45] RECOVERY - puppet last run on cp4011 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [21:57:45] RECOVERY - puppet last run on cp4015 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [21:57:54] RECOVERY - puppet last run on mw1194 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [21:57:57] RECOVERY - puppet last run on mw1078 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [21:57:57] RECOVERY - puppet last run on ms-be1014 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [21:57:58] RECOVERY - puppet last run on mw1017 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [21:57:58] RECOVERY - puppet last run on bast2001 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [21:57:58] RECOVERY - puppet last run on pollux is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [21:57:58] RECOVERY - puppet last run on ocg1002 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [21:58:05] RECOVERY - puppet last run on search1008 is OK: OK: Puppet is currently enabled, last run 61 seconds ago with 0 failures [21:58:14] RECOVERY - puppet last run on elastic1013 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [21:58:14] RECOVERY - puppet last run on lvs3002 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [21:58:14] RECOVERY - puppet last run on mw1015 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [21:58:15] RECOVERY - puppet last run on cp3007 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [21:58:15] RECOVERY - puppet last run on es1005 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [21:58:15] RECOVERY - puppet last run on search1019 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [21:58:15] RECOVERY - puppet last run on elastic1009 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [21:58:16] RECOVERY - puppet last run on db1009 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [21:58:16] RECOVERY - puppet last run on nickel is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [21:58:17] RECOVERY - puppet last run on mw1095 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [21:58:17] RECOVERY - puppet last run on amssq50 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [21:58:18] RECOVERY - puppet last run on amssq45 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [21:58:18] RECOVERY - puppet last run on amssq52 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [21:58:19] RECOVERY - puppet last run on mw1157 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [21:58:20] RECOVERY - puppet last run on mw1179 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [21:58:24] RECOVERY - puppet last run on mw1101 is OK: OK: Puppet is currently enabled, last run 67 seconds ago with 0 failures [21:58:24] RECOVERY - puppet last run on ms-be1005 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [21:58:24] RECOVERY - puppet last run on mw1102 is OK: OK: Puppet is currently enabled, last run 67 seconds ago with 0 failures [21:58:24] RECOVERY - puppet last run on mw1020 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [21:58:24] RECOVERY - puppet last run on analytics1029 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [21:58:35] RECOVERY - puppet last run on labsdb1002 is OK: OK: Puppet is currently enabled, last run 67 seconds ago with 0 failures [21:58:36] RECOVERY - puppet last run on mw1058 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [21:58:36] RECOVERY - puppet last run on mw1085 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [21:58:36] RECOVERY - puppet last run on mw1184 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [21:58:36] RECOVERY - puppet last run on analytics1024 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [21:58:36] RECOVERY - puppet last run on wtp1014 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [21:58:36] RECOVERY - puppet last run on db1058 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [21:58:37] RECOVERY - puppet last run on lvs2005 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [21:58:37] RECOVERY - puppet last run on ssl1004 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [21:58:38] RECOVERY - puppet last run on mw1169 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [21:58:48] RECOVERY - puppet last run on virt1005 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [21:58:48] RECOVERY - puppet last run on ms-be2009 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [21:58:48] RECOVERY - puppet last run on lvs4001 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [21:58:51] RECOVERY - puppet last run on virt1009 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [21:58:54] RECOVERY - puppet last run on search1021 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [21:58:54] RECOVERY - puppet last run on ms-be2010 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [21:59:04] RECOVERY - puppet last run on cp4017 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [21:59:06] RECOVERY - puppet last run on mw1019 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [21:59:07] RECOVERY - puppet last run on erbium is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [21:59:07] RECOVERY - puppet last run on mw1191 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [21:59:07] RECOVERY - puppet last run on db1019 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [21:59:07] RECOVERY - puppet last run on labsdb1007 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [21:59:08] RECOVERY - puppet last run on mw1136 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [21:59:09] RECOVERY - puppet last run on lvs1006 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [21:59:10] RECOVERY - puppet last run on praseodymium is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [21:59:14] RECOVERY - puppet last run on db2033 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [21:59:14] RECOVERY - puppet last run on ms-be3004 is OK: OK: Puppet is currently enabled, last run 62 seconds ago with 0 failures [21:59:14] RECOVERY - puppet last run on search1009 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [21:59:15] RECOVERY - puppet last run on es1003 is OK: OK: Puppet is currently enabled, last run 70 seconds ago with 0 failures [21:59:15] RECOVERY - puppet last run on analytics1034 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [21:59:24] RECOVERY - puppet last run on cp4013 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [21:59:27] RECOVERY - puppet last run on mw1182 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [21:59:27] RECOVERY - puppet last run on mw1127 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [21:59:27] RECOVERY - puppet last run on amslvs4 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [21:59:28] RECOVERY - puppet last run on cp1065 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [21:59:28] RECOVERY - puppet last run on mw1214 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [21:59:28] RECOVERY - puppet last run on cp1057 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [21:59:28] RECOVERY - puppet last run on mw1083 is OK: OK: Puppet is currently enabled, last run 63 seconds ago with 0 failures [21:59:36] RECOVERY - puppet last run on mw1094 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [21:59:44] RECOVERY - puppet last run on mw1036 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [21:59:44] RECOVERY - puppet last run on mw1096 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [21:59:44] RECOVERY - puppet last run on cp1051 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [21:59:44] RECOVERY - puppet last run on mw1138 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [21:59:44] RECOVERY - puppet last run on db1041 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [21:59:45] RECOVERY - puppet last run on tmh1002 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [21:59:45] RECOVERY - puppet last run on zirconium is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [21:59:46] RECOVERY - puppet last run on wtp1024 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [21:59:46] RECOVERY - puppet last run on mw1035 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [21:59:54] RECOVERY - puppet last run on db1024 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [21:59:55] RECOVERY - puppet last run on cp1069 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [21:59:55] RECOVERY - puppet last run on mc1008 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [21:59:55] RECOVERY - puppet last run on cp1043 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [22:00:04] RECOVERY - puppet last run on analytics1039 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [22:00:05] RECOVERY - puppet last run on ytterbium is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [22:00:16] RECOVERY - puppet last run on mw1196 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [22:00:16] RECOVERY - puppet last run on neon is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [22:00:16] RECOVERY - puppet last run on analytics1036 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [22:00:22] RECOVERY - puppet last run on virt1002 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [22:00:34] RECOVERY - puppet last run on mw1109 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [22:00:34] RECOVERY - puppet last run on analytics1019 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [22:00:35] RECOVERY - puppet last run on cp3022 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [22:00:35] RECOVERY - puppet last run on db1010 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [22:00:35] RECOVERY - puppet last run on mw1218 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [22:00:35] RECOVERY - puppet last run on amssq58 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [22:00:35] RECOVERY - puppet last run on mw1124 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [22:00:36] RECOVERY - puppet last run on amssq57 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [22:00:36] RECOVERY - puppet last run on db1007 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [22:00:44] RECOVERY - puppet last run on mw1013 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [22:00:44] RECOVERY - puppet last run on analytics1015 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [22:00:45] RECOVERY - puppet last run on labsdb1001 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [22:00:45] RECOVERY - puppet last run on search1014 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [22:00:45] RECOVERY - puppet last run on es1009 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [22:00:54] RECOVERY - puppet last run on wtp1021 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [22:00:54] RECOVERY - puppet last run on mc1004 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [22:01:05] RECOVERY - puppet last run on cp4007 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [22:01:11] RECOVERY - puppet last run on mw1161 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [22:01:11] RECOVERY - puppet last run on es1006 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [22:01:11] RECOVERY - puppet last run on db2035 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [22:01:14] RECOVERY - puppet last run on rdb1004 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [22:01:14] RECOVERY - puppet last run on mc1009 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [22:01:14] RECOVERY - puppet last run on mw1062 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [22:01:14] RECOVERY - puppet last run on mw1192 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [22:01:14] RECOVERY - puppet last run on mc1010 is OK: OK: Puppet is currently enabled, last run 60 seconds ago with 0 failures [22:01:26] RECOVERY - puppet last run on mw1132 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [22:01:26] RECOVERY - puppet last run on pc1001 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [22:01:34] RECOVERY - puppet last run on mw1072 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [22:01:34] RECOVERY - puppet last run on mw1147 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [22:01:34] RECOVERY - puppet last run on mw1048 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [22:01:35] RECOVERY - puppet last run on mw1216 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [22:01:35] RECOVERY - puppet last run on dbstore1001 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [22:01:35] RECOVERY - puppet last run on mw1130 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [22:01:35] RECOVERY - puppet last run on ssl3003 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [22:01:44] RECOVERY - puppet last run on cp3013 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [22:01:50] RECOVERY - puppet last run on uranium is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [22:01:50] RECOVERY - puppet last run on logstash1003 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [22:01:50] RECOVERY - puppet last run on wtp1017 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [22:01:50] RECOVERY - puppet last run on mw1089 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [22:01:51] RECOVERY - puppet last run on vanadium is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [22:01:51] RECOVERY - puppet last run on mw1038 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [22:01:54] RECOVERY - puppet last run on analytics1033 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [22:01:54] RECOVERY - puppet last run on mw1031 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [22:01:54] RECOVERY - puppet last run on mw1040 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [22:01:54] RECOVERY - puppet last run on search1020 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [22:01:54] RECOVERY - puppet last run on wtp1019 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [22:01:55] RECOVERY - puppet last run on ms-be1001 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [22:02:04] RECOVERY - puppet last run on mw1178 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [22:02:08] RECOVERY - puppet last run on ms-be2002 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [22:02:08] RECOVERY - puppet last run on mc1011 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [22:02:08] RECOVERY - puppet last run on mw1134 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [22:02:09] RECOVERY - puppet last run on ocg1001 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [22:02:09] RECOVERY - puppet last run on ms-fe1003 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [22:02:09] RECOVERY - puppet last run on caesium is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [22:02:14] RECOVERY - puppet last run on mw1080 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [22:02:15] RECOVERY - puppet last run on cp1059 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [22:02:15] RECOVERY - puppet last run on magnesium is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [22:02:24] RECOVERY - puppet last run on db1017 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [22:02:25] RECOVERY - puppet last run on mw1005 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [22:02:26] RECOVERY - puppet last run on cp1049 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [22:02:26] RECOVERY - puppet last run on cp1070 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [22:02:44] RECOVERY - puppet last run on amssq44 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [22:02:49] RECOVERY - puppet last run on amslvs2 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [22:02:49] RECOVERY - puppet last run on db1053 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [22:02:49] RECOVERY - puppet last run on mw1115 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [22:02:49] RECOVERY - puppet last run on mw1063 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [22:02:50] RECOVERY - puppet last run on amssq43 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [22:02:50] RECOVERY - puppet last run on logstash1001 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [22:02:50] RECOVERY - puppet last run on mw1197 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [22:02:51] RECOVERY - puppet last run on ms-be1004 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [22:02:51] RECOVERY - puppet last run on db1065 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [22:02:52] RECOVERY - puppet last run on carbon is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [22:02:52] RECOVERY - puppet last run on dbproxy1002 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [22:02:53] RECOVERY - puppet last run on db1029 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [22:02:54] RECOVERY - puppet last run on searchidx1001 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [22:02:54] RECOVERY - puppet last run on mw1067 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [22:02:54] RECOVERY - puppet last run on cp3006 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [22:02:55] RECOVERY - puppet last run on es1001 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [22:02:55] RECOVERY - puppet last run on amssq59 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [22:02:56] RECOVERY - puppet last run on cp1047 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [22:02:56] RECOVERY - puppet last run on analytics1017 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [22:03:04] RECOVERY - puppet last run on mc1016 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [22:03:04] RECOVERY - puppet last run on cp3015 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [22:03:05] RECOVERY - puppet last run on neptunium is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [22:03:05] RECOVERY - puppet last run on mw1041 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [22:03:05] RECOVERY - puppet last run on mw1007 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [22:03:05] RECOVERY - puppet last run on elastic1004 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [22:03:05] RECOVERY - puppet last run on wtp1009 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [22:03:15] RECOVERY - puppet last run on mw1045 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [22:03:24] RECOVERY - puppet last run on mw1141 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [22:03:24] RECOVERY - puppet last run on ms-be2003 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [22:03:24] RECOVERY - puppet last run on lvs4002 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [22:03:25] RECOVERY - puppet last run on ssl1002 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [22:03:25] RECOVERY - puppet last run on mw1012 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [22:03:25] RECOVERY - puppet last run on mw1028 is OK: OK: Puppet is currently enabled, last run 63 seconds ago with 0 failures [22:03:25] RECOVERY - puppet last run on cp4006 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [22:03:26] RECOVERY - puppet last run on mw1006 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [22:03:26] RECOVERY - puppet last run on cp3020 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [22:03:37] RECOVERY - puppet last run on netmon1001 is OK: OK: Puppet is currently enabled, last run 70 seconds ago with 0 failures [22:03:37] RECOVERY - puppet last run on es1008 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [22:03:37] RECOVERY - puppet last run on mw1082 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [22:03:37] RECOVERY - puppet last run on analytics1040 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [22:03:37] RECOVERY - puppet last run on db2019 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [22:03:37] RECOVERY - puppet last run on rbf1002 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [22:03:37] RECOVERY - puppet last run on mc1006 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [22:03:38] RECOVERY - puppet last run on analytics1041 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [22:03:38] RECOVERY - puppet last run on elastic1001 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [22:03:39] RECOVERY - puppet last run on cp1039 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [22:03:39] RECOVERY - puppet last run on amssq49 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [22:03:40] RECOVERY - puppet last run on ms-be1003 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [22:03:45] RECOVERY - puppet last run on gold is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [22:03:45] RECOVERY - puppet last run on platinum is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [22:03:45] RECOVERY - puppet last run on cp3008 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [22:03:45] RECOVERY - puppet last run on mw1059 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [22:03:45] RECOVERY - puppet last run on mw1187 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [22:03:46] RECOVERY - puppet last run on db1031 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [22:03:46] RECOVERY - puppet last run on search1016 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [22:03:47] RECOVERY - puppet last run on amssq54 is OK: OK: Puppet is currently enabled, last run 62 seconds ago with 0 failures [22:03:47] RECOVERY - puppet last run on amssq61 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [22:03:48] RECOVERY - puppet last run on wtp1006 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [22:03:48] RECOVERY - puppet last run on mw1140 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [22:03:49] RECOVERY - puppet last run on mw1106 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [22:03:49] RECOVERY - puppet last run on ms-be1006 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [22:03:50] RECOVERY - puppet last run on cerium is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [22:03:50] RECOVERY - puppet last run on analytics1020 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [22:03:55] RECOVERY - puppet last run on lead is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [22:03:56] RECOVERY - puppet last run on db2034 is OK: OK: Puppet is currently enabled, last run 65 seconds ago with 0 failures [22:03:56] RECOVERY - puppet last run on db1022 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [22:03:56] RECOVERY - puppet last run on mw1145 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [22:03:56] RECOVERY - puppet last run on wtp1020 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [22:03:56] RECOVERY - puppet last run on mw1160 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [22:03:56] RECOVERY - puppet last run on mw1200 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [22:03:57] RECOVERY - puppet last run on mw1174 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [22:03:57] RECOVERY - puppet last run on ms-fe2004 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [22:04:06] RECOVERY - puppet last run on analytics1025 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [22:04:07] RECOVERY - puppet last run on potassium is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [22:04:07] RECOVERY - puppet last run on xenon is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [22:04:16] RECOVERY - puppet last run on cp1055 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [22:04:16] RECOVERY - puppet last run on snapshot1003 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [22:04:16] RECOVERY - puppet last run on db2005 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [22:04:16] RECOVERY - puppet last run on db2009 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [22:04:16] RECOVERY - puppet last run on wtp1016 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [22:04:16] RECOVERY - puppet last run on helium is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [22:04:17] RECOVERY - puppet last run on db1073 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [22:04:18] RECOVERY - puppet last run on ms-be2006 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [22:04:25] RECOVERY - puppet last run on elastic1007 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [22:04:25] RECOVERY - puppet last run on mw1150 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [22:04:35] RECOVERY - puppet last run on sca1002 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [22:04:35] RECOVERY - puppet last run on lvs1002 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [22:04:35] RECOVERY - puppet last run on amssq53 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [22:04:35] RECOVERY - puppet last run on mw1009 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [22:04:35] RECOVERY - puppet last run on ms-be2004 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [22:04:36] RECOVERY - puppet last run on ms-fe1001 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [22:04:36] RECOVERY - puppet last run on search1010 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [22:04:37] RECOVERY - puppet last run on mw1100 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [22:04:37] RECOVERY - puppet last run on amssq32 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [22:04:38] RECOVERY - puppet last run on db1066 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [22:04:38] RECOVERY - puppet last run on db1050 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [22:04:50] RECOVERY - puppet last run on mw1003 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [22:04:55] RECOVERY - puppet last run on cp3003 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [22:04:55] RECOVERY - puppet last run on labsdb1003 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [22:04:56] RECOVERY - puppet last run on labstore1003 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [22:04:56] RECOVERY - puppet last run on mw1046 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [22:04:56] RECOVERY - puppet last run on elastic1008 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [22:04:57] RECOVERY - puppet last run on mc1003 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [22:05:05] RECOVERY - puppet last run on mw1026 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [22:05:07] RECOVERY - puppet last run on lvs1005 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [22:05:07] RECOVERY - puppet last run on db2002 is OK: OK: Puppet is currently enabled, last run 64 seconds ago with 0 failures [22:05:08] RECOVERY - puppet last run on mw1069 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [22:05:16] RECOVERY - puppet last run on mw1008 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [22:05:17] RECOVERY - puppet last run on mw1189 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [22:05:17] RECOVERY - puppet last run on mw1120 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [22:05:17] RECOVERY - puppet last run on analytics1035 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [22:05:17] RECOVERY - puppet last run on mw1164 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [22:05:17] RECOVERY - puppet last run on mw1173 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [22:05:17] RECOVERY - puppet last run on mw1060 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [22:05:18] RECOVERY - puppet last run on lvs3001 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [22:05:25] RECOVERY - puppet last run on elastic1012 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [22:05:38] RECOVERY - puppet last run on mw1099 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [22:05:38] RECOVERY - puppet last run on mw1217 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [22:05:38] RECOVERY - puppet last run on db1018 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [22:05:38] RECOVERY - puppet last run on cp3016 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [22:05:39] RECOVERY - puppet last run on db1040 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [22:05:39] RECOVERY - puppet last run on mw1092 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [22:05:39] RECOVERY - puppet last run on amssq35 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [22:05:45] RECOVERY - puppet last run on mw1170 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [22:05:50] RECOVERY - puppet last run on mc1002 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [22:05:50] RECOVERY - puppet last run on mw1068 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [22:05:55] RECOVERY - puppet last run on mw1117 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [22:05:55] RECOVERY - puppet last run on mw1153 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [22:05:55] RECOVERY - puppet last run on mw1088 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [22:05:55] RECOVERY - puppet last run on mw1176 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [22:05:56] RECOVERY - puppet last run on iron is OK: OK: Puppet is currently enabled, last run 68 seconds ago with 0 failures [22:05:56] RECOVERY - puppet last run on mw1205 is OK: OK: Puppet is currently enabled, last run 67 seconds ago with 0 failures [22:05:56] RECOVERY - puppet last run on mw1144 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [22:05:57] RECOVERY - puppet last run on cp1061 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [22:06:05] RECOVERY - puppet last run on db1015 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [22:06:05] RECOVERY - puppet last run on search1018 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [22:06:05] RECOVERY - puppet last run on mw1061 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [22:06:06] RECOVERY - puppet last run on cp1056 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [22:06:06] RECOVERY - puppet last run on db1002 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [22:06:06] RECOVERY - puppet last run on mw1065 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [22:06:06] RECOVERY - puppet last run on analytics1030 is OK: OK: Puppet is currently enabled, last run 61 seconds ago with 0 failures [22:06:07] RECOVERY - puppet last run on db1046 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [22:06:15] RECOVERY - puppet last run on cp4008 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [22:06:16] RECOVERY - puppet last run on search1001 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [22:06:22] RECOVERY - puppet last run on lvs2004 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [22:06:22] RECOVERY - puppet last run on ms-fe1004 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [22:06:22] RECOVERY - puppet last run on logstash1002 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [22:06:22] RECOVERY - puppet last run on dbproxy1001 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [22:06:29] RECOVERY - puppet last run on mw1211 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [22:06:29] RECOVERY - puppet last run on ms-fe2001 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [22:06:29] RECOVERY - puppet last run on db1059 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [22:06:29] RECOVERY - puppet last run on db2018 is OK: OK: Puppet is currently enabled, last run 63 seconds ago with 0 failures [22:06:35] RECOVERY - puppet last run on mw1166 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [22:06:45] RECOVERY - puppet last run on mw1039 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [22:06:45] RECOVERY - puppet last run on cp4003 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [22:06:45] RECOVERY - puppet last run on mw1052 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [22:06:45] RECOVERY - puppet last run on labcontrol2001 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [22:06:45] RECOVERY - puppet last run on ms-fe2003 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [22:06:46] RECOVERY - puppet last run on db1067 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [22:06:46] RECOVERY - puppet last run on mw1119 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [22:06:47] RECOVERY - puppet last run on mw1177 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [22:06:47] RECOVERY - puppet last run on amssq60 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [22:06:55] RECOVERY - puppet last run on db1051 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [22:06:55] RECOVERY - puppet last run on db1028 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [22:06:58] RECOVERY - puppet last run on mw1118 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [22:06:59] RECOVERY - puppet last run on amssq55 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [22:06:59] RECOVERY - puppet last run on cp3014 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [22:07:05] RECOVERY - puppet last run on db1034 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [22:07:05] RECOVERY - puppet last run on analytics1038 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [22:07:05] RECOVERY - puppet last run on mw1129 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [22:07:16] RECOVERY - puppet last run on mw1213 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [22:07:16] RECOVERY - puppet last run on mw1175 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [22:07:16] RECOVERY - puppet last run on mw1123 is OK: OK: Puppet is currently enabled, last run 65 seconds ago with 0 failures [22:07:16] RECOVERY - puppet last run on mw1042 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [22:07:16] RECOVERY - puppet last run on lvs2001 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [22:07:16] RECOVERY - puppet last run on db2036 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [22:07:16] RECOVERY - puppet last run on ruthenium is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [22:07:17] RECOVERY - puppet last run on amslvs1 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [22:07:17] RECOVERY - puppet last run on amssq47 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [22:07:18] RECOVERY - puppet last run on mw1172 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [22:07:18] RECOVERY - puppet last run on db1021 is OK: OK: Puppet is currently enabled, last run 62 seconds ago with 0 failures [22:07:19] RECOVERY - puppet last run on db1003 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [22:07:19] RECOVERY - puppet last run on mw1025 is OK: OK: Puppet is currently enabled, last run 63 seconds ago with 0 failures [22:07:20] RECOVERY - puppet last run on db1042 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [22:07:20] RECOVERY - puppet last run on polonium is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [22:07:25] RECOVERY - puppet last run on mw1208 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [22:07:25] RECOVERY - puppet last run on wtp1005 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [22:07:25] RECOVERY - puppet last run on mw1126 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [22:07:25] RECOVERY - puppet last run on db1052 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [22:07:26] RECOVERY - puppet last run on db2007 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [22:07:26] RECOVERY - puppet last run on mw1114 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [22:07:26] RECOVERY - puppet last run on es1007 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [22:07:27] RECOVERY - puppet last run on search1007 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [22:07:27] RECOVERY - puppet last run on db2038 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [22:07:28] RECOVERY - puppet last run on mw1054 is OK: OK: Puppet is currently enabled, last run 66 seconds ago with 0 failures [22:07:28] RECOVERY - puppet last run on analytics1010 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [22:07:35] RECOVERY - puppet last run on mw1162 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [22:07:35] RECOVERY - puppet last run on mw1011 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [22:07:36] RECOVERY - puppet last run on mw1002 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [22:07:36] RECOVERY - puppet last run on db1016 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [22:07:36] RECOVERY - puppet last run on cp4019 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [22:07:36] RECOVERY - puppet last run on cp4004 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [22:07:37] RECOVERY - puppet last run on labstore1001 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [22:07:37] RECOVERY - puppet last run on cp4014 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [22:07:37] RECOVERY - puppet last run on db2029 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [22:07:45] RECOVERY - puppet last run on mw1195 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [22:07:45] RECOVERY - puppet last run on amssq48 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [22:07:46] RECOVERY - puppet last run on db1023 is OK: OK: Puppet is currently enabled, last run 64 seconds ago with 0 failures [22:07:46] RECOVERY - puppet last run on pc1002 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [22:07:55] RECOVERY - puppet last run on analytics1016 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [22:07:58] RECOVERY - puppet last run on mc1012 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [22:07:58] RECOVERY - puppet last run on lithium is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [22:07:58] RECOVERY - puppet last run on cp3010 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [22:07:58] RECOVERY - puppet last run on amssq46 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [22:07:59] RECOVERY - puppet last run on gallium is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [22:08:06] RECOVERY - puppet last run on wtp1012 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [22:08:09] RECOVERY - puppet last run on mw1149 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [22:08:10] RECOVERY - puppet last run on antimony is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [22:08:10] RECOVERY - puppet last run on virt1003 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [22:08:10] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [22:08:15] RECOVERY - puppet last run on mw1190 is OK: OK: Puppet is currently enabled, last run 1 seconds ago with 0 failures [22:08:15] RECOVERY - puppet last run on mw1044 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [22:08:15] RECOVERY - puppet last run on mw1076 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [22:08:15] RECOVERY - puppet last run on labnet1001 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [22:08:15] RECOVERY - puppet last run on dataset1001 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [22:08:26] RECOVERY - puppet last run on mc1014 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [22:08:26] RECOVERY - puppet last run on db1043 is OK: OK: Puppet is currently enabled, last run 67 seconds ago with 0 failures [22:08:34] RECOVERY - puppet last run on snapshot1001 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [22:08:45] RECOVERY - puppet last run on db1060 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [22:08:46] RECOVERY - puppet last run on ms-be2011 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [22:08:50] RECOVERY - puppet last run on install2001 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [22:08:50] RECOVERY - puppet last run on cp4001 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [22:08:50] RECOVERY - puppet last run on mw1206 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [22:08:50] RECOVERY - puppet last run on virt1001 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [22:08:51] RECOVERY - puppet last run on elastic1006 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [22:08:51] RECOVERY - puppet last run on db1036 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [22:09:05] RECOVERY - puppet last run on db1026 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [22:09:05] RECOVERY - puppet last run on mw1051 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [22:09:05] RECOVERY - puppet last run on plutonium is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [22:09:05] RECOVERY - puppet last run on amssq34 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [22:09:06] RECOVERY - puppet last run on ms-be3002 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [22:09:06] RECOVERY - puppet last run on amssq36 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [22:09:06] RECOVERY - puppet last run on mw1014 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [22:09:07] RECOVERY - puppet last run on mw1055 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [22:09:16] RECOVERY - puppet last run on amssq51 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [22:09:17] RECOVERY - puppet last run on lvs2006 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [22:09:35] RECOVERY - puppet last run on labmon1001 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [22:09:36] RECOVERY - puppet last run on search1002 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [22:09:36] RECOVERY - puppet last run on mw1180 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [22:09:55] RECOVERY - puppet last run on mw1202 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [22:10:10] RECOVERY - puppet last run on cp4005 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [22:17:30] !log ran update-rc.d -f puppetmaster remove on palladium/strontium [22:17:38] Logged the message, Master [22:40:08] (03PS1) 10Alexandros Kosiaris: puppetmaster's pid ensured absent [puppet] - 10https://gerrit.wikimedia.org/r/166516