[00:00:04] RoanKattouw, ^d, marktraceur, jackmcbarn: Dear anthropoid, the time has come. Please deploy Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20150109T0000). [00:02:03] hey guys, I'm preparong a change to swat [00:03:23] * jackmcbarn waits anxiously [00:03:29] (03CR) 10Dzahn: [C: 032] Duplicate -qa notifcations to -releng [puppet] - 10https://gerrit.wikimedia.org/r/183382 (https://phabricator.wikimedia.org/T86053) (owner: 10Hashar) [00:04:33] * legoktm is here [00:06:51] Anyone volunteering for the SWAT deploys today? [00:07:28] marktraceur: ^d: swat swat? [00:07:34] I think maybe OuKB is on it [00:07:45] i'm not a swatter anymore [00:07:57] I never remember the OuKB nick [00:07:59] * ^d gives OuKB the "honorary swatter for a day" cap [00:09:39] so......who's doing it? :P [00:10:54] I guess I can do it... [00:11:32] cool [00:12:10] !swat add https://gerrit.wikimedia.org/r/#/c/180451/ [00:12:32] legoktm: OuKB is putting together a last minute submodue update for MF. Should be ready in a sec... [00:12:42] ok, I'll start with jackmcbarn [00:13:57] legoktm: i should warn you that jenkins tests for scribunto are broken right now [00:14:00] oh [00:17:04] (03PS5) 10Dzahn: cache: install the planet SSL cert on misc-web [puppet] - 10https://gerrit.wikimedia.org/r/181415 (https://phabricator.wikimedia.org/T60048) [00:17:10] legoktm, done. requires a scap though:( [00:17:52] hmm, didnt we usually keep all the gerrit votes when a new PS was simply rebasing via gerrit button [00:18:59] :| [00:21:28] you say scap was an issue because of file permissions? [00:21:39] I think that was fixed [00:21:41] if so, i ran the fix as requested by Reedy [00:22:06] it just makes legoktm stay at his keyborad for 30 minutes longer than he'd like. [00:23:09] if theres a problem i can do it [00:23:14] ARE YOU SERIOUS JENKINS [00:23:29] what did jenkins do? [00:23:30] it ran the tests on top of some Scribunto change which failed [00:23:39] and now it's re-doing all the mw core cherry-picks [00:23:49] eww [00:24:22] blame zuul, not jenkins. poor mr jenkins is just a pawn in zuul's evil plan to control the universe [00:24:30] ARE YOU SERIOUS ZUUL [00:24:47] better :) [00:24:56] the whole root cause of this mess was that a core change broke unit tests in scribunto, it was able to be merged anyway, and now we can't agree on how to fix them, so they're still broken [00:25:40] core unfair to extensions! [00:25:59] maybe jenkins should include all wmf-deployed extensions when it's testing core changes [00:26:24] thats already being worked on afaik [00:26:35] good [00:27:01] and it will only make a merge take like 45 minutes :( [00:27:12] !log legoktm Synchronized php-1.25wmf14/extensions/Scribunto/engines/LuaCommon/lualib/mw.title.lua: SWAT: https://gerrit.wikimedia.org/r/#/c/183552/ (duration: 00m 06s) [00:27:16] Logged the message, Master [00:27:31] But yes, hashar is working towards full suite testing before merging [00:29:19] !log legoktm Synchronized php-1.25wmf13/extensions/Scribunto/engines/LuaCommon/lualib/mw.title.lua: SWAT: https://gerrit.wikimedia.org/r/#/c/183552/ (duration: 00m 07s) [00:29:22] Logged the message, Master [00:29:23] jackmcbarn: ^ [00:29:27] testing [00:29:49] didn't work [00:30:11] uhh [00:30:27] well I'm pretty sure I deployed it right :P [00:30:44] * OuKB looks for mushroom clouds [00:32:17] the fix works in my environment, and it worked in anomie's, and it worked on the beta cluster. it just doesn't work in production, where you just deployed it to [00:32:58] should we revert it? [00:33:04] can you look at one server as a sanity check and make sure the fix is there? [00:33:08] i don't think so [00:33:42] my guess is the code change isn't active [00:34:11] hmm [00:34:16] it didn't get deployed [00:34:23] any idea why not? [00:35:00] legoktm: submodule update... :P [00:35:25] (03CR) 10Dzahn: [C: 032] cache: install the planet SSL cert on misc-web [puppet] - 10https://gerrit.wikimedia.org/r/181415 (https://phabricator.wikimedia.org/T60048) (owner: 10Dzahn) [00:35:40] ......yup >.< [00:36:18] reason 1087 that submodules suck [00:36:30] !log legoktm Synchronized php-1.25wmf14/extensions/Scribunto/engines/LuaCommon/lualib/mw.title.lua: SWAT: https://gerrit.wikimedia.org/r/#/c/183552/ again (duration: 00m 06s) [00:36:38] jackmcbarn: try now and then I'll do 13? [00:36:39] bd808, we should've stayed on svn [00:36:50] works on 14 now [00:37:39] !log legoktm Synchronized php-1.25wmf13/extensions/Scribunto/engines/LuaCommon/lualib/mw.title.lua: SWAT: https://gerrit.wikimedia.org/r/#/c/183552/ again (duration: 00m 07s) [00:37:47] jackmcbarn: ^ [00:37:51] ok, 13 works now too [00:38:01] woot. [00:38:07] thanks! [00:40:18] !log legoktm Synchronized php-1.25wmf14/extensions/CentralAuth/: SWAT: https://gerrit.wikimedia.org/r/#/c/183554/ (duration: 00m 06s) [00:40:21] Logged the message, Master [00:40:57] legoktm: tested [00:41:49] (03PS4) 10Dzahn: planets: add Varnish statement [puppet] - 10https://gerrit.wikimedia.org/r/181419 (https://phabricator.wikimedia.org/T60048) (owner: 10John F. Lewis) [00:41:59] !log legoktm Synchronized php-1.25wmf13/extensions/CentralAuth/: SWAT: https://gerrit.wikimedia.org/r/#/c/183554/ (duration: 00m 06s) [00:42:02] Logged the message, Master [00:43:02] (03PS5) 10Legoktm: Added ang.wikibooks and ie.wikibooks to closed.dblist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/180451 (https://phabricator.wikimedia.org/T78667) (owner: 10Dzahn) [00:43:11] (03CR) 10Legoktm: [C: 032] Added ang.wikibooks and ie.wikibooks to closed.dblist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/180451 (https://phabricator.wikimedia.org/T78667) (owner: 10Dzahn) [00:44:32] * legoktm waits [00:44:50] (03CR) 10Legoktm: [V: 032] Added ang.wikibooks and ie.wikibooks to closed.dblist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/180451 (https://phabricator.wikimedia.org/T78667) (owner: 10Dzahn) [00:45:10] legoktm: :)) [00:45:27] the "!swat add" trigger worked. amazing [00:45:51] !log legoktm Synchronized closed.dblist: SWAT: https://gerrit.wikimedia.org/r/#/c/180451/ (duration: 00m 08s) [00:45:54] Logged the message, Master [00:46:16] !log legoktm Synchronized wmf-config/InitialiseSettings.php: SWAT: https://gerrit.wikimedia.org/r/#/c/180451/ (duration: 00m 06s) [00:46:19] Logged the message, Master [00:46:49] mutante: all deployed [00:47:30] legoktm: thanks, i wonder how to confirm it's closed [00:47:39] mutante: I tried editing something and it told me I couldn't :P [00:48:28] great :) same here [00:48:38] "This wiki has been locked" . resolving [00:50:08] CONFLICT (content): Merge conflict in .gitreview [00:50:09] wtf [00:51:43] PROBLEM - MySQL Replication Heartbeat on db1016 is CRITICAL: CRIT replication delay 324 seconds [00:51:51] PROBLEM - MySQL Slave Delay on db1016 is CRITICAL: CRIT replication delay 329 seconds [00:52:52] RECOVERY - MySQL Replication Heartbeat on db1016 is OK: OK replication delay -1 seconds [00:53:02] RECOVERY - MySQL Slave Delay on db1016 is OK: OK replication delay 0 seconds [00:53:28] kaldari: OuKB: what's up with https://github.com/wikimedia/mediawiki-extensions-MobileFrontend/blob/wmf/1.25wmf13/.gitreview ?? [00:53:32] -1 seconds is nice too [00:53:48] !log Updated the Wikidata property suggester with data from Monday's JSON dump [00:53:50] Logged the message, Master [00:54:05] ergh [00:54:48] after a bunch of cherrypicks, we just recreated wmf13 from wmf14:P [00:55:16] it doesn't submodule update cleanly :/ [00:55:33] lemme take a look [00:55:50] it has a rebase conflict in the .gitreview [00:56:19] (03CR) 10Dzahn: [C: 032] planets: add Varnish statement [puppet] - 10https://gerrit.wikimedia.org/r/181419 (https://phabricator.wikimedia.org/T60048) (owner: 10John F. Lewis) [00:57:17] should I just fix that manually? [00:58:08] OuKB is fixing… [00:58:46] done for wmf13. i can just finish the deployment myself [00:59:52] legoktm, ^^^ [01:00:16] ergh [01:00:18] The authenticity of host '[gerrit.wikimedia.org]:29418 ([2620:0:861:3:208:80:154:81]:29418)' can't be established. [01:00:19] RSA key fingerprint is dc:e9:68:7b:99:1b:27:d0:f9:fd:ce:6a:2e:bf:92:e1. [01:00:22] OuKB: see _security [01:01:05] eh, wasn't in it [01:07:49] !log maxsem Started scap: SWAT: MobileFrontend and WikiGrok updates [01:07:55] Logged the message, Master [01:10:11] JohnFLewis: hello? wondering where you put that BZ dump file and how large it is [01:15:42] PROBLEM - puppet last run on wtp1007 is CRITICAL: CRITICAL: Puppet has 1 failures [01:23:02] RECOVERY - puppet last run on wtp1007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [01:25:26] !log maxsem Finished scap: SWAT: MobileFrontend and WikiGrok updates (duration: 17m 36s) [01:25:30] Logged the message, Master [01:42:48] (03PS1) 10Dzahn: bugzilla: add Apache site for static BZ version [puppet] - 10https://gerrit.wikimedia.org/r/183758 (https://phabricator.wikimedia.org/T85140) [01:45:03] (03PS1) 10Dzahn: bugzilla: add varnish config for static-bugzilla [puppet] - 10https://gerrit.wikimedia.org/r/183759 (https://phabricator.wikimedia.org/T85140) [01:49:40] !log maxsem Synchronized php-1.25wmf13/extensions/MobileFrontend: touch (duration: 00m 09s) [01:49:47] Logged the message, Master [01:57:04] (03PS1) 10Dzahn: add static-bugzilla name, point to misc-web [dns] - 10https://gerrit.wikimedia.org/r/183760 [02:02:09] (03CR) 10Dzahn: "re: 'no reason to be redirected to HTTPS'. i think the default should be using it and we should have a reason not to." [puppet] - 10https://gerrit.wikimedia.org/r/181949 (owner: 10Hoo man) [02:07:02] (03CR) 10Dzahn: "what kind of notification do you expect from this? is it about IRC or mail or even paging? also: does this really have to be modules/beta" [puppet] - 10https://gerrit.wikimedia.org/r/183454 (https://phabricator.wikimedia.org/T54867) (owner: 10Hashar) [02:07:14] legoktm: You're a SWATter now!? [02:09:52] Looks like our MobileFrontend deployment resulted in a minor breakage, so we’re about to deploy a follow-up fix. [02:14:43] marktraceur: somehow [02:15:26] (03PS1) 10Springle: depool db1003 db1005 db1006 db1009. repool db1050 in s6, db1015 in s3 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/183764 [02:17:02] (03CR) 10Springle: [C: 032] depool db1003 db1005 db1006 db1009. repool db1050 in s6, db1015 in s3 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/183764 (owner: 10Springle) [02:17:06] (03Merged) 10jenkins-bot: depool db1003 db1005 db1006 db1009. repool db1050 in s6, db1015 in s3 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/183764 (owner: 10Springle) [02:18:19] !log l10nupdate Synchronized php-1.25wmf13/cache/l10n: (no message) (duration: 00m 01s) [02:18:23] !log LocalisationUpdate completed (1.25wmf13) at 2015-01-09 02:18:23+00:00 [02:18:29] Logged the message, Master [02:18:32] Logged the message, Master [02:21:12] (03CR) 10Dzahn: "you all don't care because it's planet, right" [puppet] - 10https://gerrit.wikimedia.org/r/183007 (https://phabricator.wikimedia.org/T47806) (owner: 10Dzahn) [02:24:11] !log l10nupdate Synchronized php-1.25wmf14/cache/l10n: (no message) (duration: 00m 01s) [02:24:14] Logged the message, Master [02:24:15] !log LocalisationUpdate completed (1.25wmf14) at 2015-01-09 02:24:15+00:00 [02:24:18] Logged the message, Master [02:27:31] 3Wikimedia-Git-or-Gerrit, Code-Review, operations: Chrome warns about insecure certificate on gerrit.wikimedia.org - https://phabricator.wikimedia.org/T76562#965081 (10Krinkle) >>! In T76562#939750, @Dzahn wrote: >>>! In T76562#846225, @Seb35 wrote: >> The certificate *.wikimedia.org used on phabricator.wikimedi... [02:30:25] !log springle Synchronized wmf-config/db-eqiad.php: depool db1003 db1005 db1006 db1009. repool db1050 in s6, db1015 in s3 (duration: 00m 06s) [02:30:28] Logged the message, Master [02:38:17] 3ops-requests, operations: Package for mobile jobs (androidsdk, libdclass) missing in Trusty - https://phabricator.wikimedia.org/T84164#965087 (10Krinkle) Continuous integration slaves running Ubuntu Trusty exist at all (and aren't failing) because we couldn't wait 6 months on this issue in order to be able to g... [02:42:01] !log maxsem Synchronized php-1.25wmf13/extensions/Mantle: (no message) (duration: 00m 05s) [02:42:06] Logged the message, Master [02:42:16] !log maxsem Synchronized php-1.25wmf13/extensions/MobileFrontend/: (no message) (duration: 00m 07s) [02:42:18] Logged the message, Master [02:42:30] !log maxsem Synchronized php-1.25wmf14/extensions/MobileFrontend/: (no message) (duration: 00m 06s) [02:42:32] Logged the message, Master [02:49:47] !log maxsem Synchronized php-1.25wmf13/extensions/Mantle: (no message) (duration: 00m 07s) [02:49:49] Logged the message, Master [02:50:02] !log maxsem Synchronized php-1.25wmf13/extensions/MobileFrontend/: (no message) (duration: 00m 07s) [03:01:46] !log Running mwscript extensions/WikiGrok/maintenance/refreshCampaigns.php --wiki=enwiki --version=1 in screen session on terbium, feel free to kill if causes problems [03:01:48] Logged the message, Master [03:05:29] !log upgrade db1016 trusty [03:05:32] Logged the message, Master [04:16:42] (03PS1) 10Springle: upgrade db1016 to trusty and mariadb 10 [puppet] - 10https://gerrit.wikimedia.org/r/183771 [04:17:34] (03CR) 10Springle: [C: 032] upgrade db1016 to trusty and mariadb 10 [puppet] - 10https://gerrit.wikimedia.org/r/183771 (owner: 10Springle) [04:32:31] !log LocalisationUpdate ResourceLoader cache refresh completed at Fri Jan 9 04:32:31 UTC 2015 (duration 32m 30s) [04:32:35] Logged the message, Master [05:25:26] (03CR) 10KartikMistry: [C: 031] Add job to crunch Language team data [puppet] - 10https://gerrit.wikimedia.org/r/183734 (owner: 10Milimetric) [05:29:53] (03PS1) 10Ori.livneh: memcached: set server address to localhost rather than 127.0.0.1 on mw123* [mediawiki-config] - 10https://gerrit.wikimedia.org/r/183774 [05:30:37] (03CR) 10Ori.livneh: [C: 032] memcached: set server address to localhost rather than 127.0.0.1 on mw123* [mediawiki-config] - 10https://gerrit.wikimedia.org/r/183774 (owner: 10Ori.livneh) [05:30:41] (03Merged) 10jenkins-bot: memcached: set server address to localhost rather than 127.0.0.1 on mw123* [mediawiki-config] - 10https://gerrit.wikimedia.org/r/183774 (owner: 10Ori.livneh) [05:31:59] !log ori Synchronized wmf-config/mc.php: I33ff81e6a: memcached: set server address to localhost rather than 127.0.0.1 on mw123* (duration: 00m 05s) [05:32:02] Logged the message, Master [05:36:09] !log repooled mw123[12] [05:36:14] Logged the message, Master [05:40:33] good morning [05:40:47] good morning [05:41:24] I... don't understand your change above [05:42:45] tcp6 0 0 10.2.2.22:80 10.64.0.216:49279 TIME_WAIT - [05:42:48] tcp6 0 0 127.0.0.1:9000 127.0.0.1:53645 TIME_WAIT - [05:42:51] tcp6 0 0 127.0.0.1:9000 127.0.0.1:51177 TIME_WAIT - [05:42:54] there's tons of that [05:42:55] that's why ephemeral ports are getting starvated [05:43:10] root@mw1229:~# netstat -nap |grep -c :9000 [05:43:10] 11588 [05:43:36] that's fastcgi, obviously [05:43:55] do we really have that many qps per box/ [05:44:14] only on the machines with weight=20 in pybal [05:44:48] if you connect to 127.0.0.1, all the connections go to 127.0.0.1. if you connect to localhost, half go to ::1, and the other half to 127.0.0.1 [05:45:08] half? [05:45:10] that's weird [05:46:41] the unix domain socket solution was nice, but the socket is not group- or world-writable, so it's not usable for anyone except the nutcracker user [05:47:02] ephemeral ports we can fix with various ways [05:47:13] yes, sysctl params to reap connections [05:47:20] tw_recycle [05:47:25] or adjusting local port range [05:47:30] increasing it in size [05:48:10] I thought the unix domain socket approach was for performance reasons, not this problem [05:48:33] it was a two-for-one deal [05:49:38] updating the init script so that it calls start-stop-daemon with --umask=$DAEMON_UMASK would be nice [05:50:03] (with a chance to specify $DAEMON_UMASK in the defaults file) [05:50:27] I'd rather fix nutcracker to have an option for user/group/mode [05:50:44] but that may be an okay temporary solution [05:52:04] esp. since in our brave new world of systemd and upstart there are no defaults files :) [05:53:09] * ori nods [05:56:25] (03CR) 10Smalyshev: [C: 031] Temporarily add Elasticsearch to einsteinium [puppet] - 10https://gerrit.wikimedia.org/r/181612 (owner: 10Manybubbles) [06:25:12] (03PS2) 10Yuvipanda: beta: monitor mobile main page [puppet] - 10https://gerrit.wikimedia.org/r/183454 (https://phabricator.wikimedia.org/T54867) (owner: 10Hashar) [06:25:32] (03CR) 10Yuvipanda: [C: 032] beta: monitor mobile main page [puppet] - 10https://gerrit.wikimedia.org/r/183454 (https://phabricator.wikimedia.org/T54867) (owner: 10Hashar) [06:28:35] MaxSem: where’s the new Java service you are working on? [06:28:56] PROBLEM - puppet last run on elastic1030 is CRITICAL: CRITICAL: Puppet has 1 failures [06:29:09] https://gerrit.wikimedia.org/r/#/c/178970/ YuviPanda|zzz [06:29:16] PROBLEM - puppet last run on amssq35 is CRITICAL: CRITICAL: Puppet has 3 failures [06:29:45] PROBLEM - puppet last run on mw1119 is CRITICAL: CRITICAL: Puppet has 3 failures [06:29:56] PROBLEM - puppet last run on mw1170 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:06] PROBLEM - puppet last run on mw1042 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:16] PROBLEM - puppet last run on mw1144 is CRITICAL: CRITICAL: Puppet has 1 failures [06:30:25] PROBLEM - puppet last run on cp3014 is CRITICAL: CRITICAL: Puppet has 2 failures [06:30:36] MaxSem: I shall also poke around a little and help, I suppose. Haven’t written any Java in a few months. [06:31:06] thanks :P [06:31:19] <_joe_> you said java [06:31:26] <_joe_> without "sucks" [06:31:37] <_joe_> you're ops now, you're not allowed to do that [06:31:53] <_joe_> I'm pretty sure is one of the clauses you sign in all ops contracts [06:32:32] _joe_: I dunno, PHP vs Java I will probably pick Java. [06:33:26] _joe_: at least as a language to write in, that is :) [06:33:45] <_joe_> ori: can't we make twemproxy run as the apache user then? [06:33:57] <_joe_> I'd like to use the socket, it's surely a perf gain [06:33:58] PHP isn't a language, it's an abomination. [06:34:08] * Deskana has written a patch in core recently and it was painful. [06:34:35] Deskana: it’s slightly more usable with PHPStorm [06:34:44] of course, Java is unusable without IntelliJ [06:34:50] and without an IDE I’d much rather write PHP than Java [06:45:45] RECOVERY - puppet last run on mw1170 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [06:45:56] RECOVERY - puppet last run on mw1042 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [06:46:25] RECOVERY - puppet last run on amssq35 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:46:45] RECOVERY - puppet last run on mw1119 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [06:47:15] RECOVERY - puppet last run on elastic1030 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [06:47:16] RECOVERY - puppet last run on mw1144 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [06:47:26] RECOVERY - puppet last run on cp3014 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [06:50:57] godog: I found at least one host where diamond was running twice, as once as root, "/usr/bin/python /usr/bin/diamond -p /var/run/diamond.pid" [06:51:02] godog: ...and once as diamond "/usr/bin/python /usr/bin/diamond --foreground --skip-fork --log-stdout --skip-pidfile" [06:53:43] (03PS1) 10Yuvipanda: shinken: Make betalabs http checks specify address explicitly [puppet] - 10https://gerrit.wikimedia.org/r/183779 [07:00:10] springle: hi [07:00:35] <_joe_> ori: if using the socket is not an option, I'd use net.ipv4.tcp_tw_reuse rather than playing with localhost/127.0.0.1 [07:00:57] <_joe_> paravoid: morning [07:01:00] hi [07:02:23] paravoid: hey [07:02:29] hi :) [07:03:37] we're 100% mariadb nowadays, right? [07:04:24] yes, except for virt1001 which i think someone recently said is mysql 5.1 [07:04:59] lol :) [07:05:09] <_joe_> springle: do you think this can help you? https://logstash.wikimedia.org/#/dashboard/elasticsearch/hhvm_slow_timer [07:05:20] I'm cleaning up lucid cruft [07:05:32] and mysql-fb cruft [07:05:49] we have the mysql, mysql_wmf, coredb_mysql puppet modules and I'm kinda lost :) [07:07:04] modules/coredb_mysql/manifests/packages.pp seems especially crufty [07:07:05] _joe_: that's nice. won't help enormously as tendril does that, but having multiple sources could be handy in an outage or partition [07:08:08] paravoid: i'm ignoring coredb and migrating stuff away to the mariadb submodule. eventually mysql_wmf should go too, in favour of the generic mysql module for anything not production [07:09:01] so we'd have mysql module for whatever, misc, labs. mariadb for production with fewer hooks [07:09:24] paravoid: the day we delete coredb i'll let you click +2 if you like ;) [07:09:29] haha :) [07:09:53] still a long way to go, though, right? [07:10:26] about 50% there i guess [07:10:50] nod [07:10:52] https://tendril.wikimedia.org/host .. the 10.x boxes are mariadb, the 5.x are still coredb. roughly [07:11:22] poor you [07:11:29] why? :) [07:11:35] lots of work :) [07:11:47] yeah but not complex [07:12:03] avoiding refactoring coredb is the lazy option [07:12:56] YuviPanda: still here? [07:13:01] which timezone are you in? :) [07:13:09] YST? [07:13:25] paravoid: today, IST [07:13:34] YST being Yuvi Standard Time [07:13:38] it was a joke :) [07:13:42] :D [07:13:53] commit 220fb322e6c889800c81941aeee98477f95db231 is very wrong [07:14:02] paravoid: incidentally, if you're looking at lucid stuff in coredb, it can be removed. all precise or trusty afaik [07:14:25] springle: I was, but I thought of asking whether all the non-mariadb can go as well [07:14:39] springle: modules/coredb_mysql/manifests/packages.pp is branching twice for this [07:14:56] YuviPanda: so, I should explain [07:15:17] uh oh. yes, please do [07:16:02] when amd64 was introduced [07:16:12] (03CR) 10Yuvipanda: [C: 032] shinken: Make betalabs http checks specify address explicitly [puppet] - 10https://gerrit.wikimedia.org/r/183779 (owner: 10Yuvipanda) [07:16:20] it all worked fine more or less, except people had the need of running purely 32-bit apps [07:16:24] mostly binary [07:16:29] such as skype, or wine [07:16:37] so ia32-libs was born [07:17:11] this was initially an ugly ass package that embedded in the source a bunch of "basic" libraries, mostly on a per-request basis [07:17:44] and shipped them to /usr/lib32, iirc [07:18:04] so /usr/lib32/libfoo.so.0, while /usr/lib/libfoo.so.0 had the 64-bit variant [07:18:08] (on Debian/Ubuntu systems, that is) [07:18:40] right. that much I’m aware of. [07:19:18] sometime after this lib32foo packages were introduced [07:19:25] for random packages [07:20:01] at some point, though [07:20:12] Ubuntu started on working to solve this properly [07:20:16] (and the work later landed in Debian) [07:20:21] the proper solution was multiarch [07:20:45] in essence, packages now ship libraries only in locations such as /usr/lib/ [07:20:58] so, /usr/lib/x86_64-linux-gnu/ for amd64 [07:21:13] and on the same system, you can tell dpkg to "add" an architecture [07:21:28] so the same library from a different architectures can coexist on the same system [07:21:54] so on my system for example [07:21:54] ii libc6:amd64 2.13-38+deb7u6 amd64 Embedded GNU C Library: Shared libraries [07:21:57] ii libc6:i386 2.13-38+deb7u6 i386 Embedded GNU C Library: Shared libraries [07:22:20] there's nothing restricting you to only amd64/i386, btw [07:22:30] you can do any other compatible combinations, such as armhf/armel [07:22:40] or if you're feeling crazy, you can also do amd64/arm for example [07:22:51] by using qemu-user :) [07:23:02] that sounds nice and also terrible :) [07:23:06] dpkg & apt support this natively [07:23:23] so you can install a :i386 package and it will fetch its :i386 dependenciesw [07:23:27] so I suppose I should add i386 as an arch and then let apt install the i386 variants [07:23:33] rather than do the lib32* versions? [07:23:35] yeah [07:23:52] I'm guessing that what you saw was those :i386 packages not working in trusty [07:24:00] yeah, and did the simplest fix. [07:24:06] that was probably because the i386 architecture wasn't added on the trusty system [07:24:18] but now, I think we don’t need it on precise anymore. [07:24:30] does multiarch work on precise? [07:24:35] yeah [07:24:43] that's why the precise stanza works [07:24:57] and it’s added by default? [07:25:19] no [07:25:30] on precise, I mean? [07:25:39] it's not added by default on precise [07:25:56] hmm, so how did the :i386 work? do we have multiarch set up for precise somewhere? [07:26:12] it's added by default on saucy+, but we disable it because we don't really need it (almost :) anywhere, cf. f1b93ba4 [07:26:36] that's a good question [07:26:37] who knows... [07:27:01] so the class is included in two places [07:27:14] one is contint, but only under an "if precise" guard [07:27:16] the other is toollabs [07:27:34] yeah, and it’s actually used in neither atm. [07:27:37] lol [07:27:52] right now it’s on a self-hosted puppetmaster running https://gerrit.wikimedia.org/r/#/c/167198/ [07:28:03] and then I left the app team and didn’t finish that up [07:28:14] hah [07:28:45] paravoid: hmm, so I’m trying to figure out why that works on precise. [07:33:11] dpkg --print-foreign-architectures [07:33:16] i386 [07:33:17] on precise [07:33:19] in labs [07:33:20] but no in prod [07:33:46] well, depends on what the labs bootstrap image scripts did [07:33:48] maybe the base images had it on / off [07:33:49] yeah [07:33:57] (03PS1) 10Faidon Liambotis: monitoring: add icon for Debian hosts [puppet] - 10https://gerrit.wikimedia.org/r/183782 [07:33:59] (03PS1) 10Faidon Liambotis: ipython: remove $::operatingsystem fail() check [puppet] - 10https://gerrit.wikimedia.org/r/183783 [07:34:01] (03PS1) 10Faidon Liambotis: cpufrequtils: move Ubuntu stupidity under if guard [puppet] - 10https://gerrit.wikimedia.org/r/183784 [07:34:03] (03PS1) 10Faidon Liambotis: diamond: don't disable IPVS by default [puppet] - 10https://gerrit.wikimedia.org/r/183785 [07:34:05] (03PS1) 10Faidon Liambotis: install-server: drop lucid support for DHCP [puppet] - 10https://gerrit.wikimedia.org/r/183786 [07:34:25] (03CR) 10Nikerabbit: Content Translation configuration for Production (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/181546 (owner: 10KartikMistry) [07:34:56] <_joe_> paravoid: IPVS was enabled by default on prcise [07:35:04] comments say on trusty [07:35:26] <_joe_> well, I don't know about the comments, I remember the cronspam :) [07:35:34] and, well, the code has effect only on trusty [07:35:42] <_joe_> I didn't fix that myself, I just remember a flood of cronspam [07:35:57] <_joe_> then it's trusty, probably, and it may have been the old package? [07:36:03] <_joe_> I know godog repackaged it [07:36:03] maybe [07:36:06] yup [07:36:13] <_joe_> so it may be the reason it changed [07:36:34] (03CR) 10Faidon Liambotis: [C: 032] monitoring: add icon for Debian hosts [puppet] - 10https://gerrit.wikimedia.org/r/183782 (owner: 10Faidon Liambotis) [07:36:49] (03CR) 10Faidon Liambotis: [C: 032] ipython: remove $::operatingsystem fail() check [puppet] - 10https://gerrit.wikimedia.org/r/183783 (owner: 10Faidon Liambotis) [07:37:02] <_joe_> I was seriously thininkg of repackaging apache2 [07:37:04] <_joe_> for trusty [07:37:13] (03CR) 10Faidon Liambotis: [C: 032] cpufrequtils: move Ubuntu stupidity under if guard [puppet] - 10https://gerrit.wikimedia.org/r/183784 (owner: 10Faidon Liambotis) [07:37:43] (03PS2) 10Faidon Liambotis: install-server: drop lucid support for DHCP [puppet] - 10https://gerrit.wikimedia.org/r/183786 [07:37:45] (03PS2) 10Faidon Liambotis: diamond: don't disable IPVS by default [puppet] - 10https://gerrit.wikimedia.org/r/183785 [07:37:54] why? [07:38:28] (03CR) 10Faidon Liambotis: [C: 032] install-server: drop lucid support for DHCP [puppet] - 10https://gerrit.wikimedia.org/r/183786 (owner: 10Faidon Liambotis) [07:38:29] brbfood [07:38:45] <_joe_> because a) we could communicate with fastcgi via socket b) I can silence all those horrible (bogus) errors we see in the logs [07:39:07] <_joe_> c) a 304 would not return the whole response body to varnish like it does now [07:39:28] are you still planning to move to nginx at some point? [07:39:34] <_joe_> (the latter two are results of a patch we should apply anyway) [07:39:36] <_joe_> I am [07:39:43] if so, it may be a better investment of your time :P [07:39:45] <_joe_> not sure I'll have the time this semester [07:40:43] <_joe_> it seems like HHVM is in the "done" box of the scrum religion, so we don't get to spend too much time on it [07:41:12] <_joe_> (sorry, my sarcasm for goal-setting exercises keeps pouring out) [07:44:47] springle: so [07:44:50] paravoid: we can blow away both lucid and mysql-fb blocks [07:44:52] heh [07:44:57] nope [07:45:00] es1008.eqiad.wmnet [07:45:07] ii mysqlfb-server-5.1 5.1.53-fb3875-wm1 MySQL database server binaries and system database setup [07:45:10] ii mysqlfb-server-core-5.1 5.1.53-fb3875-wm1 MySQL database server binaries [07:45:35] the only one, apparently [07:45:51] oh that thing [07:45:59] yes sorry, my mistake [07:47:21] one single server away from being fb-free [07:47:27] amazing :) [07:47:52] paravoid: es2008 has a 10.x package installed as it was a external storage test case before migration. we can still remove lucid/mysql-fb and i'll complete the upgrade properly [07:48:18] es1008 [07:48:36] PROBLEM - Unmerged changes on repository puppet on strontium is CRITICAL: There are 4 unmerged changes in puppet (dir /var/lib/git/operations/puppet). [07:48:36] that numbering is going to bite me someday [07:51:27] in fact, es1008 is already role::mariadb::core. i guess just the packages are orphans [07:51:39] oh! [07:51:44] it certainly is precise [07:58:39] (03PS1) 10Faidon Liambotis: coredb_mysql/mysql_wmf: remove MySQL-fb support [puppet] - 10https://gerrit.wikimedia.org/r/183789 [07:59:32] springle: ^ [08:00:18] (03CR) 10Springle: [C: 031] coredb_mysql/mysql_wmf: remove MySQL-fb support [puppet] - 10https://gerrit.wikimedia.org/r/183789 (owner: 10Faidon Liambotis) [08:00:24] just +1? :) [08:00:31] I'm too scared to merge that [08:00:39] i never know when to +2 for people who have +2 :) [08:00:46] (03CR) 10Springle: [C: 032] coredb_mysql/mysql_wmf: remove MySQL-fb support [puppet] - 10https://gerrit.wikimedia.org/r/183789 (owner: 10Faidon Liambotis) [08:01:41] RECOVERY - Unmerged changes on repository puppet on strontium is OK: No changes to merge. [08:01:55] paravoid: we'll find out :) [08:02:31] breaking puppet in DBs isn't bad since mariadb isn't restarted automatically [08:03:55] paravoid: no-op on several manual puppet runs. should be fine [08:04:11] awesome [08:07:13] paravoid: I filed https://phabricator.wikimedia.org/T86294?workflow=create to look into it [08:07:49] YuviPanda: there's literally no other place we want i386 right now besides the androidsdk [08:07:56] so, don't waste time on it [08:08:09] paravoid: yeah, I’m fixing that separately. [08:08:11] <_joe_> is wikibugs gone? [08:08:22] <_joe_> I commented on an issue and it's not posting here [08:08:29] _joe_: which one? [08:08:33] maybe it's WMF-NDA? [08:08:38] <_joe_> https://phabricator.wikimedia.org/T83328 [08:08:45] <_joe_> oh right [08:08:48] <_joe_> that may be [08:08:51] PROBLEM - Varnishkafka Delivery Errors per minute on cp3019 is CRITICAL: CRITICAL: 12.50% of data above the critical threshold [20000.0] [08:08:53] yup [08:08:58] don't see why, though [08:09:06] 3ops-requests, operations: Configure twemproxy to bind a unix domain socket - https://phabricator.wikimedia.org/T83328#965351 (10Joe) [08:09:14] <_joe_> paravoid: because I forgot to check [08:09:15] :) [08:09:15] <_joe_> :) [08:09:28] weird that ori's comments aren't migrated to his account [08:10:05] <_joe_> yeah that too [08:10:29] <_joe_> https://gerrit.wikimedia.org/r/#/c/173639/ mmmh why is this a patch to the file and not a debian patch [08:10:33] <_joe_> sigh [08:16:10] RECOVERY - Varnishkafka Delivery Errors per minute on cp3019 is OK: OK: Less than 1.00% above the threshold [0.0] [08:18:07] (03PS1) 10Yuvipanda: Remove androidsdk module [puppet] - 10https://gerrit.wikimedia.org/r/183790 [08:18:12] paravoid: ^ I’ll check with hashar if contint needs ant [08:18:21] nice [08:18:42] (03CR) 10Yuvipanda: [C: 04-2] "Need to check with hashar to see if contint needs ant." [puppet] - 10https://gerrit.wikimedia.org/r/183790 (owner: 10Yuvipanda) [08:18:59] thanks :)) [08:19:12] paravoid: yw! thanks for pointing it out! [08:21:43] (03PS2) 10Yuvipanda: Remove androidsdk module [puppet] - 10https://gerrit.wikimedia.org/r/183790 [08:23:09] <_joe_> YuviPanda: I think they do. [08:24:06] _joe_: ant? yeah. [08:24:13] <_joe_> yea [08:24:16] <_joe_> :/ [08:24:21] * _joe_ hates all things java [08:27:31] heh [08:27:31] I’ll amend and merge [08:28:04] good morning [08:28:13] YuviPanda: do you ever happen to sleep ? :D [08:31:51] hashar: I do :D sometimes. [08:31:56] hashar: CI needs ant? [08:32:50] YuviPanda: I don't think we rely on it anymore [08:33:07] once upon a time, I though using ant / xml was smarter than using shell scripts [08:33:20] Generic Beta Cluster/English Wikipedia Mobile Main page is OK <--- Loooovely [08:33:49] hashar: hah! [08:33:51] hashar: want to +1 https://gerrit.wikimedia.org/r/183790 [08:34:49] * hashar looks for some spare quarters [08:36:04] (03CR) 10Hashar: [C: 031] "That was originally to build the Android app via Jenkins. Not much happened on that front though, so we can get rid of the class indeed." [puppet] - 10https://gerrit.wikimedia.org/r/183790 (owner: 10Yuvipanda) [08:36:23] (03PS3) 10Yuvipanda: Remove androidsdk module [puppet] - 10https://gerrit.wikimedia.org/r/183790 [08:36:36] (03CR) 10Yuvipanda: [C: 032] Remove androidsdk module [puppet] - 10https://gerrit.wikimedia.org/r/183790 (owner: 10Yuvipanda) [08:36:51] \o/ [08:37:46] paravoid: good morning! Are you getting some servers installed with Debian instead of Ubuntu ? [08:38:04] what do you mean? [08:38:14] YuviPanda: should I just hit submit? [08:38:28] paravoid: just did [08:38:38] In a few weeks I am going to ask for a few servers for CI. I thought about using Trusty, but since there is some work on Debian front I would happy to use Debian instead. [08:38:51] Just wondering how ready ™ Debian is for us [08:39:02] it's okay on most fronts [08:39:19] that said, there's a bunch of upstart scripts spread out all over the tree that will need to get fixed on a case-by-case basis [08:39:24] there are* [08:39:36] "fixed", as in replaced by systemd units [08:39:49] for CI, that sounds solvable [08:40:00] ohahaeou systemd damn. Yet a new horse to learn about [08:40:03] 3Multimedia, operations, ops-core: Convert Imagescalers to HHVM, Trusty - https://phabricator.wikimedia.org/T84842#965369 (10Joe) The patch that @PleaseStand referenced cleanly applies on the librsvg package in trusty, so I'll rebuild it using it alone. I don't think we need a repository, the source package uplo... [08:40:15] it's easily solvable in general, it's just that there's no point in doing this now en masse [08:40:27] rather than just fixing it up when we reinstall a service with jessie [08:41:29] <_joe_> I've ran in quite a few systemd bugs lately on sid, but had no time to track them down properly. I hope it's just some sid-related shakiness [08:41:38] like what? [08:43:04] <_joe_> like one service that systemd thinks is started, but has been restarted manually (with "service XXX restart") that loses track of its pid and is left in a un-restartable state [08:44:22] 53 /etc/init/ files all over the place [08:44:39] so, yeah :) [08:44:41] case by case [08:44:54] <_joe_> paravoid: yeah, in most cases it should be easy [08:45:06] yup [08:45:13] paravoid _joe_ the problem with ipvs in older diamond versions was that the collector would cronspam as you pointed out, IIRC that's fixed in the trusty version we have [08:45:37] (03PS3) 10Faidon Liambotis: diamond: don't disable IPVS by default [puppet] - 10https://gerrit.wikimedia.org/r/183785 [08:45:39] godog: ^ :) [08:45:46] <_joe_> godog: ok good :) [08:46:16] hehe yep precise-ly [08:46:31] how can we handle the transition of a a class using upstart which has to be applied on Ubuntu/Debian ? [08:46:38] can we get systemd on Ubuntu as well? [08:47:05] if os_version('debian >= jessie') [08:47:18] <_joe_> hashar: nope [08:47:18] or if os_version('debian >= jessie || ubuntu >= vivid') if you prefer [08:47:36] although I'm contemplating on whether I should add a $::initsystem fact [08:47:40] it's fairly trivial [08:47:45] <_joe_> paravoid: please do [08:47:53] would be great [08:47:57] <_joe_> I love fact-based or feature-based conditionals [08:48:04] yeah [08:48:15] <_joe_> and I hate branches based on realm or on the os version [08:48:22] <_joe_> whenever it's not stricly that [08:48:23] the whole trusty-is-hhvm assumptions in modules/mediawiki aren't great [08:48:29] <_joe_> I know [08:48:34] <_joe_> that was a shortcut [08:48:39] I know :) [08:48:49] are these going to be removed when all servers have been migrated? [08:49:00] or are you keeping php5 compatibility in the module? [08:49:05] for CI or whatever [08:50:25] <_joe_> paravoid: I think we should not keep compatibility, or we should consider branching out to two separated submodules [08:50:39] <_joe_> (not git submodules, though) [08:51:10] <_joe_> right now we're 17 servers away from being able to do that, I guess [08:52:05] (03CR) 10Filippo Giunchedi: [C: 031] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/183785 (owner: 10Faidon Liambotis) [08:52:35] what do you mean, godog? [08:52:41] I didn't understand your comment [08:57:25] 3ops-requests, operations: Package for mobile jobs (androidsdk, libdclass) missing in Trusty - https://phabricator.wikimedia.org/T84164#965377 (10hashar) 5Open>3Resolved a:3hashar We got rid of the androidsdk related packages. There is no jobs using them anymore https://gerrit.wikimedia.org/r/#/c/183790/ [08:58:01] <_joe_> hashar: we should maybe fix the apache lint job for puppet, should we? [08:59:08] <_joe_> paravoid: did you happen to take a look at ffmpeg2theora in december? I don't really recall [09:02:32] I don't remember... [09:02:47] <_joe_> ok so it's two of us :P [09:02:55] <_joe_> nevermind, I do have time now [09:05:20] (03CR) 10Filippo Giunchedi: [C: 031] "LGTM for this change, please consider other improvements suggested too" [puppet] - 10https://gerrit.wikimedia.org/r/182173 (owner: 10Hoo man) [09:05:31] (03PS10) 10KartikMistry: WIP: Content Translation configuration for Production [mediawiki-config] - 10https://gerrit.wikimedia.org/r/181546 (https://phabricator.wikimedia.org/T85144) [09:05:48] _joe_: yup would be nice. I have updated it to reflect the switch from ops/apache-config to puppet.git [09:06:07] _joe_: but the shell feels hacky and I am not sure even sure it is going to catch issues properly :( [09:06:46] <_joe_> hashar: I'll take a look later, is there a phab issue on this? [09:06:54] * _joe_ too lazy to search [09:07:06] paravoid: heh, it is more obvious by looking at precise's version of /usr/share/diamond/collectors/ipvs/ipvs.py which would popen() unconditionally during __init__, that's what caused the cronspam IIRC [09:07:47] so again IIRC diamond would load the collector class anyway, enabled or not [09:10:31] _joe_: why bother searching when you can ask Antoine search engine: https://phabricator.wikimedia.org/T72068 :D [09:10:43] <_joe_> hashar: eheh :) [09:11:04] _joe_: change is https://gerrit.wikimedia.org/r/#/c/166033/ [09:11:35] and the lint is a stupidly long shell script in Jenkins. Should probably move it directly in the ops/puppet repo so it can be changed easily [09:15:17] (03CR) 10Faidon Liambotis: [C: 032] diamond: don't disable IPVS by default [puppet] - 10https://gerrit.wikimedia.org/r/183785 (owner: 10Faidon Liambotis) [09:19:37] (03PS1) 10Faidon Liambotis: base: add initsystem fact [puppet] - 10https://gerrit.wikimedia.org/r/183799 [09:19:45] _joe_: ^ [09:19:56] <_joe_> paravoid: <3 [09:22:19] YuviPanda: speaking of facts, I've added you as a reviewer for https://gerrit.wikimedia.org/r/#/c/183209/ [09:22:33] although I'm guessing this may be more of an andrewbogott_afk thing [09:23:05] paravoid: yeah, but he’s out with a fever atm, I think. [09:23:10] oh poor him [09:23:15] paravoid: I can test and babysit later today or tomorrow. [09:23:15] :( [09:23:37] let me file a phab task so I don’t forget. [09:23:41] it's fine [09:23:44] no, don't :) [09:24:03] phab task per gerrit changeset is going to be way too spammy imho :) [09:24:20] and it's not urgent at all [09:24:24] I can wait for andrew [09:24:47] (03CR) 10Faidon Liambotis: [C: 032] base: add initsystem fact [puppet] - 10https://gerrit.wikimedia.org/r/183799 (owner: 10Faidon Liambotis) [09:25:13] paravoid: I don’t do it ‘per change’ but only for ‘things I ought to do that I am not doing atm, and also could complicate and drag on potentially' [09:25:26] puppet certs and salt certs and godknowswhatelse use ec2id [09:25:33] so I suppose in this case a task is ok [09:25:40] I'm not ditching it [09:25:45] just renaming it :) [09:25:54] unless you mean something else calls "facter ec2id" explictly [09:26:37] no, I’m just being slightly paranoid :) [09:26:56] root@copper:~# facter --puppet |grep initsystem [09:26:56] initsystem => upstart [09:26:59] root@cp1008:~# facter --puppet |grep initsystem [09:26:59] initsystem => systemd [09:27:26] (03PS1) 10Glaisher: Create 'autopatrolled' group on dawiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/183800 (https://phabricator.wikimedia.org/T86062) [09:27:27] plus I already created the task before seeing what you said :) [09:27:32] :P [09:28:02] (03PS2) 10Faidon Liambotis: Remove custom fact ec2id, replaced by facter's ec2 [puppet] - 10https://gerrit.wikimedia.org/r/183209 (https://phabricator.wikimedia.org/T86297) [09:28:09] (03PS3) 10Faidon Liambotis: Remove custom fact ec2id, replaced by facter's ec2 [puppet] - 10https://gerrit.wikimedia.org/r/183209 (https://phabricator.wikimedia.org/T86297) [09:34:31] (03PS1) 10Yuvipanda: tools: Fix url handling for uwsgi [puppet] - 10https://gerrit.wikimedia.org/r/183801 (https://phabricator.wikimedia.org/T85362) [09:38:52] (03PS2) 10Yuvipanda: tools: Fix url handling for uwsgi [puppet] - 10https://gerrit.wikimedia.org/r/183801 (https://phabricator.wikimedia.org/T85362) [09:39:57] (03CR) 10Yuvipanda: [C: 032] tools: Fix url handling for uwsgi [puppet] - 10https://gerrit.wikimedia.org/r/183801 (https://phabricator.wikimedia.org/T85362) (owner: 10Yuvipanda) [09:44:26] (03PS11) 10KartikMistry: Content Translation configuration for Production [mediawiki-config] - 10https://gerrit.wikimedia.org/r/181546 [09:48:36] 3Multimedia, operations, ops-core: Convert Imagescalers to HHVM, Trusty - https://phabricator.wikimedia.org/T84842#965437 (10Joe) Librsvg 2.40.2-1+wm1 is on apt.wikimedia.org and includes the patch at https://git.gnome.org/browse/librsvg/commit/?id=5ba4343bccc7e1765f38f87490b3d6a3a500fde1 I'll look at ffmpeg2th... [09:51:33] PROBLEM - Unmerged changes on repository puppet on strontium is CRITICAL: There is one unmerged change in puppet (dir /var/lib/git/operations/puppet). [09:52:04] PROBLEM - Unmerged changes on repository puppet on palladium is CRITICAL: There is one unmerged change in puppet (dir /var/lib/git/operations/puppet). [09:53:37] (^that was me, forgot to type ‘yes’ again, just did) [09:53:54] RECOVERY - Unmerged changes on repository puppet on strontium is OK: No changes to merge. [09:54:33] RECOVERY - Unmerged changes on repository puppet on palladium is OK: No changes to merge. [09:54:42] (03PS1) 10Filippo Giunchedi: gdash: fix trailing comma [puppet] - 10https://gerrit.wikimedia.org/r/183803 [09:55:27] (03CR) 10Filippo Giunchedi: [C: 032 V: 032] gdash: fix trailing comma [puppet] - 10https://gerrit.wikimedia.org/r/183803 (owner: 10Filippo Giunchedi) [09:56:48] <_joe_> paravoid: btw, you looked at the ffmpe2theora package in trusty and it already included a patch for the libav threads. I just remembered the moment I checked myself [09:57:06] <_joe_> we should build some external storage for our memories [09:57:19] <_joe_> one which doesn't require taking notes [09:57:28] don't worry, google probably has some secret project that is doing that already [09:57:56] <_joe_> paravoid: you think they'll index the contents of our brains next? [09:58:11] <_joe_> I thought they were in the "we tell you what to think and to want" business [09:58:28] <_joe_> it always seemed an easier way to control what people buy [09:58:42] not smart enough [10:02:44] 3Multimedia, operations, ops-core: Convert Imagescalers to HHVM, Trusty - https://phabricator.wikimedia.org/T84842#965458 (10Joe) ffmpeg2theora already includes the patch to disable libav threads in trusty, so we don't have any additional need to repackage things for trusty imagescalers. [10:05:04] PROBLEM - puppet last run on amssq54 is CRITICAL: CRITICAL: puppet fail [10:11:01] (03CR) 10Filippo Giunchedi: [C: 031] "LGTM, when did the move happen from one file to the other? perhaps other scripts broke too?" [puppet] - 10https://gerrit.wikimedia.org/r/183568 (https://phabricator.wikimedia.org/T1387) (owner: 10Reedy) [10:12:02] (03PS1) 10Faidon Liambotis: pdns: remove wildcards setting [puppet] - 10https://gerrit.wikimedia.org/r/183805 [10:12:04] (03PS1) 10Faidon Liambotis: ocg: remove support for lucid & precise [puppet] - 10https://gerrit.wikimedia.org/r/183806 [10:12:06] (03PS1) 10Faidon Liambotis: ocg: name apparmor.d profile properly [puppet] - 10https://gerrit.wikimedia.org/r/183807 [10:13:09] (03CR) 10Faidon Liambotis: [C: 032] pdns: remove wildcards setting [puppet] - 10https://gerrit.wikimedia.org/r/183805 (owner: 10Faidon Liambotis) [10:17:59] (03PS2) 10Faidon Liambotis: ocg: remove support for lucid & precise [puppet] - 10https://gerrit.wikimedia.org/r/183806 [10:18:01] (03PS2) 10Faidon Liambotis: ocg: name apparmor.d profile properly [puppet] - 10https://gerrit.wikimedia.org/r/183807 [10:19:19] (03PS1) 10Giuseppe Lavagetto: HAT: re-provision mw1152 as an experimental HAT imagescaler [puppet] - 10https://gerrit.wikimedia.org/r/183809 [10:19:54] <_joe_> !log reimaging mw1152 as a HAT imagescaler [10:19:55] (03PS3) 10Faidon Liambotis: ocg: name apparmor.d profile properly [puppet] - 10https://gerrit.wikimedia.org/r/183807 [10:19:58] Logged the message, Master [10:20:04] should I wait for you to merge this? :) [10:20:30] <_joe_> paravoid: merge what? [10:20:35] mw1152 [10:20:46] <_joe_> yes, I need to start reimaging first [10:21:07] (03CR) 10Faidon Liambotis: [C: 032] ocg: remove support for lucid & precise [puppet] - 10https://gerrit.wikimedia.org/r/183806 (owner: 10Faidon Liambotis) [10:21:51] (03PS4) 10Faidon Liambotis: ocg: name apparmor.d profile properly [puppet] - 10https://gerrit.wikimedia.org/r/183807 [10:21:58] (03PS2) 10Giuseppe Lavagetto: HAT: re-provision mw1152 as an experimental HAT imagescaler [puppet] - 10https://gerrit.wikimedia.org/r/183809 [10:22:40] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] HAT: re-provision mw1152 as an experimental HAT imagescaler [puppet] - 10https://gerrit.wikimedia.org/r/183809 (owner: 10Giuseppe Lavagetto) [10:23:00] (03PS5) 10Faidon Liambotis: ocg: name apparmor.d profile properly [puppet] - 10https://gerrit.wikimedia.org/r/183807 [10:23:05] (03CR) 10Faidon Liambotis: [C: 032 V: 032] ocg: name apparmor.d profile properly [puppet] - 10https://gerrit.wikimedia.org/r/183807 (owner: 10Faidon Liambotis) [10:23:21] <_joe_> paravoid: whenever you want, merge my change as well [10:23:44] done [10:23:44] <_joe_> hashar: jenkins is lagging behind I'd say [10:23:53] what else is new [10:23:58] * hashar looks at https://integration.wikimedia.org/zuul/ [10:24:11] seems fine [10:24:43] RECOVERY - puppet last run on amssq54 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [10:24:59] _joe_: it replies on https://gerrit.wikimedia.org/r/#/c/183809/ just a few seconds after you merged it :] [10:25:22] maybe one day we will manage to inject in the Gerrit change the progress of the jobs being run [10:25:41] needs to do some magic Javascript on Gerrit webpages to fetch the change status from https://integration.wikimedia.org/zuul/ [10:33:41] (03PS15) 10Faidon Liambotis: contint: provision hhvm on CI slaves [puppet] - 10https://gerrit.wikimedia.org/r/178806 (https://phabricator.wikimedia.org/T75356) (owner: 10Hashar) [10:34:04] (03CR) 10Faidon Liambotis: [C: 032] contint: provision hhvm on CI slaves [puppet] - 10https://gerrit.wikimedia.org/r/178806 (https://phabricator.wikimedia.org/T75356) (owner: 10Hashar) [10:34:40] huh [10:34:44] can merge: no [10:34:52] ah [10:35:02] (03PS8) 10Faidon Liambotis: contint: hourly auto update of wikimedia packages [puppet] - 10https://gerrit.wikimedia.org/r/183019 (owner: 10Hashar) [10:35:17] (03CR) 10Faidon Liambotis: [C: 032] contint: hourly auto update of wikimedia packages [puppet] - 10https://gerrit.wikimedia.org/r/183019 (owner: 10Hashar) [10:35:22] \O/ [10:35:24] PROBLEM - Auth DNS on labs-ns0.wikimedia.org is CRITICAL: CRITICAL - Plugin timed out while executing system call [10:35:33] quite happy to have learned about unattended upgrade [10:35:40] and apt::conf [10:35:43] PROBLEM - Disk space on labstore1001 is CRITICAL: DISK CRITICAL - free space: /srv/project 1253074 MB (3% inode=76%): /exp/project/abusefilter-global 1253074 MB (3% inode=76%): /exp/project/account-creation-assistance 1253074 MB (3% inode=76%): /exp/project/analytics 1253074 MB (3% inode=76%): /exp/project/bastion 1253074 MB (3% inode=76%): /exp/project/bots 1253074 MB (3% inode=76%): /exp/project/category-sorting 1253074 MB (3% inode [10:36:13] (03PS16) 10Faidon Liambotis: contint: provision hhvm on CI slaves [puppet] - 10https://gerrit.wikimedia.org/r/178806 (https://phabricator.wikimedia.org/T75356) (owner: 10Hashar) [10:36:22] (03CR) 10Faidon Liambotis: [C: 032] contint: provision hhvm on CI slaves [puppet] - 10https://gerrit.wikimedia.org/r/178806 (https://phabricator.wikimedia.org/T75356) (owner: 10Hashar) [10:36:34] RECOVERY - Auth DNS on labs-ns0.wikimedia.org is OK: DNS OK: 0.165 seconds response time. nagiostest.beta.wmflabs.org returns 208.80.155.135 [10:36:41] stupid pdns [10:38:25] (03PS1) 10Faidon Liambotis: ssh: introduce ssh::userkey resource [puppet] - 10https://gerrit.wikimedia.org/r/183814 [10:38:27] (03PS1) 10Faidon Liambotis: ssh: recurse/purge => true for /etc/ssh/userkeys [puppet] - 10https://gerrit.wikimedia.org/r/183815 [10:38:29] (03PS1) 10Faidon Liambotis: ssh: change userkeys' path hierarchy [puppet] - 10https://gerrit.wikimedia.org/r/183816 [10:38:31] (03PS1) 10Faidon Liambotis: ssh: support /etc/ssh/userkeys in production too [puppet] - 10https://gerrit.wikimedia.org/r/183817 [10:38:33] (03PS1) 10Faidon Liambotis: reprepro: transition to ssh::userkey [puppet] - 10https://gerrit.wikimedia.org/r/183818 [10:38:35] (03PS1) 10Faidon Liambotis: openstack: transition nova to ssh::userkey [puppet] - 10https://gerrit.wikimedia.org/r/183819 [10:38:37] (03PS1) 10Faidon Liambotis: mediawiki: transition to ssh::userkey [puppet] - 10https://gerrit.wikimedia.org/r/183820 [10:38:39] (03PS1) 10Faidon Liambotis: authdns: transition to ssh::userkey [puppet] - 10https://gerrit.wikimedia.org/r/183821 [10:38:41] (03PS1) 10Faidon Liambotis: puppet: transition to ssh::userkey [puppet] - 10https://gerrit.wikimedia.org/r/183822 [10:38:43] (03PS1) 10Faidon Liambotis: admin: transition to ssh::userkey [puppet] - 10https://gerrit.wikimedia.org/r/183823 [10:38:45] (03PS1) 10Faidon Liambotis: ssh: remove .ssh/authorized_keys support from prod [puppet] - 10https://gerrit.wikimedia.org/r/183824 [10:40:25] :) [10:40:34] overhauling ssh key management [10:40:45] <_joe_> yeah we sorely needed it [10:40:46] https://gerrit.wikimedia.org/r/#/q/project:operations/puppet+topic:ssh-userkey,n,z [10:40:49] reviews welcome [10:40:57] <_joe_> eh, I don't have much time now [10:41:05] it's not a friday thing anyway [10:41:11] just putting it out there [10:45:29] paravoid: nice, might fix T85814 too? [10:46:02] PROBLEM - puppet last run on heze is CRITICAL: CRITICAL: Puppet last ran 1 day ago [10:46:19] it will by cleaning them up [10:46:22] PROBLEM - puppet last run on helium is CRITICAL: CRITICAL: Puppet last ran 1 day ago [10:46:28] but I have no idea how these are generated in the first place tbh [10:47:18] PROBLEM - Varnishkafka Delivery Errors per minute on cp3022 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [20000.0] [10:51:31] PROBLEM - Varnishkafka Delivery Errors per minute on cp3010 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [10:54:02] 3ops-core: Experiment using apache mpm_worker instead of mpm_prefork on HHVM - https://phabricator.wikimedia.org/T85996#965514 (10Joe) the canary appservers are already using mpm worker instead of prefork, and they look pretty fine. [10:55:28] ooh nice [10:55:37] in differences in utilization? [10:55:51] RECOVERY - Varnishkafka Delivery Errors per minute on cp3022 is OK: OK: Less than 1.00% above the threshold [0.0] [10:56:33] <_joe_> paravoid: actually, I just found a bug [10:56:39] <_joe_> so my comment was wrong [10:56:41] hah [10:56:42] <_joe_> meh [10:56:45] <_joe_> fixing it [10:57:12] PROBLEM - Disk space on labstore1001 is CRITICAL: DISK CRITICAL - free space: /srv/project 1253094 MB (3% inode=76%): /exp/project/abusefilter-global 1253094 MB (3% inode=76%): /exp/project/account-creation-assistance 1253094 MB (3% inode=76%): /exp/project/analytics 1253094 MB (3% inode=76%): /exp/project/bastion 1253094 MB (3% inode=76%): /exp/project/bots 1253094 MB (3% inode=76%): /exp/project/category-sorting 1253094 MB (3% inode [10:57:32] RECOVERY - Varnishkafka Delivery Errors per minute on cp3010 is OK: OK: Less than 1.00% above the threshold [0.0] [10:59:19] <_joe_> it's a side-effect of still installing libapache2-mod-php [10:59:28] <_joe_> which we can't really stop doing btw [11:00:44] PROBLEM - Disk space on labstore1001 is CRITICAL: DISK CRITICAL - free space: /srv/project 1252569 MB (3% inode=76%): /exp/project/abusefilter-global 1252569 MB (3% inode=76%): /exp/project/account-creation-assistance 1252569 MB (3% inode=76%): /exp/project/analytics 1252569 MB (3% inode=76%): /exp/project/bastion 1252569 MB (3% inode=76%): /exp/project/bots 1252569 MB (3% inode=76%): /exp/project/category-sorting 1252569 MB (3% inode [11:03:24] PROBLEM - puppet last run on mw1152 is CRITICAL: CRITICAL: Puppet has 7 failures [11:04:03] RECOVERY - puppet last run on heze is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [11:04:04] PROBLEM - Disk space on labstore1001 is CRITICAL: DISK CRITICAL - free space: /srv/project 1253163 MB (3% inode=76%): /exp/project/abusefilter-global 1253163 MB (3% inode=76%): /exp/project/account-creation-assistance 1253163 MB (3% inode=76%): /exp/project/analytics 1253163 MB (3% inode=76%): /exp/project/bastion 1253163 MB (3% inode=76%): /exp/project/bots 1253163 MB (3% inode=76%): /exp/project/category-sorting 1253163 MB (3% inode [11:04:23] RECOVERY - puppet last run on helium is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [11:04:42] 3operations: Complete the use of HHVM over Zend PHP on the Wikimedia cluster - https://phabricator.wikimedia.org/T86081#965517 (10mark) Other than the old PHP 5.3 version that we're running in production, why is it crucial that we get away from Zend everywhere? [11:12:34] (03PS2) 10Alexandros Kosiaris: Normalize checkcommands.cfg whitespace [puppet] - 10https://gerrit.wikimedia.org/r/183514 [11:12:36] (03PS1) 10Alexandros Kosiaris: nagios_common: /usr/lib/nagios/plugins => $USER1$ [puppet] - 10https://gerrit.wikimedia.org/r/183826 [11:13:55] (03PS1) 10Giuseppe Lavagetto: mediawiki: explicitly set the mpm module we're using [puppet] - 10https://gerrit.wikimedia.org/r/183827 [11:13:58] (03PS1) 10Giuseppe Lavagetto: mediawiki: use mpm_worker everywhere [puppet] - 10https://gerrit.wikimedia.org/r/183828 [11:14:14] _joe_: why can't we? [11:14:25] <_joe_> paravoid: dependencies [11:15:03] <_joe_> paravoid: IIRC we have the alternative of installing php-fpm [11:15:15] <_joe_> which would be worse [11:15:22] ok [11:15:32] so, any differences in utilization? I'm curious [11:15:54] <_joe_> I'll know once I merge https://gerrit.wikimedia.org/r/183827 [11:15:56] <_joe_> :) [11:16:22] ah heh [11:17:24] PROBLEM - Disk space on labstore1001 is CRITICAL: DISK CRITICAL - free space: /srv/project 1252966 MB (3% inode=76%): /exp/project/abusefilter-global 1252966 MB (3% inode=76%): /exp/project/account-creation-assistance 1252966 MB (3% inode=76%): /exp/project/analytics 1252966 MB (3% inode=76%): /exp/project/bastion 1252966 MB (3% inode=76%): /exp/project/bots 1252966 MB (3% inode=76%): /exp/project/category-sorting 1252966 MB (3% inode [11:19:26] <_joe_> paravoid: I don't anticipate big changes btw, it should lower the latency when there is a traffic spike [11:19:47] might reduce system time [11:23:53] (03CR) 10Giuseppe Lavagetto: [C: 032] mediawiki: explicitly set the mpm module we're using [puppet] - 10https://gerrit.wikimedia.org/r/183827 (owner: 10Giuseppe Lavagetto) [11:25:02] <_joe_> grrr git grep failed me [11:25:24] PROBLEM - puppet last run on mw1017 is CRITICAL: CRITICAL: puppet fail [11:25:40] (03CR) 10Alexandros Kosiaris: Content Translation configuration for Production (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/181546 (owner: 10KartikMistry) [11:26:33] <_joe_> I already did this "the right way" and forgot [11:26:37] (03PS1) 10Giuseppe Lavagetto: Revert "mediawiki: explicitly set the mpm module we're using" [puppet] - 10https://gerrit.wikimedia.org/r/183829 [11:26:45] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] Revert "mediawiki: explicitly set the mpm module we're using" [puppet] - 10https://gerrit.wikimedia.org/r/183829 (owner: 10Giuseppe Lavagetto) [11:27:44] _joe_: When manually trying to run stuff against an application server... any way to bypass the https redirect? [11:28:16] ah, there's still a setting for that [11:29:41] <_joe_> hoo: X-Forwared-Proto [11:30:58] (03PS1) 10Giuseppe Lavagetto: mediawiki: use worker mpm for canary appservers, not just the config [puppet] - 10https://gerrit.wikimedia.org/r/183831 [11:31:03] (03CR) 10Alexandros Kosiaris: [C: 032] Normalize checkcommands.cfg whitespace [puppet] - 10https://gerrit.wikimedia.org/r/183514 (owner: 10Alexandros Kosiaris) [11:31:18] (03PS2) 10Giuseppe Lavagetto: mediawiki: use worker mpm for canary appservers, not just the config [puppet] - 10https://gerrit.wikimedia.org/r/183831 [11:31:36] (03CR) 10Alexandros Kosiaris: [C: 032] nagios_common: /usr/lib/nagios/plugins => $USER1$ [puppet] - 10https://gerrit.wikimedia.org/r/183826 (owner: 10Alexandros Kosiaris) [11:31:46] (03CR) 10Giuseppe Lavagetto: [C: 032] mediawiki: use worker mpm for canary appservers, not just the config [puppet] - 10https://gerrit.wikimedia.org/r/183831 (owner: 10Giuseppe Lavagetto) [11:31:53] _joe_: mh, I get some information now [11:31:56] (03PS3) 10Giuseppe Lavagetto: mediawiki: use worker mpm for canary appservers, not just the config [puppet] - 10https://gerrit.wikimedia.org/r/183831 [11:31:57] but nothing really to useful [11:32:07] * Error while processing content unencoding: invalid code lengths set [11:32:35] (03CR) 10Giuseppe Lavagetto: [V: 032] mediawiki: use worker mpm for canary appservers, not just the config [puppet] - 10https://gerrit.wikimedia.org/r/183831 (owner: 10Giuseppe Lavagetto) [11:33:50] <_joe_> hoo: that comes from mediawiki or from hhvm? [11:33:57] form curl [11:34:02] it jokes on the reply [11:34:04] <_joe_> yeah I got that [11:34:05] * chokes [11:34:11] <_joe_> oh ok [11:34:16] trying without --compress now [11:34:19] ah, here we go [11:34:34] <_joe_> try with telnet [11:34:44] <_joe_> sorry, I have one problem to deal with [11:35:05] PHP fatal error:
[11:35:05] request has exceeded memory limit [11:35:10] that's all I wanted to see [11:35:29] <_joe_> hoo: well why does this translates to a 503? [11:35:39] <_joe_> I guess it has to do with the content-length [11:35:52] <_joe_> can you pass to me somehow the curl request you were doing? [11:36:08] <_joe_> I fear we have another case of lamely setting the content-length ther [11:36:13] <_joe_> *there [11:36:28] PMed you [11:36:39] there are cookies of my test account in there [11:36:48] RECOVERY - puppet last run on mw1017 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:37:01] <_joe_> more importantly, I have your location [11:37:06] * _joe_ programming the drones [11:37:14] <_joe_> :P [11:37:24] <_joe_> thanks a lot [11:37:31] hehe... although my location is way more public than it should be :P [11:38:24] <_joe_> oh I discovered that being a contractor and having a "private enterprise" named after me made google show all the business info when searching my name [11:38:34] <_joe_> including location of my house, phone number... [11:43:29] yeah :/ [11:44:59] <_joe_> hoo: anyways, I'll take a further look to see if we can make the error get back to the user [11:45:14] _joe_: That would be great [11:45:20] <_joe_> or at least to varnish [11:45:22] do you see any way to get a bt of that error or so? [11:45:40] <_joe_> * There are processes named 'apache2' running which do not match your pid file which are left untouched in the name of safety, Please review the situation by hand. [11:45:43] <_joe_> mmmh [11:45:48] I have no idea what could drive that script out of memory... we have some stuff in there, but nothing that should eat 300M or memory [11:46:45] <_joe_> nice, when puppet runs, apache gets killed but not restarted [11:47:07] <_joe_> and puppet is just issuing a 'service apache restart' AFAIR [11:59:09] PROBLEM - HHVM rendering on mw1114 is CRITICAL: Connection refused [11:59:49] PROBLEM - Apache HTTP on mw1114 is CRITICAL: Connection refused [12:04:17] <_joe_> grr [12:04:39] RECOVERY - Apache HTTP on mw1114 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 440 bytes in 0.052 second response time [12:05:18] RECOVERY - HHVM rendering on mw1114 is OK: HTTP OK: HTTP/1.1 200 OK - 68687 bytes in 0.183 second response time [12:06:51] (03PS1) 10Yuvipanda: Revert "Labs: Make dynamic proxies use local resolver" [puppet] - 10https://gerrit.wikimedia.org/r/183833 [12:06:59] Coren ^ when you are around [12:32:38] PROBLEM - Varnishkafka Delivery Errors per minute on cp3019 is CRITICAL: CRITICAL: 12.50% of data above the critical threshold [20000.0] [12:37:28] RECOVERY - Varnishkafka Delivery Errors per minute on cp3019 is OK: OK: Less than 1.00% above the threshold [0.0] [12:48:38] PROBLEM - Varnishkafka Delivery Errors per minute on cp3019 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [20000.0] [12:49:48] PROBLEM - puppet last run on mw1231 is CRITICAL: CRITICAL: Puppet last ran 12 hours ago [12:50:38] RECOVERY - puppet last run on mw1152 is OK: OK: Puppet is currently enabled, last run 1 hour ago with 0 failures [12:54:39] RECOVERY - puppet last run on mw1231 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [12:55:37] <_joe_> so why on earth was puppet disable on mw1231 without a SAL entry? [12:55:41] <_joe_> meh [12:56:59] RECOVERY - Varnishkafka Delivery Errors per minute on cp3019 is OK: OK: Less than 1.00% above the threshold [0.0] [13:00:21] (03PS3) 10Alexandros Kosiaris: Use hiera for cxserver port [puppet] - 10https://gerrit.wikimedia.org/r/183241 [13:00:23] (03PS3) 10Alexandros Kosiaris: Cleanup cxserver module/role [puppet] - 10https://gerrit.wikimedia.org/r/183240 [13:00:25] (03PS4) 10Alexandros Kosiaris: LVS for cxserver [puppet] - 10https://gerrit.wikimedia.org/r/183243 [13:00:27] (03PS3) 10Alexandros Kosiaris: Apply cxserver role to sca [puppet] - 10https://gerrit.wikimedia.org/r/183242 [13:00:49] PROBLEM - Apache HTTP on mw1020 is CRITICAL: Connection refused [13:00:58] PROBLEM - HHVM rendering on mw1020 is CRITICAL: Connection refused [13:07:27] (03CR) 10KartikMistry: [C: 031] Use hiera for cxserver port [puppet] - 10https://gerrit.wikimedia.org/r/183241 (owner: 10Alexandros Kosiaris) [13:08:32] (03CR) 10KartikMistry: [C: 031] Cleanup cxserver module/role [puppet] - 10https://gerrit.wikimedia.org/r/183240 (owner: 10Alexandros Kosiaris) [13:09:44] (03PS4) 10Alexandros Kosiaris: Use hiera for cxserver port [puppet] - 10https://gerrit.wikimedia.org/r/183241 [13:09:47] (03PS4) 10Alexandros Kosiaris: Cleanup cxserver module/role [puppet] - 10https://gerrit.wikimedia.org/r/183240 [13:09:49] (03PS5) 10Alexandros Kosiaris: LVS for cxserver [puppet] - 10https://gerrit.wikimedia.org/r/183243 [13:09:50] (03PS4) 10Alexandros Kosiaris: Apply cxserver role to sca [puppet] - 10https://gerrit.wikimedia.org/r/183242 [13:17:49] RECOVERY - Apache HTTP on mw1020 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 440 bytes in 0.076 second response time [13:17:49] RECOVERY - HHVM rendering on mw1020 is OK: HTTP OK: HTTP/1.1 200 OK - 68687 bytes in 0.155 second response time [13:19:53] (03CR) 10Alexandros Kosiaris: [C: 032] Cleanup cxserver module/role [puppet] - 10https://gerrit.wikimedia.org/r/183240 (owner: 10Alexandros Kosiaris) [13:20:20] (03CR) 10Alexandros Kosiaris: [C: 032] Use hiera for cxserver port [puppet] - 10https://gerrit.wikimedia.org/r/183241 (owner: 10Alexandros Kosiaris) [13:20:57] (03CR) 10Alexandros Kosiaris: [C: 032] Apply cxserver role to sca [puppet] - 10https://gerrit.wikimedia.org/r/183242 (owner: 10Alexandros Kosiaris) [13:24:59] PROBLEM - puppet last run on mw1012 is CRITICAL: CRITICAL: puppet fail [13:25:00] PROBLEM - puppet last run on es2002 is CRITICAL: CRITICAL: puppet fail [13:25:00] PROBLEM - puppet last run on analytics1033 is CRITICAL: CRITICAL: puppet fail [13:25:09] PROBLEM - puppet last run on neptunium is CRITICAL: CRITICAL: puppet fail [13:25:09] PROBLEM - puppet last run on db2034 is CRITICAL: CRITICAL: puppet fail [13:25:09] PROBLEM - puppet last run on rbf1002 is CRITICAL: CRITICAL: puppet fail [13:25:09] PROBLEM - puppet last run on carbon is CRITICAL: CRITICAL: puppet fail [13:25:09] PROBLEM - puppet last run on ms-fe2004 is CRITICAL: CRITICAL: puppet fail [13:25:09] PROBLEM - puppet last run on sca1001 is CRITICAL: CRITICAL: puppet fail [13:25:10] PROBLEM - puppet last run on analytics1041 is CRITICAL: CRITICAL: puppet fail [13:25:18] PROBLEM - puppet last run on gold is CRITICAL: CRITICAL: puppet fail [13:25:18] PROBLEM - puppet last run on cp1039 is CRITICAL: CRITICAL: puppet fail [13:25:19] PROBLEM - puppet last run on analytics1020 is CRITICAL: CRITICAL: puppet fail [13:25:19] PROBLEM - puppet last run on db1073 is CRITICAL: CRITICAL: puppet fail [13:25:29] PROBLEM - puppet last run on mw1200 is CRITICAL: CRITICAL: puppet fail [13:25:29] PROBLEM - puppet last run on netmon1001 is CRITICAL: CRITICAL: puppet fail [13:25:29] PROBLEM - puppet last run on mc1006 is CRITICAL: CRITICAL: puppet fail [13:25:29] PROBLEM - puppet last run on mw1160 is CRITICAL: CRITICAL: puppet fail [13:25:39] PROBLEM - puppet last run on db2005 is CRITICAL: CRITICAL: puppet fail [13:25:39] PROBLEM - puppet last run on db2009 is CRITICAL: CRITICAL: puppet fail [13:25:49] PROBLEM - puppet last run on mw1254 is CRITICAL: CRITICAL: puppet fail [13:25:49] PROBLEM - puppet last run on mw1041 is CRITICAL: CRITICAL: puppet fail [13:25:49] PROBLEM - puppet last run on amssq32 is CRITICAL: CRITICAL: puppet fail [13:25:49] PROBLEM - puppet last run on cp3020 is CRITICAL: CRITICAL: puppet fail [13:25:49] PROBLEM - puppet last run on cp3008 is CRITICAL: CRITICAL: puppet fail [13:25:49] PROBLEM - puppet last run on xenon is CRITICAL: CRITICAL: puppet fail [13:25:49] PROBLEM - puppet last run on cp1055 is CRITICAL: CRITICAL: puppet fail [13:25:50] PROBLEM - puppet last run on helium is CRITICAL: CRITICAL: puppet fail [13:25:58] PROBLEM - puppet last run on db1031 is CRITICAL: CRITICAL: puppet fail [13:25:58] PROBLEM - puppet last run on wtp1020 is CRITICAL: CRITICAL: puppet fail [13:25:58] PROBLEM - puppet last run on lvs1002 is CRITICAL: CRITICAL: puppet fail [13:25:59] PROBLEM - puppet last run on es2008 is CRITICAL: CRITICAL: puppet fail [13:25:59] PROBLEM - puppet last run on db2019 is CRITICAL: CRITICAL: puppet fail [13:25:59] PROBLEM - puppet last run on heze is CRITICAL: CRITICAL: puppet fail [13:25:59] PROBLEM - puppet last run on lvs1005 is CRITICAL: CRITICAL: puppet fail [13:26:00] PROBLEM - puppet last run on db2039 is CRITICAL: CRITICAL: puppet fail [13:26:00] PROBLEM - puppet last run on ms-be2006 is CRITICAL: CRITICAL: puppet fail [13:26:01] PROBLEM - puppet last run on ms-be2003 is CRITICAL: CRITICAL: puppet fail [13:26:01] PROBLEM - puppet last run on ms-be1006 is CRITICAL: CRITICAL: puppet fail [13:26:02] PROBLEM - puppet last run on snapshot1003 is CRITICAL: CRITICAL: puppet fail [13:26:02] PROBLEM - puppet last run on mw1187 is CRITICAL: CRITICAL: puppet fail [13:26:03] PROBLEM - puppet last run on platinum is CRITICAL: CRITICAL: puppet fail [13:26:03] PROBLEM - puppet last run on elastic1001 is CRITICAL: CRITICAL: puppet fail [13:26:04] PROBLEM - puppet last run on mw1100 is CRITICAL: CRITICAL: puppet fail [13:26:04] PROBLEM - puppet last run on mw1082 is CRITICAL: CRITICAL: puppet fail [13:26:05] PROBLEM - puppet last run on analytics1025 is CRITICAL: CRITICAL: puppet fail [13:26:05] PROBLEM - puppet last run on mw1226 is CRITICAL: CRITICAL: puppet fail [13:26:06] PROBLEM - puppet last run on search1016 is CRITICAL: CRITICAL: puppet fail [13:26:08] PROBLEM - puppet last run on amssq35 is CRITICAL: CRITICAL: puppet fail [13:26:09] PROBLEM - puppet last run on mw1026 is CRITICAL: CRITICAL: puppet fail [13:26:09] PROBLEM - puppet last run on potassium is CRITICAL: CRITICAL: puppet fail [13:26:09] PROBLEM - puppet last run on amssq61 is CRITICAL: CRITICAL: puppet fail [13:26:09] PROBLEM - puppet last run on amssq53 is CRITICAL: CRITICAL: puppet fail [13:26:09] PROBLEM - puppet last run on dbstore2001 is CRITICAL: CRITICAL: puppet fail [13:26:10] PROBLEM - puppet last run on mw1250 is CRITICAL: CRITICAL: puppet fail [13:26:10] PROBLEM - puppet last run on analytics1040 is CRITICAL: CRITICAL: puppet fail [13:26:10] PROBLEM - puppet last run on wtp1006 is CRITICAL: CRITICAL: puppet fail [13:26:12] PROBLEM - puppet last run on cp1047 is CRITICAL: CRITICAL: puppet fail [13:26:18] PROBLEM - puppet last run on mw1224 is CRITICAL: CRITICAL: puppet fail [13:26:19] PROBLEM - puppet last run on elastic1007 is CRITICAL: CRITICAL: puppet fail [13:26:19] PROBLEM - puppet last run on search1010 is CRITICAL: CRITICAL: puppet fail [13:26:19] PROBLEM - puppet last run on mc1002 is CRITICAL: CRITICAL: puppet fail [13:26:19] PROBLEM - puppet last run on mw1222 is CRITICAL: CRITICAL: puppet fail [13:26:19] PROBLEM - puppet last run on elastic1008 is CRITICAL: CRITICAL: puppet fail [13:26:20] PROBLEM - puppet last run on mw1060 is CRITICAL: CRITICAL: puppet fail [13:26:20] PROBLEM - puppet last run on mw1174 is CRITICAL: CRITICAL: puppet fail [13:26:29] PROBLEM - puppet last run on es2001 is CRITICAL: CRITICAL: puppet fail [13:26:30] PROBLEM - puppet last run on es1008 is CRITICAL: CRITICAL: puppet fail [13:26:30] PROBLEM - puppet last run on ms-fe1001 is CRITICAL: CRITICAL: puppet fail [13:26:30] PROBLEM - puppet last run on mw1009 is CRITICAL: CRITICAL: puppet fail [13:26:30] PROBLEM - puppet last run on mw1069 is CRITICAL: CRITICAL: puppet fail [13:26:38] PROBLEM - puppet last run on mw1176 is CRITICAL: CRITICAL: puppet fail [13:26:38] PROBLEM - puppet last run on wtp1016 is CRITICAL: CRITICAL: puppet fail [13:26:39] PROBLEM - puppet last run on elastic1004 is CRITICAL: CRITICAL: puppet fail [13:26:39] PROBLEM - puppet last run on mw1046 is CRITICAL: CRITICAL: puppet fail [13:26:39] PROBLEM - puppet last run on db1066 is CRITICAL: CRITICAL: puppet fail [13:26:39] PROBLEM - puppet last run on db2002 is CRITICAL: CRITICAL: puppet fail [13:26:39] PROBLEM - puppet last run on mw1117 is CRITICAL: CRITICAL: puppet fail [13:26:39] uh [13:26:44] (03PS1) 10Alexandros Kosiaris: Typo fix for d7fc542 [puppet] - 10https://gerrit.wikimedia.org/r/183844 [13:26:48] PROBLEM - puppet last run on ms-be1003 is CRITICAL: CRITICAL: puppet fail [13:26:48] PROBLEM - puppet last run on db1022 is CRITICAL: CRITICAL: puppet fail [13:26:48] PROBLEM - puppet last run on db1050 is CRITICAL: CRITICAL: puppet fail [13:26:49] PROBLEM - puppet last run on lvs3001 is CRITICAL: CRITICAL: puppet fail [13:26:49] PROBLEM - puppet last run on lead is CRITICAL: CRITICAL: puppet fail [13:26:57] ah [13:26:58] :) [13:26:59] PROBLEM - puppet last run on mw1189 is CRITICAL: CRITICAL: puppet fail [13:26:59] PROBLEM - puppet last run on capella is CRITICAL: CRITICAL: puppet fail [13:26:59] PROBLEM - puppet last run on mw1088 is CRITICAL: CRITICAL: puppet fail [13:26:59] PROBLEM - puppet last run on mw1150 is CRITICAL: CRITICAL: puppet fail [13:26:59] PROBLEM - puppet last run on db2018 is CRITICAL: CRITICAL: puppet fail [13:26:59] PROBLEM - puppet last run on ms-be2004 is CRITICAL: CRITICAL: puppet fail [13:26:59] PROBLEM - puppet last run on mw1228 is CRITICAL: CRITICAL: puppet fail [13:27:00] PROBLEM - puppet last run on ms1004 is CRITICAL: CRITICAL: puppet fail [13:27:00] PROBLEM - puppet last run on sca1002 is CRITICAL: CRITICAL: puppet fail [13:27:01] PROBLEM - puppet last run on mw1242 is CRITICAL: CRITICAL: puppet fail [13:27:01] PROBLEM - puppet last run on cp3003 is CRITICAL: CRITICAL: puppet fail [13:27:02] PROBLEM - puppet last run on cp3016 is CRITICAL: CRITICAL: puppet fail [13:27:02] PROBLEM - puppet last run on mw1003 is CRITICAL: CRITICAL: puppet fail [13:27:03] PROBLEM - puppet last run on elastic1018 is CRITICAL: CRITICAL: puppet fail [13:27:08] PROBLEM - puppet last run on mw1099 is CRITICAL: CRITICAL: puppet fail [13:27:09] PROBLEM - puppet last run on mw1164 is CRITICAL: CRITICAL: puppet fail [13:27:09] PROBLEM - puppet last run on es2009 is CRITICAL: CRITICAL: puppet fail [13:27:09] PROBLEM - puppet last run on elastic1021 is CRITICAL: CRITICAL: puppet fail [13:27:18] PROBLEM - puppet last run on elastic1012 is CRITICAL: CRITICAL: puppet fail [13:27:18] PROBLEM - puppet last run on mw1217 is CRITICAL: CRITICAL: puppet fail [13:27:19] PROBLEM - puppet last run on mw1205 is CRITICAL: CRITICAL: puppet fail [13:27:19] PROBLEM - puppet last run on mw1008 is CRITICAL: CRITICAL: puppet fail [13:27:19] PROBLEM - puppet last run on mc1003 is CRITICAL: CRITICAL: puppet fail [13:27:19] PROBLEM - puppet last run on analytics1035 is CRITICAL: CRITICAL: puppet fail [13:27:19] PROBLEM - puppet last run on cp3014 is CRITICAL: CRITICAL: puppet fail [13:27:28] PROBLEM - puppet last run on ruthenium is CRITICAL: CRITICAL: puppet fail [13:27:28] PROBLEM - puppet last run on search1018 is CRITICAL: CRITICAL: puppet fail [13:27:29] PROBLEM - puppet last run on mw1166 is CRITICAL: CRITICAL: puppet fail [13:27:29] PROBLEM - puppet last run on mw1153 is CRITICAL: CRITICAL: puppet fail [13:27:29] PROBLEM - puppet last run on mw1123 is CRITICAL: CRITICAL: puppet fail [13:27:29] PROBLEM - puppet last run on mw1170 is CRITICAL: CRITICAL: puppet fail [13:27:39] PROBLEM - puppet last run on analytics1030 is CRITICAL: CRITICAL: puppet fail [13:27:39] PROBLEM - puppet last run on mw1068 is CRITICAL: CRITICAL: puppet fail [13:27:39] PROBLEM - puppet last run on mw1120 is CRITICAL: CRITICAL: puppet fail [13:27:39] PROBLEM - puppet last run on mw1173 is CRITICAL: CRITICAL: puppet fail [13:27:40] PROBLEM - puppet last run on mw1118 is CRITICAL: CRITICAL: puppet fail [13:27:49] PROBLEM - puppet last run on mw1052 is CRITICAL: CRITICAL: puppet fail [13:27:49] PROBLEM - puppet last run on mw1042 is CRITICAL: CRITICAL: puppet fail [13:27:49] PROBLEM - puppet last run on lvs2004 is CRITICAL: CRITICAL: puppet fail [13:27:49] PROBLEM - puppet last run on cp1056 is CRITICAL: CRITICAL: puppet fail [13:27:50] PROBLEM - puppet last run on mw1061 is CRITICAL: CRITICAL: puppet fail [13:28:00] PROBLEM - puppet last run on search1001 is CRITICAL: CRITICAL: puppet fail [13:28:00] PROBLEM - puppet last run on iron is CRITICAL: CRITICAL: puppet fail [13:28:00] PROBLEM - puppet last run on ms-fe1004 is CRITICAL: CRITICAL: puppet fail [13:28:00] PROBLEM - puppet last run on db1059 is CRITICAL: CRITICAL: puppet fail [13:28:00] PROBLEM - puppet last run on db1067 is CRITICAL: CRITICAL: puppet fail [13:28:08] PROBLEM - puppet last run on virt1006 is CRITICAL: CRITICAL: puppet fail [13:28:09] PROBLEM - puppet last run on db1002 is CRITICAL: CRITICAL: puppet fail [13:28:09] PROBLEM - puppet last run on mw1235 is CRITICAL: CRITICAL: puppet fail [13:28:09] PROBLEM - puppet last run on mw1025 is CRITICAL: CRITICAL: puppet fail [13:28:09] PROBLEM - puppet last run on cp4014 is CRITICAL: CRITICAL: puppet fail [13:28:09] PROBLEM - puppet last run on cp4008 is CRITICAL: CRITICAL: puppet fail [13:28:18] PROBLEM - puppet last run on elastic1027 is CRITICAL: CRITICAL: puppet fail [13:28:19] PROBLEM - puppet last run on db1018 is CRITICAL: CRITICAL: puppet fail [13:28:19] PROBLEM - puppet last run on ms-fe2001 is CRITICAL: CRITICAL: puppet fail [13:28:28] PROBLEM - puppet last run on labsdb1003 is CRITICAL: CRITICAL: puppet fail [13:28:29] PROBLEM - puppet last run on db1046 is CRITICAL: CRITICAL: puppet fail [13:28:29] PROBLEM - puppet last run on mw1114 is CRITICAL: CRITICAL: puppet fail [13:28:29] PROBLEM - puppet last run on search1007 is CRITICAL: CRITICAL: puppet fail [13:28:29] PROBLEM - puppet last run on mw1092 is CRITICAL: CRITICAL: puppet fail [13:28:29] PROBLEM - puppet last run on mw1002 is CRITICAL: CRITICAL: puppet fail [13:28:29] PROBLEM - puppet last run on mw1065 is CRITICAL: CRITICAL: puppet fail [13:28:30] PROBLEM - puppet last run on db1015 is CRITICAL: CRITICAL: puppet fail [13:28:30] PROBLEM - puppet last run on db1040 is CRITICAL: CRITICAL: puppet fail [13:28:31] PROBLEM - puppet last run on db1028 is CRITICAL: CRITICAL: puppet fail [13:28:38] PROBLEM - puppet last run on cp1061 is CRITICAL: CRITICAL: puppet fail [13:28:38] PROBLEM - puppet last run on amssq46 is CRITICAL: CRITICAL: puppet fail [13:28:39] PROBLEM - puppet last run on amssq60 is CRITICAL: CRITICAL: puppet fail [13:28:39] PROBLEM - puppet last run on amssq47 is CRITICAL: CRITICAL: puppet fail [13:28:39] PROBLEM - puppet last run on amssq48 is CRITICAL: CRITICAL: puppet fail [13:28:39] PROBLEM - puppet last run on mw1119 is CRITICAL: CRITICAL: puppet fail [13:28:39] PROBLEM - puppet last run on mw1172 is CRITICAL: CRITICAL: puppet fail [13:28:40] PROBLEM - puppet last run on mw1213 is CRITICAL: CRITICAL: puppet fail [13:28:40] PROBLEM - puppet last run on elastic1022 is CRITICAL: CRITICAL: puppet fail [13:28:48] PROBLEM - puppet last run on db1051 is CRITICAL: CRITICAL: puppet fail [13:28:48] PROBLEM - puppet last run on ms-fe2003 is CRITICAL: CRITICAL: puppet fail [13:28:48] PROBLEM - puppet last run on db2036 is CRITICAL: CRITICAL: puppet fail [13:28:48] PROBLEM - puppet last run on db1021 is CRITICAL: CRITICAL: puppet fail [13:28:48] PROBLEM - puppet last run on db1034 is CRITICAL: CRITICAL: puppet fail [13:28:49] PROBLEM - puppet last run on cp4003 is CRITICAL: CRITICAL: puppet fail [13:28:49] PROBLEM - puppet last run on analytics1010 is CRITICAL: CRITICAL: puppet fail [13:28:49] PROBLEM - puppet last run on mw1175 is CRITICAL: CRITICAL: puppet fail [13:28:58] PROBLEM - puppet last run on mw1039 is CRITICAL: CRITICAL: puppet fail [13:28:59] PROBLEM - puppet last run on mw1251 is CRITICAL: CRITICAL: puppet fail [13:28:59] PROBLEM - puppet last run on mw1162 is CRITICAL: CRITICAL: puppet fail [13:28:59] PROBLEM - puppet last run on es1007 is CRITICAL: CRITICAL: puppet fail [13:28:59] PROBLEM - puppet last run on pc1002 is CRITICAL: CRITICAL: puppet fail [13:28:59] PROBLEM - puppet last run on mw1144 is CRITICAL: CRITICAL: puppet fail [13:28:59] PROBLEM - puppet last run on labcontrol2001 is CRITICAL: CRITICAL: puppet fail [13:29:00] PROBLEM - puppet last run on db1023 is CRITICAL: CRITICAL: puppet fail [13:29:01] PROBLEM - puppet last run on cp4004 is CRITICAL: CRITICAL: puppet fail [13:29:01] PROBLEM - puppet last run on amslvs1 is CRITICAL: CRITICAL: puppet fail [13:29:01] PROBLEM - puppet last run on amssq55 is CRITICAL: CRITICAL: puppet fail [13:29:09] PROBLEM - puppet last run on dbproxy1001 is CRITICAL: CRITICAL: puppet fail [13:29:09] PROBLEM - puppet last run on mw1054 is CRITICAL: CRITICAL: puppet fail [13:29:10] PROBLEM - puppet last run on mw1149 is CRITICAL: CRITICAL: puppet fail [13:29:10] PROBLEM - puppet last run on mw1044 is CRITICAL: CRITICAL: puppet fail [13:29:10] PROBLEM - puppet last run on db1042 is CRITICAL: CRITICAL: puppet fail [13:29:10] PROBLEM - puppet last run on elastic1030 is CRITICAL: CRITICAL: puppet fail [13:29:10] PROBLEM - puppet last run on lvs2001 is CRITICAL: CRITICAL: puppet fail [13:29:10] PROBLEM - puppet last run on mw1129 is CRITICAL: CRITICAL: puppet fail [13:29:18] PROBLEM - puppet last run on logstash1002 is CRITICAL: CRITICAL: puppet fail [13:29:19] PROBLEM - puppet last run on cp4001 is CRITICAL: CRITICAL: puppet fail [13:29:19] PROBLEM - puppet last run on mw1011 is CRITICAL: CRITICAL: puppet fail [13:29:19] PROBLEM - puppet last run on gallium is CRITICAL: CRITICAL: puppet fail [13:29:29] PROBLEM - puppet last run on db2038 is CRITICAL: CRITICAL: puppet fail [13:29:29] PROBLEM - puppet last run on mw1076 is CRITICAL: CRITICAL: puppet fail [13:29:29] PROBLEM - puppet last run on db2042 is CRITICAL: CRITICAL: puppet fail [13:29:29] PROBLEM - puppet last run on snapshot1001 is CRITICAL: CRITICAL: puppet fail [13:29:29] PROBLEM - puppet last run on ms-be3002 is CRITICAL: CRITICAL: puppet fail [13:29:29] PROBLEM - puppet last run on cp1058 is CRITICAL: CRITICAL: puppet fail [13:29:30] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: puppet fail [13:29:39] PROBLEM - puppet last run on mc1012 is CRITICAL: CRITICAL: puppet fail [13:29:39] PROBLEM - puppet last run on mw1211 is CRITICAL: CRITICAL: puppet fail [13:29:39] PROBLEM - puppet last run on ms-be2011 is CRITICAL: CRITICAL: puppet fail [13:29:40] PROBLEM - puppet last run on db2007 is CRITICAL: CRITICAL: puppet fail [13:29:40] PROBLEM - puppet last run on db2040 is CRITICAL: CRITICAL: puppet fail [13:29:40] PROBLEM - puppet last run on install2001 is CRITICAL: CRITICAL: puppet fail [13:29:40] PROBLEM - puppet last run on virt1001 is CRITICAL: CRITICAL: puppet fail [13:29:40] PROBLEM - puppet last run on labnet1001 is CRITICAL: CRITICAL: puppet fail [13:29:48] PROBLEM - puppet last run on analytics1038 is CRITICAL: CRITICAL: puppet fail [13:29:48] PROBLEM - puppet last run on mw1237 is CRITICAL: CRITICAL: puppet fail [13:29:48] PROBLEM - puppet last run on mw1126 is CRITICAL: CRITICAL: puppet fail [13:29:49] PROBLEM - puppet last run on cp3010 is CRITICAL: CRITICAL: puppet fail [13:29:49] PROBLEM - puppet last run on amssq34 is CRITICAL: CRITICAL: puppet fail [13:29:49] PROBLEM - puppet last run on amssq51 is CRITICAL: CRITICAL: puppet fail [13:29:49] PROBLEM - puppet last run on polonium is CRITICAL: CRITICAL: puppet fail [13:29:52] PROBLEM - puppet last run on antimony is CRITICAL: CRITICAL: puppet fail [13:29:52] PROBLEM - puppet last run on mw1177 is CRITICAL: CRITICAL: puppet fail [13:29:53] PROBLEM - puppet last run on virt1003 is CRITICAL: CRITICAL: puppet fail [13:29:53] PROBLEM - puppet last run on analytics1016 is CRITICAL: CRITICAL: puppet fail [13:29:59] PROBLEM - puppet last run on db1052 is CRITICAL: CRITICAL: puppet fail [13:29:59] PROBLEM - puppet last run on mw1055 is CRITICAL: CRITICAL: puppet fail [13:30:00] PROBLEM - puppet last run on mw1249 is CRITICAL: CRITICAL: puppet fail [13:30:00] PROBLEM - puppet last run on wtp1005 is CRITICAL: CRITICAL: puppet fail [13:30:00] PROBLEM - puppet last run on nescio is CRITICAL: CRITICAL: puppet fail [13:30:00] PROBLEM - puppet last run on mc1005 is CRITICAL: CRITICAL: puppet fail [13:30:08] PROBLEM - puppet last run on search1002 is CRITICAL: CRITICAL: puppet fail [13:30:09] PROBLEM - puppet last run on db1048 is CRITICAL: CRITICAL: puppet fail [13:30:09] PROBLEM - puppet last run on mw1206 is CRITICAL: CRITICAL: puppet fail [13:30:09] PROBLEM - puppet last run on mw1208 is CRITICAL: CRITICAL: puppet fail [13:30:09] PROBLEM - puppet last run on cp1046 is CRITICAL: CRITICAL: puppet fail [13:30:18] PROBLEM - puppet last run on db1003 is CRITICAL: CRITICAL: puppet fail [13:30:19] PROBLEM - puppet last run on db1026 is CRITICAL: CRITICAL: puppet fail [13:30:19] PROBLEM - puppet last run on lvs3004 is CRITICAL: CRITICAL: puppet fail [13:30:19] PROBLEM - puppet last run on mw1051 is CRITICAL: CRITICAL: puppet fail [13:30:19] PROBLEM - puppet last run on wtp1012 is CRITICAL: CRITICAL: puppet fail [13:30:19] PROBLEM - puppet last run on mw1098 is CRITICAL: CRITICAL: puppet fail [13:30:19] PROBLEM - puppet last run on snapshot1002 is CRITICAL: CRITICAL: puppet fail [13:30:20] PROBLEM - puppet last run on mw1151 is CRITICAL: CRITICAL: puppet fail [13:30:20] PROBLEM - puppet last run on mw1247 is CRITICAL: CRITICAL: puppet fail [13:30:21] PROBLEM - puppet last run on es2004 is CRITICAL: CRITICAL: puppet fail [13:30:21] PROBLEM - puppet last run on stat1003 is CRITICAL: CRITICAL: puppet fail [13:30:22] PROBLEM - puppet last run on dataset1001 is CRITICAL: CRITICAL: puppet fail [13:30:28] PROBLEM - puppet last run on db1043 is CRITICAL: CRITICAL: puppet fail [13:30:28] PROBLEM - puppet last run on cp4018 is CRITICAL: CRITICAL: puppet fail [13:30:29] PROBLEM - puppet last run on mw1190 is CRITICAL: CRITICAL: puppet fail [13:30:29] PROBLEM - puppet last run on mw1195 is CRITICAL: CRITICAL: puppet fail [13:30:29] PROBLEM - puppet last run on mw1227 is CRITICAL: CRITICAL: puppet fail [13:30:39] PROBLEM - puppet last run on lvs2006 is CRITICAL: CRITICAL: puppet fail [13:30:39] PROBLEM - puppet last run on lithium is CRITICAL: CRITICAL: puppet fail [13:30:39] PROBLEM - puppet last run on cp4019 is CRITICAL: CRITICAL: puppet fail [13:30:39] PROBLEM - puppet last run on cp4005 is CRITICAL: CRITICAL: puppet fail [13:30:39] PROBLEM - puppet last run on db1060 is CRITICAL: CRITICAL: puppet fail [13:30:39] PROBLEM - puppet last run on mw1202 is CRITICAL: CRITICAL: puppet fail [13:30:50] PROBLEM - puppet last run on elastic1019 is CRITICAL: CRITICAL: puppet fail [13:30:50] PROBLEM - puppet last run on mc1014 is CRITICAL: CRITICAL: puppet fail [13:30:50] PROBLEM - puppet last run on gadolinium is CRITICAL: CRITICAL: puppet fail [13:30:50] PROBLEM - puppet last run on analytics1002 is CRITICAL: CRITICAL: puppet fail [13:30:50] PROBLEM - puppet last run on plutonium is CRITICAL: CRITICAL: puppet fail [13:30:50] PROBLEM - puppet last run on mw1125 is CRITICAL: CRITICAL: puppet fail [13:30:50] PROBLEM - puppet last run on mw1049 is CRITICAL: CRITICAL: puppet fail [13:30:50] PROBLEM - puppet last run on db2029 is CRITICAL: CRITICAL: puppet fail [13:30:50] PROBLEM - puppet last run on db1039 is CRITICAL: CRITICAL: puppet fail [13:30:51] PROBLEM - puppet last run on db2023 is CRITICAL: CRITICAL: puppet fail [13:30:51] PROBLEM - puppet last run on db2016 is CRITICAL: CRITICAL: puppet fail [13:30:52] PROBLEM - puppet last run on nembus is CRITICAL: CRITICAL: puppet fail [13:30:52] PROBLEM - puppet last run on virt1004 is CRITICAL: CRITICAL: puppet fail [13:30:58] PROBLEM - puppet last run on search1005 is CRITICAL: CRITICAL: puppet fail [13:30:59] PROBLEM - puppet last run on mw1079 is CRITICAL: CRITICAL: puppet fail [13:30:59] PROBLEM - puppet last run on mw1111 is CRITICAL: CRITICAL: puppet fail [13:30:59] PROBLEM - puppet last run on mw1084 is CRITICAL: CRITICAL: puppet fail [13:30:59] PROBLEM - puppet last run on mw1133 is CRITICAL: CRITICAL: puppet fail [13:31:00] PROBLEM - puppet last run on analytics1026 is CRITICAL: CRITICAL: puppet fail [13:31:00] PROBLEM - puppet last run on analytics1013 is CRITICAL: CRITICAL: puppet fail [13:31:00] PROBLEM - puppet last run on amssq56 is CRITICAL: CRITICAL: puppet fail [13:31:00] PROBLEM - puppet last run on labmon1001 is CRITICAL: CRITICAL: puppet fail [13:31:01] PROBLEM - puppet last run on elastic1024 is CRITICAL: CRITICAL: puppet fail [13:31:01] PROBLEM - puppet last run on ms-be3001 is CRITICAL: CRITICAL: puppet fail [13:31:02] PROBLEM - puppet last run on amssq36 is CRITICAL: CRITICAL: puppet fail [13:31:02] PROBLEM - puppet last run on amssq40 is CRITICAL: CRITICAL: puppet fail [13:31:03] PROBLEM - puppet last run on ms-fe3002 is CRITICAL: CRITICAL: puppet fail [13:31:03] PROBLEM - puppet last run on sodium is CRITICAL: CRITICAL: puppet fail [13:31:10] PROBLEM - puppet last run on mw1238 is CRITICAL: CRITICAL: puppet fail [13:31:10] PROBLEM - puppet last run on db1036 is CRITICAL: CRITICAL: puppet fail [13:31:10] PROBLEM - puppet last run on search1017 is CRITICAL: CRITICAL: puppet fail [13:31:10] PROBLEM - puppet last run on mw1014 is CRITICAL: CRITICAL: puppet fail [13:31:10] PROBLEM - puppet last run on lvs4003 is CRITICAL: CRITICAL: puppet fail [13:31:10] PROBLEM - puppet last run on virt1007 is CRITICAL: CRITICAL: puppet fail [13:31:10] PROBLEM - puppet last run on mw1180 is CRITICAL: CRITICAL: puppet fail [13:31:10] PROBLEM - puppet last run on db1071 is CRITICAL: CRITICAL: puppet fail [13:31:10] PROBLEM - puppet last run on mw1004 is CRITICAL: CRITICAL: puppet fail [13:31:19] PROBLEM - puppet last run on db1020 is CRITICAL: CRITICAL: puppet fail [13:31:19] PROBLEM - puppet last run on thallium is CRITICAL: CRITICAL: puppet fail [13:31:19] PROBLEM - puppet last run on analytics1022 is CRITICAL: CRITICAL: puppet fail [13:31:19] PROBLEM - puppet last run on mw1258 is CRITICAL: CRITICAL: puppet fail [13:31:19] PROBLEM - puppet last run on db2037 is CRITICAL: CRITICAL: puppet fail [13:31:20] PROBLEM - puppet last run on ms-be2005 is CRITICAL: CRITICAL: puppet fail [13:31:20] PROBLEM - puppet last run on cp1050 is CRITICAL: CRITICAL: puppet fail [13:31:20] PROBLEM - puppet last run on mw1057 is CRITICAL: CRITICAL: puppet fail [13:31:28] PROBLEM - puppet last run on cp1048 is CRITICAL: CRITICAL: puppet fail [13:31:29] PROBLEM - puppet last run on mw1168 is CRITICAL: CRITICAL: puppet fail [13:31:29] PROBLEM - puppet last run on rhenium is CRITICAL: CRITICAL: puppet fail [13:31:29] PROBLEM - puppet last run on mw1183 is CRITICAL: CRITICAL: puppet fail [13:31:39] PROBLEM - puppet last run on snapshot1004 is CRITICAL: CRITICAL: puppet fail [13:31:39] PROBLEM - puppet last run on labsdb1006 is CRITICAL: CRITICAL: puppet fail [13:31:39] PROBLEM - puppet last run on mw1056 is CRITICAL: CRITICAL: puppet fail [13:31:39] PROBLEM - puppet last run on db2001 is CRITICAL: CRITICAL: puppet fail [13:31:39] PROBLEM - puppet last run on elastic1006 is CRITICAL: CRITICAL: puppet fail [13:31:40] PROBLEM - puppet last run on oxygen is CRITICAL: CRITICAL: puppet fail [13:31:40] PROBLEM - puppet last run on mc1001 is CRITICAL: CRITICAL: puppet fail [13:31:48] PROBLEM - puppet last run on rdb1001 is CRITICAL: CRITICAL: puppet fail [13:31:49] PROBLEM - puppet last run on hafnium is CRITICAL: CRITICAL: puppet fail [13:31:49] PROBLEM - puppet last run on db1069 is CRITICAL: CRITICAL: puppet fail [13:31:49] PROBLEM - puppet last run on ms-be1008 is CRITICAL: CRITICAL: puppet fail [13:31:49] PROBLEM - puppet last run on mw1034 is CRITICAL: CRITICAL: puppet fail [13:31:49] PROBLEM - puppet last run on tungsten is CRITICAL: CRITICAL: puppet fail [13:31:58] PROBLEM - puppet last run on virt1010 is CRITICAL: CRITICAL: puppet fail [13:31:58] PROBLEM - puppet last run on amssq42 is CRITICAL: CRITICAL: puppet fail [13:31:58] PROBLEM - puppet last run on cp1063 is CRITICAL: CRITICAL: puppet fail [13:31:58] PROBLEM - puppet last run on analytics1023 is CRITICAL: CRITICAL: puppet fail [13:31:59] PROBLEM - puppet last run on mw1146 is CRITICAL: CRITICAL: puppet fail [13:31:59] PROBLEM - puppet last run on ms-be2012 is CRITICAL: CRITICAL: puppet fail [13:32:08] PROBLEM - puppet last run on acamar is CRITICAL: CRITICAL: puppet fail [13:32:08] PROBLEM - puppet last run on ms-be2008 is CRITICAL: CRITICAL: puppet fail [13:32:09] PROBLEM - puppet last run on osmium is CRITICAL: CRITICAL: puppet fail [13:32:09] PROBLEM - puppet last run on mw1159 is CRITICAL: CRITICAL: puppet fail [13:32:09] PROBLEM - puppet last run on mw1181 is CRITICAL: CRITICAL: puppet fail [13:32:09] PROBLEM - puppet last run on mw1087 is CRITICAL: CRITICAL: puppet fail [13:32:10] PROBLEM - puppet last run on elastic1015 is CRITICAL: CRITICAL: puppet fail [13:32:11] PROBLEM - puppet last run on lvs1001 is CRITICAL: CRITICAL: puppet fail [13:32:11] PROBLEM - puppet last run on mw1023 is CRITICAL: CRITICAL: puppet fail [13:32:16] PROBLEM - puppet last run on amssq41 is CRITICAL: CRITICAL: puppet fail [13:32:19] PROBLEM - puppet last run on mw1156 is CRITICAL: CRITICAL: puppet fail [13:32:19] PROBLEM - puppet last run on hooft is CRITICAL: CRITICAL: puppet fail [13:32:19] PROBLEM - puppet last run on amssq62 is CRITICAL: CRITICAL: puppet fail [13:32:21] PROBLEM - puppet last run on db2004 is CRITICAL: CRITICAL: puppet fail [13:32:21] PROBLEM - puppet last run on mw1081 is CRITICAL: CRITICAL: puppet fail [13:32:21] PROBLEM - puppet last run on argon is CRITICAL: CRITICAL: puppet fail [13:32:21] PROBLEM - puppet last run on search1023 is CRITICAL: CRITICAL: puppet fail [13:32:21] PROBLEM - puppet last run on mw1050 is CRITICAL: CRITICAL: puppet fail [13:32:28] PROBLEM - puppet last run on wtp1002 is CRITICAL: CRITICAL: puppet fail [13:32:28] PROBLEM - puppet last run on db1004 is CRITICAL: CRITICAL: puppet fail [13:32:28] PROBLEM - puppet last run on cp1062 is CRITICAL: CRITICAL: puppet fail [13:32:28] PROBLEM - puppet last run on mw1116 is CRITICAL: CRITICAL: puppet fail [13:32:29] PROBLEM - puppet last run on rubidium is CRITICAL: CRITICAL: puppet fail [13:32:29] PROBLEM - puppet last run on search1024 is CRITICAL: CRITICAL: puppet fail [13:32:29] PROBLEM - puppet last run on mw1030 is CRITICAL: CRITICAL: puppet fail [13:32:30] PROBLEM - puppet last run on mw1188 is CRITICAL: CRITICAL: puppet fail [13:32:30] PROBLEM - puppet last run on wtp1018 is CRITICAL: CRITICAL: puppet fail [13:32:31] PROBLEM - puppet last run on mw1198 is CRITICAL: CRITICAL: puppet fail [13:32:38] PROBLEM - puppet last run on ms-be2001 is CRITICAL: CRITICAL: puppet fail [13:32:39] PROBLEM - puppet last run on mw1171 is CRITICAL: CRITICAL: puppet fail [13:32:39] PROBLEM - puppet last run on wtp1023 is CRITICAL: CRITICAL: puppet fail [13:32:39] PROBLEM - puppet last run on bast4001 is CRITICAL: CRITICAL: puppet fail [13:32:39] PROBLEM - puppet last run on ms-be1007 is CRITICAL: CRITICAL: puppet fail [13:32:39] PROBLEM - puppet last run on haedus is CRITICAL: CRITICAL: puppet fail [13:32:39] PROBLEM - puppet last run on db1055 is CRITICAL: CRITICAL: puppet fail [13:32:48] PROBLEM - puppet last run on analytics1032 is CRITICAL: CRITICAL: puppet fail [13:32:48] PROBLEM - puppet last run on ms-be1012 is CRITICAL: CRITICAL: puppet fail [13:32:48] PROBLEM - puppet last run on stat1002 is CRITICAL: CRITICAL: puppet fail [13:32:58] PROBLEM - puppet last run on mw1248 is CRITICAL: CRITICAL: puppet fail [13:32:58] PROBLEM - puppet last run on mw1165 is CRITICAL: CRITICAL: puppet fail [13:32:58] PROBLEM - puppet last run on mw1163 is CRITICAL: CRITICAL: puppet fail [13:32:59] PROBLEM - puppet last run on elastic1011 is CRITICAL: CRITICAL: puppet fail [13:32:59] PROBLEM - puppet last run on cp1038 is CRITICAL: CRITICAL: puppet fail [13:32:59] PROBLEM - puppet last run on elastic1014 is CRITICAL: CRITICAL: puppet fail [13:33:08] PROBLEM - puppet last run on mw1053 is CRITICAL: CRITICAL: puppet fail [13:33:08] PROBLEM - puppet last run on analytics1037 is CRITICAL: CRITICAL: puppet fail [13:33:08] PROBLEM - puppet last run on wtp1011 is CRITICAL: CRITICAL: puppet fail [13:33:09] PROBLEM - puppet last run on wtp1022 is CRITICAL: CRITICAL: puppet fail [13:33:09] PROBLEM - puppet last run on amssq38 is CRITICAL: CRITICAL: puppet fail [13:33:09] PROBLEM - puppet last run on db1057 is CRITICAL: CRITICAL: puppet fail [13:33:09] PROBLEM - puppet last run on mw1212 is CRITICAL: CRITICAL: puppet fail [13:33:19] PROBLEM - puppet last run on es2007 is CRITICAL: CRITICAL: puppet fail [13:33:19] PROBLEM - puppet last run on mw1097 is CRITICAL: CRITICAL: puppet fail [13:33:19] PROBLEM - puppet last run on ms-be2007 is CRITICAL: CRITICAL: puppet fail [13:33:19] PROBLEM - puppet last run on ms-be1009 is CRITICAL: CRITICAL: puppet fail [13:33:19] PROBLEM - puppet last run on analytics1011 is CRITICAL: CRITICAL: puppet fail [13:33:20] PROBLEM - puppet last run on wtp1015 is CRITICAL: CRITICAL: puppet fail [13:33:20] PROBLEM - puppet last run on amslvs3 is CRITICAL: CRITICAL: puppet fail [13:33:28] PROBLEM - puppet last run on db1062 is CRITICAL: CRITICAL: puppet fail [13:33:29] PROBLEM - puppet last run on cp3012 is CRITICAL: CRITICAL: puppet fail [13:33:29] PROBLEM - puppet last run on search1015 is CRITICAL: CRITICAL: puppet fail [13:33:29] PROBLEM - puppet last run on mw1074 is CRITICAL: CRITICAL: puppet fail [13:33:39] PROBLEM - puppet last run on mw1029 is CRITICAL: CRITICAL: puppet fail [13:33:39] PROBLEM - puppet last run on mw1210 is CRITICAL: CRITICAL: puppet fail [13:33:39] PROBLEM - puppet last run on mw1148 is CRITICAL: CRITICAL: puppet fail [13:33:49] PROBLEM - puppet last run on es1002 is CRITICAL: CRITICAL: puppet fail [13:33:49] PROBLEM - puppet last run on wtp1004 is CRITICAL: CRITICAL: puppet fail [13:33:50] PROBLEM - puppet last run on wtp1013 is CRITICAL: CRITICAL: puppet fail [13:33:50] PROBLEM - puppet last run on rcs1002 is CRITICAL: CRITICAL: puppet fail [13:33:50] PROBLEM - puppet last run on mw1239 is CRITICAL: CRITICAL: puppet fail [13:33:58] PROBLEM - puppet last run on titanium is CRITICAL: CRITICAL: puppet fail [13:33:58] PROBLEM - puppet last run on mw1185 is CRITICAL: CRITICAL: puppet fail [13:33:59] PROBLEM - puppet last run on db1054 is CRITICAL: CRITICAL: puppet fail [13:33:59] PROBLEM - puppet last run on db1001 is CRITICAL: CRITICAL: puppet fail [13:33:59] PROBLEM - puppet last run on labsdb1004 is CRITICAL: CRITICAL: puppet fail [13:33:59] PROBLEM - puppet last run on ocg1003 is CRITICAL: CRITICAL: puppet fail [13:33:59] PROBLEM - puppet last run on elastic1005 is CRITICAL: CRITICAL: puppet fail [13:34:08] PROBLEM - puppet last run on elastic1002 is CRITICAL: CRITICAL: puppet fail [13:34:09] PROBLEM - puppet last run on mc1013 is CRITICAL: CRITICAL: puppet fail [13:34:09] PROBLEM - puppet last run on analytics1014 is CRITICAL: CRITICAL: puppet fail [13:34:09] PROBLEM - puppet last run on wtp1007 is CRITICAL: CRITICAL: puppet fail [13:34:09] PROBLEM - puppet last run on mw1077 is CRITICAL: CRITICAL: puppet fail [13:34:19] PROBLEM - puppet last run on db1063 is CRITICAL: CRITICAL: puppet fail [13:34:19] PROBLEM - puppet last run on protactinium is CRITICAL: CRITICAL: puppet fail [13:34:19] PROBLEM - puppet last run on rdb1002 is CRITICAL: CRITICAL: puppet fail [13:34:19] PROBLEM - puppet last run on mw1243 is CRITICAL: CRITICAL: puppet fail [13:34:19] PROBLEM - puppet last run on db2011 is CRITICAL: CRITICAL: puppet fail [13:34:20] PROBLEM - puppet last run on rcs1001 is CRITICAL: CRITICAL: puppet fail [13:34:20] PROBLEM - puppet last run on mw1032 is CRITICAL: CRITICAL: puppet fail [13:34:22] akosiaris: is the storm from the admin.yaml typo? [13:34:29] PROBLEM - puppet last run on mw1186 is CRITICAL: CRITICAL: puppet fail [13:34:29] PROBLEM - puppet last run on lvs2003 is CRITICAL: CRITICAL: puppet fail [13:34:30] PROBLEM - puppet last run on nitrogen is CRITICAL: CRITICAL: puppet fail [13:34:30] PROBLEM - puppet last run on db2003 is CRITICAL: CRITICAL: puppet fail [13:34:30] PROBLEM - puppet last run on cp1060 is CRITICAL: CRITICAL: puppet fail [13:34:40] PROBLEM - puppet last run on mc1007 is CRITICAL: CRITICAL: puppet fail [13:34:40] PROBLEM - puppet last run on mw1167 is CRITICAL: CRITICAL: puppet fail [13:34:40] PROBLEM - puppet last run on db1064 is CRITICAL: CRITICAL: puppet fail [13:34:40] PROBLEM - puppet last run on mw1108 is CRITICAL: CRITICAL: puppet fail [13:34:40] PROBLEM - puppet last run on pc1003 is CRITICAL: CRITICAL: puppet fail [13:34:40] PROBLEM - puppet last run on mw1225 is CRITICAL: CRITICAL: puppet fail [13:34:40] PROBLEM - puppet last run on rdb1003 is CRITICAL: CRITICAL: puppet fail [13:34:50] PROBLEM - puppet last run on mw1122 is CRITICAL: CRITICAL: puppet fail [13:34:50] PROBLEM - puppet last run on cp3009 is CRITICAL: CRITICAL: puppet fail [13:34:50] PROBLEM - puppet last run on cp3004 is CRITICAL: CRITICAL: puppet fail [13:34:50] PROBLEM - puppet last run on ms-fe3001 is CRITICAL: CRITICAL: puppet fail [13:34:50] PROBLEM - puppet last run on mw1105 is CRITICAL: CRITICAL: puppet fail [13:34:50] PROBLEM - puppet last run on search1013 is CRITICAL: CRITICAL: puppet fail [13:34:50] PROBLEM - puppet last run on virt1000 is CRITICAL: CRITICAL: puppet fail [13:34:52] PROBLEM - puppet last run on db1030 is CRITICAL: CRITICAL: puppet fail [13:34:53] yup [13:35:01] (03CR) 10Yuvipanda: [C: 032] Typo fix for d7fc542 [puppet] - 10https://gerrit.wikimedia.org/r/183844 (owner: 10Alexandros Kosiaris) [13:35:01] PROBLEM - puppet last run on analytics1001 is CRITICAL: CRITICAL: puppet fail [13:35:01] PROBLEM - puppet last run on mw1022 is CRITICAL: CRITICAL: puppet fail [13:35:01] PROBLEM - puppet last run on db1038 is CRITICAL: CRITICAL: puppet fail [13:35:01] PROBLEM - puppet last run on mw1033 is CRITICAL: CRITICAL: puppet fail [13:35:01] PROBLEM - puppet last run on mw1043 is CRITICAL: CRITICAL: puppet fail [13:35:01] PROBLEM - puppet last run on fluorine is CRITICAL: CRITICAL: puppet fail [13:35:01] PROBLEM - puppet last run on mw1201 is CRITICAL: CRITICAL: puppet fail [13:35:10] PROBLEM - puppet last run on mw1121 is CRITICAL: CRITICAL: puppet fail [13:35:10] PROBLEM - puppet last run on wtp1003 is CRITICAL: CRITICAL: puppet fail [13:35:10] PROBLEM - puppet last run on db1070 is CRITICAL: CRITICAL: puppet fail [13:35:19] PROBLEM - puppet last run on db1047 is CRITICAL: CRITICAL: puppet fail [13:35:19] PROBLEM - puppet last run on elastic1023 is CRITICAL: CRITICAL: puppet fail [13:35:19] PROBLEM - puppet last run on ms-fe2002 is CRITICAL: CRITICAL: puppet fail [13:35:19] PROBLEM - puppet last run on search1006 is CRITICAL: CRITICAL: puppet fail [13:35:19] PROBLEM - puppet last run on mw1139 is CRITICAL: CRITICAL: puppet fail [13:35:19] PROBLEM - puppet last run on db1044 is CRITICAL: CRITICAL: puppet fail [13:35:19] PROBLEM - puppet last run on elastic1017 is CRITICAL: CRITICAL: puppet fail [13:35:20] PROBLEM - puppet last run on search1022 is CRITICAL: CRITICAL: puppet fail [13:35:21] PROBLEM - puppet last run on iodine is CRITICAL: CRITICAL: puppet fail [13:35:21] PROBLEM - puppet last run on mw1024 is CRITICAL: CRITICAL: puppet fail [13:35:33] PROBLEM - puppet last run on db1061 is CRITICAL: CRITICAL: puppet fail [13:35:34] PROBLEM - puppet last run on analytics1004 is CRITICAL: CRITICAL: puppet fail [13:35:34] PROBLEM - puppet last run on mw1229 is CRITICAL: CRITICAL: puppet fail [13:35:39] PROBLEM - puppet last run on mw1001 is CRITICAL: CRITICAL: puppet fail [13:35:40] PROBLEM - puppet last run on db2028 is CRITICAL: CRITICAL: puppet fail [13:35:40] PROBLEM - puppet last run on mw1219 is CRITICAL: CRITICAL: puppet fail [13:35:40] PROBLEM - puppet last run on mw1142 is CRITICAL: CRITICAL: puppet fail [13:35:49] PROBLEM - puppet last run on cp4020 is CRITICAL: CRITICAL: puppet fail [13:35:49] PROBLEM - puppet last run on dbstore1002 is CRITICAL: CRITICAL: puppet fail [13:35:49] PROBLEM - puppet last run on mercury is CRITICAL: CRITICAL: puppet fail [13:35:49] PROBLEM - puppet last run on db1033 is CRITICAL: CRITICAL: puppet fail [13:35:49] PROBLEM - puppet last run on cp1052 is CRITICAL: CRITICAL: puppet fail [13:35:50] PROBLEM - puppet last run on mw1223 is CRITICAL: CRITICAL: puppet fail [13:35:50] PROBLEM - puppet last run on mw1091 is CRITICAL: CRITICAL: puppet fail [13:35:50] PROBLEM - puppet last run on cp1037 is CRITICAL: CRITICAL: puppet fail [13:35:51] PROBLEM - puppet last run on mw1016 is CRITICAL: CRITICAL: puppet fail [13:35:51] PROBLEM - puppet last run on amssq31 is CRITICAL: CRITICAL: puppet fail [13:35:51] PROBLEM - puppet last run on amssq39 is CRITICAL: CRITICAL: puppet fail [13:35:52] PROBLEM - puppet last run on mw1152 is CRITICAL: CRITICAL: puppet fail [13:35:52] PROBLEM - puppet last run on search1011 is CRITICAL: CRITICAL: puppet fail [13:35:59] PROBLEM - puppet last run on mw1236 is CRITICAL: CRITICAL: puppet fail [13:35:59] PROBLEM - puppet last run on analytics1028 is CRITICAL: CRITICAL: puppet fail [13:35:59] PROBLEM - puppet last run on mw1193 is CRITICAL: CRITICAL: puppet fail [13:36:01] PROBLEM - puppet last run on mw1093 is CRITICAL: CRITICAL: puppet fail [13:36:01] PROBLEM - puppet last run on analytics1003 is CRITICAL: CRITICAL: puppet fail [13:36:07] thanks YuviPanda [13:36:09] PROBLEM - puppet last run on db1006 is CRITICAL: CRITICAL: puppet fail [13:36:09] PROBLEM - puppet last run on mw1209 is CRITICAL: CRITICAL: puppet fail [13:36:09] PROBLEM - puppet last run on analytics1027 is CRITICAL: CRITICAL: puppet fail [13:36:09] PROBLEM - puppet last run on terbium is CRITICAL: CRITICAL: puppet fail [13:36:09] PROBLEM - puppet last run on mw1231 is CRITICAL: CRITICAL: puppet fail [13:36:09] PROBLEM - puppet last run on db1011 is CRITICAL: CRITICAL: puppet fail [13:36:09] PROBLEM - puppet last run on mw1010 is CRITICAL: CRITICAL: puppet fail [13:36:10] PROBLEM - puppet last run on db2012 is CRITICAL: CRITICAL: puppet fail [13:36:20] PROBLEM - puppet last run on iridium is CRITICAL: CRITICAL: puppet fail [13:36:20] PROBLEM - puppet last run on lvs3003 is CRITICAL: CRITICAL: puppet fail [13:36:29] PROBLEM - puppet last run on ms-be1011 is CRITICAL: CRITICAL: puppet fail [13:36:29] PROBLEM - puppet last run on cp1044 is CRITICAL: CRITICAL: puppet fail [13:36:29] PROBLEM - puppet last run on strontium is CRITICAL: CRITICAL: puppet fail [13:36:40] PROBLEM - puppet last run on d-i-test is CRITICAL: CRITICAL: puppet fail [13:36:40] PROBLEM - puppet last run on mw1215 is CRITICAL: CRITICAL: puppet fail [13:36:40] PROBLEM - puppet last run on mw1143 is CRITICAL: CRITICAL: puppet fail [13:36:40] PROBLEM - puppet last run on mw1090 is CRITICAL: CRITICAL: puppet fail [13:36:40] PROBLEM - puppet last run on es1010 is CRITICAL: CRITICAL: puppet fail [13:36:40] PROBLEM - puppet last run on dbstore2002 is CRITICAL: CRITICAL: puppet fail [13:36:48] PROBLEM - puppet last run on calcium is CRITICAL: CRITICAL: puppet fail [13:36:49] PROBLEM - puppet last run on cp1067 is CRITICAL: CRITICAL: puppet fail [13:36:49] PROBLEM - puppet last run on cp3018 is CRITICAL: CRITICAL: puppet fail [13:36:49] RECOVERY - puppet last run on mw1032 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [13:36:50] PROBLEM - puppet last run on mw1086 is CRITICAL: CRITICAL: puppet fail [13:37:00] PROBLEM - puppet last run on achernar is CRITICAL: CRITICAL: puppet fail [13:37:00] PROBLEM - puppet last run on einsteinium is CRITICAL: CRITICAL: puppet fail [13:37:00] PROBLEM - puppet last run on dysprosium is CRITICAL: CRITICAL: puppet fail [13:37:00] PROBLEM - puppet last run on cp4002 is CRITICAL: CRITICAL: puppet fail [13:37:00] PROBLEM - puppet last run on cp4009 is CRITICAL: CRITICAL: puppet fail [13:37:01] PROBLEM - puppet last run on analytics1018 is CRITICAL: CRITICAL: puppet fail [13:37:01] PROBLEM - puppet last run on cp1040 is CRITICAL: CRITICAL: puppet fail [13:37:01] PROBLEM - puppet last run on db1045 is CRITICAL: CRITICAL: puppet fail [13:37:01] PROBLEM - puppet last run on radium is CRITICAL: CRITICAL: puppet fail [13:37:04] PROBLEM - puppet last run on chromium is CRITICAL: CRITICAL: puppet fail [13:37:04] PROBLEM - puppet last run on elastic1020 is CRITICAL: CRITICAL: puppet fail [13:37:09] PROBLEM - puppet last run on mw1107 is CRITICAL: CRITICAL: puppet fail [13:37:09] PROBLEM - puppet last run on mw1064 is CRITICAL: CRITICAL: puppet fail [13:37:09] PROBLEM - puppet last run on mw1204 is CRITICAL: CRITICAL: puppet fail [13:37:09] PROBLEM - puppet last run on amssq33 is CRITICAL: CRITICAL: puppet fail [13:37:09] PROBLEM - puppet last run on ms-be1002 is CRITICAL: CRITICAL: puppet fail [13:37:09] PROBLEM - puppet last run on osm-cp1001 is CRITICAL: CRITICAL: puppet fail [13:37:09] PROBLEM - puppet last run on mw1071 is CRITICAL: CRITICAL: puppet fail [13:37:11] PROBLEM - puppet last run on amssq37 is CRITICAL: CRITICAL: puppet fail [13:37:11] PROBLEM - puppet last run on mw1027 is CRITICAL: CRITICAL: puppet fail [13:37:11] PROBLEM - puppet last run on mw1158 is CRITICAL: CRITICAL: puppet fail [13:37:11] PROBLEM - puppet last run on virt1008 is CRITICAL: CRITICAL: puppet fail [13:37:12] PROBLEM - puppet last run on copper is CRITICAL: CRITICAL: puppet fail [13:37:20] PROBLEM - puppet last run on mw1113 is CRITICAL: CRITICAL: puppet fail [13:37:20] PROBLEM - puppet last run on bast1001 is CRITICAL: CRITICAL: puppet fail [13:37:20] PROBLEM - puppet last run on wtp1010 is CRITICAL: CRITICAL: puppet fail [13:37:20] PROBLEM - puppet last run on ms-be1015 is CRITICAL: CRITICAL: puppet fail [13:37:21] PROBLEM - puppet last run on elastic1003 is CRITICAL: CRITICAL: puppet fail [13:37:21] PROBLEM - puppet last run on search1004 is CRITICAL: CRITICAL: puppet fail [13:37:21] PROBLEM - puppet last run on baham is CRITICAL: CRITICAL: puppet fail [13:37:21] PROBLEM - puppet last run on cp4012 is CRITICAL: CRITICAL: puppet fail [13:37:21] PROBLEM - puppet last run on rbf1001 is CRITICAL: CRITICAL: puppet fail [13:37:21] PROBLEM - puppet last run on virt1011 is CRITICAL: CRITICAL: puppet fail [13:37:28] PROBLEM - puppet last run on mw1037 is CRITICAL: CRITICAL: puppet fail [13:37:28] PROBLEM - puppet last run on mw1220 is CRITICAL: CRITICAL: puppet fail [13:37:29] PROBLEM - puppet last run on mw1110 is CRITICAL: CRITICAL: puppet fail [13:37:29] PROBLEM - puppet last run on cp1054 is CRITICAL: CRITICAL: puppet fail [13:37:29] PROBLEM - puppet last run on mw1241 is CRITICAL: CRITICAL: puppet fail [13:37:29] PROBLEM - puppet last run on mw1066 is CRITICAL: CRITICAL: puppet fail [13:37:30] PROBLEM - puppet last run on search1012 is CRITICAL: CRITICAL: puppet fail [13:37:30] PROBLEM - puppet last run on mw1203 is CRITICAL: CRITICAL: puppet fail [13:37:30] PROBLEM - puppet last run on mw1207 is CRITICAL: CRITICAL: puppet fail [13:37:30] PROBLEM - puppet last run on db1072 is CRITICAL: CRITICAL: puppet fail [13:37:39] PROBLEM - puppet last run on wtp1001 is CRITICAL: CRITICAL: puppet fail [13:37:41] PROBLEM - puppet last run on lvs1003 is CRITICAL: CRITICAL: puppet fail [13:37:41] PROBLEM - puppet last run on db1035 is CRITICAL: CRITICAL: puppet fail [13:37:41] PROBLEM - puppet last run on db1056 is CRITICAL: CRITICAL: puppet fail [13:37:41] PROBLEM - puppet last run on cp1053 is CRITICAL: CRITICAL: puppet fail [13:37:49] PROBLEM - puppet last run on wtp1008 is CRITICAL: CRITICAL: puppet fail [13:37:49] PROBLEM - puppet last run on mw1112 is CRITICAL: CRITICAL: puppet fail [13:37:49] PROBLEM - puppet last run on cp1045 is CRITICAL: CRITICAL: puppet fail [13:37:59] PROBLEM - puppet last run on lvs4004 is CRITICAL: CRITICAL: puppet fail [13:37:59] PROBLEM - puppet last run on mw1255 is CRITICAL: CRITICAL: puppet fail [13:37:59] PROBLEM - puppet last run on cp3011 is CRITICAL: CRITICAL: puppet fail [13:38:10] PROBLEM - puppet last run on mw1253 is CRITICAL: CRITICAL: puppet fail [13:38:20] PROBLEM - puppet last run on elastic1029 is CRITICAL: CRITICAL: puppet fail [13:38:20] PROBLEM - puppet last run on lvs1004 is CRITICAL: CRITICAL: puppet fail [13:38:20] PROBLEM - puppet last run on ms-fe1002 is CRITICAL: CRITICAL: puppet fail [13:38:20] PROBLEM - puppet last run on mw1135 is CRITICAL: CRITICAL: puppet fail [13:38:20] PROBLEM - puppet last run on db1037 is CRITICAL: CRITICAL: puppet fail [13:38:20] PROBLEM - puppet last run on eeden is CRITICAL: CRITICAL: puppet fail [13:38:20] PROBLEM - puppet last run on mw1154 is CRITICAL: CRITICAL: puppet fail [13:38:30] PROBLEM - puppet last run on cp3005 is CRITICAL: CRITICAL: puppet fail [13:38:30] PROBLEM - puppet last run on mw1104 is CRITICAL: CRITICAL: puppet fail [13:38:38] PROBLEM - puppet last run on cp1068 is CRITICAL: CRITICAL: puppet fail [13:38:39] PROBLEM - puppet last run on mw1131 is CRITICAL: CRITICAL: puppet fail [13:38:39] PROBLEM - puppet last run on tmh1001 is CRITICAL: CRITICAL: puppet fail [13:38:40] PROBLEM - puppet last run on mw1155 is CRITICAL: CRITICAL: puppet fail [13:38:48] PROBLEM - puppet last run on radon is CRITICAL: CRITICAL: puppet fail [13:38:49] PROBLEM - puppet last run on es2010 is CRITICAL: CRITICAL: puppet fail [13:38:59] PROBLEM - puppet last run on mw1021 is CRITICAL: CRITICAL: puppet fail [13:41:59] PROBLEM - Varnishkafka Delivery Errors per minute on cp3022 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [13:42:19] RECOVERY - puppet last run on analytics1033 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [13:42:19] RECOVERY - puppet last run on neptunium is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [13:42:28] RECOVERY - puppet last run on carbon is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [13:42:39] RECOVERY - puppet last run on mw1200 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [13:42:59] RECOVERY - puppet last run on cp1055 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [13:43:09] RECOVERY - puppet last run on db1031 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [13:43:09] RECOVERY - puppet last run on wtp1020 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [13:43:09] RECOVERY - puppet last run on es2008 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [13:43:09] RECOVERY - puppet last run on snapshot1003 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [13:43:09] RECOVERY - puppet last run on mw1187 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [13:43:10] RECOVERY - puppet last run on elastic1001 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [13:43:18] RECOVERY - puppet last run on analytics1025 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [13:43:18] RECOVERY - puppet last run on search1016 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [13:43:19] RECOVERY - puppet last run on mw1012 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:43:20] RECOVERY - puppet last run on potassium is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [13:43:20] RECOVERY - puppet last run on mw1250 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [13:43:29] RECOVERY - puppet last run on analytics1040 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [13:43:29] RECOVERY - puppet last run on es2002 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [13:43:29] RECOVERY - puppet last run on mw1224 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [13:43:38] RECOVERY - puppet last run on rbf1002 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [13:43:39] RECOVERY - puppet last run on analytics1041 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:43:39] RECOVERY - puppet last run on gold is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [13:43:40] RECOVERY - puppet last run on es1008 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [13:43:40] RECOVERY - puppet last run on ms-fe2004 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [13:43:40] RECOVERY - puppet last run on cp1039 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [13:43:40] RECOVERY - puppet last run on analytics1020 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [13:43:40] RECOVERY - puppet last run on db1073 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:43:48] RECOVERY - puppet last run on elastic1004 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [13:43:49] RECOVERY - puppet last run on netmon1001 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [13:43:49] RECOVERY - puppet last run on mc1006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:43:49] RECOVERY - puppet last run on mw1160 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [13:43:59] RECOVERY - puppet last run on ms-be1003 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [13:43:59] RECOVERY - puppet last run on db1022 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [13:44:00] RECOVERY - puppet last run on db1050 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [13:44:00] RECOVERY - puppet last run on lead is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [13:44:09] RECOVERY - puppet last run on ms1004 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [13:44:09] RECOVERY - puppet last run on mw1254 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [13:44:09] RECOVERY - puppet last run on mw1041 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:44:09] RECOVERY - puppet last run on db2005 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [13:44:09] RECOVERY - puppet last run on db2009 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [13:44:09] RECOVERY - puppet last run on elastic1018 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [13:44:10] RECOVERY - puppet last run on cp3020 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [13:44:10] RECOVERY - puppet last run on cp3008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:44:10] RECOVERY - puppet last run on xenon is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [13:44:18] RECOVERY - puppet last run on helium is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [13:44:18] RECOVERY - puppet last run on mw1164 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [13:44:18] RECOVERY - puppet last run on lvs1002 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [13:44:18] RECOVERY - puppet last run on lvs1005 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [13:44:19] RECOVERY - puppet last run on ms-be1006 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [13:44:19] RECOVERY - puppet last run on heze is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [13:44:19] RECOVERY - puppet last run on db2019 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [13:44:20] RECOVERY - puppet last run on es2009 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [13:44:20] RECOVERY - puppet last run on db2039 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:44:21] RECOVERY - puppet last run on platinum is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [13:44:28] RECOVERY - puppet last run on elastic1012 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [13:44:28] RECOVERY - puppet last run on ms-be2003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:44:28] RECOVERY - puppet last run on ms-be2006 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:44:28] RECOVERY - puppet last run on mw1082 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [13:44:29] RECOVERY - puppet last run on mw1226 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [13:44:29] RECOVERY - puppet last run on mc1003 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [13:44:29] RECOVERY - puppet last run on analytics1035 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [13:44:38] RECOVERY - puppet last run on mw1026 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:44:39] RECOVERY - puppet last run on wtp1006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:44:39] RECOVERY - puppet last run on amssq53 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [13:44:39] RECOVERY - puppet last run on amssq61 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [13:44:39] RECOVERY - puppet last run on cp1047 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:44:40] RECOVERY - puppet last run on dbstore2001 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [13:44:40] RECOVERY - puppet last run on mw1153 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [13:44:48] RECOVERY - puppet last run on elastic1007 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [13:44:48] RECOVERY - puppet last run on search1010 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [13:44:48] RECOVERY - puppet last run on mc1002 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [13:44:48] RECOVERY - puppet last run on db2034 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:44:49] RECOVERY - puppet last run on mw1222 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [13:44:49] RECOVERY - puppet last run on elastic1008 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:44:49] RECOVERY - puppet last run on mw1174 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [13:44:50] RECOVERY - puppet last run on mw1068 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [13:44:50] RECOVERY - puppet last run on mw1173 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:44:58] RECOVERY - puppet last run on ms-fe1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:44:58] RECOVERY - puppet last run on es2001 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:44:59] RECOVERY - puppet last run on wtp1016 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:44:59] RECOVERY - puppet last run on mw1176 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [13:44:59] RECOVERY - puppet last run on mw1046 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [13:45:08] RECOVERY - puppet last run on db1066 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:45:08] RECOVERY - puppet last run on search1001 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [13:45:08] RECOVERY - puppet last run on iron is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [13:45:08] RECOVERY - puppet last run on mw1117 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [13:45:09] RECOVERY - puppet last run on db2002 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:45:18] RECOVERY - puppet last run on virt1006 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [13:45:18] RECOVERY - puppet last run on lvs3001 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [13:45:18] RECOVERY - puppet last run on mw1189 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [13:45:18] RECOVERY - puppet last run on mw1088 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:45:19] RECOVERY - puppet last run on mw1150 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [13:45:19] RECOVERY - puppet last run on capella is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [13:45:19] RECOVERY - puppet last run on mw1228 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [13:45:20] RECOVERY - puppet last run on mw1235 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [13:45:20] RECOVERY - puppet last run on mw1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:45:21] RECOVERY - puppet last run on mw1242 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:45:28] RECOVERY - puppet last run on ms-be2004 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [13:45:28] RECOVERY - puppet last run on cp3003 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [13:45:28] RECOVERY - puppet last run on amssq32 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [13:45:28] RECOVERY - puppet last run on mw1099 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [13:45:29] RECOVERY - puppet last run on db1018 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [13:45:29] RECOVERY - puppet last run on labsdb1003 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [13:45:29] RECOVERY - puppet last run on elastic1021 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:45:30] RECOVERY - puppet last run on db1046 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [13:45:38] RECOVERY - puppet last run on ms-fe2001 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:45:38] RECOVERY - puppet last run on mw1217 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [13:45:39] RECOVERY - puppet last run on mw1100 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [13:45:39] RECOVERY - puppet last run on mw1008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:45:39] RECOVERY - puppet last run on mw1205 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:45:39] RECOVERY - puppet last run on db1015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:45:39] RECOVERY - puppet last run on db1040 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [13:45:48] RECOVERY - puppet last run on cp1061 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [13:45:48] RECOVERY - puppet last run on amssq35 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [13:45:48] RECOVERY - puppet last run on ruthenium is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [13:45:49] RECOVERY - puppet last run on search1018 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [13:45:58] RECOVERY - puppet last run on db1051 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [13:45:58] RECOVERY - puppet last run on mw1123 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:45:58] RECOVERY - puppet last run on mw1170 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [13:45:59] RECOVERY - puppet last run on db2036 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [13:45:59] RECOVERY - puppet last run on analytics1030 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:45:59] RECOVERY - puppet last run on mw1060 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:46:08] RECOVERY - puppet last run on mw1120 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:46:08] RECOVERY - puppet last run on mw1052 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [13:46:09] RECOVERY - puppet last run on cp1056 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [13:46:09] RECOVERY - puppet last run on db1023 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [13:46:09] RECOVERY - puppet last run on mw1009 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:46:09] RECOVERY - puppet last run on mw1144 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [13:46:09] RECOVERY - puppet last run on mw1069 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:46:09] RECOVERY - puppet last run on lvs2004 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [13:46:10] RECOVERY - puppet last run on mw1061 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [13:46:10] RECOVERY - puppet last run on dbproxy1001 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [13:46:18] RECOVERY - puppet last run on db1042 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [13:46:18] RECOVERY - puppet last run on elastic1030 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [13:46:18] RECOVERY - puppet last run on ms-fe1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:46:19] RECOVERY - puppet last run on lvs2001 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [13:46:19] RECOVERY - puppet last run on db1067 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [13:46:19] RECOVERY - puppet last run on db1059 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [13:46:19] RECOVERY - puppet last run on logstash1002 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [13:46:29] RECOVERY - puppet last run on db1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:46:38] RECOVERY - puppet last run on mw1025 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [13:46:39] RECOVERY - puppet last run on db2018 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:46:39] RECOVERY - puppet last run on cp4008 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [13:46:39] RECOVERY - puppet last run on elastic1027 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:46:39] RECOVERY - puppet last run on cp3016 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [13:46:39] RECOVERY - puppet last run on cp1058 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [13:46:48] RECOVERY - puppet last run on db2040 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [13:46:49] RECOVERY - puppet last run on search1007 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [13:46:49] RECOVERY - puppet last run on mw1114 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [13:46:49] RECOVERY - puppet last run on mw1092 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [13:46:58] RECOVERY - puppet last run on mw1065 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:46:58] RECOVERY - puppet last run on db1028 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [13:46:59] RECOVERY - puppet last run on mw1119 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [13:46:59] RECOVERY - puppet last run on mw1177 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [13:46:59] RECOVERY - puppet last run on mw1172 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [13:46:59] RECOVERY - puppet last run on db1052 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [13:46:59] RECOVERY - puppet last run on analytics1016 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [13:47:08] RECOVERY - puppet last run on cp3014 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [13:47:08] RECOVERY - puppet last run on mw1213 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [13:47:08] RECOVERY - puppet last run on mw1166 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:47:08] RECOVERY - puppet last run on elastic1022 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [13:47:09] RECOVERY - puppet last run on ms-fe2003 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [13:47:09] RECOVERY - puppet last run on wtp1005 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [13:47:09] RECOVERY - puppet last run on db1021 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:47:10] RECOVERY - puppet last run on db1034 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:47:10] RECOVERY - puppet last run on analytics1010 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [13:47:18] RECOVERY - puppet last run on cp4003 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [13:47:18] RECOVERY - puppet last run on mw1175 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [13:47:18] RECOVERY - puppet last run on mw1039 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [13:47:19] RECOVERY - puppet last run on mw1118 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:47:19] RECOVERY - puppet last run on mw1251 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:47:19] RECOVERY - puppet last run on mw1042 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:47:19] RECOVERY - puppet last run on es1007 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [13:47:20] RECOVERY - puppet last run on pc1002 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [13:47:28] RECOVERY - puppet last run on labcontrol2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:47:28] RECOVERY - puppet last run on wtp1012 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [13:47:29] RECOVERY - puppet last run on mw1054 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [13:47:29] RECOVERY - puppet last run on amssq55 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [13:47:29] RECOVERY - puppet last run on amslvs1 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [13:47:29] RECOVERY - puppet last run on stat1003 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [13:47:29] RECOVERY - puppet last run on mw1129 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [13:47:39] RECOVERY - puppet last run on db1043 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [13:47:39] RECOVERY - puppet last run on dataset1001 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [13:47:40] RECOVERY - puppet last run on mw1195 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [13:47:40] RECOVERY - puppet last run on mw1011 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [13:47:40] RECOVERY - puppet last run on gallium is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:47:48] RECOVERY - puppet last run on snapshot1001 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [13:47:48] RECOVERY - puppet last run on lithium is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:47:49] RECOVERY - puppet last run on db2038 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [13:47:49] RECOVERY - puppet last run on db2042 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:47:49] RECOVERY - puppet last run on cp4014 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [13:47:58] RECOVERY - puppet last run on mc1012 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [13:47:58] RECOVERY - puppet last run on mw1211 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [13:47:58] RECOVERY - puppet last run on virt1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:47:59] RECOVERY - puppet last run on ms-be2011 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [13:47:59] RECOVERY - puppet last run on virt1001 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:47:59] RECOVERY - puppet last run on db2007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:47:59] RECOVERY - puppet last run on labnet1001 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [13:48:00] RECOVERY - puppet last run on install2001 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [13:48:08] RECOVERY - puppet last run on analytics1038 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [13:48:08] RECOVERY - puppet last run on mw1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:48:08] RECOVERY - puppet last run on mw1126 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:48:18] RECOVERY - puppet last run on polonium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:48:18] RECOVERY - puppet last run on antimony is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [13:48:18] RECOVERY - puppet last run on virt1003 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:48:18] RECOVERY - puppet last run on db1036 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [13:48:18] RECOVERY - puppet last run on amssq46 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [13:48:18] RECOVERY - puppet last run on amssq47 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [13:48:19] RECOVERY - puppet last run on amssq48 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:48:19] RECOVERY - puppet last run on mw1055 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:48:20] RECOVERY - puppet last run on mw1249 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:48:20] RECOVERY - puppet last run on mc1005 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [13:48:28] RECOVERY - puppet last run on search1002 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [13:48:28] RECOVERY - puppet last run on nescio is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [13:48:29] RECOVERY - puppet last run on db1048 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [13:48:29] RECOVERY - puppet last run on analytics1022 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [13:48:29] RECOVERY - puppet last run on mw1206 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [13:48:29] RECOVERY - puppet last run on cp1050 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [13:48:29] RECOVERY - puppet last run on mw1208 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [13:48:30] RECOVERY - puppet last run on mw1162 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [13:48:38] RECOVERY - puppet last run on db1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:48:38] RECOVERY - puppet last run on db1026 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [13:48:38] RECOVERY - puppet last run on cp4004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:48:39] RECOVERY - puppet last run on mw1051 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [13:48:39] RECOVERY - puppet last run on lvs3004 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [13:48:39] RECOVERY - puppet last run on mw1149 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [13:48:39] RECOVERY - puppet last run on snapshot1002 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [13:48:48] RECOVERY - puppet last run on mw1044 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [13:48:48] RECOVERY - puppet last run on mw1247 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [13:48:49] RECOVERY - puppet last run on es2004 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [13:48:49] RECOVERY - puppet last run on elastic1006 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [13:48:58] RECOVERY - puppet last run on db1069 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [13:48:58] RECOVERY - puppet last run on cp4001 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [13:48:58] RECOVERY - puppet last run on ms-be1008 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [13:48:58] RECOVERY - puppet last run on mw1076 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [13:48:58] RECOVERY - puppet last run on lvs2006 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [13:48:59] RECOVERY - puppet last run on db1060 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:48:59] PROBLEM - Varnishkafka Delivery Errors per minute on cp3020 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [20000.0] [13:49:08] RECOVERY - puppet last run on elastic1019 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:49:08] RECOVERY - puppet last run on ms-be3002 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:49:08] RECOVERY - puppet last run on mc1014 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [13:49:08] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [13:49:09] RECOVERY - puppet last run on analytics1023 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [13:49:09] RECOVERY - puppet last run on gadolinium is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [13:49:09] RECOVERY - puppet last run on analytics1002 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [13:49:10] RECOVERY - puppet last run on plutonium is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [13:49:10] RECOVERY - puppet last run on mw1049 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:49:11] RECOVERY - puppet last run on db1039 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [13:49:18] RECOVERY - puppet last run on db2029 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [13:49:18] RECOVERY - puppet last run on db2016 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [13:49:18] RECOVERY - puppet last run on osmium is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [13:49:18] RECOVERY - puppet last run on search1005 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [13:49:19] RECOVERY - puppet last run on nembus is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [13:49:19] RECOVERY - Varnishkafka Delivery Errors per minute on cp3022 is OK: OK: Less than 1.00% above the threshold [0.0] [13:49:19] RECOVERY - puppet last run on mw1181 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [13:49:20] RECOVERY - puppet last run on mw1237 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [13:49:20] RECOVERY - puppet last run on mw1079 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [13:49:21] RECOVERY - puppet last run on elastic1015 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [13:49:21] RECOVERY - puppet last run on mw1084 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [13:49:22] RECOVERY - puppet last run on mw1133 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [13:49:22] RECOVERY - puppet last run on analytics1026 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:49:28] RECOVERY - puppet last run on analytics1013 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [13:49:28] RECOVERY - puppet last run on labmon1001 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [13:49:28] RECOVERY - puppet last run on amssq56 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [13:49:28] RECOVERY - puppet last run on elastic1024 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [13:49:29] RECOVERY - puppet last run on sodium is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [13:49:29] RECOVERY - puppet last run on mw1238 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [13:49:29] RECOVERY - puppet last run on search1017 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [13:49:30] RECOVERY - puppet last run on mw1014 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [13:49:30] RECOVERY - puppet last run on ms-be3001 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:49:31] RECOVERY - puppet last run on amssq34 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [13:49:31] RECOVERY - puppet last run on amssq60 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:49:32] RECOVERY - puppet last run on ms-fe3002 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [13:49:32] RECOVERY - puppet last run on cp3010 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:49:33] RECOVERY - puppet last run on amssq51 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [13:49:33] RECOVERY - puppet last run on db2004 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [13:49:34] RECOVERY - puppet last run on argon is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [13:49:34] RECOVERY - puppet last run on virt1007 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [13:49:35] RECOVERY - puppet last run on db1004 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:49:35] RECOVERY - puppet last run on lvs4003 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [13:49:38] RECOVERY - puppet last run on mw1180 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:49:38] RECOVERY - puppet last run on cp1062 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [13:49:38] RECOVERY - puppet last run on db1071 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [13:49:38] RECOVERY - puppet last run on rubidium is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [13:49:39] RECOVERY - puppet last run on db1020 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [13:49:39] RECOVERY - puppet last run on thallium is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [13:49:48] RECOVERY - puppet last run on wtp1018 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [13:49:48] RECOVERY - puppet last run on db2037 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [13:49:48] RECOVERY - puppet last run on cp1046 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [13:49:49] RECOVERY - puppet last run on cp1048 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [13:49:49] RECOVERY - puppet last run on mw1057 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [13:49:49] RECOVERY - puppet last run on ms-be2005 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [13:49:49] RECOVERY - puppet last run on bast4001 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [13:49:50] RECOVERY - puppet last run on mw1168 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [13:49:50] RECOVERY - puppet last run on rhenium is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [13:49:58] RECOVERY - puppet last run on ms-be1012 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:49:59] RECOVERY - puppet last run on mw1098 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:49:59] RECOVERY - puppet last run on mw1151 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [13:49:59] RECOVERY - puppet last run on snapshot1004 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [13:49:59] RECOVERY - puppet last run on labsdb1006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:49:59] RECOVERY - puppet last run on stat1002 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [13:49:59] RECOVERY - puppet last run on mw1190 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:50:00] RECOVERY - puppet last run on db2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:50:00] RECOVERY - puppet last run on cp4018 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [13:50:09] RECOVERY - puppet last run on mw1227 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [13:50:09] RECOVERY - puppet last run on oxygen is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [13:50:09] RECOVERY - puppet last run on mw1165 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [13:50:09] RECOVERY - puppet last run on mc1001 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [13:50:09] RECOVERY - puppet last run on rdb1001 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [13:50:09] RECOVERY - puppet last run on cp1038 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [13:50:09] RECOVERY - puppet last run on hafnium is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:50:10] RECOVERY - puppet last run on tungsten is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [13:50:10] RECOVERY - puppet last run on mw1034 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [13:50:18] RECOVERY - puppet last run on wtp1022 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [13:50:18] RECOVERY - puppet last run on mw1202 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:50:18] RECOVERY - puppet last run on virt1010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:50:18] RECOVERY - puppet last run on cp1063 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:50:19] RECOVERY - puppet last run on cp4005 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [13:50:19] RECOVERY - puppet last run on cp4019 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:50:19] RECOVERY - puppet last run on mw1146 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [13:50:20] RECOVERY - puppet last run on mw1125 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [13:50:28] RECOVERY - puppet last run on ms-be1009 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [13:50:28] RECOVERY - puppet last run on db2023 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [13:50:28] RECOVERY - puppet last run on ms-be2012 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:50:29] RECOVERY - puppet last run on ms-be2008 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [13:50:29] RECOVERY - puppet last run on mw1159 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [13:50:29] RECOVERY - puppet last run on mw1111 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [13:50:33] (03CR) 10Mark Bergsma: WIP: Reuse parsoid varnish for cxserver in beta (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/181613 (https://phabricator.wikimedia.org/T76200) (owner: 10Alexandros Kosiaris) [13:50:38] RECOVERY - puppet last run on db1062 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [13:50:38] RECOVERY - puppet last run on lvs1001 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [13:50:38] RECOVERY - puppet last run on mw1023 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [13:50:38] RECOVERY - puppet last run on mw1156 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:50:38] RECOVERY - puppet last run on amssq41 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [13:50:39] RECOVERY - puppet last run on mw1081 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [13:50:48] RECOVERY - puppet last run on amssq40 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:50:48] RECOVERY - puppet last run on amssq36 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [13:50:48] RECOVERY - puppet last run on mw1029 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [13:50:48] RECOVERY - puppet last run on mw1210 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [13:50:48] RECOVERY - puppet last run on search1023 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [13:50:49] RECOVERY - puppet last run on wtp1002 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [13:50:49] RECOVERY - puppet last run on mw1050 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:50:50] RECOVERY - puppet last run on mw1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:50:50] RECOVERY - puppet last run on mw1116 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [13:50:51] RECOVERY - puppet last run on search1024 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [13:50:51] RECOVERY - puppet last run on mw1030 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [13:50:58] RECOVERY - puppet last run on mw1188 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [13:50:58] RECOVERY - puppet last run on wtp1004 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [13:50:58] RECOVERY - puppet last run on mw1258 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:50:58] RECOVERY - puppet last run on mw1198 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [13:50:58] RECOVERY - puppet last run on wtp1023 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:50:59] RECOVERY - puppet last run on wtp1013 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [13:50:59] RECOVERY - puppet last run on mw1171 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [13:51:00] PROBLEM - Varnishkafka Delivery Errors per minute on cp3005 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [13:51:00] RECOVERY - puppet last run on ms-be1007 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [13:51:01] RECOVERY - puppet last run on ms-be2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:51:01] RECOVERY - puppet last run on titanium is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [13:51:08] RECOVERY - puppet last run on db1001 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [13:51:08] RECOVERY - puppet last run on db1055 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [13:51:08] RECOVERY - puppet last run on haedus is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [13:51:08] RECOVERY - puppet last run on analytics1032 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [13:51:08] RECOVERY - puppet last run on mw1183 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:51:09] RECOVERY - puppet last run on elastic1005 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [13:51:09] RECOVERY - puppet last run on mw1056 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:51:10] RECOVERY - puppet last run on elastic1002 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [13:51:18] RECOVERY - puppet last run on mw1248 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:51:18] RECOVERY - puppet last run on mw1163 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [13:51:19] RECOVERY - puppet last run on elastic1011 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [13:51:19] RECOVERY - puppet last run on elastic1014 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [13:51:19] RECOVERY - puppet last run on protactinium is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [13:51:28] RECOVERY - puppet last run on rdb1002 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [13:51:29] RECOVERY - puppet last run on mw1053 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:51:29] RECOVERY - puppet last run on mw1243 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [13:51:29] RECOVERY - puppet last run on analytics1037 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:51:29] RECOVERY - puppet last run on wtp1011 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [13:51:29] RECOVERY - puppet last run on db1057 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [13:51:29] RECOVERY - puppet last run on rcs1001 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [13:51:30] RECOVERY - puppet last run on mw1212 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:51:39] RECOVERY - puppet last run on amssq42 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [13:51:39] RECOVERY - puppet last run on mw1097 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [13:51:39] RECOVERY - puppet last run on analytics1011 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [13:51:39] RECOVERY - puppet last run on db2003 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [13:51:39] RECOVERY - puppet last run on es2007 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [13:51:39] RECOVERY - puppet last run on cp1060 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:51:39] RECOVERY - puppet last run on wtp1015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:51:40] RECOVERY - puppet last run on ms-be2007 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [13:51:40] RECOVERY - puppet last run on acamar is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [13:51:41] RECOVERY - puppet last run on mc1007 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [13:51:48] RECOVERY - puppet last run on mw1087 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [13:51:48] RECOVERY - puppet last run on amslvs3 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [13:51:48] RECOVERY - puppet last run on search1015 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [13:51:49] RECOVERY - puppet last run on mw1074 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [13:51:58] RECOVERY - puppet last run on amssq62 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [13:51:58] RECOVERY - puppet last run on hooft is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:51:59] RECOVERY - puppet last run on db1030 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [13:51:59] RECOVERY - puppet last run on analytics1001 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [13:51:59] RECOVERY - puppet last run on mw1148 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [13:52:09] RECOVERY - puppet last run on es1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:52:09] RECOVERY - puppet last run on rcs1002 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [13:52:09] RECOVERY - puppet last run on mw1239 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [13:52:09] RECOVERY - puppet last run on wtp1003 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:52:18] RECOVERY - puppet last run on mw1185 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [13:52:18] RECOVERY - puppet last run on db1054 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [13:52:18] RECOVERY - puppet last run on labsdb1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:52:18] RECOVERY - puppet last run on ocg1003 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [13:52:19] RECOVERY - puppet last run on db1070 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [13:52:29] RECOVERY - puppet last run on db1044 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [13:52:29] RECOVERY - puppet last run on mc1013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:52:29] RECOVERY - puppet last run on elastic1017 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [13:52:29] RECOVERY - puppet last run on analytics1014 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [13:52:29] RECOVERY - puppet last run on wtp1007 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [13:52:30] RECOVERY - puppet last run on mw1077 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [13:52:38] RECOVERY - puppet last run on db1063 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [13:52:39] RECOVERY - puppet last run on mw1229 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [13:52:39] RECOVERY - puppet last run on db2011 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [13:52:39] RECOVERY - puppet last run on mw1001 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [13:52:48] RECOVERY - puppet last run on amssq38 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [13:52:49] RECOVERY - puppet last run on nitrogen is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [13:52:49] RECOVERY - puppet last run on lvs2003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:52:58] RECOVERY - puppet last run on mw1167 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [13:52:58] RECOVERY - puppet last run on mercury is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [13:52:58] RECOVERY - puppet last run on db1064 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [13:52:59] RECOVERY - puppet last run on pc1003 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [13:52:59] RECOVERY - puppet last run on mw1108 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [13:52:59] RECOVERY - puppet last run on mw1225 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [13:53:08] RECOVERY - puppet last run on cp3012 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [13:53:08] RECOVERY - puppet last run on rdb1003 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [13:53:08] RECOVERY - puppet last run on mw1122 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [13:53:08] RECOVERY - puppet last run on mw1105 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [13:53:09] RECOVERY - puppet last run on cp3009 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [13:53:09] RECOVERY - puppet last run on search1013 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:53:09] RECOVERY - puppet last run on cp3004 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [13:53:10] RECOVERY - puppet last run on virt1000 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:53:10] RECOVERY - puppet last run on db1006 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [13:53:19] RECOVERY - puppet last run on db1011 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [13:53:19] RECOVERY - puppet last run on mw1022 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [13:53:19] RECOVERY - puppet last run on db1038 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [13:53:19] RECOVERY - puppet last run on mw1043 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:53:19] RECOVERY - puppet last run on mw1033 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [13:53:19] RECOVERY - puppet last run on iridium is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [13:53:28] RECOVERY - puppet last run on mw1201 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [13:53:28] RECOVERY - puppet last run on mw1121 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:53:29] RECOVERY - puppet last run on db2012 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [13:53:38] RECOVERY - puppet last run on db1047 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:53:38] RECOVERY - puppet last run on elastic1023 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:53:38] RECOVERY - puppet last run on search1006 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [13:53:38] RECOVERY - puppet last run on ms-be1011 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:53:39] RECOVERY - puppet last run on ms-fe2002 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [13:53:39] RECOVERY - puppet last run on search1022 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [13:53:39] RECOVERY - puppet last run on d-i-test is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [13:53:48] RECOVERY - puppet last run on iodine is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [13:53:48] RECOVERY - puppet last run on mw1024 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [13:53:48] RECOVERY - puppet last run on es1010 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [13:53:49] RECOVERY - puppet last run on db1061 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [13:53:58] RECOVERY - puppet last run on calcium is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [13:53:58] RECOVERY - puppet last run on analytics1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:53:59] RECOVERY - puppet last run on mw1186 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:54:08] RECOVERY - puppet last run on einsteinium is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [13:54:08] RECOVERY - puppet last run on mw1219 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [13:54:08] RECOVERY - puppet last run on mw1142 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [13:54:08] RECOVERY - puppet last run on db2028 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [13:54:09] RECOVERY - puppet last run on dbstore1002 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [13:54:09] RECOVERY - puppet last run on db1033 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:54:09] RECOVERY - puppet last run on elastic1020 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:54:09] RECOVERY - puppet last run on cp1052 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [13:54:10] RECOVERY - puppet last run on mw1223 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [13:54:18] RECOVERY - puppet last run on cp1037 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [13:54:18] RECOVERY - puppet last run on mw1091 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [13:54:18] RECOVERY - puppet last run on mw1016 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [13:54:18] RECOVERY - puppet last run on search1011 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [13:54:18] RECOVERY - puppet last run on ms-be1002 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:54:18] RECOVERY - puppet last run on mw1152 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [13:54:19] RECOVERY - puppet last run on analytics1028 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [13:54:19] RECOVERY - puppet last run on copper is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [13:54:28] RECOVERY - puppet last run on ms-fe3001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:54:28] RECOVERY - puppet last run on mw1093 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [13:54:28] RECOVERY - puppet last run on wtp1010 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [13:54:29] RECOVERY - puppet last run on analytics1003 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [13:54:29] RECOVERY - puppet last run on analytics1027 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:54:29] RECOVERY - puppet last run on mw1209 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [13:54:29] RECOVERY - puppet last run on terbium is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [13:54:30] RECOVERY - puppet last run on elastic1003 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [13:54:30] RECOVERY - puppet last run on search1004 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [13:54:31] RECOVERY - puppet last run on mw1231 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [13:54:31] RECOVERY - puppet last run on rbf1001 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [13:54:32] RECOVERY - puppet last run on virt1011 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [13:54:32] RECOVERY - puppet last run on mw1220 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [13:54:38] RECOVERY - puppet last run on mw1010 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [13:54:38] RECOVERY - puppet last run on fluorine is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [13:54:38] RECOVERY - puppet last run on search1012 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [13:54:39] RECOVERY - puppet last run on mw1203 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [13:54:39] RECOVERY - puppet last run on db1072 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [13:54:49] RECOVERY - puppet last run on mw1139 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [13:54:58] RECOVERY - puppet last run on strontium is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [13:54:58] RECOVERY - puppet last run on cp1044 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [13:54:58] RECOVERY - puppet last run on mw1112 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [13:54:59] RECOVERY - puppet last run on mw1215 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [13:54:59] RECOVERY - puppet last run on mw1090 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [13:54:59] RECOVERY - puppet last run on cp1045 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [13:55:07] https://ganglia.wikimedia.org/latest/?c=MySQL%20eqiad&m=cpu_report&r=hour&s=descending&hc=4&mc=2 :( [13:55:09] RECOVERY - puppet last run on dbstore2002 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [13:55:09] RECOVERY - puppet last run on cp1067 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [13:55:10] RECOVERY - puppet last run on mw1086 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [13:55:18] RECOVERY - puppet last run on achernar is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [13:55:19] RECOVERY - puppet last run on cp1040 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [13:55:19] RECOVERY - puppet last run on db1045 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [13:55:19] RECOVERY - puppet last run on analytics1018 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [13:55:19] RECOVERY - puppet last run on cp4020 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:55:19] RECOVERY - puppet last run on cp4009 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [13:55:19] PROBLEM - Varnishkafka Delivery Errors per minute on cp3022 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [13:55:28] RECOVERY - puppet last run on chromium is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [13:55:28] RECOVERY - puppet last run on radium is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [13:55:28] RECOVERY - puppet last run on ms-fe1002 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [13:55:28] RECOVERY - puppet last run on mw1107 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [13:55:28] RECOVERY - puppet last run on mw1064 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [13:55:28] RECOVERY - puppet last run on mw1204 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [13:55:29] RECOVERY - puppet last run on db1037 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [13:55:29] RECOVERY - puppet last run on osm-cp1001 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [13:55:29] RECOVERY - puppet last run on amssq31 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [13:55:30] RECOVERY - puppet last run on mw1071 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [13:55:30] RECOVERY - puppet last run on mw1236 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [13:55:38] RECOVERY - puppet last run on virt1008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:55:38] RECOVERY - puppet last run on mw1027 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [13:55:38] RECOVERY - puppet last run on mw1193 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [13:55:38] RECOVERY - puppet last run on bast1001 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [13:55:39] RECOVERY - puppet last run on cp1068 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [13:55:39] RECOVERY - puppet last run on ms-be1015 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [13:55:48] RECOVERY - puppet last run on mw1037 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [13:55:48] RECOVERY - puppet last run on mw1110 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [13:55:48] RECOVERY - puppet last run on mw1155 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [13:55:48] RECOVERY - puppet last run on baham is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [13:55:48] RECOVERY - puppet last run on mw1241 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:55:49] RECOVERY - puppet last run on cp4012 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [13:55:49] RECOVERY - puppet last run on mw1066 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:55:58] RECOVERY - puppet last run on es2010 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [13:55:58] RECOVERY - puppet last run on wtp1001 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [13:55:59] RECOVERY - puppet last run on lvs3003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:55:59] RECOVERY - puppet last run on lvs1003 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [13:55:59] RECOVERY - puppet last run on db1035 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [13:55:59] RECOVERY - puppet last run on mw1021 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [13:55:59] RECOVERY - puppet last run on db1056 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:56:08] RECOVERY - puppet last run on cp1053 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [13:56:08] RECOVERY - puppet last run on wtp1008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:56:09] RECOVERY - puppet last run on mw1143 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:56:18] RECOVERY - Varnishkafka Delivery Errors per minute on cp3020 is OK: OK: Less than 1.00% above the threshold [0.0] [13:56:19] RECOVERY - puppet last run on mw1255 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [13:56:19] RECOVERY - puppet last run on lvs4004 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [13:56:28] RECOVERY - puppet last run on mw1253 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [13:56:28] RECOVERY - puppet last run on cp3018 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:56:29] RECOVERY - puppet last run on dysprosium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:56:29] RECOVERY - puppet last run on elastic1029 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [13:56:29] RECOVERY - puppet last run on lvs1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:56:38] RECOVERY - puppet last run on cp4002 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [13:56:39] RECOVERY - puppet last run on mw1154 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [13:56:39] RECOVERY - puppet last run on mw1135 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [13:56:39] RECOVERY - puppet last run on amssq33 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [13:56:39] RECOVERY - puppet last run on eeden is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [13:56:48] RECOVERY - puppet last run on amssq39 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [13:56:49] RECOVERY - puppet last run on mw1158 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:56:49] RECOVERY - puppet last run on mw1104 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:56:49] RECOVERY - puppet last run on mw1113 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:56:49] RECOVERY - puppet last run on amssq37 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [13:56:49] RECOVERY - puppet last run on cp3005 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [13:56:58] RECOVERY - puppet last run on mw1131 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:56:58] RECOVERY - puppet last run on tmh1001 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [13:56:59] RECOVERY - puppet last run on cp1054 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:57:08] RECOVERY - puppet last run on mw1207 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [13:57:08] RECOVERY - puppet last run on radon is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:57:35] (03CR) 10Nikerabbit: Content Translation configuration for Production (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/181546 (owner: 10KartikMistry) [13:57:39] RECOVERY - puppet last run on cp3011 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:58:18] RECOVERY - Varnishkafka Delivery Errors per minute on cp3005 is OK: OK: Less than 1.00% above the threshold [0.0] [14:02:25] RECOVERY - Varnishkafka Delivery Errors per minute on cp3022 is OK: OK: Less than 1.00% above the threshold [0.0] [14:02:55] PROBLEM - cxserver on sca1001 is CRITICAL: Connection refused [14:03:45] PROBLEM - cxserver on sca1002 is CRITICAL: Connection refused [14:09:08] (03CR) 10coren: [C: 031] "Yep. Unnecessary to keep this; though it helped find the issue." [puppet] - 10https://gerrit.wikimedia.org/r/183833 (owner: 10Yuvipanda) [14:09:51] (03PS4) 10Yuvipanda: Remove custom fact ec2id, replaced by facter's ec2 [puppet] - 10https://gerrit.wikimedia.org/r/183209 (https://phabricator.wikimedia.org/T86297) (owner: 10Faidon Liambotis) [14:09:55] (03PS2) 10Yuvipanda: Revert "Labs: Make dynamic proxies use local resolver" [puppet] - 10https://gerrit.wikimedia.org/r/183833 [14:10:09] (03CR) 10Yuvipanda: [C: 032] Revert "Labs: Make dynamic proxies use local resolver" [puppet] - 10https://gerrit.wikimedia.org/r/183833 (owner: 10Yuvipanda) [14:12:31] (03PS1) 10Alexandros Kosiaris: Add a cxserver suffix to the package declaration [puppet] - 10https://gerrit.wikimedia.org/r/183846 [14:19:28] (03CR) 10Chmarkine: "Are there any new discussions on this topic (removing RC4)? The IESG unanimously passed the draft "Prohibiting RC4 Cipher Suites" yesterda" [puppet] - 10https://gerrit.wikimedia.org/r/178555 (owner: 10BBlack) [14:20:04] RECOVERY - puppet last run on sca1001 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [14:25:05] RECOVERY - puppet last run on sca1002 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [14:43:39] 3operations: Complete the use of HHVM over Zend PHP on the Wikimedia cluster - https://phabricator.wikimedia.org/T86081#965697 (10Reedy) >>! In T86081#965517, @mark wrote: > Other than the old PHP 5.3 version that we're running in production, why is it crucial that we get away from Zend everywhere? It's not rea... [14:45:19] (03PS1) 10Alexandros Kosiaris: checkout cxserver/deploy as well [puppet] - 10https://gerrit.wikimedia.org/r/183856 [14:45:50] (03PS5) 10Yuvipanda: Remove custom fact ec2id, replaced by facter's ec2 [puppet] - 10https://gerrit.wikimedia.org/r/183209 (https://phabricator.wikimedia.org/T86297) (owner: 10Faidon Liambotis) [14:45:52] (03CR) 10Alexandros Kosiaris: [C: 032] checkout cxserver/deploy as well [puppet] - 10https://gerrit.wikimedia.org/r/183856 (owner: 10Alexandros Kosiaris) [14:46:01] heh, so much rebasing [14:46:09] (03CR) 10Alexandros Kosiaris: [C: 032] Add a cxserver suffix to the package declaration [puppet] - 10https://gerrit.wikimedia.org/r/183846 (owner: 10Alexandros Kosiaris) [14:47:48] (03PS6) 10Yuvipanda: Remove custom fact ec2id, replaced by facter's ec2 [puppet] - 10https://gerrit.wikimedia.org/r/183209 (https://phabricator.wikimedia.org/T86297) (owner: 10Faidon Liambotis) [14:48:02] (03CR) 10Yuvipanda: [C: 032] "Tested on betalabs, works fine \o/" [puppet] - 10https://gerrit.wikimedia.org/r/183209 (https://phabricator.wikimedia.org/T86297) (owner: 10Faidon Liambotis) [14:55:13] (03CR) 10Reedy: "https://github.com/wikimedia/operations-puppet/commit/40c1ad76bb13c4f8a19a46ab2e56f489f25ab353 / Ibaa28d8cb93c440f909e075e767f47a30076b687" [puppet] - 10https://gerrit.wikimedia.org/r/183568 (https://phabricator.wikimedia.org/T1387) (owner: 10Reedy) [14:57:55] PROBLEM - puppet last run on sca1001 is CRITICAL: CRITICAL: Puppet has 1 failures [14:59:15] RECOVERY - puppet last run on sca1001 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [15:06:46] (03PS1) 10Alexandros Kosiaris: Followup commit for a49baa9 [puppet] - 10https://gerrit.wikimedia.org/r/183857 [15:07:14] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: 7.14% of data above the critical threshold [500.0] [15:08:15] PROBLEM - Varnishkafka Delivery Errors per minute on cp3021 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [20000.0] [15:09:09] (03CR) 10Alexandros Kosiaris: [C: 032] Followup commit for a49baa9 [puppet] - 10https://gerrit.wikimedia.org/r/183857 (owner: 10Alexandros Kosiaris) [15:15:35] RECOVERY - Varnishkafka Delivery Errors per minute on cp3021 is OK: OK: Less than 1.00% above the threshold [0.0] [15:17:53] 3operations: Complete the use of HHVM over Zend PHP on the Wikimedia cluster - https://phabricator.wikimedia.org/T86081#965759 (10hashar) CI still rely on PHP 5.3 though since that is what MediaWiki supports. Phabricator is not advertised as supported under HHVM, but it runs on our setup with PHP 5.5 as provide... [15:19:34] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: Less than 1.00% above the threshold [250.0] [15:20:35] PROBLEM - Unmerged changes on repository puppet on strontium is CRITICAL: There is one unmerged change in puppet (dir /var/lib/git/operations/puppet). [15:20:59] 3operations: Complete the use of HHVM over Zend PHP on the Wikimedia cluster - https://phabricator.wikimedia.org/T86081#965763 (10mark) Just restricting the conversation to MediaWiki: Some things can remain on PHP Zend, as long as that version is newer than 5.3? [15:21:54] RECOVERY - Unmerged changes on repository puppet on strontium is OK: No changes to merge. [15:24:02] (03CR) 10Jgreen: [C: 032 V: 031] "also see https://phabricator.wikimedia.org/T86208" [dns] - 10https://gerrit.wikimedia.org/r/183262 (owner: 10Jgreen) [15:24:54] !log deployed DNS dmarc record for wikipedia.* [15:24:59] Logged the message, Master [15:25:47] 3operations: Complete the use of HHVM over Zend PHP on the Wikimedia cluster - https://phabricator.wikimedia.org/T86081#965768 (10Reedy) >>! In T86081#965763, @mark wrote: > Just restricting the conversation to MediaWiki: > > Some things can remain on PHP Zend, as long as that version is newer than 5.3? Yeah,... [15:26:13] 3operations: Complete the use of HHVM over Zend PHP on the Wikimedia cluster - https://phabricator.wikimedia.org/T86081#965769 (10chasemp) >>! In T86081#965759, @hashar wrote: > Phabricator is not advertised as supported under HHVM, but it runs on our setup with PHP 5.5 as provided by Ubuntu Trusty. I have spo... [15:28:24] PROBLEM - Varnishkafka Delivery Errors per minute on cp3019 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [20000.0] [15:28:25] chasemp: that's just lol [15:30:09] why for lol? [15:30:34] Phabricator is/was a FB project. But they won't support another facebook project (ie hhvm) [15:31:58] 3ops-eqiad: Add blanking panels to full and semi-full racks in row D - https://phabricator.wikimedia.org/T86306#965783 (10Cmjohnson) 3NEW [15:32:38] Reedy: ah well HHVM was not even HHVM really back then I think [15:32:48] and the experience was far less thrilling is my impression [15:33:55] You'd think with the size and complexity of something like phabricator, there would be a decent benefit from being able to run it on hhvm [15:34:17] <_joe_> no one wants to run phabricator on HHVM I hope [15:34:48] <_joe_> the substance of that ticket is "upgrade tin and some other hosts to trusty" [15:35:00] <_joe_> the real problem will be - try to guess.... [15:35:09] <_joe_> ...wikitech! [15:35:21] :) [15:35:22] <_joe_> I repeatedly advised AGAINST including it in the deploy train [15:35:31] <_joe_> for a good reason [15:36:01] <_joe_> it's a piece of infrastructure as well (given it runs OSM), which should be able to move separatedly from the mw stack [15:36:33] <_joe_> but well, I guess the people who turned me down then will address this now. [15:38:05] RECOVERY - Varnishkafka Delivery Errors per minute on cp3019 is OK: OK: Less than 1.00% above the threshold [0.0] [15:38:56] (03CR) 10Rush: "fwiw yes I think this bug was since resolved upstream :)" [puppet] - 10https://gerrit.wikimedia.org/r/183785 (owner: 10Faidon Liambotis) [15:44:04] PROBLEM - Varnishkafka Delivery Errors per minute on cp3015 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [15:48:50] hashar: the unit tests for the math extension seems to be broken [15:48:58] did you change something in the configuration [15:48:59] https://integration.wikimedia.org/ci/job/mwext-Math-testextension-zend/4/console [15:49:56] _joe_: I’m trying to get rid of wikitech :) [15:49:56] well [15:50:01] the special parts of it, at least [15:50:05] RECOVERY - Varnishkafka Delivery Errors per minute on cp3015 is OK: OK: Less than 1.00% above the threshold [0.0] [15:50:12] and have it be just another wiki. just with LDAP instead of SUL [15:50:18] hopefully in the… next 6 months [15:50:35] <_joe_> well, if we remove OSM we can use SUL as well [15:51:09] yeah, but unification would be a problem [15:51:20] <_joe_> the problem for me is OSM - or to be more precise, the fact that AFAIK wikitech must be hosted on virt1000 [15:51:34] <_joe_> if that's not the case, we have much less of a problem [15:53:48] 3operations: Switch HAT appservers to trusty's ICU - https://phabricator.wikimedia.org/T86096#965824 (10Joe) p:5High>3Normal [15:54:25] 3operations: Switch HAT appservers to trusty's ICU - https://phabricator.wikimedia.org/T86096#961386 (10Joe) This will be done with our next train of package upgrades hopefully. [15:54:49] 3operations: Switch HAT appservers to trusty's ICU - https://phabricator.wikimedia.org/T86096#965827 (10Joe) [15:55:03] 3operations: Switch HAT appservers to trusty's ICU - https://phabricator.wikimedia.org/T86096#961386 (10Joe) [15:56:49] 3operations: Switch HAT appservers to trusty's ICU - https://phabricator.wikimedia.org/T86096#965842 (10Reedy) Yup, updateCollation.php does need running when we've upgraded :) I think we need to run it everywhere... Can you confirm @tstarling ? Also, we will probably need to get @springle involved for the lar... [15:57:09] 3operations: Complete the use of HHVM over Zend PHP on the Wikimedia cluster - https://phabricator.wikimedia.org/T86081#965846 (10mark) So, the move to HHVM has already happened for everywhere where it really matters. We can convert some more things, but it's not critical. It seems to me that this ticket is real... [15:58:54] (03PS1) 10Rush: Puppet add default license [puppet] - 10https://gerrit.wikimedia.org/r/183862 [16:00:51] 3WMF-Legal, operations, Wikimedia-General-or-Unknown: Default license for operations/puppet - https://phabricator.wikimedia.org/T67270#965855 (10chasemp) As a draft: https://gerrit.wikimedia.org/r/#/c/183862/ [16:01:14] RECOVERY - cxserver on sca1002 is OK: HTTP OK: HTTP/1.1 200 OK - 1103 bytes in 0.022 second response time [16:02:52] 3operations: Complete the use of HHVM over Zend PHP on the Wikimedia cluster - https://phabricator.wikimedia.org/T86081#965862 (10hashar) Dropping PHP 5.3 supports from MediaWiki is already tracked as {T75901} which was pending the migration of Wikimedia. So I would just close this task (which is phasing out Ze... [16:03:28] chasemp: don't merge that yet [16:03:34] oh no plans too [16:03:57] but someone had to make it if it was ever going to get done :) [16:06:38] (03PS1) 10Alexandros Kosiaris: Adjust the configuration file path for cxserver [puppet] - 10https://gerrit.wikimedia.org/r/183865 [16:09:01] <^d> chasemp: I know upstream wasn't interested in hhvm for phab but we could probably do it. Maybe convert like phab-01 or something. [16:09:19] <^d> For testing [16:09:33] (03CR) 10Alexandros Kosiaris: [C: 032] Adjust the configuration file path for cxserver [puppet] - 10https://gerrit.wikimedia.org/r/183865 (owner: 10Alexandros Kosiaris) [16:09:38] I'm not interested in being teh solo hhvm phab install [16:09:53] if that's the majority desire then I guess we try it but [16:10:05] seems like an endless pit of little reward for too much trouble [16:10:24] <^d> I honestly doubt it's going to be as bad on the phab side as you and evan think. [16:10:36] no probably not, or at least out of the box [16:10:38] 3WMF-Legal, operations, Wikimedia-General-or-Unknown: Default license for operations/puppet - https://phabricator.wikimedia.org/T67270#965873 (10hashar) Some of my puppet stuff are borrowed from OpenStack with an Apache License iirc. I don't think I am willing to use CC0. [16:10:44] but when they do something they love that doesn't sit well with hhvm [16:10:54] <^d> Like what? [16:10:54] we are going to be coordinating with two upstreams, neither of which care about each other [16:11:14] I have no idea that's part of the worry :) [16:11:25] RECOVERY - cxserver on sca1001 is OK: HTTP OK: HTTP/1.1 200 OK - 1103 bytes in 0.028 second response time [16:11:40] <^d> I would rather base our worry in fact. [16:11:55] php sucks [16:11:57] that's a fact [16:12:05] <_joe_> chasemp: there is _no_ good reason to run phab itself on HHVM [16:12:15] <_joe_> I don't see a good reason to do that either [16:12:17] <^d> {{cn}} [16:12:19] Reedy: opinioned fact [16:12:20] yes I agree, but ^d seems to think maybe it would be ok [16:12:35] opinionated fact* [16:12:38] <_joe_> ^d: http://www.reddit.com/r/lolphp/ [16:12:42] personally I see a road of endless troubles [16:13:03] I really don't care in the absolute sense, but if I'm making the call no thanks [16:14:33] (03CR) 10Alexandros Kosiaris: [C: 032] LVS for cxserver [puppet] - 10https://gerrit.wikimedia.org/r/183243 (owner: 10Alexandros Kosiaris) [16:16:06] PROBLEM - Varnishkafka Delivery Errors per minute on cp3006 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [20000.0] [16:16:48] 3WMF-Legal, operations, Wikimedia-General-or-Unknown: Default license for operations/puppet - https://phabricator.wikimedia.org/T67270#965885 (10faidon) @LuisV_WMF, I think puppet manifests are very much "code", especially some of our quite complicated manifests. Think of it as a large collection of bash scripts... [16:17:12] <^d> _joe_: I find no references to Phabricator in lolphp. [16:17:45] PROBLEM - Varnishkafka Delivery Errors per minute on cp3008 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [16:18:22] seems like maybe facebook runs their install on hhvm https://secure.phabricator.com/T1261 [16:18:29] but it doesn't seem like a very nice time [16:18:32] ^d: someone linked to an even priestly post in there a while back [16:18:49] <^d> Yes, I know. [16:18:51] <^d> Lots of FUD [16:19:06] <^d> "I don't want to support it", etc [16:19:24] ^d, can I ask why you are in favor of phab on hhvm? [16:19:56] @hashar: thanks for fixing the dependency [16:21:29] aude: hoo can one of you update the deploy calendar (in the upcoming section) what the wikidata team's plan is for next week? I still haven't seen the plan, but I keep hearing there is a plan. [16:21:43] <_joe_> ^d: sorry I thought that was about "php sucks" [16:21:53] <^d> Heh, no. [16:22:02] <^d> For "no benefit at all to hhvm" [16:22:08] <^d> chasemp: I like hhvm. I also like consistency and an upstream that's actually responsive to our needs. [16:22:11] greg-g: Plan is to make a branch and follow the normal deploy, I think [16:22:21] aude is afk [16:22:49] <_joe_> ^d: but wait for phpng! in my lame fibonacci benchmark on a wearable it is 2x faster than HHVM!!!1! [16:23:01] <^d> yay benchmarks!! [16:23:15] <^d> chasemp: And I don't think it's nearly as scary as you think. Moving a single-server application to HHVM is trivial. We did it in MW ages ago. [16:23:31] I don't know if it will be really hard or really easy [16:23:36] but I know the cost is non-zero [16:23:44] and maybe really non-zero into the future [16:23:50] <_joe_> hey we could ditch php 5.3 support in mediawiki and use https://github.com/endel/php-code-downgrade/ [16:23:51] and what problem would we be solving? [16:24:04] RECOVERY - Varnishkafka Delivery Errors per minute on cp3008 is OK: OK: Less than 1.00% above the threshold [0.0] [16:24:04] physikerwelt: that is a bit of a mess sorry :( [16:24:16] physikerwelt: there are some other tests failing, not sure what happens on that front though [16:24:46] RECOVERY - Varnishkafka Delivery Errors per minute on cp3006 is OK: OK: Less than 1.00% above the threshold [0.0] [16:24:54] <^d> chasemp: It's not a problem yet, but as load on Phab increases (and it will, as we assume more git duties) it'll be nice to find ways to reduce that load. [16:24:54] PROBLEM - Varnishkafka Delivery Errors per minute on cp3019 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [20000.0] [16:25:03] <^d> Since Phab doesn't scale horizontally nearly as nicely. [16:25:48] That's gonna suck [16:25:59] hopefully the scaling stuff will come just in time for our needs, and I'm not like dead set against hhvm in this case, although joe saying it's a bad idea would tend to sway me [16:26:22] just curious as you seem really for it, but I wasn't sure of the why [16:26:26] <^d> I would rather us test it out in like vagrant and labs and at least make an informed decision. [16:26:32] * ^d adds to his todo list [16:26:55] PROBLEM - Varnishkafka Delivery Errors per minute on cp3015 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [16:27:02] let me say we have dedicated far less resources to phab than people seem to think [16:27:13] upstream has basically carried us a large part of the way, it's really like...me...and mukunda [16:27:18] and so doing hhvm off the cuff [16:27:26] * ^d is clearly volunteering [16:27:27] We had less assigned when we went to gerrit ;D [16:27:28] seems somewhat odd to me considering a comment like https://secure.phabricator.com/T1261#89894 [16:27:36] sure but forever and ever volunteering :) [16:28:12] <^d> You seem to think I have a life outside of this ;-) [16:28:14] chasemp: would volunteers be able to help you with it? [16:28:42] valhallasw`cloud: no, go away [16:28:43] we are trying to sort things out so it's not on us with some CI and process so I hope so [16:28:48] hoo: what's the short description of what's new? [16:29:20] <^d> Evan responds best to patches. I like writing patches. [16:29:24] but volunteering is good, however does that mean that every merge 2 times a month you are going to have tiem to sort out the newest HHVM issue :) [16:29:28] <^d> It's better than writing bugs and just bitching at people :) [16:29:30] I really don't know I guess [16:29:32] yes that is true [16:29:50] greg-g: Nothing big... we had a couple more performance things merged [16:30:09] If we're switching away from gerrit, to all phab, allocating more (people) resouces to phab seems likely [16:30:10] hoo: kk, ty [16:30:14] They might hit "hard" or not at all... hard to tell from local testing according to my experience [16:30:28] Until resources happen I'm not holding my breath :) [16:30:36] heh [16:30:38] not out of doubt, but more a healthy skepticism [16:30:50] @hashar: maybe I can reproduce it on the math-preview labs instance... it would be good to know if the produced file looks different [16:30:56] I don't want to charge the hill without looking behind me that's for sure [16:30:58] hoo: k [16:31:15] PROBLEM - Varnishkafka Delivery Errors per minute on cp3007 is CRITICAL: CRITICAL: 12.50% of data above the critical threshold [20000.0] [16:31:26] (no resources explicitly planned for phab/gerrit stuff coming "soon") [16:33:04] RECOVERY - Varnishkafka Delivery Errors per minute on cp3015 is OK: OK: Less than 1.00% above the threshold [0.0] [16:33:05] PROBLEM - Varnishkafka Delivery Errors per minute on cp3020 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [20000.0] [16:33:19] ES is finished now anyway. ^d must have lots of free time [16:33:21] * Reedy looks shifty [16:33:36] (03PS2) 10BryanDavis: Change fatalmonitor script to read hhvm.log [puppet] - 10https://gerrit.wikimedia.org/r/183541 [16:34:20] greg-g: https://phabricator.wikimedia.org/T85971#965918 [16:34:25] you probably want to undo that [16:34:34] RECOVERY - Varnishkafka Delivery Errors per minute on cp3019 is OK: OK: Less than 1.00% above the threshold [0.0] [16:35:31] hoo: :) [16:35:35] (03CR) 10Alexandros Kosiaris: [C: 032] LVS IP assignment for cxserver [dns] - 10https://gerrit.wikimedia.org/r/183232 (owner: 10Alexandros Kosiaris) [16:37:24] RECOVERY - Varnishkafka Delivery Errors per minute on cp3007 is OK: OK: Less than 1.00% above the threshold [0.0] [16:40:08] hoo: btw, wanna due it during the "morning" swat window? (other projects sidebar) [16:40:15] RECOVERY - Varnishkafka Delivery Errors per minute on cp3020 is OK: OK: Less than 1.00% above the threshold [0.0] [16:40:25] hoo: on monday, that is [16:41:04] I might not be around during that one, so that is Katie's call [16:41:19] If I am around, probably yes [16:41:24] PROBLEM - Host cxserver.svc.eqiad.wmnet is DOWN: CRITICAL - Network Unreachable (10.2.2.18) [16:41:36] <^d> Reedy: 800 more things to do! [16:41:36] <^d> It's why I keep a todo list. [16:41:58] hoo: i'll put it then, and we can figure it out day of [16:42:43] hmm. network unreachable? hopefully that just means closed port or response timeout.. [16:42:46] ACKNOWLEDGEMENT - Host cxserver.svc.eqiad.wmnet is DOWN: CRITICAL - Network Unreachable (10.2.2.18) alexandros kosiaris bringing the service up, no worries [16:42:52] yay [16:43:24] PROBLEM - Varnishkafka Delivery Errors per minute on cp3016 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [20000.0] [16:45:27] Reedy: https://phabricator.wikimedia.org/T86310 [16:45:32] Will that cause problems? [16:45:42] the currently largest tiff is only about 1GiB [16:46:03] Hm.. logo in gerrit.wikimedia.org is broken [16:46:04] GET https://gerrit.wikimedia.org/r/static/wikimedia-codereview-logo.png net::ERR_CONTENT_DECODING_FAILED [16:46:15] PROBLEM - Varnishkafka Delivery Errors per minute on cp3010 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [20000.0] [16:47:33] (03PS1) 10Yuvipanda: Remove ambigus, unused 'hashs' view [software] - 10https://gerrit.wikimedia.org/r/183871 (https://phabricator.wikimedia.org/T85867) [16:48:07] hoo: At worst, it won't thumbnail [16:48:26] hoo: Though, it's probably about time I upped some of the thumbnailing limits anyway [16:48:34] Ok, but not like that one pdf(?) that hammered down stuff once [16:49:24] Will tackle that upload request, then [16:50:35] RECOVERY - Varnishkafka Delivery Errors per minute on cp3016 is OK: OK: Less than 1.00% above the threshold [0.0] [16:50:56] (03CR) 10coren: [C: 031] "Yuvi word is Law." [software] - 10https://gerrit.wikimedia.org/r/183871 (https://phabricator.wikimedia.org/T85867) (owner: 10Yuvipanda) [16:51:13] (03CR) 10Yuvipanda: [C: 032] Remove ambigus, unused 'hashs' view [software] - 10https://gerrit.wikimedia.org/r/183871 (https://phabricator.wikimedia.org/T85867) (owner: 10Yuvipanda) [16:51:37] Did someone deploy a gerrit upgrade or config change? [16:52:26] (03PS1) 10Yuvipanda: Remove hashs view from whitelist [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/183872 (https://phabricator.wikimedia.org/T85867) [16:53:12] Krinkle: Don't think so [16:53:28] https://gerrit.wikimedia.org/r/static/wikimedia-codereview-logo.png WFM in incognito [16:53:40] interesting [16:55:27] must've gotten stuck in a cache somewhere [16:55:38] It was serving as 304 Not Modified with empty contents [16:55:40] 3operations: graphite clustering plan - https://phabricator.wikimedia.org/T86316#965981 (10fgiunchedi) 3NEW a:3fgiunchedi [16:55:55] RECOVERY - Varnishkafka Delivery Errors per minute on cp3010 is OK: OK: Less than 1.00% above the threshold [0.0] [16:58:05] RECOVERY - Host cxserver.svc.eqiad.wmnet is UP: PING OK - Packet loss = 0%, RTA = 0.79 ms [16:58:06] (03PS1) 10Yuvipanda: Drop pif_edits, povwatch_* [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/183874 (https://phabricator.wikimedia.org/T85867) [16:58:33] <_joe_> akosiaris: I assume this is you? [16:58:43] !log CREATE INDEX /*i*/br_timestamp ON /*_*/bounce_records(br_timestamp); for bounce_records on wikishared on extension1 [16:58:44] _joe_: you assume correctly [16:58:47] Logged the message, Master [16:58:53] the good news is I am mostly done... [16:59:16] (03PS1) 10Yuvipanda: Remove povwatch_* and pif_edits views [software] - 10https://gerrit.wikimedia.org/r/183875 (https://phabricator.wikimedia.org/T85867) [16:59:22] Coren: ^^ [16:59:30] <_joe_> akosiaris: the good news is it's friday and amply beer'o clock [16:59:51] hehe [17:00:15] YuviPanda: Shouldn't you combine those a bit and make a cleanup patch rather than umptimillion tiny patches? [17:00:32] (03CR) 10coren: [C: 031] "Old cruft is old." [software] - 10https://gerrit.wikimedia.org/r/183875 (https://phabricator.wikimedia.org/T85867) (owner: 10Yuvipanda) [17:00:33] Coren: yeah, I realized that later but I had previously merged the patches [17:00:44] Coren: these are the only ones legoktm has identified at this point [17:01:04] 3operations: graphite clustering plan - https://phabricator.wikimedia.org/T86316#966005 (10chasemp) p:5Triage>3Normal [17:01:08] maybe I can convince Reedy to take a look at the audit stuff as well :D [17:01:10] greg-g: Think we can/should deploy https://gerrit.wikimedia.org/r/183873 quick? [17:01:19] (03CR) 10Yuvipanda: [C: 032] Remove povwatch_* and pif_edits views [software] - 10https://gerrit.wikimedia.org/r/183875 (https://phabricator.wikimedia.org/T85867) (owner: 10Yuvipanda) [17:01:20] It's flooding logstash right now [17:01:23] greg-g: see also spaaaaaaaaaaaaaaaaaaaaaaam [17:01:25] Not too bad, but still. [17:01:35] Reedy: marktraceur doit [17:01:39] Got it. [17:01:48] * Reedy waits for jenkins to fuck the shit up first [17:02:11] Reedy: Would you mind doing branch patches? I can deploy them, but it'd be a pain to update my local mediawiki branches [17:02:28] marktraceur: just use cherry pick in gerrit interface? [17:02:32] (03PS2) 10Yuvipanda: Remove hashs, pif_edits & povwatch* view from whitelist [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/183872 (https://phabricator.wikimedia.org/T85867) [17:02:35] Oh, right, because it's not an extension [17:02:38] :) [17:02:40] Excuse me while I rejoice [17:02:48] (03Abandoned) 10Yuvipanda: Drop pif_edits, povwatch_* [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/183874 (https://phabricator.wikimedia.org/T85867) (owner: 10Yuvipanda) [17:03:09] (I need to make an irssi alias for /doit $nick that does $nick: https://www.youtube.com/watch?v=JoqDYcCDOTg [17:04:43] greg-g: don't forget to create /postmortem https://www.youtube.com/watch?v=_GP5_NQ_LEs [17:05:31] hoo: :) :) I was almost expecting the picard "why the ef?!" meme in video form [17:06:45] Patches getting merged [17:11:22] (03PS1) 10Giuseppe Lavagetto: puppet: move hiera lookups for the cluster to the actual classes [puppet] - 10https://gerrit.wikimedia.org/r/183879 [17:11:24] (03PS1) 10Giuseppe Lavagetto: mediawiki: move cluster definitions to hiera [puppet] - 10https://gerrit.wikimedia.org/r/183880 [17:11:26] (03PS1) 10Giuseppe Lavagetto: puppet: use the role keyword for all varnishes [puppet] - 10https://gerrit.wikimedia.org/r/183881 [17:11:28] (03PS1) 10Giuseppe Lavagetto: puppet: include admin in role classes for mediawiki and cache [puppet] - 10https://gerrit.wikimedia.org/r/183882 [17:12:08] <_joe_> if someone wants to have fun over the weekend.... [17:14:23] cool [17:16:52] OK, I'm updating wmf13 and wmf14 and syncing now that Jenkins stopped being lazy [17:18:55] !log marktraceur Synchronized php-1.25wmf13/includes/filerepo/file/File.php: Remove silly debug line from File class (duration: 00m 08s) [17:18:59] Logged the message, Master [17:19:13] !log marktraceur Synchronized php-1.25wmf14/includes/filerepo/file/File.php: Remove silly debug line from File class (duration: 00m 07s) [17:19:14] RIP [17:19:16] Logged the message, Master [17:19:18] greg-g: Thanks for that! [17:19:52] (03CR) 10Gage: [C: 031] puppet: include admin in role classes for mediawiki and cache [puppet] - 10https://gerrit.wikimedia.org/r/183882 (owner: 10Giuseppe Lavagetto) [17:20:16] marktraceur: np and ty [17:36:37] (03CR) 10Gage: [C: 031] puppet: use the role keyword for all varnishes [puppet] - 10https://gerrit.wikimedia.org/r/183881 (owner: 10Giuseppe Lavagetto) [17:39:09] (03CR) 10Merlijn van Deen: [C: 04-1] Add report that verifies view definitions (039 comments) [software/labsdb-auditor] - 10https://gerrit.wikimedia.org/r/182848 (https://phabricator.wikimedia.org/T85473) (owner: 10Yuvipanda) [17:39:31] 3ops-eqiad, ops-codfw: ship blanking panels from eqiad to codfw - https://phabricator.wikimedia.org/T86082#966074 (10Cmjohnson) I did an inventory check on the blanking panels I have. There are a total of 65 3u panels.. Good for 195u. That just over 4 cabinets. We should definitely buy matching panels but at... [17:39:48] (03CR) 10Gage: [C: 031] mediawiki: move cluster definitions to hiera [puppet] - 10https://gerrit.wikimedia.org/r/183880 (owner: 10Giuseppe Lavagetto) [17:40:56] 3ops-eqiad: Add blanking panels to full and semi-full racks in row D - https://phabricator.wikimedia.org/T86306#966077 (10Cmjohnson) 5Open>3Resolved a:3Cmjohnson Blanking Panels have been installed in Row D. The only exception is the first 35u in racks D1-D3 are open for future growth. We do not have any... [17:43:32] paravoid, _joe_: so, the localhost trick appeared to have not made the problem go away entirely. it was applied to mw123*, but i still see connection errors from mw1233 in the log. i think i will make a patch for configurable user perms to nutcracker, but until that is completed, what should we do? sysctl to reap connections, or (preferable, imo) amend the init script to add a DAEMON_UMASK option? [17:44:07] preferable because the unix domain socket solution also comes with a performance benefit [17:44:42] (03CR) 10Rush: [C: 04-1] "no...I don't think we want to do this as it will conflict with any two roles that include admin. admin has to be included at the node lev" [puppet] - 10https://gerrit.wikimedia.org/r/183882 (owner: 10Giuseppe Lavagetto) [17:45:28] (03CR) 10Rush: "https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/admin/README;ab9182aedc7cf8dd0d53d47bfda02e6982221840$132" [puppet] - 10https://gerrit.wikimedia.org/r/183882 (owner: 10Giuseppe Lavagetto) [17:48:24] (03PS1) 10Gilles: Disable thumbnail prerendering in production [mediawiki-config] - 10https://gerrit.wikimedia.org/r/183885 (https://phabricator.wikimedia.org/T76035) [17:57:40] (03PS2) 10Rush: Puppet add default license [puppet] - 10https://gerrit.wikimedia.org/r/183862 [17:58:00] 3Continuous-Integration, Ops-Access-Requests: Make sure relevant RelEng people have access to gallium (Chris M, Dan, Mukunda, Zeljko) - https://phabricator.wikimedia.org/T85936#966189 (10chasemp) [17:58:54] 3Continuous-Integration, Ops-Access-Requests: Make sure relevant RelEng people have access to gallium (Chris M, Dan, Mukunda, Zeljko) - https://phabricator.wikimedia.org/T85936#957717 (10chasemp) [17:59:54] PROBLEM - Varnishkafka Delivery Errors per minute on cp3005 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [18:00:25] PROBLEM - Varnishkafka Delivery Errors per minute on cp3010 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [18:00:35] PROBLEM - Varnishkafka Delivery Errors per minute on cp3019 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [20000.0] [18:01:35] PROBLEM - Varnishkafka Delivery Errors per minute on cp3020 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [20000.0] [18:01:47] ottomata: ^ [18:02:09] (03PS3) 10BryanDavis: Change fatalmonitor script to read hhvm.log [puppet] - 10https://gerrit.wikimedia.org/r/183541 [18:02:30] (03PS4) 10Ori.livneh: Change fatalmonitor script to read hhvm.log [puppet] - 10https://gerrit.wikimedia.org/r/183541 (owner: 10BryanDavis) [18:02:37] (03CR) 10Ori.livneh: [C: 032 V: 032] Change fatalmonitor script to read hhvm.log [puppet] - 10https://gerrit.wikimedia.org/r/183541 (owner: 10BryanDavis) [18:05:25] PROBLEM - Varnishkafka Delivery Errors per minute on cp3006 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [18:05:35] PROBLEM - Varnishkafka Delivery Errors per minute on cp3022 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [18:06:24] RECOVERY - Varnishkafka Delivery Errors per minute on cp3010 is OK: OK: Less than 1.00% above the threshold [0.0] [18:08:14] RECOVERY - Varnishkafka Delivery Errors per minute on cp3005 is OK: OK: Less than 1.00% above the threshold [0.0] [18:09:04] RECOVERY - Varnishkafka Delivery Errors per minute on cp3019 is OK: OK: Less than 1.00% above the threshold [0.0] [18:09:29] 3operations: Complete the use of HHVM over Zend PHP on the Wikimedia cluster - https://phabricator.wikimedia.org/T86081#966222 (10chasemp) 5Open>3Resolved a:3chasemp >>! In T86081#965862, @hashar wrote: > Dropping PHP 5.3 supports from MediaWiki is already tracked as {T75901} which was pending the migratio... [18:10:04] RECOVERY - Varnishkafka Delivery Errors per minute on cp3020 is OK: OK: Less than 1.00% above the threshold [0.0] [18:11:35] RECOVERY - Varnishkafka Delivery Errors per minute on cp3022 is OK: OK: Less than 1.00% above the threshold [0.0] [18:12:04] 3operations: Complete the use of HHVM over Zend PHP on the Wikimedia cluster - https://phabricator.wikimedia.org/T86081#966232 (10Legoktm) >>! In T86081#965846, @mark wrote: > So, the move to HHVM has already happened for everywhere where it really matters. We can convert some more things, but it's not critical.... [18:12:35] RECOVERY - Varnishkafka Delivery Errors per minute on cp3006 is OK: OK: Less than 1.00% above the threshold [0.0] [18:13:10] 3operations: Complete the use of HHVM over Zend PHP on the Wikimedia cluster - https://phabricator.wikimedia.org/T86081#966235 (10mark) >>! In T86081#966232, @Legoktm wrote: >>>! In T86081#965846, @mark wrote: >> So, the move to HHVM has already happened for everywhere where it really matters. We can convert som... [18:14:22] 3Release-Engineering, operations: Determine Trebuchet/git-deploy maintenance plan - https://phabricator.wikimedia.org/T85008#966251 (10chasemp) a:3greg Dear Greg, You marked this as high which I think is reasonable :) But it seems like anything really high priority has to have an assignee or it should be bust... [18:14:35] (03PS3) 10Alexandros Kosiaris: Reuse parsoid varnish for cxserver [puppet] - 10https://gerrit.wikimedia.org/r/181613 (https://phabricator.wikimedia.org/T76200) [18:17:04] (03PS2) 10Ottomata: Add job to crunch Language team data [puppet] - 10https://gerrit.wikimedia.org/r/183734 (owner: 10Milimetric) [18:17:25] 3operations, Beta-Cluster: Renumber apache user/group to uid=48 - https://phabricator.wikimedia.org/T78076#966266 (10ori) > This package was removed by @ori in https://gerrit.wikimedia.org/r/#/c/136151/ on 2014-05-29 which left the uid of the apache user unspecified for all subsequent hosts provisioned by Puppet... [18:20:43] (03CR) 10BBlack: [C: 031] "+1 for the idea being a sane path forward for now, to use parsoid varnish just as a pass-proxy until a better idea evolves. It does avoid" [puppet] - 10https://gerrit.wikimedia.org/r/181613 (https://phabricator.wikimedia.org/T76200) (owner: 10Alexandros Kosiaris) [18:23:38] (03CR) 10Ori.livneh: [C: 04-2] "We should not care about the exact UID; this is carryover from having to migrate from the old wikimedia-appserver package. Merging this wo" [puppet] - 10https://gerrit.wikimedia.org/r/178690 (https://phabricator.wikimedia.org/T78076) (owner: 10BryanDavis) [18:25:00] (03PS4) 10Alexandros Kosiaris: Reuse parsoid varnish for cxserver [puppet] - 10https://gerrit.wikimedia.org/r/181613 (https://phabricator.wikimedia.org/T76200) [18:25:52] (03PS8) 10Ori.livneh: logstash: Parse apache syslog messages [puppet] - 10https://gerrit.wikimedia.org/r/179480 (owner: 10BryanDavis) [18:26:00] (03CR) 10Ori.livneh: [C: 032 V: 032] logstash: Parse apache syslog messages [puppet] - 10https://gerrit.wikimedia.org/r/179480 (owner: 10BryanDavis) [18:29:54] PROBLEM - Varnishkafka Delivery Errors per minute on cp3004 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [18:30:35] PROBLEM - Varnishkafka Delivery Errors per minute on cp3020 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [20000.0] [18:35:04] PROBLEM - Varnishkafka Delivery Errors per minute on cp3008 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [20000.0] [18:35:23] ottomata: ack? [18:35:30] are the varnishkafka alerts ignorable? [18:35:44] PROBLEM - Varnishkafka Delivery Errors per minute on cp3022 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [18:35:45] RECOVERY - Varnishkafka Delivery Errors per minute on cp3004 is OK: OK: Less than 1.00% above the threshold [0.0] [18:36:00] ori: ignroable, yeah, i keep getting side tracked, they are kinda under development a bit too :/ [18:39:05] RECOVERY - Varnishkafka Delivery Errors per minute on cp3020 is OK: OK: Less than 1.00% above the threshold [0.0] [18:42:24] RECOVERY - Varnishkafka Delivery Errors per minute on cp3008 is OK: OK: Less than 1.00% above the threshold [0.0] [18:44:15] RECOVERY - Varnishkafka Delivery Errors per minute on cp3022 is OK: OK: Less than 1.00% above the threshold [0.0] [18:46:57] (03CR) 10Dzahn: [C: 032] planets: remove SSL stanzas [puppet] - 10https://gerrit.wikimedia.org/r/181984 (https://phabricator.wikimedia.org/T60048) (owner: 10John F. Lewis) [18:50:38] (03CR) 10Dzahn: [C: 032] planet: change dns to misc-web [dns] - 10https://gerrit.wikimedia.org/r/181985 (https://phabricator.wikimedia.org/T60048) (owner: 10John F. Lewis) [18:54:05] mutante: what happened to that udp2log/ferm patch? [18:54:17] 3operations: Put all zirconium vhosts behind misc varnish cluster - https://phabricator.wikimedia.org/T60048#966337 (10Dzahn) [18:55:49] paravoid: you mean this one? https://gerrit.wikimedia.org/r/#/c/169691/ [18:55:55] not merged yet [18:56:27] ah, one of the last comments was that "ud2plog is going away soon" [18:56:39] I've been hearing this for 3 years now [18:59:12] paravoid: did you see my question above re: nutcracker? [19:02:11] (03CR) 10John F. Lewis: [C: 031] bugzilla: add Apache site for static BZ version [puppet] - 10https://gerrit.wikimedia.org/r/183758 (https://phabricator.wikimedia.org/T85140) (owner: 10Dzahn) [19:02:58] (03CR) 10John F. Lewis: [C: 031] bugzilla: add varnish config for static-bugzilla [puppet] - 10https://gerrit.wikimedia.org/r/183759 (https://phabricator.wikimedia.org/T85140) (owner: 10Dzahn) [19:03:17] ori: Apache uid is not 48/48 [19:03:25] it's 996/48 [19:03:25] it's not [19:03:28] it varies [19:03:34] "> so it is still consistently 48 / 48, even on new installs." [19:03:41] yes, this was incorrect on my part [19:03:55] but my review/comment on is right [19:04:03] 3operations, Beta-Cluster: Renumber apache user/group to uid=48 - https://phabricator.wikimedia.org/T78076#966359 (10Dzahn) >>! In T78076#966266, @ori wrote:> > so it is still consistently 48 / 48, even on new installs. this is not the case. mw1033: id apache uid=996(apache) gid=48(apache) groups=48(apache) [19:04:15] mutante: thanks for correcting [19:04:38] if that's the case then we should also change or delete https://wikitech.wikimedia.org/wiki/UID [19:05:35] yeah [19:09:04] (03PS1) 10Alexandros Kosiaris: Introduce cxserver.eqiad.wikimedia.org [dns] - 10https://gerrit.wikimedia.org/r/183888 (https://phabricator.wikimedia.org/T76200) [19:09:42] (03CR) 10Alexandros Kosiaris: "Thanks, ran through the puppet compiler and cherry picked on Beta. Seems to work fine. Will merge Monday UTC morning along with https://ge" [puppet] - 10https://gerrit.wikimedia.org/r/181613 (https://phabricator.wikimedia.org/T76200) (owner: 10Alexandros Kosiaris) [19:20:57] (03CR) 10Dzahn: [C: 031] "thanks for using find. there might be other things to improve as mentioned above but i don't want to slow this down more" [puppet] - 10https://gerrit.wikimedia.org/r/182173 (owner: 10Hoo man) [19:26:45] PROBLEM - Varnishkafka Delivery Errors per minute on cp3005 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [20000.0] [19:26:46] anyone feels like doing Apache deploy? [19:26:58] f.e. 181892 [19:27:45] PROBLEM - Varnishkafka Delivery Errors per minute on cp3004 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [20000.0] [19:28:34] PROBLEM - Varnishkafka Delivery Errors per minute on cp3020 is CRITICAL: CRITICAL: 12.50% of data above the critical threshold [20000.0] [19:28:44] PROBLEM - Varnishkafka Delivery Errors per minute on cp3019 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [20000.0] [19:28:45] PROBLEM - Varnishkafka Delivery Errors per minute on cp3015 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [20000.0] [19:34:25] RECOVERY - Varnishkafka Delivery Errors per minute on cp3020 is OK: OK: Less than 1.00% above the threshold [0.0] [19:36:04] RECOVERY - Varnishkafka Delivery Errors per minute on cp3004 is OK: OK: Less than 1.00% above the threshold [0.0] [19:36:14] RECOVERY - Varnishkafka Delivery Errors per minute on cp3005 is OK: OK: Less than 1.00% above the threshold [0.0] [19:37:14] PROBLEM - Varnishkafka Delivery Errors per minute on cp3022 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [20000.0] [19:38:06] <^d> mutante: Those hit counter fields/tables should all be null or 0 on WMF sites. [19:38:14] <^d> We've had hitcounters disabled since the beginning of time. [19:38:14] RECOVERY - Varnishkafka Delivery Errors per minute on cp3019 is OK: OK: Less than 1.00% above the threshold [0.0] [19:38:14] RECOVERY - Varnishkafka Delivery Errors per minute on cp3015 is OK: OK: Less than 1.00% above the threshold [0.0] [19:39:18] PROBLEM - Varnishkafka Delivery Errors per minute on cp3021 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [20000.0] [19:40:05] PROBLEM - Varnishkafka Delivery Errors per minute on cp3016 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [20000.0] [19:40:25] PROBLEM - Varnishkafka Delivery Errors per minute on cp3003 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [20000.0] [19:43:30] ^d: ok, well, it existed a couple years ago and i just wanted to know if "completely remove or globally add the "views" column in stats tables" [19:43:36] is still current [19:43:42] it's another task [19:43:56] 3Engineering-Community, WMF-Legal, operations: Implement the Volunteer NDA process in Phabricator - https://phabricator.wikimedia.org/T655#966489 (10JohnLewis) >>! In T655#964250, @chasemp wrote: > I think there is also a place for address? Yeah, the NDA documents (paper version) have a place to put your addres... [19:44:18] gah this is why I hate using 'JohnLewis' as my username in things [19:45:50] <^d> mutante: Yeah, some of it's existed since the dawn of time. [19:46:10] JohnLewis: name collision? [19:46:23] mutante: no, just annoying IRC pings :) [19:46:25] RECOVERY - Varnishkafka Delivery Errors per minute on cp3003 is OK: OK: Less than 1.00% above the threshold [0.0] [19:46:33] by bots. [19:48:14] JohnLewis: /ignore wikibugs ? :p [19:48:34] I want a /dontpingme option :p [19:50:04] RECOVERY - Varnishkafka Delivery Errors per minute on cp3021 is OK: OK: Less than 1.00% above the threshold [0.0] [19:50:24] RECOVERY - Varnishkafka Delivery Errors per minute on cp3022 is OK: OK: Less than 1.00% above the threshold [0.0] [19:52:04] RECOVERY - Varnishkafka Delivery Errors per minute on cp3016 is OK: OK: Less than 1.00% above the threshold [0.0] [19:52:22] (03PS3) 10BBlack: SSL: Remove RC4, enable 3DES [puppet] - 10https://gerrit.wikimedia.org/r/178555 [19:55:44] 3ops-eqiad: eqiad housekeeping task -storage room - https://phabricator.wikimedia.org/T86344#966545 (10Cmjohnson) [19:58:24] (03PS3) 10Ottomata: Add job to crunch Language team data [puppet] - 10https://gerrit.wikimedia.org/r/183734 (owner: 10Milimetric) [19:58:30] (03CR) 10Ottomata: [C: 032 V: 032] Add job to crunch Language team data [puppet] - 10https://gerrit.wikimedia.org/r/183734 (owner: 10Milimetric) [19:59:15] (03CR) 10BBlack: [C: 031] "This version doesn't mess with the DH stuff, it just swaps RC4 for 3DES, which seems to be what we need to get done in the near term on ou" [puppet] - 10https://gerrit.wikimedia.org/r/178555 (owner: 10BBlack) [20:00:45] PROBLEM - Varnishkafka Delivery Errors per minute on cp3010 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [20000.0] [20:02:15] PROBLEM - Varnishkafka Delivery Errors per minute on cp3015 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [20000.0] [20:02:55] PROBLEM - Varnishkafka Delivery Errors per minute on cp3007 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [20000.0] [20:04:14] PROBLEM - puppet last run on cp3020 is CRITICAL: CRITICAL: puppet fail [20:06:32] (03PS1) 10Ottomata: Temporarly disable esams bits varnishkafka [puppet] - 10https://gerrit.wikimedia.org/r/183896 [20:07:45] PROBLEM - Varnishkafka Delivery Errors per minute on cp3008 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [20000.0] [20:07:55] RECOVERY - Varnishkafka Delivery Errors per minute on cp3010 is OK: OK: Less than 1.00% above the threshold [0.0] [20:08:33] (03CR) 10Chmarkine: [C: 031] SSL: Remove RC4, enable 3DES [puppet] - 10https://gerrit.wikimedia.org/r/178555 (owner: 10BBlack) [20:08:37] (03PS2) 10Ottomata: Temporarly disable esams bits varnishkafka [puppet] - 10https://gerrit.wikimedia.org/r/183896 [20:10:44] RECOVERY - Varnishkafka Delivery Errors per minute on cp3015 is OK: OK: Less than 1.00% above the threshold [0.0] [20:11:15] RECOVERY - Varnishkafka Delivery Errors per minute on cp3007 is OK: OK: Less than 1.00% above the threshold [0.0] [20:15:44] (03CR) 10Ottomata: [C: 032] Temporarly disable esams bits varnishkafka [puppet] - 10https://gerrit.wikimedia.org/r/183896 (owner: 10Ottomata) [20:16:15] !log stopping esams bits varnishkafka instances [20:16:24] Logged the message, Master [20:17:14] RECOVERY - Varnishkafka Delivery Errors per minute on cp3008 is OK: OK: Less than 1.00% above the threshold [0.0] [20:20:44] (03PS1) 10Chad: Remove $wgDisableCounters, defunct and true by default now [mediawiki-config] - 10https://gerrit.wikimedia.org/r/183902 [20:22:30] RECOVERY - puppet last run on cp3020 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [20:54:07] 3operations, Phabricator: Create #site-incident tag and use it for incident reports - https://phabricator.wikimedia.org/T85889#966759 (10greg) meta-comment: I want to make sure #Ops is on board with any change given the incident report process came out of their group explicitly. So, let's bring this up in one of... [21:00:37] PROBLEM - nutcracker port on mw1226 is CRITICAL: Cannot assign requested address [21:01:47] RECOVERY - nutcracker port on mw1226 is OK: TCP OK - 0.000 second response time on port 11212 [21:04:47] 3Wikimedia-Mailing-lists, ops-requests: Delete tools-wmt-staff mailing list - https://phabricator.wikimedia.org/T85038#966844 (10Dzahn) a:3Dzahn [21:06:03] 3Wikimedia-Mailing-lists, ops-requests: Delete tools-wmt-staff mailing list - https://phabricator.wikimedia.org/T85038#937353 (10Dzahn) done. JohnLewis was list admin. there was practically no content in the archives (kept them anyways). the list is gone: /var/lib/mailman/bin# ./rmlist tools-wmt-staff Not remov... [21:06:42] 3Wikimedia-Mailing-lists, ops-requests: Delete tools-wmt-staff mailing list - https://phabricator.wikimedia.org/T85038#966868 (10Dzahn) 5Open>3Resolved [21:22:36] Seriously? https://phabricator.wikimedia.org/T74514#966771 [21:23:27] 3operations, Wikimedia-Mailing-lists: disable the defunct mk-edu list - https://phabricator.wikimedia.org/T74289#966971 (10Dzahn) a:3Dzahn [21:24:16] PROBLEM - Varnishkafka Delivery Errors per minute on cp3007 is CRITICAL: CRITICAL: 12.50% of data above the critical threshold [20000.0] [21:26:39] 3operations, Wikimedia-Mailing-lists: disable the defunct mk-edu list - https://phabricator.wikimedia.org/T74289#966991 (10Dzahn) Hello, i deleted the list on the server. So now it won't show up anymore and the listinfo page is gone. The archives still exist on the server if they are desired. sodium:/var/lib/... [21:26:55] 3operations, Wikimedia-Mailing-lists: disable the defunct mk-edu list - https://phabricator.wikimedia.org/T74289#966992 (10Dzahn) 5Open>3Resolved [21:27:55] 3operations, Wikimedia-Mailing-lists: disable the defunct mk-edu list - https://phabricator.wikimedia.org/T74289#761627 (10Dzahn) gone: https://lists.wikimedia.org/mailman/listinfo/mk-edu still here: https://lists.wikimedia.org/pipermail/mk-edu/ [21:30:16] RECOVERY - Varnishkafka Delivery Errors per minute on cp3007 is OK: OK: Less than 1.00% above the threshold [0.0] [21:32:07] (03CR) 10Ottomata: [C: 032] "Lemme know if/when you want me to merge. I"d rather do this while we are both available." [puppet] - 10https://gerrit.wikimedia.org/r/172201 (https://bugzilla.wikimedia.org/72740) (owner: 10QChris) [21:54:03] (03PS3) 10Dzahn: admin: add krenair to deployers [puppet] - 10https://gerrit.wikimedia.org/r/181421 (owner: 10Giuseppe Lavagetto) [21:55:01] (03PS4) 10Alex Monk: admin: add krenair to deployers [puppet] - 10https://gerrit.wikimedia.org/r/181421 (owner: 10Giuseppe Lavagetto) [21:55:17] (just changed commit message) [21:58:28] (03PS5) 10Dzahn: admin: add krenair to deployers [puppet] - 10https://gerrit.wikimedia.org/r/181421 (owner: 10Giuseppe Lavagetto) [21:58:35] (needed rebase again) [21:59:59] (03CR) 10Dzahn: [C: 032] "waiting period over, has manager approval, has NDA approval" [puppet] - 10https://gerrit.wikimedia.org/r/181421 (owner: 10Giuseppe Lavagetto) [22:01:12] 3Engineering-Community, WMF-Legal, operations: Implement the Volunteer NDA process in Phabricator - https://phabricator.wikimedia.org/T655#967112 (10Qgil) @chasemp, we have been discussing and testing that process for a while now, and basically it is just waiting the official blessing of Legal (they ok'ed it ver... [22:01:42] Krenair: Notice: /Stage[main]/Admin/Admin::Hashuser[krenair]/Admin::User[krenair]/File[/home/krenair/.ssh/authorized_keys]/ensure: created [22:01:50] Krenair: that was on terbium [22:02:07] on now also on bast1001 to get there [22:02:21] Okay, guess I should actually set up my ssh to test. [22:03:33] Host tin [22:03:49] ProxyCommand ssh-W %h:%p krenair@bast1001.wikimedia.org [22:03:54] User krenair [22:04:07] ^ that in .ssh/config, then try just "ssh tin" [22:04:24] I need IdentityFile as well I think [22:04:39] (03CR) 10Bartosz Dziewoński: "Can't we split this into two rules, one to associate the Phabricator task, and one to generate the link without the stupid $1 restriction?" [puppet] - 10https://gerrit.wikimedia.org/r/177128 (https://phabricator.wikimedia.org/T75997) (owner: 10Krinkle) [22:04:54] Krenair: no, i dont need that either [22:05:03] but i should have said "tin" right away not terbium [22:05:08] it just created your account on tin [22:05:13] that is the deployment server [22:05:23] (03PS5) 10Bartosz Dziewoński: gerrit: Don't match Phabricator identifiers within urls [puppet] - 10https://gerrit.wikimedia.org/r/177128 (https://phabricator.wikimedia.org/T75997) (owner: 10Krinkle) [22:08:04] mutante, works [22:08:37] Krenair: :) [22:08:42] resolved ticket [22:09:16] welcome to deployers [22:11:03] thanks mutante [22:11:07] PROBLEM - Varnishkafka Delivery Errors per minute on cp3008 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [20000.0] [22:12:07] PROBLEM - Varnishkafka Delivery Errors per minute on cp3007 is CRITICAL: CRITICAL: 12.50% of data above the critical threshold [20000.0] [22:16:45] greg-g, ^ [22:17:10] Krenair: congrats :) [22:17:15] Krenair: now don't break anything :P [22:17:26] hah [22:18:06] RECOVERY - Varnishkafka Delivery Errors per minute on cp3007 is OK: OK: Less than 1.00% above the threshold [0.0] [22:18:17] RECOVERY - Varnishkafka Delivery Errors per minute on cp3008 is OK: OK: Less than 1.00% above the threshold [0.0] [22:18:44] 3Phabricator, operations: Create #site-incident tag and use it for incident reports - https://phabricator.wikimedia.org/T85889#967149 (10GWicke) @greg: Sounds good to me. [22:18:48] 3Phabricator, operations: Migrate procurement@ from RT to Phabricator - https://phabricator.wikimedia.org/T84862#967150 (10Qgil) fwiw https://phabricator.wikimedia.org/tag/ops-access-requests/ says: > procurement@ queues is currently still handled in RT. > That project will be activated once that has been migr... [22:21:02] 3Engineering-Community, WMF-Legal, operations: Implement the Volunteer NDA process in Phabricator - https://phabricator.wikimedia.org/T655#967154 (10chasemp) Sounds good to me if that is cool with legal. I wasn't trying to redefine the process, what is there now is basically the old idea in the new system. Bef... [22:22:50] (03PS3) 10Gergő Tisza: Deploy Sentry on the beta cluster [mediawiki-config] - 10https://gerrit.wikimedia.org/r/181439 (https://phabricator.wikimedia.org/T78807) [22:23:37] (03CR) 10Gergő Tisza: "UnWIPping. Looks like a proper puppet setup for Sentry will take a while, no sense in blocking on that." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/181439 (https://phabricator.wikimedia.org/T78807) (owner: 10Gergő Tisza) [22:25:24] 3Engineering-Community, WMF-Legal, operations: Implement the Volunteer NDA process in Phabricator - https://phabricator.wikimedia.org/T655#967161 (10Qgil) Ok, but let's agree that the first NDA request we receive needs to go through the new Legalpad-based process. Luis will need to vet it anyway, so it will be o... [22:25:50] Hi [22:25:55] Are WMF employees online? [22:26:07] I'm attempting to do a Google hangouts but I am in mainland China and Freegate is not working well [22:26:10] a ton of us [22:26:33] I have been asked to do Google Hangouts today (around 4 PM San francisco time, I think 8 AM Beijing time) [22:26:45] But I got a phone call coming from California... when I answered no voice [22:26:53] I think my Chinese phone provider doesn't take international calls [22:27:11] I had my father call the number but he got a switchboard [22:27:14] So he left a message [22:29:12] WhisperToTheWorl: do you want somebody to call you back ? [22:29:39] To send a message over IRC or email [22:29:47] Or if they have WeChat (Chinese chat service) [22:29:48] Or Skype [22:29:49] who is your contact at the office? [22:29:57] (Being handled on #wikimedia) [22:30:01] I don't know... the names are in my email [22:30:05] And thats Gmail [22:30:46] WhisperToTheWorl, what kind of call is it going to be, 1:1 or group? [22:31:30] i'm not certain exactly.. I don't have access to the related emails [22:33:31] WhisperToTheWorl, do you have a link to the hangout or something? [22:34:05] That would be in my Gmail account and unfortunately I'm having trouble accessing that [22:34:19] It's resolved [22:34:20] The Chinese interfere with Google services intentionally [22:34:31] Nemo: Thank you [22:35:13] I wish these problems didn't happen... but they are happening :( [22:39:21] I have the interview link now [22:39:44] Emily is my contact [22:45:50] WhisperToTheWorl: ok, if you need us to relay a message to her , can do [22:49:39] I did get her e-mail so I sent her a reply [22:50:03] I also made a post at https://meta.wikimedia.org/wiki/Talk:Google_Hangouts [22:50:08] discussing Mainland China [22:50:32] ugh. i can't clone the operations/puppet repo, getting "fatal: early EOF" when it's like 96% done. anyone else has this, or just me? [22:50:51] MatmaRex: try cloning from github, maybe? [22:54:26] meh [22:55:05] valhallasw`cloud: in unrelated news, http://tools.wmflabs.org/gerrit-patch-uploader/ is not working for me. after submitting, the output ends with "git commit --author="Bartosz Dziewoński " -F - < message" and the patch isn't uploaded. [22:55:21] MatmaRex: :( what repo is that? [22:55:26] PROBLEM - Slow CirrusSearch query rate on fluorine is CRITICAL: CirrusSearch-slow.log_line_rate CRITICAL: 0.00333333333333 [22:55:31] operations/puppet [22:55:50] might be related, then, not sure. [22:55:59] (03PS6) 10Gerrit Patch Uploader: gerrit: Don't match Phabricator identifiers within urls [puppet] - 10https://gerrit.wikimedia.org/r/177128 (https://phabricator.wikimedia.org/T75997) (owner: 10Krinkle) [22:56:00] although the commit suggests it cloned correctly [22:56:01] (03CR) 10Gerrit Patch Uploader: "This commit was uploaded using the Gerrit Patch Uploader [1]." [puppet] - 10https://gerrit.wikimedia.org/r/177128 (https://phabricator.wikimedia.org/T75997) (owner: 10Krinkle) [22:56:19] valhallasw`cloud: bah, worked now, when i got rid of the "ń". [22:56:26] MatmaRex: :| [22:56:44] sooo yeah, fix your unicode :D [22:57:02] MatmaRex: I'm 90% certain it's slightly more subtle than that ;-) [22:57:18] because I'm pretty sure you reported something like that before [22:57:22] (03CR) 10Bartosz Dziewoński: "I seem to be unable to clone this repo, so tried this way. Chris, is this going to work?" [puppet] - 10https://gerrit.wikimedia.org/r/177128 (https://phabricator.wikimedia.org/T75997) (owner: 10Krinkle) [22:57:58] MatmaRex: it didn't show errors of any kind, just stopped after git commit? [22:58:15] "--author=" + committer.encode('utf-8') looks OK to me :/ [22:58:20] valhallasw`cloud: yep. did not show me the output of `git commit` itself, only that line [23:00:37] RECOVERY - Slow CirrusSearch query rate on fluorine is OK: CirrusSearch-slow.log_line_rate OKAY: 0.0 [23:01:57] MatmaRex: yep, can reproduce. teh fuck. [23:06:16] MatmaRex: and yep, I had to fix my unicode [23:06:21] .encode instead of .decode [23:06:24] >_< [23:06:30] * valhallasw`cloud sits in a corner, ashamed [23:06:45] except that's not it [23:07:31] valhallasw`cloud: i feel bad that your toolchain of choice doesn't Just Work when faced with utf-8 strings :( [23:07:39] :> [23:08:40] MatmaRex: there's no such thing as 'Just Working' :p [23:09:20] valhallasw`cloud: #python3masterrace [23:09:37] legoktm: try cron and python 3, then come back to me. [23:09:41] (that rhymes) [23:10:20] what about cron? I've run python3 scripts under it just fine [23:10:53] legoktm: cron = C locale = ascii default encoding [23:11:04] so don't try to dump anything non-ascii to stdout [23:11:25] sounds like a bug in cron to me ;) [23:14:56] legoktm: why? providing C as default locale is perfectly valid [23:15:23] what other locale should it use? cron doesn't do .profile, I think :-p [23:15:38] and then there's stuff like 'what's python 3's default file encoding?' [23:16:01] python 3 helps immensely in the inside, but the boundaries are still problematic [23:17:15] anyway, bed time [23:17:31] MatmaRex: I'll deploy tomorrow, I think, but I have to merge some stuff from YuviPanda so it might take a bit [23:17:58] sure. thanks [23:18:55] MatmaRex: also, I probably spammed it before, https://www.mediawiki.org/wiki/Phabricator/Code has some notes on how to get started with phab development [23:24:37] (03PS2) 10Dzahn: bugzilla: add Apache site for static BZ version [puppet] - 10https://gerrit.wikimedia.org/r/183758 (https://phabricator.wikimedia.org/T85140) [23:25:49] Thanks to everyone! Be back later [23:26:50] (03CR) 10Dzahn: [C: 032] bugzilla: add Apache site for static BZ version [puppet] - 10https://gerrit.wikimedia.org/r/183758 (https://phabricator.wikimedia.org/T85140) (owner: 10Dzahn) [23:35:19] (03PS2) 10Dzahn: bugzilla: add varnish config for static-bugzilla [puppet] - 10https://gerrit.wikimedia.org/r/183759 (https://phabricator.wikimedia.org/T85140) [23:38:17] (03CR) 10Dzahn: [C: 032] bugzilla: add varnish config for static-bugzilla [puppet] - 10https://gerrit.wikimedia.org/r/183759 (https://phabricator.wikimedia.org/T85140) (owner: 10Dzahn) [23:44:20] (03CR) 10Dzahn: [C: 032] add static-bugzilla name, point to misc-web [dns] - 10https://gerrit.wikimedia.org/r/183760 (owner: 10Dzahn)