[05:31:32] <jayme>	 ottomata: Agreed. As I wrote that down later yesterday I came to that conclusion as well. In a re-deploy scenario we also lack the dynamic configmaps that flink creates so it would not even know where to pick up the latest checkpoint...bummer
[06:13:39] <wikibugs>	 10serviceops, 10Maps, 10SRE-swift-storage: Swift account to store pre-rendered vector-tiles - https://phabricator.wikimedia.org/T283049 (10jijiki)
[06:14:01] <wikibugs>	 10serviceops, 10Maps, 10SRE-swift-storage: Swift account to store pre-rendered vector-tiles - https://phabricator.wikimedia.org/T283049 (10jijiki)
[07:41:32] <wikibugs>	 10serviceops, 10MW-on-K8s, 10SRE: Create a basic helm chart to test MediaWiki on kubernetes - https://phabricator.wikimedia.org/T265327 (10Joe) 05Open→03Resolved
[07:46:12] <wikibugs>	 10serviceops, 10MW-on-K8s, 10SRE: Create a mwdebug deployment for mediawiki on kubernetes - https://phabricator.wikimedia.org/T283056 (10Joe) p:05Triage→03High
[08:21:29] <_joe_>	 question: do we want to volunteer moving deployment-charts to gitlab early in the process?
[08:47:52] <jayme>	 depends on the definition of early I would say. We'll be down one a.lex and up one to for onboarding the coming weeks so we might be a little short on time
[09:49:50] <_joe_>	 jayme, akosiaris any chance we can switch to ipvs for kube-proxy?
[09:53:52] <jayme>	 _joe_: technically yes, but we've not tested it and puppet is not prepared for it. What do you need that for?
[09:54:10] <_joe_>	 jayme: mediawiki :P
[09:54:22] <_joe_>	 somehow I trust ipvs' load balancing more than iptables
[09:54:27] <jayme>	 hrhr. Mediawiki works with iptables as well :)
[09:54:30] <jayme>	 ah, okay
[09:55:01] <_joe_>	 but it's also true we're getting good results with stuff like sessionstore or eventgate getting 10k rps already
[09:55:11] <_joe_>	 so not immediately necessary, sure
[09:56:28] <jayme>	 indeed. We can maybe revisit that as part of the "announcing service-ips via bgp" / "better load balancing" project
[10:12:35] <_joe_>	 yeah
[11:57:51] <akosiaris>	 _joe_: we got like 30krps on iptables, all of mediawiki is like 6k-7k?
[11:58:09] <akosiaris>	 I have doubts we are going to see some benefit performance wise
[11:58:47] <_joe_>	 more like 20k, but yes
[12:00:51] <akosiaris>	 https://grafana.wikimedia.org/d/000000519/kubernetes-overview?viewPanel=26&orgId=1 has peaks at 33k 
[12:01:03] <akosiaris>	 so well over 30k total across multiple services
[14:26:13] <_joe_>	 I'm merging https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/685721 so we'll have the ability to get diffs for all changes in deployment-charts
[14:47:17] <_joe_>	 ottomata: sorry I +2'd that change by mistake (fat fingers), but I wanted to show you https://integration.wikimedia.org/ci/job/helm-lint/4134/console
[14:47:36] <_joe_>	 (there is the diff from the change at the end)
[14:48:24] <ottomata>	 nice! :)
[14:49:43] <_joe_>	 ottomata: btw, it's time we bring you back to the mainline - we added canary support to the common helpers 0.3
[14:50:05] <_joe_>	 we just used a slightly different naming than you did, so you it should be just a few lines to change
[14:51:03] <ottomata>	 great saw that was in progress too sounds good
[14:52:41] <mutante>	 morning, was checking on the meeting. next Tuesday is cool, get well soon, joe
[15:04:06] <urandom>	 _joe_: where is you preso happening?
[15:06:38] <_joe_>	 urandom: next week, I sent a message to y'all
[15:06:51] <_joe_>	 and I moved the calendar invite
[15:07:07] <_joe_>	 I got the vaccine shot on saturday and I felt very sick until this morning
[15:07:14] <_joe_>	 so had no time to prepare it :/
[15:07:41] <_joe_>	 I was a bit too optimistic about my reaction to the shot
[15:07:59] <urandom>	 that sucks, but I'm glad you're feeling better now
[15:08:10] <_joe_>	 sorry, I wrote a message on slack, and moved the calendar invite
[15:08:25] <_joe_>	 I forgot you were doing the code jam
[15:43:10] <wikibugs>	 10serviceops, 10Prod-Kubernetes, 10Pybal, 10SRE, 10Traffic: Proposal: simplify set up of a new load-balanced service on kubernetes - https://phabricator.wikimedia.org/T238909 (10akosiaris) PR is at https://github.com/projectcalico/confd/pull/515, waiting for review now. It's been tested locally in a coup...
[16:16:00] <wikibugs>	 10serviceops, 10Maps, 10Product-Infrastructure-Team-Backlog, 10SRE, and 3 others: New Service Request tegola - https://phabricator.wikimedia.org/T274390 (10jijiki)
[16:17:16] <wikibugs>	 10serviceops, 10Maps, 10Product-Infrastructure-Team-Backlog, 10SRE, and 2 others: New Service Request tegola - https://phabricator.wikimedia.org/T274390 (10jijiki)
[17:21:30] <wikibugs>	 10serviceops, 10MW-on-K8s, 10SRE: Create a mwdebug deployment for mediawiki on kubernetes - https://phabricator.wikimedia.org/T283056 (10jijiki)
[17:32:38] <wikibugs>	 10serviceops, 10Maps, 10Product-Infrastructure-Team-Backlog, 10SRE, and 2 others: New Service Request maps-vector-server - https://phabricator.wikimedia.org/T274390 (10jijiki)
[17:34:51] <wikibugs>	 10serviceops, 10Performance-Team, 10Release-Engineering-Team (Radar): Create warmup procedure for MediaWiki app servers - https://phabricator.wikimedia.org/T230037 (10thcipriani)
[17:50:32] <wikibugs>	 10serviceops, 10SRE, 10Developer Productivity, 10Performance-Team (Radar), and 2 others: All debug hosts give (likely spurious) message: PHP Fatal error:  The UdpSocket to 127.0.0.1:10514 has been closed (from Monolog/SyslogUdp) - https://phabricator.wikimedia.org/T214734 (10thcipriani)
[20:19:23] <wikibugs>	 10serviceops, 10Platform Engineering, 10Release Pipeline, 10Release-Engineering-Team, and 5 others: Kask functional testing with Cassandra via the Deployment Pipeline - https://phabricator.wikimedia.org/T224041 (10thcipriani)
[21:11:26] <urandom>	 shdubsh: is your ECS-fu strong enough that you could map the following log attributes off the top of your head? - https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/services/kask/+/refs/heads/master/logging.go#46
[21:12:06] <urandom>	 time -> @timestamp and msg -> message are straightforward enough
[21:12:49] <urandom>	 but skimming through the docs, it's not obvious to me how to map the others
[21:13:09] <shdubsh>	 I'd guess: level => log.level, RequestID => event.id, appname => service.type
[21:14:18] <shdubsh>	 assuming appname reasonably coorelates with the service.type definition
[21:17:42] <urandom>	 "name of thing that is logging" ?
[21:18:36] <urandom>	 oh damn, log.level seems obvious now
[21:18:41] <urandom>	 but event.id?
[21:19:14] <urandom>	 my guess was trace.id
[21:20:01] <shdubsh>	 hmm, good point
[21:20:17] <urandom>	 shdubsh: RequestID is passed in by the caller, that's definitely one that needs to be consistent everywhere
[21:20:58] <shdubsh>	 ah, yes then that would make sense. if it's consistent between services, trace.id makes the most sense.
[21:22:19] <urandom>	 cool, and service.type is meant to be the name of the service/application?
[21:22:25] <urandom>	 "the thing that logs"?
[21:22:30] <shdubsh>	 yeah, that's how I read it
[21:22:42] <urandom>	 oh, yeah
[21:23:11] <urandom>	 oooh.
[21:23:31] <urandom>	 I see the  distinction between service.type and service.name, I think
[21:23:44] <shdubsh>	 different than service.name in the sense that service.name indicates purpose.  like jobrunner-mediawiki and api-mediawiki.  both are mediawiki, but have different purposes
[21:24:07] <shdubsh>	 another example, logging-elasticsearch vs. search-elasticsearch
[21:24:59] <shdubsh>	 probably have those backwards though :/ <service>-<purpose/cluster>
[21:26:10] <urandom>	 yeah, or in this case, kask is the service.type and sessionstore is the service.name?
[21:26:16] <urandom>	 (as an example)
[21:29:30] <shdubsh>	 That sounds about right.
[21:51:14] <urandom>	 shdubsh: thanks!
[22:41:22] <wikibugs>	 10serviceops, 10SRE, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, 10Platform Engineering (Icebox): Undeploy graphoid - https://phabricator.wikimedia.org/T242855 (10Jdforrester-WMF)
[22:57:04] <wikibugs>	 10serviceops, 10SRE, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, 10Platform Engineering (Icebox): Undeploy graphoid - https://phabricator.wikimedia.org/T242855 (10Iniquity)
[22:58:17] <wikibugs>	 10serviceops, 10SRE, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, 10Platform Engineering (Icebox): Undeploy graphoid - https://phabricator.wikimedia.org/T242855 (10Jdforrester-WMF)