[00:32:11] 10serviceops, 10Analytics, 10ChangeProp, 10Community-Tech, and 6 others: Provide the ability to have time-delayed or time-offset jobs in the job queue - https://phabricator.wikimedia.org/T218812 (10Krinkle) [00:36:04] 10serviceops, 10Analytics, 10ChangeProp, 10Community-Tech, and 6 others: Provide the ability to have time-delayed or time-offset jobs in the job queue - https://phabricator.wikimedia.org/T218812 (10Krinkle) Tracking on the RFC board. As Daniel mentioned, it's not yet in the stage where it's seeking input o... [06:50:23] 10serviceops, 10MediaWiki-General-or-Unknown, 10Operations, 10Core Platform Team (PHP7 (TEC4)), and 2 others: Some pages will become completely unreachable after PHP7 update due to Unicode changes - https://phabricator.wikimedia.org/T219279 (10Joe) So for the record, in terms of impact: ` anomie> _joe_: A... [06:52:35] 10serviceops, 10MediaWiki-General-or-Unknown, 10Operations, 10Core Platform Team (PHP7 (TEC4)), and 2 others: Some pages will become completely unreachable after PHP7 update due to Unicode changes - https://phabricator.wikimedia.org/T219279 (10Joe) >>! In T219279#5095261, @kchapman wrote: > @Joe did you ge... [06:53:04] 10serviceops, 10Core Platform Team Backlog, 10MediaWiki-General-or-Unknown, 10Operations, and 2 others: Some pages will become completely unreachable after PHP7 update due to Unicode changes - https://phabricator.wikimedia.org/T219279 (10Joe) [10:57:28] 10serviceops, 10Operations, 10Wikidata, 10Wikidata-Termbox-Hike, and 4 others: New Service Request: Wikidata Termbox SSR - https://phabricator.wikimedia.org/T212189 (10WMDE-leszek) Sounds good, thanks @akosiaris ! We monitor T220402 then. [12:22:42] 10serviceops, 10Operations, 10Thumbor: Export useful metrics from haproxy logs for Thumbor - https://phabricator.wikimedia.org/T220499 (10jijiki) [12:23:01] 10serviceops, 10Operations, 10Thumbor: Export useful metrics from haproxy logs for Thumbor - https://phabricator.wikimedia.org/T220499 (10jijiki) [12:23:39] 10serviceops, 10Operations, 10Thumbor, 10User-jijiki: Upgrade Thumbor to Buster - https://phabricator.wikimedia.org/T216815 (10jijiki) [12:59:46] godog: addressed several nitpicks, how do you feel about merging? [13:00:55] we should force a puppet-run on swift nodes but dunno if it requires a restart [13:03:42] fsero: sorry I can't right now :( debugging what went wrong after logstash1012's crash, tomorrow morning perhaps? [13:08:03] yep i'll make an invite so we dont forget [13:13:08] in other regards https://github.com/cststack/k8comp [13:13:13] this might be interesting for us [13:37:45] that looks pretty nice fsero [13:38:26] yup is also a helm plugin so it works out of the box from helm [13:45:04] <_joe_> oh god [13:45:09] <_joe_> hiera for kubernetes? [13:45:15] <_joe_> y'all want to KILL ME [13:45:43] <_joe_> seriously though, it sounds wicked but might be not-bad [13:46:35] i would prefer to kill hiera, and in the meantime puppet [13:46:40] but i want to keep it real [13:46:49] <_joe_> yeah you know that's not gonna happen [13:46:54] you need to work with ehat you have [13:47:00] <_joe_> you can partially kill it from our everyday job [13:47:05] <_joe_> but seriously though [13:47:21] <_joe_> what is the advantage of using hiera vs helm and values.yaml files? [13:47:44] <_joe_> I'm trying to understand [13:47:50] not much except easing the migration from current services [13:48:23] i think the use case was described by otto and eventgate, they have the list of kafka nodes on hiera and now they are duplicating it on values.yaml [13:48:43] <_joe_> EHHHH [13:48:44] which may or may not be an inconvenience [13:49:21] <_joe_> we might never be able to use the same hiera hierarchy we use for production for this [13:49:54] <_joe_> I'm not trying to shoot this down btw, I'm trying to understand if it could be valuable [13:50:16] my intent here is to portrait the option, so we have it on our radar [13:50:27] <_joe_> on one hand, I would prefer us to manage (via puppet or whatever) a global "production-values.yaml" file [13:50:45] <_joe_> so that otto could refer to values in it [13:51:02] <_joe_> and make it part of the release process, as helm can merge values from multiple files [13:51:28] <_joe_> and well, ofc having helm-file to guarantee we can track what has been sent to k8s is probably a good idea [15:24:46] any idea who/how I should ask about my Go app deployment question from yesterday? [15:25:34] XioNoX: we should have replied to you frankly [15:25:41] so that is our bad [15:27:15] I take IRC as best effort, so a "send us an email" works for me too [15:30:42] why don't you open a task and add the serviceops tag? [15:30:49] this will work too! [15:34:53] cool, will do [15:35:31] I asked on IRC because I'm mostly curious about the work required. So I don't want someone to spend too much time on it. [15:46:50] we can direct your questions to our previous customers [15:47:02] I am sure they will be happy to share their experience with you [16:37:08] <_joe_> XioNoX: is this an external software? [16:37:19] <_joe_> or something you're writing? [16:37:30] <_joe_> and, where would that need to run? [16:38:26] _joe_: XioNoX refers to https://github.com/cloudflare/gortr [16:38:37] <_joe_> oh ok [16:38:47] is an external software, and it serves and HTTP endpoint if im not mistaken [16:39:00] i think he dont care about where but is a good candidate for k8s [16:39:12] <_joe_> snap [16:39:19] <_joe_> I hoped I could get you to package it [16:40:59] I don't know much about k8s, but I'd say either that, or netmon hosts, or their own ganeti, they would need ~1G ram, ~1 vcpu [16:41:25] XioNoX: is up to you, in any case first step is to package it [16:41:40] <_joe_> I was trying to understand how important this is to the infrastructure [16:41:53] <_joe_> I'd rather not put infrastructural applications on top of k8s [16:42:26] <_joe_> but I can see arguments in the other direction for things like e.g. netbox [16:42:32] _joe_: would ideally need to be redundant (one in each DC), can go down for a little bit of time, but ideally nothing too long [16:42:43] <_joe_> k8s guarantees a better HA in general [16:42:48] <_joe_> but may fail badly [16:42:55] <_joe_> this should be reachable from where? [16:43:12] _joe_: all the routers, so only internal stuff (private IP) [16:43:26] the 2 options so far are a java app, or this Go app :) [16:43:28] <_joe_> ok, so it should be ok to be on kube [16:44:17] <_joe_> also, do we want to enforce the use of the pipeline for such a project? [16:44:27] _joe_: the pipeline? [16:44:40] that sounds scary [16:44:42] <_joe_> the deployment pipeline :) [16:44:54] <_joe_> fsero: do you remember what do we do for zotero? I think we run it through the pipeline [16:45:10] <_joe_> XioNoX: it's really not, it automates a few of the steps for you :) [16:45:19] oh I like that [16:46:00] is through the pipeline https://github.com/wikimedia/mediawiki-services-zotero/tree/master/.pipeline [16:46:03] <_joe_> but yeah, open a task, slap "serviceops" on it, and we should discuss there [16:47:01] <_joe_> fsero: so yeah, blubber supports go applications too, we could exploit that [16:47:10] _joe_: on top of you mind, if the option is this go app, or a java app, would you say the Go app is better? [16:47:22] go is always better [16:47:29] :) [16:47:32] smaller memory footprint [16:47:38] also it would need to be able to rsync data from outside, would k8s still be an option? [16:47:38] and more RPS [16:47:51] i guess you mean that json right? [16:47:56] <_joe_> uhm [16:48:18] <_joe_> does this application support using a proxy to connect to the outside? [16:48:35] <_joe_> XioNoX: define "outside" :) [16:48:40] I don't know, I will look [16:48:50] and put everything on the task [16:48:57] <_joe_> yes, please [16:49:06] outside means the wild internet [16:49:31] <_joe_> but if it needs to import data from ripe, we can sync the data ourselves to an internal server and then sync from there to the application [16:49:42] <_joe_> maybe after validating the data is legit? [16:50:06] <_joe_> anyways, I'm just letting my imagination fly [16:50:13] the tool checks if the data is legit using RPKI [16:50:31] anyway, thanks, that's already a lot of info [16:56:50] _joe_: I'm curious of your reasoning for putting netbox in k8s [16:58:42] <_joe_> volans: as long as netbox is not an integral part of our automation [16:58:48] <_joe_> if it becomes such, then no. [16:59:22] ah ok, because the plans are in that direction ;) [16:59:29] <_joe_> but if not, k8s guarantees in general a better availability than a single ganeti VM [16:59:53] the only blocker for put anything on k8s is state [16:59:57] I don't see really the advantage, given the not easily available postgres behind ;) [17:00:01] as long state is managed elsewhere is finde [17:00:05] *fine [17:00:31] need to run, happy to continue the conversation later :) [17:00:36] right now it's on two physical hosts btw, active/passive [17:00:38] :) [17:07:37] there is a Rust option too, https://github.com/NLnetLabs/routinator [17:07:56] but I wouldn't put RPKI in the deployment pipeline tbh [17:08:08] each DC should function independently in this regard [17:08:18] this sounds like a good candidate for the per-PoP ganeti clusters [17:08:28] <_joe_> indeed [17:08:33] paravoid: Yeah I looked at routinator, it requires Rust 1.30, while Debian Stretch has 1.24 [17:08:37] <_joe_> that's why I was asking for more context [17:09:04] buster has 1.32 ;) [17:09:20] yeah, but can we use it in prod? [17:09:22] I've tried building it in the past though, if we go down the debianization route it's going to be hell [17:09:29] XioNoX: it's already in prod [17:09:43] ah, didn't know [17:09:48] <_joe_> also you don't need rust to run it, just to build it [17:09:53] it was a quartery goal in Q3, we talked about it every week :P [17:09:56] <_joe_> then it's a normal executable IIRC? [17:10:00] 10serviceops, 10Operations, 10Thumbor: Export useful metrics from haproxy logs for Thumbor - https://phabricator.wikimedia.org/T220499 (10Gilles) For reference: https://medium.com/@tom.fawcett/extracting-useful-duration-metrics-from-haproxy-prometheus-fluentd-2be9832ff702 We can do the same with mtail. [17:10:10] _joe_: correct [17:10:15] for the feedback I got, making a package seems to be mandatory [17:10:38] last i looked there were a bunch of cargo dependencies not in Debian [17:11:15] (and I wouldn't be surprised if gortr has similar needs) [17:17:37] <_joe_> oh for sure [18:13:15] 10serviceops, 10Core Platform Team Backlog, 10MediaWiki-General-or-Unknown, 10Operations, and 3 others: Some pages will become completely unreachable after PHP7 update due to Unicode changes - https://phabricator.wikimedia.org/T219279 (10Esanders) > @Esanders is it a huge deal if, in light of the blast rad... [19:22:51] 10serviceops, 10Operations, 10RESTBase, 10RESTBase-API, and 3 others: Make RESTBase spec standard compliant and switch to OpenAPI 3.0 - https://phabricator.wikimedia.org/T218218 (10mobrovac) >>! In T218218#5081897, @mobrovac wrote: > One other thing left to do here: replace optional parameters in the `/sys... [19:23:00] 10serviceops, 10Operations, 10RESTBase, 10RESTBase-API, and 3 others: Make RESTBase spec standard compliant and switch to OpenAPI 3.0 - https://phabricator.wikimedia.org/T218218 (10mobrovac) [19:51:57] 10serviceops, 10Operations, 10RESTBase, 10RESTBase-API, and 3 others: Make RESTBase spec standard compliant and switch to OpenAPI 3.0 - https://phabricator.wikimedia.org/T218218 (10Pchelolo)