[08:41:20] hi folks - I have something suspicious on Parsoid-PHP logstash dashboard, it's Very Empty since a couple of days. I believe there's a filter somewhere whose value should change (because i've seen some stuff in mediawiki-errors that i think should be on that board), but I don't know what or how :/ could someone have a look? [08:41:53] (i need to be out for a doc appointment for the next 2+ hours, so i won't be behind a keyboard, but giving a heads-up in case it's useful. thanks!) [09:29:54] <_joe_> ihurbain: without an example of something that should be in parsoid-php and is in mediawiki-errors, it's hard to understand what filter might be wrong [10:28:47] topranks: _joe_: I'm re-trying the wikikube-codfw depool test from yesterday now [10:29:36] <_joe_> jayme: ack [11:14:31] _joe_: good point, sorry, that was an oversight. https://logstash.wikimedia.org/app/discover#/doc/logstash-*/logstash-deploy-1-7.0.0-1-2025.06.19?id=x45Qh5cBONYSsV1OGyRO and https://logstash.wikimedia.org/app/discover#/doc/logstash-*/logstash-deploy-1-7.0.0-1-2025.06.19?id=xdndh5cBc9IpmxE5V8Cy feel like they would belong in there. [11:14:57] in any case the fact that we have on that board 0 oom and 0 timeout for more than 48 hours is probably too good to be true [11:20:06] <_joe_> ihurbain: https://grafana.wikimedia.org/d/35WSHOjVk/application-servers-red-k8s?orgId=1&from=now-7d&to=now&timezone=utc&var-site=$__all&var-deployment=mw-parsoid&var-method=GET&var-code=200&var-handler=php&var-service=mediawiki&refresh=1m it seems to coincide with when we moved away the last speck f traffic from restbase [11:20:20] <_joe_> now requests for parsoid urls go to mw-api-int [11:21:35] <_joe_> so if as i think that's using a specialized filter to get only alerts from the parsoid cluster, tat explains it [11:22:15] that sounds plausible (i have no idea how this board is defined, that was Before My Timeā„¢) [11:22:33] I think i'm actually the one that fixed it after we moved to mw-parsoid [11:23:05] let me see if I can fix that [11:23:20] thank you :) [11:24:44] hmm the problem is nothing identifies thes messages as being from parsoid [11:24:59] before they were emitted by the servergroup mw-parsoid [11:25:08] ah [11:25:34] I *could* try and match on exception.file containing "parsoid" [11:26:11] I don't know how structured logging works in mediawiki though, maybe the log message type could be set to parsoid instead of mediawiki? [11:26:46] i can have a look [11:27:01] lmk what you find :) [11:27:26] the match exception.file containing parsoid may be a reasonable stopgap before we can fix and deploy a better fix [11:27:48] claime: I think we have kube-mw-parsoid servergroup there right [11:28:02] effie: we don't anymore [11:28:15] hmm [11:28:17] traffic to that was from restbase, and now it's going directly to mw-api-int [11:28:28] ihurbain: yeah i'm looking at that in // [11:28:33] ah! right! [11:32:40] ihurbain: timeouts can't work like that though since they're all thrown by the excimer thingie [11:33:16] yeah and OOMs are thrown by PHP fatal error, i don't see that happening in that way either [11:33:30] hmm ooms look like I can catch them though [11:33:40] ah [11:33:43] ah yes [11:33:54] on file.whatever, yes [11:34:03] i was looking from the logger point of view [11:34:19] (exception.file, not file.whatever) [11:34:25] ok I've fixed what I can [11:34:31] Should be at least a little useful now [11:34:40] brilliant, thank you :) [11:35:04] i'll follow up with the team when they wake up and they're not on a holiday :P [11:35:13] you're welcome [12:49:52] <_joe_> heads up to all roots: I have switched the requestctl cli tool from talking directly to etcd to talking to the web api. My tests were all successful but please let me know if you still use it and it doesn't work. [12:50:36] <_joe_> there's been a few small changes (like, I removed the "pretty" view from the client as if you want human-readable stuff you should use the web interface [12:52:18] <_joe_> oh, one big advantage is - you can run it without sudo! [13:19:07] tappof: I think I beat you to puppet merge, ok to go ? [13:19:30] yes effie thanks [13:19:34] cheers