[10:58:58] i'll merge " scap: add codfw canary appservers to dsh group" https://gerrit.wikimedia.org/r/c/operations/puppet/+/574902 [11:10:38] <_joe_> mutante: please no [11:10:49] <_joe_> we need a different approach [11:11:02] <_joe_> I'll explain later, I'm headed to lunch [11:11:20] <_joe_> but that risks breaking the scap canary checks [11:11:39] <_joe_> let me revert [11:11:41] ok, reverting. i tried to get reviews though [11:11:52] already doing it. no worries [11:11:53] <_joe_> I'm not blaming you [11:11:55] <_joe_> :) [11:12:33] assumption was that site.pp and dsh.yaml need to match [11:12:35] <_joe_> so the problem is that scap uses that list to deploy to the canaries, and then to check logstash for increased errors on a % of them [11:12:57] <_joe_> so if we add the ones in codfw, those see no traffic so they have no surge in log errors [11:13:26] <_joe_> we need to do something along the lines of https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/465411/ [11:13:33] <_joe_> I'll work on it more [11:13:40] ah. and that could to thinking something is fine when it really isnt.. [11:13:47] <_joe_> yes [11:13:47] gotcha. ok. thanks [11:14:02] <_joe_> yeah it's a problem we need to solve before a switchover anyways [11:14:14] <_joe_> I'll pick it up again when I come back after lunch [11:14:17] ok, revert is merged on the master [11:14:26] also lunch here [11:14:28] <_joe_> thanks and sorry for dropping the ball on the review [11:14:35] no problem [11:14:43] talk to you later [11:55:58] 10serviceops, 10Operations, 10Patch-For-Review: miscweb1001/2001 - upgrade to buster or decom - https://phabricator.wikimedia.org/T247648 (10Dzahn) [11:56:10] 10serviceops, 10Operations, 10Patch-For-Review: miscweb1001/2001 - upgrade to buster or decom - https://phabricator.wikimedia.org/T247648 (10Dzahn) [13:22:35] there's now a buster-based builder host; deneb.codfw.wmnet. I tested a few package builds and all seems to be working fine, could someone test if the base Docker image build is working as expected? [13:29:36] <_joe_> moritzm: will do! [13:29:48] i know hashar wants this new image: https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/580128 [13:29:59] for contint on buster [13:30:02] <_joe_> mutante: well he's not online, is he? [13:30:28] haven't see him yet today [13:33:36] ack, thx [13:41:41] 10serviceops, 10Operations, 10Patch-For-Review: miscweb1001/2001 - upgrade to buster or decom - https://phabricator.wikimedia.org/T247648 (10Dzahn) [13:45:35] <_joe_> moritzm: are we running the timers that run the docker reports there too? [13:49:45] currently only the image pruning is toggleable via Hiera, but we can add the same for the docker-reporter timer (if you mean that one) [13:49:57] a while ago on boron those had failed for some reason and icinga alerted about systemd state. systemctl start docker-reporter-base-images (and -k8s and -releng) fixed it [13:50:09] that is that, right [13:50:40] <_joe_> moritzm: it's ok if they run [13:50:54] <_joe_> but yeah we should probably remove them from boron [13:51:07] <_joe_> actually if they're running on deneb already that's a good sign [13:51:49] I'll disable them on boron manually, boron will probably be shut down/removed in 1-2 weeks, if the docker build stuff works fine, I'll send a mail to ops list [13:52:08] <_joe_> ok, I'll test that part later in the day [13:52:24] <_joe_> lemme tackle the "scap canaries in codfw" issue once and for all [13:52:33] ack [13:52:57] i see all 3 timers in deneb. and nice about canaries [14:34:39] 10serviceops, 10Operations, 10Patch-For-Review: upgrade planet.wikimedia.org backends to buster - https://phabricator.wikimedia.org/T247651 (10Dzahn) [14:38:36] _joe_: akosiaris: are we doing the meeting? current status: 3 yes, 2 no, 2 awaiting but i dont think i can see who it is because we use a group [14:39:12] mutante: as far as I know, yes [14:39:22] mutante: if you click on the group name, it expands to show the list [14:39:23] alright [14:39:24] <_joe_> the three yes it's us three [14:39:36] (I hadn't clicked the thing but I am a yes) [14:40:00] rlazarus: ok, thanks, i think it's because i'm doing it on mobile [14:40:15] ok, will be back by then [14:40:20] oh wow you're right, it doesn't show on mobile [14:40:27] that's pants-on-head [14:55:46] 10serviceops, 10MediaWiki-General, 10Operations, 10Core Platform Team Workboards (Clinic Duty Team), and 2 others: Some pages will become completely unreachable after PHP7 update due to Unicode changes - https://phabricator.wikimedia.org/T219279 (10AMooney) a:03holger.knust [15:09:39] 10serviceops, 10MediaWiki-Parser, 10Operations, 10Core Platform Team Workboards (Clinic Duty Team), 10Wikimedia-Incident: API action=parse should be poolcounter-limited if a re-parse is necessary - https://phabricator.wikimedia.org/T243803 (10AMooney) a:03nnikkhoui [15:35:28] 10serviceops, 10Operations, 10Traffic, 10Puppet: Puppet systemd::mask is an anti pattern that has unwanted side effect - https://phabricator.wikimedia.org/T233839 (10hashar) 05Stalled→03Invalid [18:44:36] 10serviceops, 10MediaWiki-General, 10Operations, 10Core Platform Team Workboards (Clinic Duty Team), and 3 others: Some pages will become completely unreachable after PHP7 update due to Unicode changes - https://phabricator.wikimedia.org/T219279 (10holger.knust) [19:00:32] 10serviceops, 10MediaWiki-General, 10Operations, 10Core Platform Team Workboards (Clinic Duty Team), and 3 others: Some pages will become completely unreachable after PHP7 update due to Unicode changes - https://phabricator.wikimedia.org/T219279 (10holger.knust) Suggestion for #user-notice Mediawiki is up... [20:17:46] 10serviceops, 10MediaWiki-Page-derived-data, 10Performance-Team: Watchlist missing revisions with pages near the size limit - https://phabricator.wikimedia.org/T248564 (10Krinkle) >>! In T248564#6002428, @Anomie wrote: > > I managed to find [[https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstas... [20:17:58] 10serviceops, 10MediaWiki-Page-derived-data, 10Performance-Team: Watchlist missing revisions with pages near the size limit - https://phabricator.wikimedia.org/T248564 (10Krinkle) p:05Triage→03Medium [20:25:31] 10serviceops, 10Analytics: Clarify multi-service instance concepts in helm charts and enable canary releases - https://phabricator.wikimedia.org/T242861 (10Ottomata) [23:12:53] 10serviceops: buster-nodejs10-devel seems to have an npm/node version mismatch - https://phabricator.wikimedia.org/T248928 (10JoeWalsh) [23:12:55] 10serviceops: buster-nodejs10-devel seems to have an npm/node version mismatch - https://phabricator.wikimedia.org/T248928 (10JoeWalsh) [23:12:57] 10serviceops: buster-nodejs10-devel has an npm/node version mismatch - https://phabricator.wikimedia.org/T248928 (10JoeWalsh) [23:13:21] 10serviceops: buster-nodejs10-devel has an npm/node version mismatch - https://phabricator.wikimedia.org/T248928 (10Jdforrester-WMF) The CI node10 images use 10.15.2 and 6.5.0, for reference.