[10:08:38] o/ [10:08:56] tarrow: o/ [10:09:10] tarrow: we are trying to solve [10:09:50] sorry about that [10:09:54] lol [10:09:56] lucky you didn't flood out [10:09:59] akosiaris: rotfl you managed to ping everybody [10:09:59] and there he goes [10:10:09] nice L8 copy&paste issue [10:10:32] Alex the spammer :P [10:10:38] <_joe_> tarrow: can you test if termbox works in staging as soon as I finished deploying? [10:10:39] welcome back flooder [10:10:44] _joe_: sure! [10:10:55] now we know all the secrit channels [10:10:59] jakob_WMDE: o/ [10:11:02] <_joe_> tarrow: deployment done [10:11:15] <_joe_> akosiaris: yeah we need to expunge the public logs I guess [10:11:39] sorry, I screwed up [10:11:40] wrong paste [10:11:41] what did I even paste... [10:11:43] who dared to summon me? [10:11:48] <_joe_> akosiaris: various stuff [10:11:55] <_joe_> including the nicks of everyone in the channel [10:11:57] Elitre: accidental ping [10:12:15] _joe_: just need to remind myself how to test it... [10:12:17] <_joe_> so, tarrow, long story short, a change in mediawiki broke termbox in production [10:12:32] <_joe_> I will also deploy to codfw as it shouldn't be called currently [10:12:36] Elitre: yours truly, my mistake. [10:12:53] I know, I wanted to make Alex even more uncomfortable. all's good! have a lovely day y'all! [10:13:03] <_joe_> Elitre: ahahah :* [10:13:12] Elitre: achievement unlocked :-) [10:13:45] _joe_: sounds good [10:14:08] (and congrats for yet another success with https://phabricator.wikimedia.org/T257649 ) [10:15:29] So this is the problem? https://phabricator.wikimedia.org/T257887#6303393 [10:16:43] RECOVERY - termbox codfw on termbox.svc.codfw.wmnet is OK: All endpoints are healthy [10:16:45] nice! [10:16:56] <_joe_> :) [10:17:07] <_joe_> Amir1: yes [10:17:13] <_joe_> but for termbox it was a previosu change [10:17:18] <_joe_> ok, going with eqiad [10:17:21] akosiaris: you're forbidden to use ctrl+v for the next 12h :-P [10:18:33] <_joe_> so wikifeeds is going to be thornier [10:18:38] <_joe_> well not really [10:18:50] <_joe_> but we can't just move forward with the original patch [10:18:57] <_joe_> we need to fix the pybal conf too [10:19:09] volans: it was middle-click actually :-( [10:19:11] <_joe_> and I don't think the patch applied last night is the best choice [10:19:31] _joe_: so "right now" cofdw seems to be doing what I expect [10:19:36] <_joe_> yes [10:19:41] <_joe_> because I released already [10:19:47] <_joe_> now I'm doing so in eqiad [10:19:51] cool [10:20:02] <_joe_> to be clear, nothing was wrong with termbox per-se [10:22:14] want me to check anything else? [10:22:42] not sure if people are considering the current status "a working workaround or a solved issue" but latency is still high [10:23:01] see: https://grafana.wikimedia.org/d/35vIuGpZk/wikifeeds?panelId=20&fullscreen&orgId=1&from=1594635775018&to=1594722175019&var-dc=eqiad%20prometheus%2Fk8s&var-service=wikifeeds [10:23:15] it went down to pre-phase4 latencies [10:23:21] but not to pre-phase3 latencies [10:23:55] I am going to replace the incident doc graph as it is missleading [10:26:08] <_joe_> jynus: it will be soon fixed [10:26:21] <_joe_> as soon as we roll out the new wikifeeds config [10:26:23] ok, sorry [10:26:32] I realize now [10:26:36] <_joe_> no, you were correct in pointing it out [10:26:42] <_joe_> we only fixed group2 [10:26:45] <_joe_> with that revert [10:26:55] I was worried the graph would made it thought that we are already green [10:27:16] as it was not "wide enogh" before [10:28:05] jynus: thanks for updating. I had just thrown some graph in there so I don't forget about it [10:29:19] you weren't the first one to made that mistake, I think we all looked at the 4 minute latencies [10:29:47] before realizing the existing 30 second increase one that was previous [10:30:18] <_joe_> ok, thanks for the help Amir1 & tarrow, termbox is unbroken :) [10:35:12] I 've purged IRC logs as well from the bot. My gaffe is now hopefully only in the histories of the participants. [10:43:21] _joe_: no problemo! Mind if I drop you a PM? [10:43:35] <_joe_> no ofc :) [10:50:32] akosiaris: the terminal in OSX has a feature to avoid accidental copy-pastes, you should give it a go! [10:59:43] ema: iterm ? [10:59:59] which setting? I would like to check if I have it [11:02:15] effie: Prefs > Advanced > When pasting more than this many characters, require confirmation [11:02:19] https://gitlab.com/gnachman/iterm2/-/issues/7424 [11:02:30] ema: nice try, not taking the bait :P [11:02:41] ema: btw, how do you even cope with the touch bar? [11:03:26] :) [11:03:48] thanks ema! [11:04:49] effie: <3 [11:18:35] 10serviceops, 10Operations: Move testreduce away from scandium to a separate Buster Ganeti VM - https://phabricator.wikimedia.org/T257906 (10MoritzMuehlenhoff) [11:18:45] <_joe_> jayme: https://grafana.wikimedia.org/d/35vIuGpZk/wikifeeds?panelId=20&fullscreen&orgId=1&from=now-15m&to=now seems to have had effect [11:18:58] <_joe_> we just unbroke the feeds for group1 wikis [11:19:10] yeah \o/ [11:19:22] <_joe_> ok, good [12:32:19] Hmm i was pinged [12:38:05] many people were pinged earlier... sort of a woops [12:54:18] * addshore just read scrollback looking for a ping (lol) :P [13:12:29] addshore: IRCCloud.js: showRedBanner() { if(rand(0,100)==7){ return true;} [14:13:00] 10serviceops, 10DBA, 10OTRS, 10Operations: Create a parallel OTRS database with a freezed snapshot of the production one - https://phabricator.wikimedia.org/T257928 (10jcrespo) [14:13:07] 10serviceops, 10DBA, 10OTRS, 10Operations: Create a parallel OTRS database with a freezed snapshot of the production one - https://phabricator.wikimedia.org/T257928 (10jcrespo) p:05Triage→03Medium [14:15:45] 10serviceops, 10DBA, 10OTRS, 10Operations: Create a parallel OTRS database with a freezed snapshot of the production one - https://phabricator.wikimedia.org/T257928 (10jcrespo) I was planning on doing this slowly with @akosiaris so at the same time he learned about streamlined db provisioning system, but I... [14:17:06] 10serviceops, 10DBA, 10OTRS, 10Operations: Create a parallel OTRS database with a frozen snapshot of the production one - https://phabricator.wikimedia.org/T257928 (10Reedy) [14:20:46] 10serviceops, 10DBA, 10OTRS, 10Operations: Create a parallel OTRS database with a frozen snapshot of the production one - https://phabricator.wikimedia.org/T257928 (10jcrespo) [14:28:08] 10serviceops, 10DBA, 10OTRS, 10Operations: Create a parallel OTRS database with a frozen snapshot of the production one - https://phabricator.wikimedia.org/T257928 (10jcrespo) https://en.wiktionary.org/wiki/freezed {icon hand-peace-o spin} [14:36:05] addshore: we all did :) [14:51:48] not me, I ignore all my highlights [14:53:36] 10serviceops, 10Release-Engineering-Team-TODO, 10Scap, 10Release-Engineering-Team (Deployment services), 10User-jijiki: Allow scap sync to deploy gradually - https://phabricator.wikimedia.org/T212147 (10LarsWirzenius) [14:53:40] 10serviceops, 10Operations, 10Scap: Make canary wait time configurable - https://phabricator.wikimedia.org/T217924 (10LarsWirzenius) 05Open→03Resolved --canary-wait-time has been included in a release and announced to the public and used on multiple trains now. Closing task. [15:00:51] <_joe_> rzl: ack! [16:23:19] 10serviceops, 10Operations, 10Patch-For-Review: Move testreduce away from scandium to a separate Buster Ganeti VM - https://phabricator.wikimedia.org/T257906 (10Dzahn) [16:39:58] 10serviceops, 10observability, 10Developer Productivity: Logstash entries from php7-fatal-error.php use level "ERR" instead of "ERROR" - https://phabricator.wikimedia.org/T248181 (10Krinkle) p:05Triage→03Medium a:03Krinkle [16:40:34] can i get someone to stop puppet on scandium for a bit? I need to stop rt testing service temporarily to verify something and i don't want puppet restarting it behind my back. [16:44:04] mutante ? [16:44:35] 10serviceops, 10Operations, 10Patch-For-Review: Move testreduce away from scandium to a separate Buster Ganeti VM - https://phabricator.wikimedia.org/T257906 (10ssastry) [16:48:50] rzl or effie or akosiaris maybe? :) [16:50:25] subbu: done [16:50:30] thanks! [17:02:20] cdanis, done with my tests. you can restart puppet on scandium. [17:02:47] done [17:05:48] ty [17:12:00] subbu: just missed that because i was in meeting. wanted to let you know i created a ticket to create a buster VM for testreduce though [17:12:12] subbu: after i heard about your chat with Moritz etc [17:12:18] yes, i am right now responding on that ticket. :) [17:12:57] cool, we just need to figure out needed RAM/disk etc [17:16:34] https://phabricator.wikimedia.org/T257940#6305339 [17:17:51] 10serviceops, 10Operations, 10Parsoid, 10Parsoid-Tests, 10Patch-For-Review: Move testreduce away from scandium to a separate Buster Ganeti VM - https://phabricator.wikimedia.org/T257906 (10ssastry) [17:19:20] thanks, ack [17:20:31] subbu: we can start with a relatively small VM and add CPU/ RAM when needed [17:20:43] k