[05:56:37] 10serviceops, 10Growth-Team, 10MediaWiki-Cache, 10Operations, and 5 others: Investigate increase in tx bandwidth usage for mc1033 - https://phabricator.wikimedia.org/T223310 (10Joe) I think I know what happened here - and it's possibly in relation with T223180 . PHP7's APC memory was perfectly ok when I l... [06:42:09] 10serviceops, 10Growth-Team, 10MediaWiki-Cache, 10Operations, and 5 others: Investigate increase in tx bandwidth usage for mc1033 - https://phabricator.wikimedia.org/T223310 (10kostajh) > @kostajh @Catrope Could you rule out whether it's probably that the SpecialHomepage may have caused an increase in unca... [06:45:21] 10serviceops, 10Operations, 10Traffic, 10HHVM, 10PHP 7.2 support: Increased instability in MediaWiki backends (according to load balancers) - https://phabricator.wikimedia.org/T223952 (10Joe) [06:45:45] 10serviceops, 10Operations, 10Traffic, 10HHVM, 10PHP 7.2 support: Increased instability in MediaWiki backends (according to load balancers) - https://phabricator.wikimedia.org/T223952 (10Joe) p:05Triage→03High a:03Joe [06:47:52] 10serviceops, 10Operations, 10Traffic, 10HHVM, 10PHP 7.2 support: Increased instability in MediaWiki backends (according to load balancers) - https://phabricator.wikimedia.org/T223952 (10Joe) While there is no evidence that the increase in traffic sent to php7 is the cause of this increase in errors, the... [06:52:41] 10serviceops, 10Growth-Team, 10MediaWiki-Cache, 10Operations, and 5 others: Investigate increase in tx bandwidth usage for mc1033 - https://phabricator.wikimedia.org/T223310 (10Joe) @kostajh for now I'm switching off php7 for other investigations, so we will know immediately if the additional traffic is du... [07:17:11] 10serviceops, 10Growth-Team, 10MediaWiki-Cache, 10Operations, and 5 others: Investigate increase in tx bandwidth usage for mc1033 - https://phabricator.wikimedia.org/T223310 (10Joe) So, after turning off php7 this morning we saw no modification in the rate of requests to mc1033. It seems extremely probabl... [07:20:31] 10serviceops, 10Operations, 10User-jijiki: Investigate increase in GET ops registered by mcrouter for the mediawiki appserver cluster - https://phabricator.wikimedia.org/T223647 (10Joe) Switching off php7 confirmed it's the cause of the increased number of GETs. I would like to first repackage and deploy th... [07:26:03] 10serviceops, 10Operations, 10Services, 10Core Platform Team (RESTBase Split (CDP2)), and 3 others: Deploy the RESTBase front-end service to Kubernetes - https://phabricator.wikimedia.org/T223953 (10mobrovac) [07:26:53] 10serviceops, 10RESTBase, 10Core Platform Team (RESTBase Split (CDP2)), 10Core Platform Team Kanban (Doing), and 3 others: Split RESTBase in two services: storage service and API router/proxy - https://phabricator.wikimedia.org/T220449 (10mobrovac) [07:27:43] 10serviceops, 10RESTBase, 10Core Platform Team (RESTBase Split (CDP2)), 10Core Platform Team Kanban (Doing), and 3 others: Split RESTBase in two services: storage service and API router/proxy - https://phabricator.wikimedia.org/T220449 (10mobrovac) [07:27:50] 10serviceops, 10RESTBase, 10Core Platform Team (RESTBase Split (CDP2)), 10Core Platform Team Kanban (Doing), and 3 others: Split RESTBase in two services: storage service and API router/proxy - https://phabricator.wikimedia.org/T220449 (10mobrovac) [07:27:53] 10serviceops, 10Operations, 10Services, 10Core Platform Team (RESTBase Split (CDP2)), and 3 others: Deploy the RESTBase front-end service to Kubernetes - https://phabricator.wikimedia.org/T223953 (10mobrovac) [07:28:25] 10serviceops, 10Operations, 10Services, 10Core Platform Team (RESTBase Split (CDP2)), and 3 others: Deploy the RESTBase front-end service (RESTRouter) to Kubernetes - https://phabricator.wikimedia.org/T223953 (10mobrovac) [07:51:18] 10serviceops, 10MediaWiki-Cache, 10Operations, 10Core Platform Team (Security, stability, performance and scalability (TEC1)), and 5 others: Use a multi-dc aware store for ObjectCache's MainStash if needed. - https://phabricator.wikimedia.org/T212129 (10daniel) >>! In T212129#5199079, @Krinkle wrote: > Exp... [08:16:29] 10serviceops, 10Growth-Team, 10MediaWiki-Cache, 10Operations, and 5 others: Investigate increase in tx bandwidth usage for mc1033 - https://phabricator.wikimedia.org/T223310 (10elukey) Would it be possible to deploy https://gerrit.wikimedia.org/r/511612 before the weekly train? [09:00:29] 10serviceops, 10Operations, 10Traffic, 10HHVM, and 2 others: Increased instability in MediaWiki backends (according to load balancers) - https://phabricator.wikimedia.org/T223952 (10Joe) While we have to wait and see if the absence of php7 traffic improves the situation (and in that case, why is that the c... [09:01:46] 10serviceops, 10DBA, 10Operations, 10HHVM, 10PHP 7.2 support: Increased instability in MediaWiki backends (according to load balancers) - https://phabricator.wikimedia.org/T223952 (10Joe) Having confirmed pybal is not an issue here, while databases might have a part in the issue, I'll change the tags acc... [09:23:59] 10serviceops, 10Operations, 10Release Pipeline, 10Release-Engineering-Team, and 5 others: Introduce wikidata termbox SSR to kubernetes - https://phabricator.wikimedia.org/T220402 (10Tarrow) @mobrovac and @akosiaris Sorry to drop in late to the conversation but I've been on holiday and then at the hackathon... [09:25:51] 10serviceops, 10Operations, 10User-jijiki: Investigate increase in GET ops registered by mcrouter for the mediawiki appserver cluster - https://phabricator.wikimedia.org/T223647 (10elukey) Other 20s of traffic from mw1238: ` tcpdump -r test_after.pcap -A | grep -v gets| egrep -o "get\ .*" | sort | cut -d ":... [09:29:16] 10serviceops, 10Operations, 10User-jijiki: Investigate increase in GET ops registered by mcrouter for the mediawiki appserver cluster - https://phabricator.wikimedia.org/T223647 (10elukey) 20s of traffic from mw1239: ` tcpdump -r test_after.pcap -A | grep -v gets| egrep -o "get\ .*" | sort | cut -d ":" -f 1... [09:50:07] 10serviceops, 10DBA, 10Operations, 10Performance-Team, and 2 others: Increased instability in MediaWiki backends (according to load balancers) - https://phabricator.wikimedia.org/T223952 (10Marostegui) From the DB land this is what I have seen regarding s1: - Small increase on query response time starting... [09:57:22] 10serviceops, 10Operations, 10User-jijiki: Investigate increase in GET ops registered by mcrouter for the mediawiki appserver cluster - https://phabricator.wikimedia.org/T223647 (10elukey) Grabbed a sample of localhost traffic to port 11213 on mw1238 from 11:45:39 to 11:48:05 (146s). tcpdump filters: * tc... [14:56:55] 10serviceops, 10DBA, 10Operations, 10Performance-Team, and 2 others: Increased instability in MediaWiki backends (according to load balancers) - https://phabricator.wikimedia.org/T223952 (10Marostegui) [15:01:57] 10serviceops, 10DBA, 10Operations, 10Performance-Team, and 2 others: Increased instability in MediaWiki backends (according to load balancers) - https://phabricator.wikimedia.org/T223952 (10CDanis) We saw one of these events at 14:48 today and pybal reported fetch failures for -- and wanted to depool -- ba... [15:02:52] 10serviceops, 10DBA, 10Operations, 10Performance-Team, and 2 others: Increased instability in MediaWiki backends (according to load balancers) - https://phabricator.wikimedia.org/T223952 (10Marostegui) >>! In T223952#5201169, @CDanis wrote: > We saw one of these events at 14:48 today and pybal reported fet... [16:48:11] 10serviceops, 10Operations, 10Release Pipeline, 10Release-Engineering-Team, and 4 others: Kask integration testing with Cassandra via the Deployment Pipeline - https://phabricator.wikimedia.org/T224041 (10thcipriani) [16:49:35] 10serviceops, 10Operations, 10Release Pipeline, 10Release-Engineering-Team, and 4 others: Kask integration testing with Cassandra via the Deployment Pipeline - https://phabricator.wikimedia.org/T224041 (10thcipriani) p:05Triage→03Normal It seems that the cassandra subchart already exists for cask (via... [17:08:20] 10serviceops, 10Operations, 10Wikimedia-Site-requests, 10Patch-For-Review, and 2 others: Increase Memory Limit for Scribunto - https://phabricator.wikimedia.org/T223737 (10Volans) [17:08:56] FYI T223391#5196854 Wikidiff2 looking for a deployment [17:09:12] https://phabricator.wikimedia.org/T223391#5196854 (the bot didn't liked it :) ) [20:54:42] 10serviceops, 10MediaWiki-Cache, 10Operations, 10Core Platform Team (Security, stability, performance and scalability (TEC1)), and 5 others: Use a multi-dc aware store for ObjectCache's MainStash if needed. - https://phabricator.wikimedia.org/T212129 (10aaron) They preferably would happen in the main DC vi... [23:09:08] 10serviceops, 10Growth-Team, 10MediaWiki-Cache, 10Operations, and 5 others: Investigate increase in tx bandwidth usage for mc1033 - https://phabricator.wikimedia.org/T223310 (10Krinkle) [23:09:15] 10serviceops, 10Growth-Team, 10MediaWiki-Cache, 10Operations, and 5 others: Investigate increase in tx bandwidth usage for mc1033 - https://phabricator.wikimedia.org/T223310 (10Krinkle) [23:09:27] 10serviceops, 10Growth-Team, 10Operations, 10Performance-Team, and 4 others: Investigate increase in tx bandwidth usage for mc1033 - https://phabricator.wikimedia.org/T223310 (10Krinkle)