[00:08:09] 10serviceops, 10Operations, 10Performance-Team, 10Patch-For-Review, 10User-jijiki: MediaWiki to route specific keys to /*/mw-with-onhost-tier/ - https://phabricator.wikimedia.org/T264604 (10aaron) >>! In T264604#6601576, @jijiki wrote: > @aaron is there a timeline as to when those patches will be merged?... [02:20:55] 10serviceops, 10Operations, 10observability: alert on too many close-to-saturated appservers / apiservers - https://phabricator.wikimedia.org/T267176 (10CDanis) [07:01:35] 10serviceops, 10DC-Ops, 10Operations, 10ops-eqiad: eqiad: Physical moves for MediaWiki servers - https://phabricator.wikimedia.org/T266164 (10elukey) @Cmjohnson icinga reports that mw1267's mgmt is down, can you check? It also reports that PS redundancy is not good :( [09:06:12] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Upgrade kubernetes clusters to a security supported (LTS) version - https://phabricator.wikimedia.org/T244335 (10JMeybohm) [09:23:48] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Upgrade kubernetes clusters to a security supported (LTS) version - https://phabricator.wikimedia.org/T244335 (10JMeybohm) [10:14:39] 10serviceops, 10Operations, 10CommRel-Specialists-Support (Oct-Dec-2020): CommRel support for ICU 63 upgrade - https://phabricator.wikimedia.org/T267145 (10Elitre) @Trizek-WMF FYI? [10:47:43] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Upgrade kubernetes clusters to a security supported (LTS) version - https://phabricator.wikimedia.org/T244335 (10JMeybohm) >>! In T244335#6600895, @akosiaris wrote: >> We are not able to go 1.19 because of calico only supporting 1.18 > > Looks like this isn't t... [10:47:48] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Decide if we want to stick with etcd datastore - https://phabricator.wikimedia.org/T266895 (10JMeybohm) >>! In T244335#6600895, @akosiaris wrote: >> Decide decide if we are going to be staying with direct access to etcd (version 3?) or try and switch to the kube... [10:58:40] 10serviceops, 10Operations, 10Performance-Team, 10Patch-For-Review, 10User-jijiki: MediaWiki to route specific keys to /*/mw-with-onhost-tier/ - https://phabricator.wikimedia.org/T264604 (10jijiki) @aaron mwdebug1001 has the mcrouter configuration we want to roll out when we merge the mediawiki patches,... [10:59:39] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Upgrade kubernetes clusters to a security supported (LTS) version - https://phabricator.wikimedia.org/T244335 (10akosiaris) >>! In T244335#6602317, @JMeybohm wrote: >>>! In T244335#6600895, @akosiaris wrote: >>> We are not able to go 1.19 because of calico only... [11:31:02] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Build calico 3.16 - https://phabricator.wikimedia.org/T266893 (10JMeybohm) [WIP] will continue shortly What's in the box (release tarball): Docker images * calico/node ** Looks like a scratch container, but actually copies / of a `registry.access.redhat.com/ub... [11:57:19] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Build calico 3.16 - https://phabricator.wikimedia.org/T266893 (10Joe) calico/node is the only more-than-slightly-worrisome thing here. for everything else we're probably ok using their builds for the time being. It's also true that as far as external images go,... [13:36:26] 10serviceops, 10MediaWiki-General, 10Operations, 10MW-1.34-notes (1.34.0-wmf.16; 2019-07-30), and 4 others: Some pages will become completely unreachable after PHP7 update due to Unicode changes - https://phabricator.wikimedia.org/T219279 (10Aklapper) @holger.knust : Could you please answer the last commen... [14:21:32] 10serviceops, 10Operations, 10Traffic, 10HTTPS: puppetmaster[12]001: add TLS termination - https://phabricator.wikimedia.org/T263831 (10Nintendofan885) [16:15:20] 10serviceops, 10Operations: Upgrade the MediaWiki servers to ICU 63 - https://phabricator.wikimedia.org/T264991 (10jijiki) [16:17:28] 10serviceops, 10Operations: Upgrade the MediaWiki servers to ICU 63 - https://phabricator.wikimedia.org/T264991 (10jijiki) @thcipriani we will be upgrading to ICU 63 on the 16th Nov 2020. Since we will be restarting php-fpm across the cluster that day, can we put a note about this on the deployment calendar? [16:41:58] 10serviceops, 10Desktop Improvements, 10Operations, 10Product-Infrastructure-Team-Backlog, and 3 others: Connection closed while downloading PDF of articles - https://phabricator.wikimedia.org/T266373 (10LGoto) p:05Triage→03Low [16:42:14] 10serviceops, 10Operations, 10Product-Infrastructure-Team-Backlog, 10Proton, and 2 others: PDF download generates invalid PDF files - https://phabricator.wikimedia.org/T266559 (10LGoto) p:05Triage→03Low [16:51:44] 10serviceops, 10Operations: Upgrade the MediaWiki servers to ICU 63 - https://phabricator.wikimedia.org/T264991 (10jijiki) [16:57:39] 10serviceops, 10Operations, 10TechCom, 10Performance Issue, and 3 others: Strategy for storing parser output for "old revision" (Popular diffs and permalinks) - https://phabricator.wikimedia.org/T244058 (10Reedy) [17:06:33] 10serviceops, 10Operations, 10Growth-Team (Current Sprint), 10Patch-For-Review, and 2 others: Reimage one memcached shard per DC to Buster - https://phabricator.wikimedia.org/T252391 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jiji on cumin1001.eqiad.wmnet for hosts: ` mc1036.eqiad.wmnet... [17:13:43] 10serviceops, 10Operations, 10observability, 10User-jijiki: alert on too many close-to-saturated appservers / apiservers - https://phabricator.wikimedia.org/T267176 (10jijiki) [17:15:39] 10serviceops, 10Operations, 10observability, 10User-jijiki: alert on too many close-to-saturated appservers / apiservers - https://phabricator.wikimedia.org/T267176 (10jijiki) a:05CDanis→03jijiki [17:15:44] 10serviceops, 10Operations, 10observability, 10User-jijiki: alert on too many close-to-saturated appservers / apiservers - https://phabricator.wikimedia.org/T267176 (10jijiki) I agree that is a good idea! [17:20:18] 10serviceops, 10Operations, 10CommRel-Specialists-Support (Oct-Dec-2020), 10User-notice: CommRel support for ICU 63 upgrade - https://phabricator.wikimedia.org/T267145 (10Trizek-WMF) p:05Triage→03High a:03Trizek-WMF I can handle it. :) > We expect to start the upgrade no earlier than Monday Nov 16,... [17:31:35] 10serviceops, 10Operations, 10Growth-Team (Current Sprint), 10Patch-For-Review, and 2 others: Reimage one memcached shard per DC to Buster - https://phabricator.wikimedia.org/T252391 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mc1036.eqiad.wmnet'] ` and were **ALL** successful. [17:38:24] 10serviceops, 10Operations, 10CommRel-Specialists-Support (Oct-Dec-2020), 10User-notice: CommRel support for ICU 63 upgrade - https://phabricator.wikimedia.org/T267145 (10RLazarus) We just met this morning to sort out our timeline -- current plan is to do the do the upgrade on Nov 16. That means the distur... [17:40:03] 10serviceops, 10Operations: Upgrade the MediaWiki servers to ICU 63 - https://phabricator.wikimedia.org/T264991 (10RLazarus) [17:54:10] 10serviceops: create mwdebug1003 - ganeti VM with buster and appserver role - https://phabricator.wikimedia.org/T267248 (10Dzahn) [17:56:43] 10serviceops, 10Release-Engineering-Team-TODO, 10Patch-For-Review, 10Release-Engineering-Team (Deployment services): Upgrade MediaWiki appservers to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10Dzahn) [17:59:12] 10serviceops: create mwdebug1003 - ganeti VM with buster and appserver role - https://phabricator.wikimedia.org/T267248 (10Dzahn) [18:14:00] 10serviceops, 10Operations, 10Product-Infrastructure-Team-Backlog, 10Proton, and 2 others: PDF download generates invalid PDF files - https://phabricator.wikimedia.org/T266559 (10Emdosis) The [[ https://en.wikipedia.org/wiki/Physics | Physics]] page I downloaded yesterday (and today) was corrupted as well.... [18:34:32] 10serviceops: create mwdebug1003 - ganeti VM with buster and appserver role - https://phabricator.wikimedia.org/T267248 (10Dzahn) a:03Dzahn [19:05:15] 10serviceops, 10Operations, 10CommRel-Specialists-Support (Oct-Dec-2020), 10User-notice: CommRel support for ICU 63 upgrade - https://phabricator.wikimedia.org/T267145 (10Trizek-WMF) I'm going to add something to Tech News, and we could refine it next week. Thank you for finding the old task and the mess... [19:08:59] 10serviceops, 10Operations, 10CommRel-Specialists-Support (Oct-Dec-2020), 10User-notice: CommRel support for ICU 63 upgrade - https://phabricator.wikimedia.org/T267145 (10Trizek-WMF) * https://meta.wikimedia.org/wiki/Tech/News/2020/46 * https://meta.wikimedia.org/wiki/Tech/News/2020/47 [19:10:49] 10serviceops, 10Operations, 10CommRel-Specialists-Support (Oct-Dec-2020), 10User-notice: CommRel support for ICU 63 upgrade - https://phabricator.wikimedia.org/T267145 (10Trizek-WMF) [20:48:18] 10serviceops, 10Operations, 10Growth-Team (Current Sprint), 10Patch-For-Review, and 2 others: Reimage one memcached shard per DC to Buster - https://phabricator.wikimedia.org/T252391 (10jijiki) [20:55:52] 10serviceops, 10Operations, 10Growth-Team (Current Sprint), 10Patch-For-Review, and 2 others: Reimage one memcached shard per DC to Buster - https://phabricator.wikimedia.org/T252391 (10jijiki) `mc1036` is happily running buster, after the initial tuning (tx to @elukey), things look ok. We will keep monito... [21:03:59] 10serviceops, 10Operations, 10observability, 10User-jijiki: alert on too many close-to-saturated appservers / apiservers - https://phabricator.wikimedia.org/T267176 (10CDanis) As discussed, here's a start on the query: https://w.wiki/k6F Both thresholds in there need some tuning, but it's a start. This sh... [21:19:29] 10serviceops, 10Performance-Team, 10Platform Engineering Roadmap Decision Making, 10Code-Health-Objective, and 4 others: Determine multi-dc strategy for CentralAuth - https://phabricator.wikimedia.org/T267270 (10BPirkle) [21:21:36] 10serviceops, 10Performance-Team, 10Platform Engineering Roadmap Decision Making, 10Code-Health-Objective, and 4 others: Determine multi-dc strategy for CentralAuth - https://phabricator.wikimedia.org/T267270 (10BPirkle)