[00:27:59] 10SRE, 10ops-eqiad, 10cloud-services-team (Hardware): labstore1007 crashed after storage controller errors--replace disk? - https://phabricator.wikimedia.org/T281045 (10wiki_willy) Hi @Jclark-ctr - let me know how you want to proceed [00:30:17] 10SRE, 10ops-eqiad, 10cloud-services-team (Hardware): labstore1007 crashed after storage controller errors--replace disk? - https://phabricator.wikimedia.org/T281045 (10Jclark-ctr) @wiki_willy we need to order drive >>! In T281045#7109977, @wiki_willy wrote: > Hi @Jclark-ctr - let me know how you want to p... [00:33:52] (03CR) 10Ottomata: [C: 03+1] superset: remove caching (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693981 (https://phabricator.wikimedia.org/T273850) (owner: 10Razzi) [01:29:30] (03PS1) 10Razzi: Add dbstore1006 to analytics vlan [homer/public] - 10https://gerrit.wikimedia.org/r/694002 (https://phabricator.wikimedia.org/T283125) [01:32:09] (03CR) 10Razzi: "This should do the trick for dbstore1006 access by the analytics vlan." [homer/public] - 10https://gerrit.wikimedia.org/r/694002 (https://phabricator.wikimedia.org/T283125) (owner: 10Razzi) [02:03:55] 10SRE, 10ops-codfw, 10DC-Ops, 10Patch-For-Review: (Need By: 2021-04-30) rack/setup/install backup200[4-7] - https://phabricator.wikimedia.org/T277323 (10Papaul) a:05Papaul→03jcrespo @jcrespo first SSD is set to bootable [02:07:56] (03PS1) 10TrainBranchBot: Branch commit for wmf/1.37.0-wmf.7 [core] (wmf/1.37.0-wmf.7) - 10https://gerrit.wikimedia.org/r/694015 [02:07:58] (03CR) 10TrainBranchBot: [C: 03+2] Branch commit for wmf/1.37.0-wmf.7 [core] (wmf/1.37.0-wmf.7) - 10https://gerrit.wikimedia.org/r/694015 (owner: 10TrainBranchBot) [02:26:23] (03Merged) 10jenkins-bot: Branch commit for wmf/1.37.0-wmf.7 [core] (wmf/1.37.0-wmf.7) - 10https://gerrit.wikimedia.org/r/694015 (owner: 10TrainBranchBot) [04:05:23] 10SRE, 10Wikimedia-Mailing-lists: "X-BeenThere" header not included anymore in daily-article-l mailing list emails (superseded by "List-Id"?) - https://phabricator.wikimedia.org/T283520 (10Krd) Makes sense, communicated to the user this way, thank you! [04:06:05] 10SRE, 10Wikimedia-Mailing-lists: "X-BeenThere" header not included anymore in daily-article-l mailing list emails (superseded by "List-Id"?) - https://phabricator.wikimedia.org/T283520 (10Krd) 05Open→03Resolved [04:25:22] (03PS1) 10Marostegui: dbstore1004: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/694024 (https://phabricator.wikimedia.org/T283125) [04:25:59] (03CR) 10Marostegui: [C: 03+2] dbstore1004: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/694024 (https://phabricator.wikimedia.org/T283125) (owner: 10Marostegui) [04:29:53] (03PS2) 10Marostegui: data.yaml: Replace kzeta's key [puppet] - 10https://gerrit.wikimedia.org/r/693636 (https://phabricator.wikimedia.org/T283383) [04:33:37] (03CR) 10Marostegui: [C: 03+2] data.yaml: Replace kzeta's key [puppet] - 10https://gerrit.wikimedia.org/r/693636 (https://phabricator.wikimedia.org/T283383) (owner: 10Marostegui) [04:35:51] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Replace SSH key for kzimmerman - https://phabricator.wikimedia.org/T283383 (10Marostegui) 05Open→03Resolved This has been done, should start getting spread as soon as puppet kicks in. Please re-open if you find issues [04:36:16] 10SRE, 10SRE-Access-Requests: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10Marostegui) [04:36:41] 10SRE, 10SRE-Access-Requests: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10Marostegui) [04:39:34] 10SRE, 10SRE-Access-Requests: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10Marostegui) p:05Triage→03Medium a:03Marostegui @egardner can you please coordinate your manager's approval on this task? Thanks [04:39:49] 10SRE, 10SRE-Access-Requests: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10Marostegui) [04:58:05] (03PS1) 10Marostegui: control-mariadb-10.4: Upgrade version to 10.4.19 [software] - 10https://gerrit.wikimedia.org/r/694029 [04:58:49] (03CR) 10Marostegui: [C: 03+2] control-mariadb-10.4: Upgrade version to 10.4.19 [software] - 10https://gerrit.wikimedia.org/r/694029 (owner: 10Marostegui) [04:59:18] (03Merged) 10jenkins-bot: control-mariadb-10.4: Upgrade version to 10.4.19 [software] - 10https://gerrit.wikimedia.org/r/694029 (owner: 10Marostegui) [05:02:28] (03CR) 10Marostegui: "This grants needs to be deployed everywhere too" [puppet] - 10https://gerrit.wikimedia.org/r/692621 (https://phabricator.wikimedia.org/T276589) (owner: 10Muehlenhoff) [05:03:00] 10SRE, 10Kubernetes, 10discovery-system: Document what #discovery-system is - https://phabricator.wikimedia.org/T282948 (10Marostegui) p:05Triage→03Medium [05:06:20] (03PS1) 10Marostegui: db1160: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/694039 [05:07:05] (03CR) 10Marostegui: [C: 03+2] db1160: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/694039 (owner: 10Marostegui) [05:07:45] (03PS5) 10Ryan Kemper: cloudelastic: bump inactive shard alert threshold [puppet] - 10https://gerrit.wikimedia.org/r/693204 (https://phabricator.wikimedia.org/T283269) [05:10:11] (03CR) 10Ryan Kemper: [C: 03+2] cloudelastic: bump inactive shard alert threshold [puppet] - 10https://gerrit.wikimedia.org/r/693204 (https://phabricator.wikimedia.org/T283269) (owner: 10Ryan Kemper) [05:45:21] (03CR) 10KartikMistry: [C: 03+2] Update cxserver to 2021-05-15-034540-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/693842 (https://phabricator.wikimedia.org/T276214) (owner: 10KartikMistry) [05:47:39] (03Merged) 10jenkins-bot: Update cxserver to 2021-05-15-034540-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/693842 (https://phabricator.wikimedia.org/T276214) (owner: 10KartikMistry) [06:07:45] 10SRE, 10serviceops, 10Performance-Team (Radar): Increased POST latency for MW app servers (Oct 2019) - https://phabricator.wikimedia.org/T235755 (10Marostegui) It's been almost 2 years, any point on keeping this task open? [06:13:12] (03CR) 10Ayounsi: [C: 04-1] "IPv6 missing." [homer/public] - 10https://gerrit.wikimedia.org/r/694002 (https://phabricator.wikimedia.org/T283125) (owner: 10Razzi) [06:15:56] 10SRE, 10serviceops, 10Performance-Team (Radar): Increased POST latency for MW app servers (Oct 2019) - https://phabricator.wikimedia.org/T235755 (10jijiki) 05Open→03Resolved a:03jijiki No! [06:30:57] (03CR) 10Elukey: [C: 03+1] "LGTM!" [puppet] - 10https://gerrit.wikimedia.org/r/686766 (https://phabricator.wikimedia.org/T282185) (owner: 10Razzi) [06:42:30] (03PS2) 10ArielGlenn: add new mirror for xml/sql dumps: wikimedia.bringyour.com [puppet] - 10https://gerrit.wikimedia.org/r/693369 [06:43:28] (03CR) 10ArielGlenn: [C: 03+2] add new mirror for xml/sql dumps: wikimedia.bringyour.com [puppet] - 10https://gerrit.wikimedia.org/r/693369 (owner: 10ArielGlenn) [06:49:20] (03CR) 10Elukey: "One nit - please remember to clean up after running puppet (python packages, memcached packages, etc..)" [puppet] - 10https://gerrit.wikimedia.org/r/693981 (https://phabricator.wikimedia.org/T273850) (owner: 10Razzi) [06:52:36] (03CR) 10Legoktm: [C: 03+1] "Will deploy tomorrow." [puppet] - 10https://gerrit.wikimedia.org/r/693595 (owner: 10Ladsgroup) [06:53:23] (03CR) 10Legoktm: [C: 03+1] lists: Stop routing mail to mailman2 [puppet] - 10https://gerrit.wikimedia.org/r/693599 (https://phabricator.wikimedia.org/T52864) (owner: 10Ladsgroup) [06:55:34] (03CR) 10Legoktm: [C: 04-1] lists: Stop mailman2 service (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/693600 (https://phabricator.wikimedia.org/T52864) (owner: 10Ladsgroup) [06:58:18] (03CR) 10Filippo Giunchedi: [C: 03+1] package_builder: add logstash-plugins build hooks [puppet] - 10https://gerrit.wikimedia.org/r/693958 (owner: 10Cwhite) [06:58:47] (03CR) 10Filippo Giunchedi: [C: 03+1] update puppet defaults and docs to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693906 (https://phabricator.wikimedia.org/T283213) (owner: 10Herron) [07:03:21] (03CR) 10Filippo Giunchedi: [C: 03+1] Set up build on production builder host [software/logstash/plugins] - 10https://gerrit.wikimedia.org/r/688418 (owner: 10Cwhite) [07:09:17] (03PS1) 10Jcrespo: mailman2: Generate a 5-year retention Archive backups of mailman [puppet] - 10https://gerrit.wikimedia.org/r/694210 (https://phabricator.wikimedia.org/T282303) [07:10:45] (03CR) 10jerkins-bot: [V: 04-1] mailman2: Generate a 5-year retention Archive backups of mailman [puppet] - 10https://gerrit.wikimedia.org/r/694210 (https://phabricator.wikimedia.org/T282303) (owner: 10Jcrespo) [07:13:45] (03PS2) 10Jcrespo: mailman2: Generate a 5-year retention Archive backups of mailman [puppet] - 10https://gerrit.wikimedia.org/r/694210 (https://phabricator.wikimedia.org/T282303) [07:14:44] 10SRE, 10wikimedia-irc-libera: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10Marostegui) [07:14:47] 10SRE, 10netops, 10observability: Create RIPE Atlas measurements against our authoritative DNS servers; alert on them - https://phabricator.wikimedia.org/T283359 (10ayounsi) Indeed the nerd-sniping gravitational pull is strong on that one and deserves its own task. In the short term, understanding and contro... [07:19:02] (03CR) 10Jcrespo: "Please read carefully what this will do, and if it is what you want: perform a new backup of only the configured mailman2 fileset (with it" [puppet] - 10https://gerrit.wikimedia.org/r/694210 (https://phabricator.wikimedia.org/T282303) (owner: 10Jcrespo) [07:20:48] (03CR) 10Jcrespo: "This patch will be reverted when the backup completes, so it is a one time backup- it will be very difficult to manipulate after that (but" [puppet] - 10https://gerrit.wikimedia.org/r/694210 (https://phabricator.wikimedia.org/T282303) (owner: 10Jcrespo) [07:22:38] 10SRE, 10ops-codfw, 10DC-Ops, 10Patch-For-Review: (Need By: 2021-04-30) rack/setup/install backup200[4-7] - https://phabricator.wikimedia.org/T277323 (10jcrespo) Thank you very much! This will help me speed up the reimages. [07:25:14] (03CR) 10Giuseppe Lavagetto: [C: 03+1] wmflib: add role/public_endpoint to wmflib::service [puppet] - 10https://gerrit.wikimedia.org/r/685734 (owner: 10Filippo Giunchedi) [07:27:57] 10SRE, 10ops-eqiad, 10DC-Ops: (Need By: 2021-04-30) rack/setup/install backup100[4-7] - https://phabricator.wikimedia.org/T277327 (10jcrespo) [07:30:09] 10SRE, 10ops-eqiad, 10DC-Ops: (Need By: 2021-04-30) rack/setup/install backup100[4-7] - https://phabricator.wikimedia.org/T277327 (10jcrespo) I have added a link to https://wikitech.wikimedia.org/wiki/Raid_setup#Dell_R740xd2 on the setup. One issue we found with the RAID is that only 1 disk device can be set... [07:32:15] (03CR) 10Elukey: [C: 03+1] "Really nice work, LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/693918 (https://phabricator.wikimedia.org/T281984) (owner: 10Giuseppe Lavagetto) [07:36:44] (03CR) 10ZPapierski: [C: 03+1] "LGTM, but with limited understanding of k8s" [deployment-charts] - 10https://gerrit.wikimedia.org/r/671204 (https://phabricator.wikimedia.org/T264006) (owner: 10Mstyles) [07:44:46] (03CR) 10Filippo Giunchedi: [C: 03+2] wmflib: add role/public_endpoint to wmflib::service [puppet] - 10https://gerrit.wikimedia.org/r/685734 (owner: 10Filippo Giunchedi) [07:44:53] (03PS3) 10Filippo Giunchedi: wmflib: add role/public_endpoint to wmflib::service [puppet] - 10https://gerrit.wikimedia.org/r/685734 [07:45:33] (03PS5) 10DCausse: helm_diff: supports new and renamed charts [deployment-charts] - 10https://gerrit.wikimedia.org/r/693359 [07:45:35] (03PS20) 10DCausse: rdf-streaming-updater: switch to H/A session-cluster [deployment-charts] - 10https://gerrit.wikimedia.org/r/671204 (https://phabricator.wikimedia.org/T264006) (owner: 10Mstyles) [07:45:37] (03PS5) 10DCausse: Rename chart rdf-streaming-updater as flink-session-cluster [deployment-charts] - 10https://gerrit.wikimedia.org/r/693411 [07:45:39] (03PS4) 10DCausse: rdf-streaming-updater: use the flink-session-cluster chart [deployment-charts] - 10https://gerrit.wikimedia.org/r/693416 [07:46:10] (03Abandoned) 10DCausse: rdf-streaming-updater: use session mode [deployment-charts] - 10https://gerrit.wikimedia.org/r/681497 (https://phabricator.wikimedia.org/T280166) (owner: 10Mstyles) [07:46:18] (03Abandoned) 10DCausse: rdf-streaming-updater: enable HA capability [deployment-charts] - 10https://gerrit.wikimedia.org/r/679519 (https://phabricator.wikimedia.org/T273098) (owner: 10Mstyles) [07:56:08] (03CR) 10Ema: [C: 03+2] Revert "cache: downgrade Varnish on cp3052 to 5.1.3-1wm15" [puppet] - 10https://gerrit.wikimedia.org/r/693440 (https://phabricator.wikimedia.org/T264398) (owner: 10Ema) [07:59:23] 10SRE, 10ops-codfw, 10DC-Ops, 10Patch-For-Review: (Need By: 2021-04-30) rack/setup/install backup200[4-7] - https://phabricator.wikimedia.org/T277323 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jynus on cumin2001.codfw.wmnet for hosts: ` ['backup2005.codfw.wmnet', 'backup2006.codfw.wmnet... [07:59:28] (03CR) 10JMeybohm: [C: 03+1] profile::envoy: Add argument for building the envoy-future package [puppet] - 10https://gerrit.wikimedia.org/r/693870 (owner: 10Hnowlan) [08:09:58] 10SRE, 10CFSSL-PKI, 10Patch-For-Review: Investigate Check for expired certificates debmonitor - https://phabricator.wikimedia.org/T283185 (10jcrespo) I am seeing minor things with "something related to certs" & debmonitor. They could be offtopic here, but reporting in case they are relevant: * On reimage:... [08:18:01] (03CR) 10Volans: [C: 03+2] wmf-auto-reimage: update CA validation [puppet] - 10https://gerrit.wikimedia.org/r/693926 (owner: 10Volans) [08:18:30] 10SRE, 10CFSSL-PKI, 10Patch-For-Review: Investigate Check for expired certificates debmonitor - https://phabricator.wikimedia.org/T283185 (10Volans) >>! In T283185#7110292, @jcrespo wrote: > I am seeing minor things with "something related to certs" & debmonitor. They could be offtopic here, but reporting in... [08:22:41] (03CR) 10Vgutierrez: [C: 03+1] varnish: add .sh suffix to shell scripts [puppet] - 10https://gerrit.wikimedia.org/r/693377 (https://phabricator.wikimedia.org/T148494) (owner: 10Ema) [08:29:01] 10SRE, 10ops-codfw, 10DC-Ops, 10Patch-For-Review: (Need By: 2021-04-30) rack/setup/install backup200[4-7] - https://phabricator.wikimedia.org/T277323 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['backup2005.codfw.wmnet', 'backup2006.codfw.wmnet', 'backup2007.codfw.wmnet'] ` and were **ALL**... [08:35:24] 10SRE, 10ops-codfw, 10DC-Ops, 10Patch-For-Review: (Need By: 2021-04-30) rack/setup/install backup200[4-7] - https://phabricator.wikimedia.org/T277323 (10jcrespo) [08:36:42] 10SRE, 10ops-codfw, 10DC-Ops, 10Patch-For-Review: (Need By: 2021-04-30) rack/setup/install backup200[4-7] - https://phabricator.wikimedia.org/T277323 (10jcrespo) 05Open→03Resolved Thank you very much, @papaul and @Volans your help managing these servers saved me hours of work getting those prepared! CC... [08:49:54] 10SRE, 10Data-Persistence-Backup, 10Goal, 10Patch-For-Review: Puppetize media backups infrastructure - https://phabricator.wikimedia.org/T276442 (10jcrespo) All codfw hardware is fully setup now. [08:53:43] (03CR) 10JMeybohm: [C: 03+2] docker_registry_ha: Ensure Vary header is send [puppet] - 10https://gerrit.wikimedia.org/r/693430 (https://phabricator.wikimedia.org/T256762) (owner: 10JMeybohm) [08:54:52] 10SRE, 10Thumbor, 10Wikimedia-SVG-rendering, 10Upstream: Incorrect text positioning in SVG rasterization (scale/transform; font-size; kerning) - https://phabricator.wikimedia.org/T36947 (10Aklapper) "Incorrect text spacing when transform is not 1:1" will be fixed in librsvg 2.51.2 (and librsvg 2.50.6, once... [08:56:20] (03PS1) 10Jbond: P:pki::get_cert: add calling module to fail message [puppet] - 10https://gerrit.wikimedia.org/r/694304 [08:57:26] (03CR) 10Kormat: [C: 03+2] "It's go-time." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693413 (https://phabricator.wikimedia.org/T282761) (owner: 10Kormat) [08:58:38] (03Merged) 10jenkins-bot: db-eqiad.php: Set pc1010 as pc1 primary. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693413 (https://phabricator.wikimedia.org/T282761) (owner: 10Kormat) [09:04:45] (03CR) 10Jbond: [C: 03+2] P:pki::get_cert: add calling module to fail message [puppet] - 10https://gerrit.wikimedia.org/r/694304 (owner: 10Jbond) [09:05:40] 10SRE, 10Wikimedia-Mailing-lists: "X-BeenThere" header not included anymore in daily-article-l mailing list emails (superseded by "List-Id"?) - https://phabricator.wikimedia.org/T283520 (10Aklapper) 05Resolved→03Declined Changing task status as no code was changed [09:05:46] 10SRE, 10netops, 10observability: Create RIPE Atlas measurements against our authoritative DNS servers; alert on them - https://phabricator.wikimedia.org/T283359 (10cmooney) Indeed some fascinating results there alright thanks for sharing Chris. While I agree with Arzhel, these results do serve to remind me... [09:06:55] (03PS1) 10Cathal Mooney: Added Wikidough VMs to BGP Anycast codfw [homer/public] - 10https://gerrit.wikimedia.org/r/694305 (https://phabricator.wikimedia.org/T283503) [09:11:37] (03PS1) 10Jbond: Revert "P:pki::get_cert: add calling module to fail message" [puppet] - 10https://gerrit.wikimedia.org/r/694307 [09:11:42] (03PS1) 10Kormat: Revert "pc1010: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/694308 [09:12:32] (03CR) 10Kormat: [C: 03+2] Revert "pc1010: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/694308 (owner: 10Kormat) [09:13:19] (03PS4) 10Matthias Mullie: Change HTTP to HTTPS for concept URIs on Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/679327 (https://phabricator.wikimedia.org/T258590) (owner: 10Seddon) [09:13:50] (03PS1) 10Jbond: P:pupetdb::microsite: only request certs if pki is available [puppet] - 10https://gerrit.wikimedia.org/r/694328 [09:18:16] (03PS1) 10JMeybohm: docker-registry: Add caching config for nginx [puppet] - 10https://gerrit.wikimedia.org/r/694330 (https://phabricator.wikimedia.org/T264209) [09:18:41] (03CR) 10Jbond: [C: 03+2] P:pupetdb::microsite: only request certs if pki is available [puppet] - 10https://gerrit.wikimedia.org/r/694328 (owner: 10Jbond) [09:18:43] (03CR) 10jerkins-bot: [V: 04-1] docker-registry: Add caching config for nginx [puppet] - 10https://gerrit.wikimedia.org/r/694330 (https://phabricator.wikimedia.org/T264209) (owner: 10JMeybohm) [09:19:19] (03PS1) 10Kormat: pc1010: Set mysql_role as 'master' [puppet] - 10https://gerrit.wikimedia.org/r/694331 (https://phabricator.wikimedia.org/T282761) [09:19:50] (03CR) 10Filippo Giunchedi: [C: 03+2] role: add pontoon::frontend role/profile [puppet] - 10https://gerrit.wikimedia.org/r/685739 (owner: 10Filippo Giunchedi) [09:19:55] (03PS3) 10Filippo Giunchedi: role: add pontoon::frontend role/profile [puppet] - 10https://gerrit.wikimedia.org/r/685739 [09:20:17] (03CR) 10Kormat: [C: 03+2] pc1010: Set mysql_role as 'master' [puppet] - 10https://gerrit.wikimedia.org/r/694331 (https://phabricator.wikimedia.org/T282761) (owner: 10Kormat) [09:21:03] (03PS1) 10Jcrespo: mediabackup: Install minio on the storage hosts and open port 9000 [puppet] - 10https://gerrit.wikimedia.org/r/694332 (https://phabricator.wikimedia.org/T276442) [09:21:27] (03PS2) 10Jcrespo: mediabackup: Install minio on the storage hosts and open port 9000 [puppet] - 10https://gerrit.wikimedia.org/r/694332 (https://phabricator.wikimedia.org/T276442) [09:21:42] (03PS21) 10Elukey: Add istio base images build support [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192) [09:21:44] (03PS5) 10Elukey: Add knative serving and net-istio images [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/692899 (https://phabricator.wikimedia.org/T278194) [09:21:46] (03PS3) 10Elukey: Add base kubeflow kfserving images and kube-rbac-proxy [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693644 (https://phabricator.wikimedia.org/T272919) [09:21:48] (03PS2) 10Elukey: Add Jetstack's cert-manager base go images. [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693826 (https://phabricator.wikimedia.org/T280661) [09:22:19] (03CR) 10Elukey: "Just updated the upstream revision from 1.6.2 to .14 (latest released, contains a lot of bugfixes for the 1.6 serie)" [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192) (owner: 10Elukey) [09:24:56] (03PS1) 10Jbond: P:puppetdb::microsite: only pupulate site content if enabled [puppet] - 10https://gerrit.wikimedia.org/r/694333 [09:25:50] (03CR) 10Kormat: "> Patch Set 3:" [puppet] - 10https://gerrit.wikimedia.org/r/692621 (https://phabricator.wikimedia.org/T276589) (owner: 10Muehlenhoff) [09:27:15] (03CR) 10Gehel: [C: 04-1] "See minor comments inline." (033 comments) [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 (owner: 10Hashar) [09:27:43] (03CR) 10Jbond: [C: 03+2] P:puppetdb::microsite: only pupulate site content if enabled [puppet] - 10https://gerrit.wikimedia.org/r/694333 (owner: 10Jbond) [09:27:56] (03Abandoned) 10Cathal Mooney: Added Wikidough VMs to BGP Anycast codfw [homer/public] - 10https://gerrit.wikimedia.org/r/693923 (owner: 10Cathal Mooney) [09:29:00] (03PS2) 10JMeybohm: docker-registry: Add caching config for nginx [puppet] - 10https://gerrit.wikimedia.org/r/694330 (https://phabricator.wikimedia.org/T264209) [09:32:27] (03PS1) 10Itamar Givon: Test Wikidata: Enable empty list to object serialization [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694339 (https://phabricator.wikimedia.org/T241422) [09:33:10] (03CR) 10Ladsgroup: [C: 03+1] mailman2: Generate a 5-year retention Archive backups of mailman [puppet] - 10https://gerrit.wikimedia.org/r/694210 (https://phabricator.wikimedia.org/T282303) (owner: 10Jcrespo) [09:35:51] (03Abandoned) 10Matthias Mullie: [WikibaseMediaInfo] MediaSearch: remove old, unused set of heuristics [mediawiki-config] - 10https://gerrit.wikimedia.org/r/658590 (https://phabricator.wikimedia.org/T271532) (owner: 10Matthias Mullie) [09:37:15] (03PS5) 10Matthias Mullie: Change HTTP to HTTPS for concept URIs on Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/679327 (https://phabricator.wikimedia.org/T258590) (owner: 10Seddon) [09:37:27] (03PS3) 10Jcrespo: mediabackup: Install minio on the storage hosts and open port 9000 [puppet] - 10https://gerrit.wikimedia.org/r/694332 (https://phabricator.wikimedia.org/T276442) [09:38:01] (03CR) 10Matthias Mullie: [C: 03+1] "Ready for deployment" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/679327 (https://phabricator.wikimedia.org/T258590) (owner: 10Seddon) [09:38:03] (03PS3) 10Jcrespo: mailman2: Generate a 5-year retention Archive backups of mailman [puppet] - 10https://gerrit.wikimedia.org/r/694210 (https://phabricator.wikimedia.org/T282303) [09:39:57] 10SRE, 10Cloud-VPS, 10LDAP, 10cloud-services-team (Kanban): investigate slapd memory leak - https://phabricator.wikimedia.org/T130593 (10Marostegui) @MoritzMuehlenhoff do you happen to know if this is still happening or we can close this? [09:40:15] (03CR) 10Jcrespo: [C: 03+2] mailman2: Generate a 5-year retention Archive backups of mailman [puppet] - 10https://gerrit.wikimedia.org/r/694210 (https://phabricator.wikimedia.org/T282303) (owner: 10Jcrespo) [09:40:34] (03CR) 10JMeybohm: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29664/console" [puppet] - 10https://gerrit.wikimedia.org/r/694330 (https://phabricator.wikimedia.org/T264209) (owner: 10JMeybohm) [09:41:45] 10SRE, 10DC-Ops: determine/process/document bios firmware tracking/updating policies - https://phabricator.wikimedia.org/T141128 (10Marostegui) [09:43:52] (03PS1) 10Jcrespo: mailman2: Disable temporarily production mailman2 backups [puppet] - 10https://gerrit.wikimedia.org/r/694354 (https://phabricator.wikimedia.org/T282303) [09:44:03] (03PS2) 10Jcrespo: mailman2: Disable temporarily production mailman2 backups [puppet] - 10https://gerrit.wikimedia.org/r/694354 (https://phabricator.wikimedia.org/T282303) [09:45:39] (03CR) 10Ema: [C: 03+2] Revert "cache: downgrade Varnish on cp3054 to 6.0.0-1wm1" [puppet] - 10https://gerrit.wikimedia.org/r/693441 (https://phabricator.wikimedia.org/T264398) (owner: 10Ema) [09:45:44] (03CR) 10Jcrespo: [C: 03+2] mailman2: Disable temporarily production mailman2 backups [puppet] - 10https://gerrit.wikimedia.org/r/694354 (https://phabricator.wikimedia.org/T282303) (owner: 10Jcrespo) [09:48:16] (03CR) 10Tonina Zhelyazkova: [C: 03+1] Test Wikidata: Enable empty list to object serialization [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694339 (https://phabricator.wikimedia.org/T241422) (owner: 10Itamar Givon) [09:48:21] (03PS1) 10Jcrespo: Revert "mailman2: Generate a 5-year retention Archive backups of mailman" [puppet] - 10https://gerrit.wikimedia.org/r/694309 [09:48:30] (03PS1) 10Jbond: hiera - cloud: disable the puppetdb microservice by default [puppet] - 10https://gerrit.wikimedia.org/r/694355 [09:48:35] (03PS2) 10Jcrespo: Revert "mailman2: Generate a 5-year retention Archive backups of mailman" [puppet] - 10https://gerrit.wikimedia.org/r/694309 [09:49:02] (03PS3) 10Jcrespo: Revert "mailman2: Generate a 5-year retention Archive backups of mailman" [puppet] - 10https://gerrit.wikimedia.org/r/694309 [09:49:46] (03PS4) 10Jcrespo: Revert "mailman2: Generate a 5-year retention Archive backups of mailman" [puppet] - 10https://gerrit.wikimedia.org/r/694309 [09:50:09] (03CR) 10Jcrespo: [C: 04-2] "Blocked on backup to finish." [puppet] - 10https://gerrit.wikimedia.org/r/694309 (owner: 10Jcrespo) [09:50:11] (03CR) 10Jbond: [C: 03+2] hiera - cloud: disable the puppetdb microservice by default [puppet] - 10https://gerrit.wikimedia.org/r/694355 (owner: 10Jbond) [09:53:34] 10SRE, 10Data-Persistence-Backup, 10Wikimedia-Mailing-lists, 10Patch-For-Review: The Great Clean Up of Mailman2 - https://phabricator.wikimedia.org/T282303 (10jcrespo) Ready when you are: ` Run Backup job JobName: lists1001.wikimedia.org-Weekly-Mon-Archive-var-lib-mailman Level: Full Client: lists100... [09:55:37] 10SRE, 10Data-Persistence-Backup, 10Wikimedia-Mailing-lists, 10Patch-For-Review: The Great Clean Up of Mailman2 - https://phabricator.wikimedia.org/T282303 (10Ladsgroup) LGTM [09:57:45] 10SRE, 10Data-Persistence-Backup, 10Wikimedia-Mailing-lists, 10Patch-For-Review: The Great Clean Up of Mailman2 - https://phabricator.wikimedia.org/T282303 (10jcrespo) This is now scheduled, I will monitor and give a heads up when it finishes. [10:03:31] (03CR) 10Giuseppe Lavagetto: [C: 03+1] "Looks ok, but please do run some tests pulling and pushing images to the restricted namespace 😊" [puppet] - 10https://gerrit.wikimedia.org/r/694330 (https://phabricator.wikimedia.org/T264209) (owner: 10JMeybohm) [10:04:15] (03CR) 10Giuseppe Lavagetto: (WIP) mwdebug: add helfile configuration (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/693875 (owner: 10Effie Mouzeli) [10:04:55] (03CR) 10Ema: [C: 03+2] varnish: add .sh suffix to shell scripts [puppet] - 10https://gerrit.wikimedia.org/r/693377 (https://phabricator.wikimedia.org/T148494) (owner: 10Ema) [10:05:52] (03PS4) 10Effie Mouzeli: (WIP) mwdebug: add helmfile configuration [deployment-charts] - 10https://gerrit.wikimedia.org/r/693875 [10:08:11] (03PS4) 10Jcrespo: mediabackup: Install minio on the storage hosts and open port 9000 [puppet] - 10https://gerrit.wikimedia.org/r/694332 (https://phabricator.wikimedia.org/T276442) [10:13:19] (03CR) 10Klausman: Add Jetstack's cert-manager base go images. (031 comment) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693826 (https://phabricator.wikimedia.org/T280661) (owner: 10Elukey) [10:13:41] (03CR) 10Jcrespo: "A few questions in the form of a patch:" [puppet] - 10https://gerrit.wikimedia.org/r/694332 (https://phabricator.wikimedia.org/T276442) (owner: 10Jcrespo) [10:13:58] (03CR) 10Jcrespo: "Puppet compiler: https://puppet-compiler.wmflabs.org/compiler1003/29666/backup2004.codfw.wmnet/index.html" [puppet] - 10https://gerrit.wikimedia.org/r/694332 (https://phabricator.wikimedia.org/T276442) (owner: 10Jcrespo) [10:21:19] (03CR) 10Klausman: Add knative serving and net-istio images (032 comments) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/692899 (https://phabricator.wikimedia.org/T278194) (owner: 10Elukey) [10:24:50] (03CR) 10ZPapierski: [C: 03+1] "LGTM, but my knowledge of k8s is limited" [deployment-charts] - 10https://gerrit.wikimedia.org/r/671204 (https://phabricator.wikimedia.org/T264006) (owner: 10Mstyles) [10:24:55] (03PS1) 10Volans: wmf-auto-reimage: check the debian installer env [puppet] - 10https://gerrit.wikimedia.org/r/694366 [10:25:11] (03CR) 10ZPapierski: [C: 03+1] Rename chart rdf-streaming-updater as flink-session-cluster [deployment-charts] - 10https://gerrit.wikimedia.org/r/693411 (owner: 10DCausse) [10:27:15] (03CR) 10Klausman: Add istio base images build support (031 comment) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192) (owner: 10Elukey) [10:28:57] (03CR) 10Volans: "I plan to test it on sretest once deployed." [puppet] - 10https://gerrit.wikimedia.org/r/694366 (owner: 10Volans) [10:32:30] (03PS2) 10Giuseppe Lavagetto: mediawiki: add egress policies to databases [deployment-charts] - 10https://gerrit.wikimedia.org/r/693871 [10:32:32] (03PS1) 10Giuseppe Lavagetto: rake: always run repo update before deployment diff [deployment-charts] - 10https://gerrit.wikimedia.org/r/694373 [10:37:41] (03CR) 10Giuseppe Lavagetto: [C: 03+2] rake: always run repo update before deployment diff [deployment-charts] - 10https://gerrit.wikimedia.org/r/694373 (owner: 10Giuseppe Lavagetto) [10:39:14] (03PS1) 10Jbond: O:envoyproxy: add a way to restart envoy proxy [puppet] - 10https://gerrit.wikimedia.org/r/694379 [10:40:17] (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29667/console" [puppet] - 10https://gerrit.wikimedia.org/r/694379 (owner: 10Jbond) [10:40:29] (03Merged) 10jenkins-bot: rake: always run repo update before deployment diff [deployment-charts] - 10https://gerrit.wikimedia.org/r/694373 (owner: 10Giuseppe Lavagetto) [10:41:01] (03PS2) 10Jbond: O:envoyproxy: add a way to restart envoy proxy [puppet] - 10https://gerrit.wikimedia.org/r/694379 [10:41:24] (03PS6) 10Giuseppe Lavagetto: helm_diff: supports new and renamed charts [deployment-charts] - 10https://gerrit.wikimedia.org/r/693359 (owner: 10DCausse) [10:41:33] (03CR) 10Giuseppe Lavagetto: [C: 03+2] helm_diff: supports new and renamed charts [deployment-charts] - 10https://gerrit.wikimedia.org/r/693359 (owner: 10DCausse) [10:43:23] (03CR) 10Hashar: Download upstream war with Maven (033 comments) [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 (owner: 10Hashar) [10:43:34] (03PS2) 10Hashar: Download upstream war with Maven [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 [10:43:43] (03Merged) 10jenkins-bot: helm_diff: supports new and renamed charts [deployment-charts] - 10https://gerrit.wikimedia.org/r/693359 (owner: 10DCausse) [10:45:00] (03CR) 10Ayounsi: [C: 03+1] "LGTM!" [homer/public] - 10https://gerrit.wikimedia.org/r/694305 (https://phabricator.wikimedia.org/T283503) (owner: 10Cathal Mooney) [10:46:34] (03PS1) 10Jbond: P:pki::multirootca: drop nrpe check for expired certs [puppet] - 10https://gerrit.wikimedia.org/r/694388 [10:47:10] (03PS3) 10Jbond: O:envoyproxy: add a way to restart envoy proxy [puppet] - 10https://gerrit.wikimedia.org/r/694379 [10:47:16] (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29669/console" [puppet] - 10https://gerrit.wikimedia.org/r/694388 (owner: 10Jbond) [10:48:10] (03PS4) 10Jbond: O:envoyproxy: add a way to restart envoy proxy [puppet] - 10https://gerrit.wikimedia.org/r/694379 [10:48:28] (03CR) 10Jbond: [V: 03+1 C: 03+2] P:pki::multirootca: drop nrpe check for expired certs [puppet] - 10https://gerrit.wikimedia.org/r/694388 (owner: 10Jbond) [10:55:09] (03CR) 10Jcrespo: "Thanks for working on this! <3" [puppet] - 10https://gerrit.wikimedia.org/r/694366 (owner: 10Volans) [10:57:51] (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 40): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29671/console" [puppet] - 10https://gerrit.wikimedia.org/r/694379 (owner: 10Jbond) [11:00:04] Amir1, Lucas_WMDE, awight, and Urbanecm: (Dis)respected human, time to deploy European mid-day backport window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210525T1100). Please do the needful. [11:00:04] matthiasmullie: A patch you scheduled for European mid-day backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [11:03:46] (03PS3) 10Effie Mouzeli: WIP: add notls support for external addresses to memcached [puppet] - 10https://gerrit.wikimedia.org/r/693474 (https://phabricator.wikimedia.org/T271967) [11:07:56] (03CR) 10Lucas Werkmeister (WMDE): [C: 03+1] Change HTTP to HTTPS for concept URIs on Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/679327 (https://phabricator.wikimedia.org/T258590) (owner: 10Seddon) [11:09:28] (03PS4) 10Effie Mouzeli: WIP: add notls support for external addresses to memcached [puppet] - 10https://gerrit.wikimedia.org/r/693474 (https://phabricator.wikimedia.org/T271967) [11:11:46] 10SRE, 10observability, 10Patch-For-Review: Migrate mwlog/udp2log servers to Buster - https://phabricator.wikimedia.org/T224565 (10Lucas_Werkmeister_WMDE) Since deployers are expected to SSH onto this host, can someone add its SSH fingerprint to Wikitech? [11:15:27] (03CR) 10Lucas Werkmeister (WMDE): [C: 03+2] Change HTTP to HTTPS for concept URIs on Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/679327 (https://phabricator.wikimedia.org/T258590) (owner: 10Seddon) [11:16:47] 10SRE, 10observability, 10Patch-For-Review: Migrate mwlog/udp2log servers to Buster - https://phabricator.wikimedia.org/T224565 (10Lucas_Werkmeister_WMDE) Here’s the fingerprint if anyone else needs it: ` $ curl -sL https://config-master.wikimedia.org/known_hosts.ecdsa | grep mwlog1002 | ssh-keygen -lf- 256... [11:17:04] (03Merged) 10jenkins-bot: Change HTTP to HTTPS for concept URIs on Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/679327 (https://phabricator.wikimedia.org/T258590) (owner: 10Seddon) [11:19:31] (03CR) 10David Caro: [C: 03+1] "Just some nits, feel free to ignore." (036 comments) [debs/nfsd-ldap] - 10https://gerrit.wikimedia.org/r/693500 (https://phabricator.wikimedia.org/T283385) (owner: 10Bstorm) [11:28:55] (03CR) 10Elukey: Add istio base images build support (031 comment) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192) (owner: 10Elukey) [11:34:57] (03CR) 10Klausman: Add base kubeflow kfserving images and kube-rbac-proxy (031 comment) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693644 (https://phabricator.wikimedia.org/T272919) (owner: 10Elukey) [11:35:02] (03PS1) 10Jgiannelos: Bump chromium-render to version 2021-05-24-102519-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/694413 [11:35:48] (03CR) 10Klausman: Add knative serving and net-istio images (031 comment) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/692899 (https://phabricator.wikimedia.org/T278194) (owner: 10Elukey) [11:36:04] (03CR) 10Klausman: Add Jetstack's cert-manager base go images. (031 comment) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693826 (https://phabricator.wikimedia.org/T280661) (owner: 10Elukey) [11:39:10] (03CR) 10Jgiannelos: "recheck" [deployment-charts] - 10https://gerrit.wikimedia.org/r/693917 (https://phabricator.wikimedia.org/T283159) (owner: 10Effie Mouzeli) [11:39:28] (03PS1) 10Marostegui: db1124: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/694414 [11:40:15] (03CR) 10Marostegui: [C: 03+2] db1124: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/694414 (owner: 10Marostegui) [11:41:27] 10SRE, 10Data-Persistence-Backup, 10Wikimedia-Mailing-lists: The Great Clean Up of Mailman2 - https://phabricator.wikimedia.org/T282303 (10jcrespo) >20 Gigabytes backed up so far (1/6th), it is normal a full backup takes a lot of time there due to many small files. ` 20.83 G lists1001.wikimedia.org-Weekly-Mo... [11:41:51] (03PS1) 10Marostegui: instances.yaml: Remove db1124 from dbctl [puppet] - 10https://gerrit.wikimedia.org/r/694415 [11:43:33] (03PS6) 10Elukey: Add knative serving and net-istio images [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/692899 (https://phabricator.wikimedia.org/T278194) [11:43:35] (03PS4) 10Elukey: Add base kubeflow kfserving images and kube-rbac-proxy [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693644 (https://phabricator.wikimedia.org/T272919) [11:43:37] (03PS3) 10Elukey: Add Jetstack's cert-manager base go images. [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693826 (https://phabricator.wikimedia.org/T280661) [11:44:22] (03CR) 10Elukey: Add knative serving and net-istio images (031 comment) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/692899 (https://phabricator.wikimedia.org/T278194) (owner: 10Elukey) [11:44:24] (03CR) 10Jgiannelos: [C: 03+1] Rename maps-vector-server to tegola-vector-tiles [deployment-charts] - 10https://gerrit.wikimedia.org/r/693917 (https://phabricator.wikimedia.org/T283159) (owner: 10Effie Mouzeli) [11:44:33] 10Puppet, 10Patch-For-Review: puppetdb seems to be slow on host reimage - https://phabricator.wikimedia.org/T263578 (10jbond) I have created a draft [[ https://grafana-rw.wikimedia.org/d/C0lCOf3Mz/puppetdb-postgres?orgId=1&from=now-6h&to=now | puppetdb postgress dash board ]] to help investigate further [11:50:17] (03CR) 10Marostegui: [C: 03+2] instances.yaml: Remove db1124 from dbctl [puppet] - 10https://gerrit.wikimedia.org/r/694415 (owner: 10Marostegui) [11:52:27] (03PS1) 10Marostegui: db1124: Move db1124 to the testing section [puppet] - 10https://gerrit.wikimedia.org/r/694422 [11:53:59] (03CR) 10Marostegui: [C: 03+2] db1124: Move db1124 to the testing section [puppet] - 10https://gerrit.wikimedia.org/r/694422 (owner: 10Marostegui) [11:57:43] 10Puppet, 10Patch-For-Review: puppetdb seems to be slow on host reimage - https://phabricator.wikimedia.org/T263578 (10jbond) I have noticed we get very regular spikes of [[ https://grafana-rw.wikimedia.org/d/000000477/puppetdb?viewPanel=7&orgId=1 | command processing ]] which also corresponds to high [[ https... [11:59:12] (03PS1) 10Marostegui: mariadb: Make db1124 master of test-s4 [puppet] - 10https://gerrit.wikimedia.org/r/694424 [11:59:55] (03CR) 10Marostegui: [C: 03+2] mariadb: Make db1124 master of test-s4 [puppet] - 10https://gerrit.wikimedia.org/r/694424 (owner: 10Marostegui) [12:00:57] (03CR) 10Southparkfan: "> Patch Set 2:" [puppet] - 10https://gerrit.wikimedia.org/r/682259 (https://phabricator.wikimedia.org/T127717) (owner: 10Southparkfan) [12:01:05] (03CR) 10Kormat: prometheus-mysql-exporter: Abort early and capture db integrity issue (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693414 (owner: 10Jcrespo) [12:08:50] (03CR) 10Elukey: [C: 03+1] "Left a non-blocking nit." (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/694366 (owner: 10Volans) [12:12:04] 10SRE, 10Continuous-Integration-Infrastructure: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL ) - https://phabricator.wikimedia.org/T283582 (10hashar) [12:17:18] (03PS1) 10Ema: cache: print exp_policy_base and exp_policy_rate [puppet] - 10https://gerrit.wikimedia.org/r/694438 (https://phabricator.wikimedia.org/T275809) [12:18:36] (03CR) 10Jcrespo: ":-)" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693414 (owner: 10Jcrespo) [12:21:00] 10Puppet, 10Patch-For-Review: puppetdb seems to be slow on host reimage - https://phabricator.wikimedia.org/T263578 (10Volans) The spikes don't seem to follow the 30m usual puppet frequency but more a 20 minutes one: `00`, `20`, `40`... For reference the current puppet runs for the alert hosts are at `18,48` f... [12:21:38] (03CR) 10Jbond: [C: 03+1] "lgtm" [puppet] - 10https://gerrit.wikimedia.org/r/694366 (owner: 10Volans) [12:27:48] (03PS1) 10Ema: cache: add command line parameters to exp_policy.py [puppet] - 10https://gerrit.wikimedia.org/r/694440 (https://phabricator.wikimedia.org/T275809) [12:28:16] (03CR) 10Kormat: [C: 03+1] prometheus-mysql-exporter: Abort early and capture db integrity issue (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693414 (owner: 10Jcrespo) [12:28:22] (03CR) 10jerkins-bot: [V: 04-1] cache: add command line parameters to exp_policy.py [puppet] - 10https://gerrit.wikimedia.org/r/694440 (https://phabricator.wikimedia.org/T275809) (owner: 10Ema) [12:34:54] (03PS2) 10Ema: cache: add command line parameters to exp_policy.py [puppet] - 10https://gerrit.wikimedia.org/r/694440 (https://phabricator.wikimedia.org/T275809) [12:36:50] (03CR) 10Ema: [C: 03+2] cache: print exp_policy_base and exp_policy_rate [puppet] - 10https://gerrit.wikimedia.org/r/694438 (https://phabricator.wikimedia.org/T275809) (owner: 10Ema) [12:36:59] (03CR) 10Ema: [C: 03+2] cache: add command line parameters to exp_policy.py [puppet] - 10https://gerrit.wikimedia.org/r/694440 (https://phabricator.wikimedia.org/T275809) (owner: 10Ema) [12:39:27] 10Puppet, 10SRE, 10observability, 10User-jbond: Add additional promethous metricts to puppet runs - https://phabricator.wikimedia.org/T283585 (10jbond) p:05Triage→03Medium [12:40:13] (03PS2) 10Volans: wmf-auto-reimage: check the debian installer env [puppet] - 10https://gerrit.wikimedia.org/r/694366 [12:40:23] (03CR) 10Volans: "Addressed comment" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/694366 (owner: 10Volans) [12:40:30] (03CR) 10Arturo Borrero Gonzalez: [C: 03+1] "We need to review the versioning string for future releases. In the debian world the `debuXuY` syntax is only used for security updates." [debs/nfsd-ldap] - 10https://gerrit.wikimedia.org/r/693500 (https://phabricator.wikimedia.org/T283385) (owner: 10Bstorm) [12:47:20] (03CR) 10Volans: "This is a great effort, thanks for starting it." [puppet] - 10https://gerrit.wikimedia.org/r/692869 (https://phabricator.wikimedia.org/T282787) (owner: 10MMandere) [12:48:51] (03CR) 10Urbanecm: [C: 03+2] Revert "Use svwiki 20th anniversary logos" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693662 (https://phabricator.wikimedia.org/T282389) (owner: 10Zabe) [12:50:05] (03CR) 10Urbanecm: [C: 03+2] Revert "Add svwiki 20th anniversary logos" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693663 (https://phabricator.wikimedia.org/T282389) (owner: 10Zabe) [12:50:08] 10Puppet, 10SRE, 10observability, 10User-jbond: Add additional prometheus metrics to puppet runs - https://phabricator.wikimedia.org/T283585 (10Volans) [12:50:08] (03Merged) 10jenkins-bot: Revert "Use svwiki 20th anniversary logos" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693662 (https://phabricator.wikimedia.org/T282389) (owner: 10Zabe) [12:50:16] (03CR) 10jerkins-bot: [V: 04-1] Revert "Add svwiki 20th anniversary logos" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693663 (https://phabricator.wikimedia.org/T282389) (owner: 10Zabe) [12:51:20] (03PS4) 10Urbanecm: Revert "Add svwiki 20th anniversary logos" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693663 (https://phabricator.wikimedia.org/T282389) (owner: 10Zabe) [12:51:25] (03CR) 10Urbanecm: [C: 03+2] Revert "Add svwiki 20th anniversary logos" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693663 (https://phabricator.wikimedia.org/T282389) (owner: 10Zabe) [12:53:36] (03Merged) 10jenkins-bot: Revert "Add svwiki 20th anniversary logos" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693663 (https://phabricator.wikimedia.org/T282389) (owner: 10Zabe) [12:53:44] (03PS1) 10Marostegui: install_server: Do not reimage db1176, db1178. [puppet] - 10https://gerrit.wikimedia.org/r/694457 [12:54:47] (03CR) 10Marostegui: [C: 03+2] install_server: Do not reimage db1176, db1178. [puppet] - 10https://gerrit.wikimedia.org/r/694457 (owner: 10Marostegui) [12:58:26] (03PS1) 10Effie Mouzeli: (WIP) profile::memcached::instance: Add TLS support [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) [13:00:00] (03CR) 10jerkins-bot: [V: 04-1] (WIP) profile::memcached::instance: Add TLS support [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) (owner: 10Effie Mouzeli) [13:03:41] (03CR) 10Gehel: "Sorry, I missed a few minor things." (034 comments) [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 (owner: 10Hashar) [13:05:33] (03CR) 10Thcipriani: [C: 03+1] "I like the concept: this will greatly simplify!" [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 (owner: 10Hashar) [13:12:24] (03PS1) 10Marostegui: data.yaml: Add Eric Gardner to gerrit-root [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) [13:12:48] (03CR) 10Marostegui: [C: 04-2] "DO NOT submit. Manager approval pending. Key verification pending." [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) (owner: 10Marostegui) [13:12:54] (03CR) 10jerkins-bot: [V: 04-1] data.yaml: Add Eric Gardner to gerrit-root [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) (owner: 10Marostegui) [13:13:42] (03PS2) 10Itamar Givon: Test Wikidata: Enable empty list to object serialization [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694339 (https://phabricator.wikimedia.org/T241422) [13:17:50] (03PS2) 10Marostegui: data.yaml: Add Eric Gardner to gerrit-root [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) [13:20:36] (03PS1) 10David Caro: prometheus: Use high retention time when using size [puppet] - 10https://gerrit.wikimedia.org/r/694472 [13:22:04] (03PS2) 10Nikki Nikkhoui: Initial image-suggestion-api helm chart [deployment-charts] - 10https://gerrit.wikimedia.org/r/688358 (https://phabricator.wikimedia.org/T281257) [13:22:46] (03CR) 10David Caro: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/694472 (owner: 10David Caro) [13:27:17] (03CR) 10Hashar: Download upstream war with Maven (033 comments) [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 (owner: 10Hashar) [13:27:24] (03PS3) 10Hashar: Download upstream war with Maven [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 [13:30:44] (03PS1) 10Hashar: Add .editorconfig for pom.xml [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/694474 [13:31:42] (03PS5) 10Effie Mouzeli: WIP: add notls support for external addresses to memcached [puppet] - 10https://gerrit.wikimedia.org/r/693474 (https://phabricator.wikimedia.org/T271967) [13:32:11] (03CR) 10Gehel: [C: 03+1] "LGTM" [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 (owner: 10Hashar) [13:32:19] (03PS2) 10Effie Mouzeli: (WIP) profile::memcached::instance: Add TLS support [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) [13:33:28] (03CR) 10Hashar: "Will let Tyler / Ahmon +2 it." [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 (owner: 10Hashar) [13:35:04] (03CR) 10Klausman: Add knative serving and net-istio images (031 comment) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/692899 (https://phabricator.wikimedia.org/T278194) (owner: 10Elukey) [13:35:11] (03CR) 10Klausman: [C: 03+1] Add knative serving and net-istio images [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/692899 (https://phabricator.wikimedia.org/T278194) (owner: 10Elukey) [13:35:45] (03CR) 10Klausman: [C: 03+1] Add base kubeflow kfserving images and kube-rbac-proxy [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693644 (https://phabricator.wikimedia.org/T272919) (owner: 10Elukey) [13:35:57] (03CR) 10Klausman: [C: 03+1] Add Jetstack's cert-manager base go images. [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693826 (https://phabricator.wikimedia.org/T280661) (owner: 10Elukey) [13:36:13] (03CR) 10Effie Mouzeli: "PCC is noop https://puppet-compiler.wmflabs.org/compiler1003/29677/" [puppet] - 10https://gerrit.wikimedia.org/r/693474 (https://phabricator.wikimedia.org/T271967) (owner: 10Effie Mouzeli) [13:37:07] (03CR) 10Klausman: [C: 03+1] Add istio base images build support (031 comment) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192) (owner: 10Elukey) [13:37:16] (03CR) 10Effie Mouzeli: "PCC is as expected and happy: https://puppet-compiler.wmflabs.org/compiler1001/29678/" [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) (owner: 10Effie Mouzeli) [13:46:25] (03PS3) 10Mforns: reportupdater: Rsync logs to HDFS [puppet] - 10https://gerrit.wikimedia.org/r/692909 (https://phabricator.wikimedia.org/T274880) [13:48:44] (03PS1) 10Effie Mouzeli: (WIP) hieradata: enable tls on mc2019 [puppet] - 10https://gerrit.wikimedia.org/r/694484 (https://phabricator.wikimedia.org/T271967) [13:49:03] (03PS4) 10Mforns: reportupdater: Rsync logs to HDFS [puppet] - 10https://gerrit.wikimedia.org/r/692909 (https://phabricator.wikimedia.org/T274880) [13:50:35] (03PS3) 10Effie Mouzeli: (WIP) profile::memcached::instance: Add TLS support [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) [13:51:12] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10thcipriani) >>! In T283541#7111124, @gerritbot wrote: > Change 694466 had a related patch set uploaded (by Marostegui; author: Marostegui): > %%%[operat... [13:51:19] (03PS4) 10Effie Mouzeli: (WIP) profile::memcached::instance: Add TLS support [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) [13:51:26] (03CR) 10Thcipriani: [C: 04-1] data.yaml: Add Eric Gardner to gerrit-root [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) (owner: 10Marostegui) [13:51:38] (03PS2) 10Effie Mouzeli: (WIP) hieradata: enable tls on mc2019 [puppet] - 10https://gerrit.wikimedia.org/r/694484 (https://phabricator.wikimedia.org/T271967) [13:51:54] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10thcipriani) [13:53:05] (03PS3) 10Marostegui: data.yaml: Add Eric Gardner to gerrit-root [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) [13:53:18] (03CR) 10jerkins-bot: [V: 04-1] (WIP) profile::memcached::instance: Add TLS support [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) (owner: 10Effie Mouzeli) [13:53:34] (03CR) 10Marostegui: [C: 04-2] "Fixing the group per your comment at https://phabricator.wikimedia.org/T283541#7111219" [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) (owner: 10Marostegui) [13:53:40] (03PS4) 10Marostegui: data.yaml: Add Eric Gardner to deployment group [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) [13:58:55] (03CR) 10David Caro: "The changes are the expected ones, and the failures are also expected" [puppet] - 10https://gerrit.wikimedia.org/r/694472 (owner: 10David Caro) [14:07:57] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "LGTM, but don't amend the stdlib module." (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693906 (https://phabricator.wikimedia.org/T283213) (owner: 10Herron) [14:08:49] (03CR) 10Jcrespo: [C: 03+2] "I will merge as is, and depending how frequently this fails again and how easy is to debug, we can add additional patches on top of it." [puppet] - 10https://gerrit.wikimedia.org/r/693414 (owner: 10Jcrespo) [14:12:23] (03PS1) 10Jbond: (WIP) Implement json logging [puppet] - 10https://gerrit.wikimedia.org/r/694493 [14:23:12] (03CR) 10Lucas Werkmeister (WMDE): [C: 03+1] Test Wikidata: Enable empty list to object serialization (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694339 (https://phabricator.wikimedia.org/T241422) (owner: 10Itamar Givon) [14:32:34] (03CR) 10Giuseppe Lavagetto: [C: 04-1] (WIP) mwdebug: add helmfile configuration (037 comments) [deployment-charts] - 10https://gerrit.wikimedia.org/r/693875 (owner: 10Effie Mouzeli) [14:33:23] (03PS1) 10Jbond: O:cfssl::cert: also regenrate chained file when generating re-signing [puppet] - 10https://gerrit.wikimedia.org/r/694500 [14:34:00] (03PS2) 10Jbond: O:cfssl::cert: also regenrate chained file when generating re-signing [puppet] - 10https://gerrit.wikimedia.org/r/694500 [14:39:38] 10SRE, 10serviceops, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, 10Platform Engineering (Icebox): Undeploy graphoid - https://phabricator.wikimedia.org/T242855 (10Seddon) [14:41:16] (03PS3) 10Itamar Givon: Test Wikidata: Enable empty list to object serialization [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694339 (https://phabricator.wikimedia.org/T241422) [14:42:31] (03CR) 10Itamar Givon: Test Wikidata: Enable empty list to object serialization (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694339 (https://phabricator.wikimedia.org/T241422) (owner: 10Itamar Givon) [14:42:46] (03CR) 10Lucas Werkmeister (WMDE): [C: 03+1] Test Wikidata: Enable empty list to object serialization [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694339 (https://phabricator.wikimedia.org/T241422) (owner: 10Itamar Givon) [14:43:40] (03CR) 10Filippo Giunchedi: [C: 03+1] prometheus: Use high retention time when using size [puppet] - 10https://gerrit.wikimedia.org/r/694472 (owner: 10David Caro) [14:44:19] (03PS2) 10Herron: update puppet defaults and docs to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693906 (https://phabricator.wikimedia.org/T283213) [14:48:46] (03CR) 10Hnowlan: [C: 03+2] profile::envoy: Add argument for building the envoy-future package [puppet] - 10https://gerrit.wikimedia.org/r/693870 (owner: 10Hnowlan) [14:48:53] 10SRE, 10netbox, 10netops: Stage drmrs in Netbox - https://phabricator.wikimedia.org/T283594 (10ayounsi) p:05Triage→03Medium [14:49:45] 10SRE, 10Data-Persistence-Backup, 10netops: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10jcrespo) You may be upon something. UDP transmission speed seems equivalent in both ways: (there was the... [14:50:04] (03CR) 10Herron: [C: 03+2] update puppet defaults and docs to libera.chat (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693906 (https://phabricator.wikimedia.org/T283213) (owner: 10Herron) [14:53:00] 10Puppet, 10Patch-For-Review: puppetdb seems to be slow on host reimage - https://phabricator.wikimedia.org/T263578 (10jbond) > For reference the current puppet runs for the alert hosts are at 18,48 for alert1001 and 23,53 for alert2001. Thanks Fyi i have also updated the [[ https://grafana-rw.wikimedia.org/d... [14:53:12] 10SRE, 10serviceops, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, 10Platform Engineering (Icebox): Undeploy graphoid - https://phabricator.wikimedia.org/T242855 (10Seddon) a:05Seddon→03None [14:56:18] 10SRE, 10Data-Persistence-Backup, 10netops: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10jcrespo) Compare if I run the above with TCP (minus the -u), where the difference can be appreciated in o... [14:56:25] (03PS1) 10Ottomata: [WIP] airflow 2 [puppet] - 10https://gerrit.wikimedia.org/r/694514 (https://phabricator.wikimedia.org/T272973) [14:57:03] (03CR) 10Ottomata: "Still very WIP , no need for review yet! Just adding yall for reference." [puppet] - 10https://gerrit.wikimedia.org/r/694514 (https://phabricator.wikimedia.org/T272973) (owner: 10Ottomata) [14:57:50] (03CR) 10jerkins-bot: [V: 04-1] [WIP] airflow 2 [puppet] - 10https://gerrit.wikimedia.org/r/694514 (https://phabricator.wikimedia.org/T272973) (owner: 10Ottomata) [15:00:44] (03PS2) 10Ottomata: [WIP] airflow 2 [puppet] - 10https://gerrit.wikimedia.org/r/694514 (https://phabricator.wikimedia.org/T272973) [15:02:14] (03CR) 10jerkins-bot: [V: 04-1] [WIP] airflow 2 [puppet] - 10https://gerrit.wikimedia.org/r/694514 (https://phabricator.wikimedia.org/T272973) (owner: 10Ottomata) [15:03:45] (03CR) 10CDanis: [C: 03+1] icinga: switch to LibreNMS AlertManager paging (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/685779 (https://phabricator.wikimedia.org/T281095) (owner: 10Filippo Giunchedi) [15:04:27] (03CR) 10Volans: "LGTM, couple of questions/nits inline" (034 comments) [puppet] - 10https://gerrit.wikimedia.org/r/694379 (owner: 10Jbond) [15:04:38] 10SRE, 10Data-Persistence-Backup, 10netops: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10jcrespo) > @jcrespo, if you are ok with the proposed tests can you advise a newbie on what the best way t... [15:05:12] (03PS5) 10Jbond: O:envoyproxy: add a way to restart envoy proxy [puppet] - 10https://gerrit.wikimedia.org/r/694379 [15:19:46] (03PS1) 10Hashar: gerrit: add Java 11 packages [puppet] - 10https://gerrit.wikimedia.org/r/694523 (https://phabricator.wikimedia.org/T268225) [15:19:48] (03PS1) 10Hashar: gerrit: switch to Java 11 [puppet] - 10https://gerrit.wikimedia.org/r/694524 (https://phabricator.wikimedia.org/T268225) [15:20:15] (03CR) 10jerkins-bot: [V: 04-1] gerrit: add Java 11 packages [puppet] - 10https://gerrit.wikimedia.org/r/694523 (https://phabricator.wikimedia.org/T268225) (owner: 10Hashar) [15:20:35] (03CR) 10jerkins-bot: [V: 04-1] gerrit: switch to Java 11 [puppet] - 10https://gerrit.wikimedia.org/r/694524 (https://phabricator.wikimedia.org/T268225) (owner: 10Hashar) [15:21:14] (03PS2) 10Hashar: gerrit: add Java 11 packages [puppet] - 10https://gerrit.wikimedia.org/r/694523 (https://phabricator.wikimedia.org/T268225) [15:21:16] (03PS2) 10Hashar: gerrit: switch to Java 11 [puppet] - 10https://gerrit.wikimedia.org/r/694524 (https://phabricator.wikimedia.org/T268225) [15:23:51] 10SRE, 10Data-Persistence-Backup, 10netops: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10jcrespo) backups hosts happen to have a generous scratching area, I have left on backup1003:/srv and back... [15:24:02] (03PS5) 10Effie Mouzeli: (WIP) profile::memcached::instance: Add TLS support [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) [15:25:04] (03CR) 10Legoktm: [C: 03+1] "Post-merge +1, thanks :)" [puppet] - 10https://gerrit.wikimedia.org/r/694210 (https://phabricator.wikimedia.org/T282303) (owner: 10Jcrespo) [15:26:49] (03PS3) 10Effie Mouzeli: (WIP) hieradata: enable tls on mc2019 [puppet] - 10https://gerrit.wikimedia.org/r/694484 (https://phabricator.wikimedia.org/T271967) [15:29:17] (03CR) 10Razzi: [C: 03+2] yarn: temporarily stop allowing jobs to be submitted to yarn [puppet] - 10https://gerrit.wikimedia.org/r/692465 (https://phabricator.wikimedia.org/T278423) (owner: 10Razzi) [15:31:43] (03CR) 10Elukey: Add istio base images build support (031 comment) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192) (owner: 10Elukey) [15:32:02] 10SRE, 10Data-Persistence-Backup, 10netops: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10cmooney) Ok thanks for the update, and confirmation that it's ok to to add those temp iptables rules if n... [15:32:06] 10SRE, 10Data-Persistence-Backup, 10Wikimedia-Mailing-lists: The Great Clean Up of Mailman2 - https://phabricator.wikimedia.org/T282303 (10jcrespo) You can track the progress at: https://grafana.wikimedia.org/d/413r2vbWk/bacula?orgId=1&var-site=eqiad&var-job=lists1001.wikimedia.org-Weekly-Mon-Archive-var-lib... [15:34:47] (03PS22) 10Elukey: Add istio base images build support [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192) [15:34:49] (03PS7) 10Elukey: Add knative serving and net-istio images [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/692899 (https://phabricator.wikimedia.org/T278194) [15:34:52] (03PS5) 10Elukey: Add base kubeflow kfserving images and kube-rbac-proxy [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693644 (https://phabricator.wikimedia.org/T272919) [15:34:54] (03PS4) 10Elukey: Add Jetstack's cert-manager base go images. [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693826 (https://phabricator.wikimedia.org/T280661) [15:36:08] (03PS2) 10Ladsgroup: lists: Stop mailman2 service [puppet] - 10https://gerrit.wikimedia.org/r/693600 (https://phabricator.wikimedia.org/T52864) [15:37:13] (03CR) 10Ladsgroup: lists: Stop mailman2 service (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/693600 (https://phabricator.wikimedia.org/T52864) (owner: 10Ladsgroup) [15:37:27] (03CR) 10Ladsgroup: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/693600 (https://phabricator.wikimedia.org/T52864) (owner: 10Ladsgroup) [15:39:40] (03CR) 10David Caro: [C: 03+2] prometheus: Use high retention time when using size [puppet] - 10https://gerrit.wikimedia.org/r/694472 (owner: 10David Caro) [15:39:42] (03PS1) 10Ayounsi: Reports: ignore drmrs [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/694539 (https://phabricator.wikimedia.org/T283594) [15:45:13] (03PS3) 10Ladsgroup: lists: Stop mailman2 service [puppet] - 10https://gerrit.wikimedia.org/r/693600 (https://phabricator.wikimedia.org/T52864) [15:45:15] (03CR) 10Volans: [C: 04-1] "As agreed on IRC I did a pass and commented only where I'm confident I know the side effects." (0319 comments) [puppet] - 10https://gerrit.wikimedia.org/r/692869 (https://phabricator.wikimedia.org/T282787) (owner: 10MMandere) [15:45:46] (03CR) 10Ladsgroup: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/693600 (https://phabricator.wikimedia.org/T52864) (owner: 10Ladsgroup) [15:49:24] 10SRE, 10Gerrit, 10Release-Engineering-Team (Seen): Create Gerrit Administrator right policy - https://phabricator.wikimedia.org/T218686 (10hashar) [15:50:08] (03CR) 10Ahmon Dancy: [C: 04-1] "Minor nits." (036 comments) [puppet] - 10https://gerrit.wikimedia.org/r/694330 (https://phabricator.wikimedia.org/T264209) (owner: 10JMeybohm) [15:50:28] (03CR) 10Volans: [C: 03+1] "LGTM, I think accounting.py might need the exclusion too" [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/694539 (https://phabricator.wikimedia.org/T283594) (owner: 10Ayounsi) [15:50:33] (03PS4) 10Effie Mouzeli: (WIP) hieradata: enable tls on mc2019 [puppet] - 10https://gerrit.wikimedia.org/r/694484 (https://phabricator.wikimedia.org/T271967) [15:50:41] 10SRE, 10Data-Persistence-Backup, 10netops: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10jcrespo) > I might run one or two tests between these hosts also if that is ok. Absolutely no problem, o... [15:51:14] 10SRE, 10Gerrit, 10TechCom, 10Release-Engineering-Team (Seen): Expand Gerrit Manager permissions - https://phabricator.wikimedia.org/T234474 (10hashar) [15:52:25] 10SRE, 10netbox, 10netops, 10Patch-For-Review: Stage drmrs in Netbox - https://phabricator.wikimedia.org/T283594 (10Volans) I would suggest to start adding things with `Status=Planned` and adapt our automation and tooling to exclude them if they're not doing that already. [15:52:54] (03CR) 10Ayounsi: [C: 03+2] "> Patch Set 1: Code-Review+1" [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/694539 (https://phabricator.wikimedia.org/T283594) (owner: 10Ayounsi) [15:54:27] (03Merged) 10jenkins-bot: Reports: ignore drmrs [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/694539 (https://phabricator.wikimedia.org/T283594) (owner: 10Ayounsi) [16:00:05] jbond42 and cdanis: #bothumor When your hammer is PHP, everything starts looking like a thumb. Rise for Puppet request window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210525T1600). [16:02:23] (03PS5) 10Effie Mouzeli: (WIP) hieradata: enable tls on mc2019 (3) [puppet] - 10https://gerrit.wikimedia.org/r/694484 (https://phabricator.wikimedia.org/T271967) [16:04:17] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10egardner) @thcipriani that's correct – I just need access to do the occasional backport and config deploy, I wasn't sure what the exact name for that gr... [16:06:07] (03CR) 10Ahmon Dancy: "Testing in progress." [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 (owner: 10Hashar) [16:07:01] (03PS6) 10Effie Mouzeli: WIP: add notls support for external addresses to memcached (1) [puppet] - 10https://gerrit.wikimedia.org/r/693474 (https://phabricator.wikimedia.org/T271967) [16:07:32] (03PS6) 10Effie Mouzeli: (WIP) profile::memcached::instance: Add TLS support (2) [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) [16:07:39] (03PS7) 10Effie Mouzeli: (WIP) profile::memcached::instance: Add TLS support (2) [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) [16:07:47] (03PS6) 10Effie Mouzeli: (WIP) hieradata: enable tls on mc2019 (3) [puppet] - 10https://gerrit.wikimedia.org/r/694484 (https://phabricator.wikimedia.org/T271967) [16:10:06] (03PS6) 10Jbond: O:envoyproxy: add a way to restart envoy proxy [puppet] - 10https://gerrit.wikimedia.org/r/694379 [16:10:36] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10thcipriani) >>! In T283541#7111749, @egardner wrote: > @thcipriani that's correct – I just need access to do the occasional backport and config deploy,... [16:12:24] (03CR) 10Jbond: [C: 03+1] "thanks updated" (034 comments) [puppet] - 10https://gerrit.wikimedia.org/r/694379 (owner: 10Jbond) [16:13:19] (03PS3) 10JMeybohm: docker-registry: Add caching config for nginx [puppet] - 10https://gerrit.wikimedia.org/r/694330 (https://phabricator.wikimedia.org/T264209) [16:13:23] (03PS1) 10JMeybohm: httpbb: Add tests for docker-registry [puppet] - 10https://gerrit.wikimedia.org/r/694552 (https://phabricator.wikimedia.org/T273521) [16:14:01] (03CR) 10Jbond: [C: 03+1] "updated thanks" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/694379 (owner: 10Jbond) [16:16:21] (03CR) 10Dzahn: [C: 04-1] "change looks mostly ok and has approval, but the UID need to be adjusted" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) (owner: 10Marostegui) [16:23:17] (03CR) 10Ahmon Dancy: [C: 03+2] "Works as advertised." [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 (owner: 10Hashar) [16:23:20] (03PS3) 10Ottomata: [WIP] airflow 2 [puppet] - 10https://gerrit.wikimedia.org/r/694514 (https://phabricator.wikimedia.org/T272973) [16:24:39] (03CR) 10Jbond: "echo Ricardo's comments thanks for the effort in this PS. I have tried to follow Ricardo's lead and mark places that are OK or places whi" (034 comments) [puppet] - 10https://gerrit.wikimedia.org/r/692869 (https://phabricator.wikimedia.org/T282787) (owner: 10MMandere) [16:24:53] (03CR) 10jerkins-bot: [V: 04-1] [WIP] airflow 2 [puppet] - 10https://gerrit.wikimedia.org/r/694514 (https://phabricator.wikimedia.org/T272973) (owner: 10Ottomata) [16:25:42] (03PS1) 10Ottomata: Set krb: present for kzeta [puppet] - 10https://gerrit.wikimedia.org/r/694553 (https://phabricator.wikimedia.org/T283386) [16:26:42] (03CR) 10Ottomata: [C: 03+2] Set krb: present for kzeta [puppet] - 10https://gerrit.wikimedia.org/r/694553 (https://phabricator.wikimedia.org/T283386) (owner: 10Ottomata) [16:33:37] (03PS5) 10Marostegui: data.yaml: Add Eric Gardner to deployment group [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) [16:34:22] (03CR) 10Marostegui: [C: 04-2] "Thanks Daniel, fixed it!" [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) (owner: 10Marostegui) [16:34:59] (03CR) 10Marostegui: "Key has been verified too, this is ready to go whenever Tyler approves it here and on the phab task!" [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) (owner: 10Marostegui) [16:39:42] (03CR) 10Ahmon Dancy: [C: 04-1] "There are some unresolved comments from patchset 2." [puppet] - 10https://gerrit.wikimedia.org/r/694330 (https://phabricator.wikimedia.org/T264209) (owner: 10JMeybohm) [16:41:19] (03PS7) 10Effie Mouzeli: WIP: add notls support for external addresses to memcached (1) [puppet] - 10https://gerrit.wikimedia.org/r/693474 (https://phabricator.wikimedia.org/T271967) [16:41:32] (03PS8) 10Effie Mouzeli: (WIP) profile::memcached::instance: Add TLS support (2) [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) [16:44:33] (03CR) 10Dzahn: [C: 03+1] "lgtm" [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) (owner: 10Marostegui) [16:45:46] (03CR) 10RLazarus: "The tests look good -- I haven't run them but I assume they pass currently." [puppet] - 10https://gerrit.wikimedia.org/r/694552 (https://phabricator.wikimedia.org/T273521) (owner: 10JMeybohm) [16:46:49] (03CR) 10Hashar: [C: 03+2] "+2 again after https://gerrit.wikimedia.org/r/c/integration/config/+/694555/" [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 (owner: 10Hashar) [16:46:57] (03Merged) 10jenkins-bot: Download upstream war with Maven [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693450 (owner: 10Hashar) [16:47:09] 10SRE, 10netops, 10observability: Create RIPE Atlas measurements against our authoritative DNS servers; alert on them - https://phabricator.wikimedia.org/T283359 (10CDanis) Totally agree with all of the above! Filed a placeholder task: {T283614} [16:47:39] (03CR) 10Bstorm: buster: generate a release for buster and add some files (033 comments) [debs/nfsd-ldap] - 10https://gerrit.wikimedia.org/r/693500 (https://phabricator.wikimedia.org/T283385) (owner: 10Bstorm) [16:48:21] (03CR) 10Bstorm: "> Patch Set 1: Code-Review+1" [debs/nfsd-ldap] - 10https://gerrit.wikimedia.org/r/693500 (https://phabricator.wikimedia.org/T283385) (owner: 10Bstorm) [16:49:15] (03PS1) 10JMeybohm: Add per test timeouts [software/httpbb] - 10https://gerrit.wikimedia.org/r/694556 [16:52:49] 10SRE, 10SRE-Access-Requests: Requesting access to production deployment for David Lynch - https://phabricator.wikimedia.org/T283607 (10Marostegui) a:03Marostegui [16:53:32] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10Marostegui) ssh key has been verified, this is only waiting for manager approval the +1 from Tyler on the gerrit patch [16:56:35] (03PS9) 10Effie Mouzeli: (WIP) profile::memcached::instance: Add TLS support (2) [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) [16:56:53] (03PS7) 10Effie Mouzeli: (WIP) hieradata: enable tls on mc2019 (3) [puppet] - 10https://gerrit.wikimedia.org/r/694484 (https://phabricator.wikimedia.org/T271967) [16:58:08] (03CR) 10jerkins-bot: [V: 04-1] (WIP) profile::memcached::instance: Add TLS support (2) [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) (owner: 10Effie Mouzeli) [17:00:05] chrisalbon and accraze: How many deployers does it take to do Services – Graphoid / ORES deploy? (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210525T1700). [17:00:34] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10MarkTraceur) Hereby approved by manager :) thanks, all! [17:00:38] (03PS10) 10Effie Mouzeli: (WIP) profile::memcached::instance: Add TLS support (2) [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) [17:01:18] (03PS8) 10Effie Mouzeli: (WIP) hieradata: enable tls on mc2019 (3) [puppet] - 10https://gerrit.wikimedia.org/r/694484 (https://phabricator.wikimedia.org/T271967) [17:04:44] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10Marostegui) [17:05:21] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10Marostegui) Thanks @MarkTraceur @thcipriani +1 from your side to add this to the deployment group? [17:06:06] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10thcipriani) >>! In T283541#7111931, @Marostegui wrote: > Thanks @MarkTraceur > @thcipriani +1 from your side to add this to the deployment group? Yes, +1 [17:08:16] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10Marostegui) [17:08:45] 10SRE, 10wikimedia-irc-libera: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10bd808) [17:09:46] (03PS6) 10Marostegui: data.yaml: Add Eric Gardner to deployment group [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) [17:12:11] (03CR) 10Thcipriani: [C: 03+1] "thank you!" [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) (owner: 10Marostegui) [17:12:23] (03CR) 10Marostegui: [C: 03+2] data.yaml: Add Eric Gardner to deployment group [puppet] - 10https://gerrit.wikimedia.org/r/694466 (https://phabricator.wikimedia.org/T283541) (owner: 10Marostegui) [17:14:29] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production deployment for Eric Gardner - https://phabricator.wikimedia.org/T283541 (10Marostegui) 05Open→03Resolved Access patch has been merged, you should try in like 1h or so to make sure puppet has run everywhere. Please make sure... [17:21:29] (03PS1) 10Razzi: yarn: enable submitting jobs to queue [puppet] - 10https://gerrit.wikimedia.org/r/694588 (https://phabricator.wikimedia.org/T278423) [17:23:37] (03CR) 10Razzi: [C: 03+2] yarn: enable submitting jobs to queue [puppet] - 10https://gerrit.wikimedia.org/r/694588 (https://phabricator.wikimedia.org/T278423) (owner: 10Razzi) [17:31:09] (03CR) 10Cwhite: [C: 03+2] rsyslog: add ecs_170 template [puppet] - 10https://gerrit.wikimedia.org/r/688502 (https://phabricator.wikimedia.org/T234565) (owner: 10Cwhite) [17:33:08] 10SRE, 10ops-eqiad, 10Analytics-Radar: Try to move some new analytics worker nodes to different racks - https://phabricator.wikimedia.org/T276239 (10Cmjohnson) @elukey I have this on my plan for tomorrow morning. i'll update the task once the move is complete. [17:34:18] (03PS1) 10Ayounsi: Reports: fix typo [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/694596 (https://phabricator.wikimedia.org/T283594) [17:34:57] (03CR) 10Cwhite: [C: 03+2] logstash: bump ecs to 1.7.0-3 [puppet] - 10https://gerrit.wikimedia.org/r/693964 (owner: 10Cwhite) [17:35:08] (03CR) 10Volans: [C: 03+1] "LGTM" [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/694596 (https://phabricator.wikimedia.org/T283594) (owner: 10Ayounsi) [17:35:53] (03PS1) 10Razzi: yarn: set queue state to RUNNING [puppet] - 10https://gerrit.wikimedia.org/r/694597 (https://phabricator.wikimedia.org/T278423) [17:37:37] (03CR) 10Razzi: [C: 03+2] yarn: set queue state to RUNNING [puppet] - 10https://gerrit.wikimedia.org/r/694597 (https://phabricator.wikimedia.org/T278423) (owner: 10Razzi) [17:40:40] (03PS3) 10Krinkle: Temporarily shorten $wgParserCacheExpireTime from 30 to 22 days [mediawiki-config] - 10https://gerrit.wikimedia.org/r/685181 (https://phabricator.wikimedia.org/T280605) [17:42:11] (03CR) 10Krinkle: "The background cron job, re-configured in https://gerrit.wikimedia.org/r/c/operations/puppet/+/685222/ , has finally finished running the " [mediawiki-config] - 10https://gerrit.wikimedia.org/r/685181 (https://phabricator.wikimedia.org/T280605) (owner: 10Krinkle) [17:44:32] (03PS4) 10Krinkle: Temporarily shorten $wgParserCacheExpireTime from 30 to 21 days [mediawiki-config] - 10https://gerrit.wikimedia.org/r/685181 (https://phabricator.wikimedia.org/T280605) [17:44:50] (03PS5) 10Krinkle: Temporarily shorten $wgParserCacheExpireTime from 30 to 21 days [mediawiki-config] - 10https://gerrit.wikimedia.org/r/685181 (https://phabricator.wikimedia.org/T280605) [17:44:59] (03CR) 10Ayounsi: [C: 03+2] Reports: fix typo [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/694596 (https://phabricator.wikimedia.org/T283594) (owner: 10Ayounsi) [17:45:44] (03Merged) 10jenkins-bot: Reports: fix typo [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/694596 (https://phabricator.wikimedia.org/T283594) (owner: 10Ayounsi) [17:48:14] (03CR) 10Krinkle: [C: 03+2] Temporarily shorten $wgParserCacheExpireTime from 30 to 21 days [mediawiki-config] - 10https://gerrit.wikimedia.org/r/685181 (https://phabricator.wikimedia.org/T280605) (owner: 10Krinkle) [17:48:16] (03CR) 10Hashar: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/694523 (https://phabricator.wikimedia.org/T268225) (owner: 10Hashar) [17:48:19] (03CR) 10Hashar: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/694524 (https://phabricator.wikimedia.org/T268225) (owner: 10Hashar) [17:49:25] (03Merged) 10jenkins-bot: Temporarily shorten $wgParserCacheExpireTime from 30 to 21 days [mediawiki-config] - 10https://gerrit.wikimedia.org/r/685181 (https://phabricator.wikimedia.org/T280605) (owner: 10Krinkle) [17:50:49] 10SRE, 10ops-eqiad, 10cloud-services-team (Hardware): labstore1007 crashed after storage controller errors--replace disk? - https://phabricator.wikimedia.org/T281045 (10wiki_willy) [17:51:52] 10SRE, 10ops-eqiad, 10cloud-services-team (Hardware): labstore1007 crashed after storage controller errors--replace disk? - https://phabricator.wikimedia.org/T281045 (10wiki_willy) T283618 created to order the replacement part. @Bstorm or @aborrero - one of you might need to confirm the drive size though, t... [17:53:52] (03CR) 10Hashar: [C: 03+2] Add .editorconfig for pom.xml [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/694474 (owner: 10Hashar) [17:54:01] (03Merged) 10jenkins-bot: Add .editorconfig for pom.xml [software/gerrit] (deploy/wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/694474 (owner: 10Hashar) [17:56:26] 10SRE, 10ops-eqiad, 10cloud-services-team (Hardware): labstore1007 crashed after storage controller errors--replace disk? - https://phabricator.wikimedia.org/T281045 (10Bstorm) It's this drive ` physicaldrive 1E:1:5 (port 1E:box 1:bay 5, 6001.1 GB)` Is that enough? [17:57:19] 10SRE, 10ops-eqiad, 10cloud-services-team (Hardware): labstore1007 crashed after storage controller errors--replace disk? - https://phabricator.wikimedia.org/T281045 (10ArielGlenn) This may be a silly qustion, but might it not be a good idea to have a few spare drives on hand for labstore1006/7 until they ar... [17:59:20] 10SRE, 10ops-eqiad: Degraded RAID on ms-be1053 - https://phabricator.wikimedia.org/T282839 (10Cmjohnson) The case was submitted with HPE, Successfully Submitted Case Number: 5355909720 [18:00:04] Deploy window Pre MediaWiki train break (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210525T1800) [18:04:01] 10SRE, 10ops-codfw, 10DC-Ops, 10fundraising-tech-ops: (Need By: TBD) rack/setup/install fran2001.frack.codfw.wmnet - https://phabricator.wikimedia.org/T282056 (10Papaul) @Jgreen this server supposed to use for IP address 10.195.0.36 but this IP address is used by mintaka which was part of recycle assets s... [18:09:12] 10SRE, 10Traffic, 10Patch-For-Review: Offer Wikidough as an anycasted service - https://phabricator.wikimedia.org/T283027 (10cmooney) [18:09:15] 10SRE, 10Traffic: RIPE Atlas monitoring of reachability & latency towards anycasted Wikidough IP - https://phabricator.wikimedia.org/T283614 (10cmooney) [18:09:56] 10SRE, 10netops, 10observability: Create RIPE Atlas measurements against our authoritative DNS servers; alert on them - https://phabricator.wikimedia.org/T283359 (10cmooney) Great, I'm working with Sukhbir to get that online in next day or two so we should be able to progress it any time after that. [18:11:05] (03CR) 10Effie Mouzeli: "PCC looks ok https://puppet-compiler.wmflabs.org/compiler1003/29687/" [puppet] - 10https://gerrit.wikimedia.org/r/694484 (https://phabricator.wikimedia.org/T271967) (owner: 10Effie Mouzeli) [18:13:50] (03CR) 10Effie Mouzeli: "PCC ok https://puppet-compiler.wmflabs.org/compiler1002/29686/" [puppet] - 10https://gerrit.wikimedia.org/r/694465 (https://phabricator.wikimedia.org/T271967) (owner: 10Effie Mouzeli) [18:14:34] (03CR) 10Effie Mouzeli: "NOOP in PCC https://puppet-compiler.wmflabs.org/compiler1001/29685/" [puppet] - 10https://gerrit.wikimedia.org/r/693474 (https://phabricator.wikimedia.org/T271967) (owner: 10Effie Mouzeli) [18:23:05] 10SRE, 10serviceops, 10Patch-For-Review, 10Performance-Team (Radar), 10User-jijiki: Enable TLS on memcached for cross-dc replication - https://phabricator.wikimedia.org/T271967 (10jijiki) [18:25:33] 10SRE, 10SRE-Access-Requests: Requesting access to production deployment for David Lynch - https://phabricator.wikimedia.org/T283607 (10DLynch) @Urbanecm Yeah, I'd only need to run maintenance scripts, so I guess `restricted` would be fine. I'm only asking for this because on the review task we were told that... [18:26:52] 10SRE, 10ops-eqiad, 10DC-Ops, 10cloud-services-team (Hardware): cloudvirt1038: PCIe error - https://phabricator.wikimedia.org/T276922 (10Cmjohnson) Dell is asking for more logs, this will not be a quick process [18:28:16] (03PS1) 10Dzahn: add miscweb to LVS [puppet] - 10https://gerrit.wikimedia.org/r/694625 (https://phabricator.wikimedia.org/T281538) [18:28:30] 10SRE, 10SRE-Access-Requests: Requesting access to production deployment for David Lynch - https://phabricator.wikimedia.org/T283607 (10Marostegui) "self service" means that you don't really need to get blocked on the DBA once the table is approved and filters are in place (if any), you can coordinate with any... [18:31:37] 10SRE, 10SRE-Access-Requests: Requesting access to production deployment for David Lynch - https://phabricator.wikimedia.org/T283607 (10Urbanecm) >>! In T283607#7112219, @DLynch wrote: > [...] > I'm only asking for this because on the review task we were told that "deployment is self-service" and nobody on the... [18:32:22] 10SRE, 10SRE-Access-Requests: Requesting access to production deployment for David Lynch - https://phabricator.wikimedia.org/T283607 (10DLynch) @Urbanecm Yeah, that's 100% fine. [18:33:28] (03PS1) 10Dzahn: service/miscweb: switch state from service_setup to lvs_setup [puppet] - 10https://gerrit.wikimedia.org/r/694628 (https://phabricator.wikimedia.org/T281538) [18:33:32] (03PS1) 10Dzahn: service/miscweb: switch state from lvs_setup to monitoring_setup [puppet] - 10https://gerrit.wikimedia.org/r/694629 (https://phabricator.wikimedia.org/T281538) [18:33:34] (03PS1) 10Dzahn: service/miscweb: switch state to production [puppet] - 10https://gerrit.wikimedia.org/r/694630 (https://phabricator.wikimedia.org/T281538) [18:33:50] 10SRE, 10SRE-Access-Requests: Requesting access to production deployment for David Lynch - https://phabricator.wikimedia.org/T283607 (10Marostegui) @DLynch ok to close this for now and we can reopen if needed in the future? [18:35:24] 10SRE, 10SRE-Access-Requests: Requesting access to production deployment for David Lynch - https://phabricator.wikimedia.org/T283607 (10DLynch) @Marostegui Sure thing! [18:37:28] 10SRE, 10SRE-Access-Requests: Requesting access to production deployment for David Lynch - https://phabricator.wikimedia.org/T283607 (10Marostegui) 05Open→03Invalid Thank you both! [18:37:46] (03PS1) 1020after4: testwikis wikis to 1.37.0-wmf.7 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694631 [18:37:48] (03CR) 1020after4: [C: 03+2] testwikis wikis to 1.37.0-wmf.7 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694631 (owner: 1020after4) [18:38:53] (03Merged) 10jenkins-bot: testwikis wikis to 1.37.0-wmf.7 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694631 (owner: 1020after4) [18:40:31] 10SRE, 10Advanced Mobile Contributions, 10Traffic, 10User-Joe: AMC – Opt-in for logged out users - https://phabricator.wikimedia.org/T215624 (10Jdlrobson) @ovasileva @phuedx is it useful to still have this ticket open? If so, should it be in #readers-web-backlog or tracking? [18:56:49] (03PS1) 10Krinkle: Allow talk pages to have a different ParserCache expiry [extensions/DiscussionTools] (wmf/1.37.0-wmf.7) - 10https://gerrit.wikimedia.org/r/694314 (https://phabricator.wikimedia.org/T280605) [19:00:04] twentyafterfour and hashar: #bothumor When your hammer is PHP, everything starts looking like a thumb. Rise for MediaWiki train - American+European Version. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210525T1900). [19:01:36] :o [19:25:33] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops-radar: (Need By: TBD) rack/setup/install mw14[14-56] - https://phabricator.wikimedia.org/T273915 (10Cmjohnson) [19:26:19] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops-radar: (Need By: TBD) rack/setup/install mw14[14-56] - https://phabricator.wikimedia.org/T273915 (10Cmjohnson) mw1414-,mw1422 racked, dns updated and homer ran. BIOS/Idrac is not setup yet [19:43:53] 10SRE, 10serviceops, 10Patch-For-Review, 10Performance-Team (Radar), 10User-jijiki: Enable TLS on memcached for cross-dc replication - https://phabricator.wikimedia.org/T271967 (10jbond) i wonder if we have considered just having the TLS port every where accept localhost? [19:45:56] 10SRE, 10serviceops, 10Patch-For-Review, 10Performance-Team (Radar), 10User-jijiki: Enable TLS on memcached for cross-dc replication - https://phabricator.wikimedia.org/T271967 (10jbond) > i wonder if we have considered just having the TLS port every where accept localhost? regardless i guess we need a t... [19:47:52] 10SRE, 10Data-Persistence-Backup, 10netops: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10cmooney) Thanks Jamie I've been digging into this. Looking at the PCAPs, and even the iperf cli output,... [19:58:12] (03PS1) 1020after4: group0 wikis to 1.37.0-wmf.7 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694647 [19:58:14] (03CR) 1020after4: [C: 03+2] group0 wikis to 1.37.0-wmf.7 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694647 (owner: 1020after4) [19:58:57] (03Merged) 10jenkins-bot: group0 wikis to 1.37.0-wmf.7 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694647 (owner: 1020after4) [19:59:09] (03CR) 10RLazarus: O:envoyproxy: add a way to restart envoy proxy (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/694379 (owner: 10Jbond) [20:08:07] 10SRE, 10netops, 10observability: Create RIPE Atlas measurements against our authoritative DNS servers; alert on them - https://phabricator.wikimedia.org/T283359 (10jbond) > ! And from a cursory glance we can spot a number of interesting things that I am holding myself back from getting nerdsniped on: this i... [20:17:33] (03CR) 10Superyetkin: "This change is ready for review." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694315 (owner: 10Superyetkin) [20:33:25] (03CR) 10Jbond: [C: 04-1] "thanks for the input" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/694379 (owner: 10Jbond) [20:36:48] (03PS5) 10Aklapper: Adjust CSP header for pdfs & videos & set enforce on testwiki [puppet] - 10https://gerrit.wikimedia.org/r/547929 (https://phabricator.wikimedia.org/T117618) (owner: 10Brian Wolff) [20:38:46] (03PS5) 10Eric Gardner: Enable MediaSearch Assessment filter [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693951 (https://phabricator.wikimedia.org/T276257) [20:49:43] 10SRE, 10Wikimedia-Logstash, 10observability, 10service-runner: Move service-runner to new logging infrastructure - https://phabricator.wikimedia.org/T211125 (10Aklapper) @Pchelolo: Hi, all related patches in Gerrit have been merged or abandoned. Is there more to do in this task? Asking as you are set as t... [20:50:25] 10SRE, 10Wikimedia-Logstash, 10observability, 10service-runner: Move service-runner to new logging infrastructure - https://phabricator.wikimedia.org/T211125 (10Pchelolo) 05Open→03Resolved [20:50:29] 10SRE, 10Wikimedia-Logstash, 10observability, 10Patch-For-Review: Deprecate all non-Kafka logstash inputs - https://phabricator.wikimedia.org/T227080 (10Pchelolo) [20:50:32] 10SRE, 10Discovery-Search, 10Elasticsearch, 10Wikimedia-Logstash, 10observability: Migrate Elasticsearch from deprecated Gelf logstash input to rsyslog Kafka logging pipeline - https://phabricator.wikimedia.org/T225125 (10Pchelolo) [20:50:35] 10SRE, 10Wikimedia-Logstash, 10observability: Migrate services using deprecated Gelf logstash input to Kafka enabled logging pipeline - https://phabricator.wikimedia.org/T225122 (10Pchelolo) [20:52:22] 10SRE, 10Performance-Team (Radar): Automated service restarts for common low-level system services - https://phabricator.wikimedia.org/T135991 (10Aklapper) [21:00:48] 10SRE, 10FR-MW-Vagrant, 10Fundraising-Backlog, 10MediaWiki-Vagrant: Package XDebug 2.9 for apt.wikimedia.org - https://phabricator.wikimedia.org/T220406 (10Aklapper) [21:03:25] (03PS1) 10Zabe: Restrict changetags to 'autoconfirmed' users on meta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694686 (https://phabricator.wikimedia.org/T283625) [21:04:55] (03CR) 10MarcoAurelio: "This change is ready for review." [software/klaxon] - 10https://gerrit.wikimedia.org/r/694316 (owner: 10MarcoAurelio) [21:07:20] 10SRE, 10Analytics-Radar, 10LDAP-Access-Requests, 10SRE-Access-Requests: Account setup issues for jmixter-ctr - https://phabricator.wikimedia.org/T283250 (10jmixter) yeah sorry about that. I think this was a symptom of me being new and not having any idea what I was doing. I think things are resolved now. [21:10:42] (03PS3) 10Aklapper: Set $wgUploadNavigationUrl for few wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/364121 (https://phabricator.wikimedia.org/T170083) (owner: 10Framawiki) [21:18:09] jouncebot: now [21:18:09] No deployments scheduled for the next 1 hour(s) and 41 minute(s) [21:18:12] jouncebot: next [21:18:12] In 1 hour(s) and 41 minute(s): Evening backport window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210525T2300) [21:21:02] 10SRE, 10Analytics, 10Discovery, 10Event-Platform, and 2 others: Avoid accepting Kafka messages with whacky timestamps - https://phabricator.wikimedia.org/T282887 (10BPirkle) [21:34:10] (03CR) 10CDanis: [C: 03+2] index: Remove optional `?#` on webchat link to reduce potential encoding errors [software/klaxon] - 10https://gerrit.wikimedia.org/r/694316 (owner: 10MarcoAurelio) [21:36:29] (03Merged) 10jenkins-bot: index: Remove optional `?#` on webchat link to reduce potential encoding errors [software/klaxon] - 10https://gerrit.wikimedia.org/r/694316 (owner: 10MarcoAurelio) [21:48:47] (03PS1) 10Razzi: sre.hadoop.roll-restart-masters: use sudo -u hdfs kerberos-run-command [cookbooks] - 10https://gerrit.wikimedia.org/r/694710 [21:54:36] (03CR) 10Razzi: [C: 03+2] sre.hadoop.roll-restart-masters: use sudo -u hdfs kerberos-run-command [cookbooks] - 10https://gerrit.wikimedia.org/r/694710 (owner: 10Razzi) [21:58:05] (03Merged) 10jenkins-bot: sre.hadoop.roll-restart-masters: use sudo -u hdfs kerberos-run-command [cookbooks] - 10https://gerrit.wikimedia.org/r/694710 (owner: 10Razzi) [22:05:13] 10SRE, 10ops-eqiad, 10cloud-services-team (Hardware): cloudvirt1040 primary NIC disconnected - https://phabricator.wikimedia.org/T281399 (10Jclark-ctr) opened dell support ticket Service Request Detail: 1060698910 even though it shows connected now will follow up with dell [22:14:29] (03PS1) 1020after4: Increase apache URL length limit for Phabricator [puppet] - 10https://gerrit.wikimedia.org/r/694731 (https://phabricator.wikimedia.org/T281390) [22:26:25] (03PS1) 10Razzi: sre.hadoop.roll-restart-masters: run hdfs as hdfs and yarn as yarn [cookbooks] - 10https://gerrit.wikimedia.org/r/694737 (https://phabricator.wikimedia.org/T283067) [22:26:36] (03CR) 10Razzi: [C: 03+2] kerberos: require --email_address for create and reset-password [puppet] - 10https://gerrit.wikimedia.org/r/686766 (https://phabricator.wikimedia.org/T282185) (owner: 10Razzi) [22:33:50] (03CR) 10Razzi: [C: 03+2] sre.hadoop.roll-restart-masters: run hdfs as hdfs and yarn as yarn [cookbooks] - 10https://gerrit.wikimedia.org/r/694737 (https://phabricator.wikimedia.org/T283067) (owner: 10Razzi) [22:37:12] (03Merged) 10jenkins-bot: sre.hadoop.roll-restart-masters: run hdfs as hdfs and yarn as yarn [cookbooks] - 10https://gerrit.wikimedia.org/r/694737 (https://phabricator.wikimedia.org/T283067) (owner: 10Razzi) [22:54:19] (03PS1) 10Cwhite: rsyslog: try to parse the msg field as json before shipping [puppet] - 10https://gerrit.wikimedia.org/r/694758 [22:56:42] (03CR) 10Cwhite: [C: 03+2] package_builder: add logstash-plugins build hooks [puppet] - 10https://gerrit.wikimedia.org/r/693958 (owner: 10Cwhite) [22:57:08] (03CR) 10Cwhite: [C: 03+2] "PCC 👍 https://puppet-compiler.wmflabs.org/compiler1001/29689/" [puppet] - 10https://gerrit.wikimedia.org/r/693958 (owner: 10Cwhite) [22:58:29] (03PS4) 10Cwhite: logstash: add openstack ECS transition config and tests [puppet] - 10https://gerrit.wikimedia.org/r/689262 (https://phabricator.wikimedia.org/T234565) [22:59:57] (03CR) 10jerkins-bot: [V: 04-1] logstash: add openstack ECS transition config and tests [puppet] - 10https://gerrit.wikimedia.org/r/689262 (https://phabricator.wikimedia.org/T234565) (owner: 10Cwhite) [23:00:05] RoanKattouw, Niharika, and Urbanecm: #bothumor Q:How do functions break up? A:They stop calling each other. Rise for Evening backport window deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210525T2300). [23:00:05] No GERRIT patches in the queue for this window AFAICS. [23:09:17] (03PS5) 10Cwhite: logstash: add openstack ECS transition config and tests [puppet] - 10https://gerrit.wikimedia.org/r/689262 (https://phabricator.wikimedia.org/T234565) [23:10:21] (03PS1) 10Razzi: sre.hadoop.roll-restart-masters: consistent sleep confirmation [cookbooks] - 10https://gerrit.wikimedia.org/r/694768 [23:18:51] (03CR) 10Razzi: [C: 03+2] sre.hadoop.roll-restart-masters: consistent sleep confirmation [cookbooks] - 10https://gerrit.wikimedia.org/r/694768 (owner: 10Razzi) [23:21:35] (03Merged) 10jenkins-bot: sre.hadoop.roll-restart-masters: consistent sleep confirmation [cookbooks] - 10https://gerrit.wikimedia.org/r/694768 (owner: 10Razzi) [23:24:34] 10SRE, 10wikimedia-irc-libera: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10razzi) [23:31:28] (03CR) 10Razzi: [C: 03+2] reportupdater: Rsync logs to HDFS [puppet] - 10https://gerrit.wikimedia.org/r/692909 (https://phabricator.wikimedia.org/T274880) (owner: 10Mforns) [23:53:50] 10SRE, 10Analytics-Radar, 10LDAP-Access-Requests, 10SRE-Access-Requests: Account setup issues for jmixter-ctr - https://phabricator.wikimedia.org/T283250 (10Dzahn) 05Open→03Resolved a:03Dzahn @jmixter Cool, great to hear that things work for you now and thanks for confirming. I think the wiki editin...