[00:16:53] (03PS2) 10Mstyles: kibana: add kibana to relforge [puppet] - 10https://gerrit.wikimedia.org/r/581111 (https://phabricator.wikimedia.org/T246961) [00:23:58] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [00:28:10] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [01:03:50] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [01:05:52] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [02:00:50] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [02:02:58] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [02:30:18] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [02:32:26] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [03:29:28] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [03:31:36] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [03:32:51] (03PS1) 10VolkerE: Remove unnecessary, overqualified element parts of id selectors [mediawiki-config] - 10https://gerrit.wikimedia.org/r/581879 (https://phabricator.wikimedia.org/T248137) [04:27:02] (03PS1) 10BPirkle: Add configuration variable $wgEnableRestAPIDevelopmentEndpoints [mediawiki-config] - 10https://gerrit.wikimedia.org/r/581886 (https://phabricator.wikimedia.org/T247997) [05:08:40] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [05:10:46] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [05:32:02] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [05:34:10] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [05:57:38] (03PS1) 10DannyS712: wgCopyUploadsDomains: Add supremecourt.gov [mediawiki-config] - 10https://gerrit.wikimedia.org/r/581891 [05:59:30] (03PS2) 10DannyS712: wgCopyUploadsDomains: Add supremecourt.gov [mediawiki-config] - 10https://gerrit.wikimedia.org/r/581891 (https://phabricator.wikimedia.org/T248146) [06:06:02] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [06:08:08] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [06:12:04] (03CR) 10Marostegui: [C: 03+1] "Is this a bug somewhere upstream or is it expected?" [puppet] - 10https://gerrit.wikimedia.org/r/581617 (https://phabricator.wikimedia.org/T246997) (owner: 10Filippo Giunchedi) [06:16:32] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [06:20:48] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [06:21:43] (03PS1) 10KartikMistry: apertium-es-ast: Fix FTBFS with apertium 3.6 [debs/contenttranslation/apertium-es-ast] - 10https://gerrit.wikimedia.org/r/581893 (https://phabricator.wikimedia.org/T247585) [06:29:12] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [06:33:22] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [06:42:19] (03PS1) 10Marostegui: install_server: Do not reimage pc1008 [puppet] - 10https://gerrit.wikimedia.org/r/581895 (https://phabricator.wikimedia.org/T247787) [06:44:10] (03CR) 10Marostegui: [C: 03+2] install_server: Do not reimage pc1008 [puppet] - 10https://gerrit.wikimedia.org/r/581895 (https://phabricator.wikimedia.org/T247787) (owner: 10Marostegui) [06:49:59] (03PS1) 10KartikMistry: apertium-es-ro: Fix FTBFS with apertium 3.6 [debs/contenttranslation/apertium-es-ro] - 10https://gerrit.wikimedia.org/r/581896 (https://phabricator.wikimedia.org/T247585) [07:04:18] PROBLEM - BGP status on cr2-eqord is CRITICAL: BGP CRITICAL - AS6939/IPv4: Connect - HE, AS6939/IPv6: Active - HE https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [07:09:23] !log marostegui@cumin1001 dbctl commit (dc=all): 'Promote es1014 to es3 master, this is a NOOP T239791', diff saved to https://phabricator.wikimedia.org/P10734 and previous config saved to /var/cache/conftool/dbconfig/20200320-070922-marostegui.json [07:09:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:09:30] T239791: DB: perform rolling restart of mariadb daemons to pick up CA changes - https://phabricator.wikimedia.org/T239791 [07:09:46] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depool es1017 for update T239791', diff saved to https://phabricator.wikimedia.org/P10735 and previous config saved to /var/cache/conftool/dbconfig/20200320-070945-marostegui.json [07:09:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:10:36] RECOVERY - BGP status on cr2-eqord is OK: BGP OK - up: 91, down: 0, shutdown: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [07:23:34] PROBLEM - IPv6 ping to codfw on ripe-atlas-codfw IPv6 is CRITICAL: CRITICAL - failed 87 probes of 543 (alerts on 50) - https://atlas.ripe.net/measurements/1791212/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts https://grafana.wikimedia.org/d/K1qm1j-Wz/ripe-atlas [07:26:06] !log Restart mysql on es1017 for upgrade - T239791 [07:26:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:26:12] T239791: DB: perform rolling restart of mariadb daemons to pick up CA changes - https://phabricator.wikimedia.org/T239791 [07:30:22] 10Operations, 10Puppet, 10DBA, 10User-jbond: DB: perform rolling restart of mariadb daemons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [07:30:55] 10Operations, 10Puppet, 10DBA, 10User-jbond: DB: perform rolling restart of mariadb daemons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) Only s1-s8 and x1 masters pending. [07:32:00] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [07:32:05] !log marostegui@cumin1001 dbctl commit (dc=all): 'Slowly repool es1017', diff saved to https://phabricator.wikimedia.org/P10736 and previous config saved to /var/cache/conftool/dbconfig/20200320-073205-marostegui.json [07:32:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:36:16] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [07:39:39] (03PS1) 10KartikMistry: apertium-eu-en: Fix FTBFS with apertium 3.6 [debs/contenttranslation/apertium-eu-en] - 10https://gerrit.wikimedia.org/r/581967 (https://phabricator.wikimedia.org/T247585) [07:46:28] PROBLEM - BGP status on cr2-eqord is CRITICAL: BGP CRITICAL - AS6939/IPv6: Active - HE, AS6939/IPv4: Connect - HE https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [07:46:41] !log upload hadoop_2.8.5-2 (and related debs) to thirdparty/bigtop14 on wikimedia-stretch (manually rebuilt via docker after patch backports from upstream) [07:46:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:47:56] RECOVERY - IPv6 ping to codfw on ripe-atlas-codfw IPv6 is OK: OK - failed 36 probes of 543 (alerts on 50) - https://atlas.ripe.net/measurements/1791212/#!map https://wikitech.wikimedia.org/wiki/Network_monitoring%23Atlas_alerts https://grafana.wikimedia.org/d/K1qm1j-Wz/ripe-atlas [07:48:17] !log marostegui@cumin1001 dbctl commit (dc=all): 'Slowly repool es1017', diff saved to https://phabricator.wikimedia.org/P10737 and previous config saved to /var/cache/conftool/dbconfig/20200320-074816-marostegui.json [07:48:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:51:02] (03CR) 10Jcrespo: [C: 03+1] smart: stop smartd on Buster + hpsa [puppet] - 10https://gerrit.wikimedia.org/r/581617 (https://phabricator.wikimedia.org/T246997) (owner: 10Filippo Giunchedi) [07:52:50] RECOVERY - BGP status on cr2-eqord is OK: BGP OK - up: 91, down: 0, shutdown: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [07:53:20] mmmmm -^ [07:54:17] from the logs seems a temporary glitch [07:59:40] !log reorder LVS BGP neighbors and add descriptions - https://gerrit.wikimedia.org/r/576320 [07:59:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:01:12] PROBLEM - BGP status on cr2-eqord is CRITICAL: BGP CRITICAL - AS6939/IPv6: Active - HE, AS6939/IPv4: Connect - HE https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [08:03:35] XioNoX: hello :) [08:03:44] elukey: yo! [08:03:57] HE is having issues it seems? [08:04:20] yeah, I don't see maint announces among the emails [08:04:49] we peer with them in Equinix ORD right? [08:05:16] (I was trying to see where HE was in the network graphs) [08:05:30] PROBLEM - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3054 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [08:06:14] elukey: yeah, we have them as transit for v6 and peering for v4 [08:07:22] looks like v6 is back up and v4 still down [08:08:17] priorities :D [08:08:38] and we could probably move their v6 back to peering as we only get a partial view [08:11:53] (03CR) 10Ayounsi: [C: 03+2] "> Patch Set 3: Code-Review+1" [homer/public] - 10https://gerrit.wikimedia.org/r/576320 (owner: 10Ayounsi) [08:14:08] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [08:15:52] RECOVERY - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3054 is OK: HTTP OK: HTTP/1.0 200 OK - 22348 bytes in 0.257 second response time https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [08:20:24] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [08:33:34] !log marostegui@cumin1001 dbctl commit (dc=all): 'Slowly repool es1017', diff saved to https://phabricator.wikimedia.org/P10738 and previous config saved to /var/cache/conftool/dbconfig/20200320-083334-marostegui.json [08:33:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:38:48] 10Operations, 10LDAP-Access-Requests: Request for a ldap account and be added to nda ldap group for PHPCC - https://phabricator.wikimedia.org/T247731 (10ArielGlenn) I assume this is for a contract for a specific period of time. Is there an expiry date and a staff person that should be notified when that date i... [08:47:31] !log marostegui@cumin1001 dbctl commit (dc=all): 'Fully repool es1017', diff saved to https://phabricator.wikimedia.org/P10739 and previous config saved to /var/cache/conftool/dbconfig/20200320-084730-marostegui.json [08:47:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:52:30] (03PS2) 10Muehlenhoff: profile::java::analytics: Switch to apt::package_from_component [puppet] - 10https://gerrit.wikimedia.org/r/565567 [08:53:07] (03PS1) 10ArielGlenn: add thephp.cc contractors as ldap-only users [puppet] - 10https://gerrit.wikimedia.org/r/581977 (https://phabricator.wikimedia.org/T247731) [08:54:36] (03CR) 10jerkins-bot: [V: 04-1] add thephp.cc contractors as ldap-only users [puppet] - 10https://gerrit.wikimedia.org/r/581977 (https://phabricator.wikimedia.org/T247731) (owner: 10ArielGlenn) [08:55:56] (03CR) 10Muehlenhoff: [C: 03+2] profile::java::analytics: Switch to apt::package_from_component [puppet] - 10https://gerrit.wikimedia.org/r/565567 (owner: 10Muehlenhoff) [08:58:24] (03CR) 10ArielGlenn: [C: 04-2] "Do not merge, waiting for additional info" [puppet] - 10https://gerrit.wikimedia.org/r/581977 (https://phabricator.wikimedia.org/T247731) (owner: 10ArielGlenn) [08:59:36] !log installing freetype bugfix updates from stretch point release [08:59:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:09:01] 10Operations, 10Commons, 10SRE-swift-storage: Big number of uploads from DPLA bot - https://phabricator.wikimedia.org/T248151 (10fgiunchedi) [09:11:22] ACKNOWLEDGEMENT - mediawiki originals uploads -hourly- for codfw on icinga1001 is CRITICAL: account=mw-media class=originals cluster=swift instance=ms-fe2005:9112 job=statsd_exporter site=codfw Filippo Giunchedi https://phabricator.wikimedia.org/T248151 https://wikitech.wikimedia.org/wiki/Swift/How_To https://grafana.wikimedia.org/d/OPgmB1Eiz/swift?panelId=26&fullscreen&orgId=1&var-DC=codfw [09:11:22] ACKNOWLEDGEMENT - mediawiki originals uploads -hourly- for eqiad on icinga1001 is CRITICAL: account=mw-media class=originals cluster=swift instance=ms-fe1005:9112 job=statsd_exporter site=eqiad Filippo Giunchedi https://phabricator.wikimedia.org/T248151 https://wikitech.wikimedia.org/wiki/Swift/How_To https://grafana.wikimedia.org/d/OPgmB1Eiz/swift?panelId=26&fullscreen&orgId=1&var-DC=eqiad [09:11:26] (03CR) 10Alexandros Kosiaris: [C: 03+1] calico: add changeprop access to varnish multicast address [deployment-charts] - 10https://gerrit.wikimedia.org/r/581667 (https://phabricator.wikimedia.org/T213193) (owner: 10Hnowlan) [09:11:43] 10Operations, 10Datasets-General-or-Unknown: Provide a good download service of dumps from Wikimedia - https://phabricator.wikimedia.org/T122917 (10ArielGlenn) In the meantime labstore hosts handle this now and there is a superceding ticket about bandwidth and access for those, see T191491. Closing this as dup. [09:12:22] 10Operations, 10Cloud-Services, 10Datasets-General-or-Unknown, 10User-ArielGlenn, 10cloud-services-team (Kanban): Adjust bandwidth/connection limits, memory settings on labstore1006,7 as appropriate - https://phabricator.wikimedia.org/T191491 (10ArielGlenn) [09:12:24] 10Operations, 10Datasets-General-or-Unknown: Provide a good download service of dumps from Wikimedia - https://phabricator.wikimedia.org/T122917 (10ArielGlenn) [09:13:36] (03PS1) 10Filippo Giunchedi: grafana: add upload bytes for originals to swift [puppet] - 10https://gerrit.wikimedia.org/r/581979 [09:14:12] (03CR) 10Filippo Giunchedi: [C: 03+2] grafana: add upload bytes for originals to swift [puppet] - 10https://gerrit.wikimedia.org/r/581979 (owner: 10Filippo Giunchedi) [09:14:23] 10Operations, 10Datasets-General-or-Unknown: investigate rsync between dcs with encryption - https://phabricator.wikimedia.org/T123560 (10ArielGlenn) 05Open→03Declined Now no private data is rsynced but only public files, even within the same dc, so this can be declined. [09:14:25] 10Operations, 10Epic: Encrypt all the things - https://phabricator.wikimedia.org/T111653 (10ArielGlenn) [09:24:11] (03CR) 10Filippo Giunchedi: "LGTM modulo the 'require' change" [puppet] - 10https://gerrit.wikimedia.org/r/579422 (https://phabricator.wikimedia.org/T247376) (owner: 10Herron) [09:25:12] (03CR) 10Filippo Giunchedi: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/579340 (https://phabricator.wikimedia.org/T247376) (owner: 10Herron) [09:25:26] RECOVERY - BGP status on cr2-eqord is OK: BGP OK - up: 91, down: 0, shutdown: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [09:26:14] (03CR) 10Muehlenhoff: "Is there still Toolforge on jessie using Docker at this point?" [puppet] - 10https://gerrit.wikimedia.org/r/524731 (owner: 10Muehlenhoff) [09:26:55] (03Abandoned) 10Filippo Giunchedi: puppetmaster: lock commits on /srv/private on non-master hosts [puppet] - 10https://gerrit.wikimedia.org/r/420705 (https://phabricator.wikimedia.org/T189891) (owner: 10Filippo Giunchedi) [09:27:11] 10Operations, 10LDAP-Access-Requests, 10Patch-For-Review: Request for a ldap account and be added to nda ldap group for PHPCC - https://phabricator.wikimedia.org/T247731 (10darthmon_wmde) >>! In T247731#5986224, @ArielGlenn wrote: > I assume this is for a contract for a specific period of time. Is there an e... [09:28:03] (03PS1) 10Elukey: admin: refactor analytics-related groups and add documentation [puppet] - 10https://gerrit.wikimedia.org/r/581983 (https://phabricator.wikimedia.org/T246578) [09:28:40] !log rolling restart of FPM on mw1261-mw1265 for freetype update [09:28:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:29:12] (03CR) 10Filippo Giunchedi: "> Patch Set 1: Code-Review+1" [puppet] - 10https://gerrit.wikimedia.org/r/581617 (https://phabricator.wikimedia.org/T246997) (owner: 10Filippo Giunchedi) [09:30:05] 10Operations, 10LDAP-Access-Requests, 10Patch-For-Review: Request for a ldap account and be added to nda ldap group for PHPCC - https://phabricator.wikimedia.org/T247731 (10ArielGlenn) >>! In T247731#5986344, @darthmon_wmde wrote: ... > > Is is possible to leave it open for now and I will notify you as soon... [09:33:47] 10Operations, 10LDAP-Access-Requests, 10Patch-For-Review: Request for a ldap account and be added to nda ldap group for PHPCC - https://phabricator.wikimedia.org/T247731 (10MoritzMuehlenhoff) > Is is possible to leave it open for now and I will notify you as soon as I know? because of the current situation w... [09:53:17] (03PS1) 10Muehlenhoff: postgres: Remove support for jessie [puppet] - 10https://gerrit.wikimedia.org/r/581985 [09:54:18] (03CR) 10jerkins-bot: [V: 04-1] postgres: Remove support for jessie [puppet] - 10https://gerrit.wikimedia.org/r/581985 (owner: 10Muehlenhoff) [10:04:36] (03CR) 1020after4: [C: 03+1] Add an image for python2 app based on Buster [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/580128 (https://phabricator.wikimedia.org/T215458) (owner: 10Hashar) [10:05:38] (03CR) 10DCausse: [C: 03+1] cirrus: Increase commonswiki near match weight [mediawiki-config] - 10https://gerrit.wikimedia.org/r/580394 (https://phabricator.wikimedia.org/T245642) (owner: 10EBernhardson) [10:08:45] (03CR) 1020after4: [C: 03+1] zuul: provision the scap repository [puppet] - 10https://gerrit.wikimedia.org/r/579587 (https://phabricator.wikimedia.org/T215458) (owner: 10Hashar) [10:09:33] twentyafterfour: hi, the train conducting kind of delayed my effort on that zuul/scap project unfortunately. But yeah that puppet patch is a good base to start with ;] [10:12:15] hashar: I created this early today, which looks related to yesterday's deployment: https://phabricator.wikimedia.org/T248147 [10:12:39] 2 million warnings in 24 hours!!! [10:12:48] :-( [10:12:52] I haven't seen them in logstash last night! :-\ [10:13:07] !log repooling wdqs1006 [10:13:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:14:04] maybe because they are not in the mediawiki log bucket [10:18:47] marostegui: I have hinted about dropping the logging warning. But I can't take care of it, I am doing school with kids :\ [10:20:06] hashar: Sure, I don't think it is super urgent, it is just annoying and something to take care of soon, but it can wait for AaronSchulz for sure [10:22:24] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job={icinga,swagger_check_citoid_cluster_eqiad} site={codfw,eqiad} https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [10:24:26] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [10:26:12] (03CR) 10Hnowlan: [C: 03+2] calico: add changeprop access to varnish multicast address [deployment-charts] - 10https://gerrit.wikimedia.org/r/581667 (https://phabricator.wikimedia.org/T213193) (owner: 10Hnowlan) [10:26:29] (03Merged) 10jenkins-bot: calico: add changeprop access to varnish multicast address [deployment-charts] - 10https://gerrit.wikimedia.org/r/581667 (https://phabricator.wikimedia.org/T213193) (owner: 10Hnowlan) [10:26:45] marostegui: or I can give it a try this afternoon yes. Thx for the task! [10:29:05] !log hnowlan@deploy1001 helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . [10:29:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:29:37] hashar: thank you! [10:29:48] hashar: If the tags need changing, please feel free [10:31:07] 10Operations, 10Wikimedia-Mailing-lists: Please decom reading-wmf mailing list - https://phabricator.wikimedia.org/T248126 (10dr0ptp4kt) @Quiddity noted that archival would be appreciated. I'm not familiar with how that works. Can anyone speak to it? This is an internal mailing list and while it's fine if WMF... [10:34:11] !log hnowlan@deploy1001 helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' . [10:34:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:45:17] <_joe_> hnowlan: is it working? :) [10:45:28] <_joe_> an in deploying [10:45:43] <_joe_> not asking about actually functioning :P [10:48:16] (03CR) 10Jbond: "looks good however some of theses groups still appear in hiera." [puppet] - 10https://gerrit.wikimedia.org/r/581983 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [10:49:58] (03CR) 10Elukey: "> looks good however some of theses groups still appear in hiera." [puppet] - 10https://gerrit.wikimedia.org/r/581983 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [10:53:33] 10Operations, 10Puppet, 10User-jbond: Add a deprecated flag to admin groups - https://phabricator.wikimedia.org/T248161 (10jbond) [10:53:43] 10Operations, 10Puppet, 10User-jbond: Add a deprecated flag to admin groups - https://phabricator.wikimedia.org/T248161 (10jbond) p:05Triage→03Low [10:54:00] _joe_: it's deploying for sure, working not so much :) [10:54:11] (03PS2) 10Elukey: admin: refactor analytics-related groups and add documentation [puppet] - 10https://gerrit.wikimedia.org/r/581983 (https://phabricator.wikimedia.org/T246578) [10:54:13] <_joe_> that's a first step though :D [10:54:45] (03CR) 10Jbond: "> For the flag if it is a easy thing yes, but I wouldn't spend to much time on it (I hope that people seeing "Deprecated" in data.yaml wil" [puppet] - 10https://gerrit.wikimedia.org/r/581983 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [10:56:08] !log hnowlan@deploy1001 helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . [10:56:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:56:44] !log hnowlan@deploy1001 helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . [10:56:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:58:20] (03CR) 10Jbond: "see inline" (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/581983 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [10:59:41] (03CR) 10Arturo Borrero Gonzalez: [C: 03+1] "This can be safely merged from the Toolforge PoV. Thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/524731 (owner: 10Muehlenhoff) [11:00:53] (03CR) 10Elukey: admin: refactor analytics-related groups and add documentation (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/581983 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [11:02:02] (03CR) 10Alexandros Kosiaris: [C: 04-1] "Thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/524731 (owner: 10Muehlenhoff) [11:02:06] (03CR) 10Alexandros Kosiaris: [C: 03+2] profile::docker::engine: Remove support for jessie [puppet] - 10https://gerrit.wikimedia.org/r/524731 (owner: 10Muehlenhoff) [11:03:00] (03CR) 10Elukey: [C: 04-1] ">" (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/581983 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [11:07:06] (03CR) 10Jbond: [C: 03+2] profile::idp: update profile to use tlsproxy::envoy [puppet] - 10https://gerrit.wikimedia.org/r/574020 (https://phabricator.wikimedia.org/T240941) (owner: 10Jbond) [11:10:41] !log upload oozie 4.3.0-2 packages to thirdparty/bigtop14 on wikimedia-stretch [11:10:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:12:50] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job={icinga,idp} site={codfw,eqiad} https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [11:14:16] (03PS2) 10Muehlenhoff: postgres: Remove support for jessie [puppet] - 10https://gerrit.wikimedia.org/r/581985 [11:15:19] (03CR) 10jerkins-bot: [V: 04-1] postgres: Remove support for jessie [puppet] - 10https://gerrit.wikimedia.org/r/581985 (owner: 10Muehlenhoff) [11:17:39] (03PS1) 10Andrew-WMDE: TwoColConflict: Limited default deployment [mediawiki-config] - 10https://gerrit.wikimedia.org/r/581991 (https://phabricator.wikimedia.org/T244863) [11:22:46] (03PS1) 10Jbond: Revert "profile::idp: update profile to use tlsproxy::envoy" [puppet] - 10https://gerrit.wikimedia.org/r/581992 [11:26:05] (03CR) 10Muehlenhoff: [C: 03+1] Revert "profile::idp: update profile to use tlsproxy::envoy" [puppet] - 10https://gerrit.wikimedia.org/r/581992 (owner: 10Jbond) [11:27:25] (03CR) 10Jbond: [C: 03+2] Revert "profile::idp: update profile to use tlsproxy::envoy" [puppet] - 10https://gerrit.wikimedia.org/r/581992 (owner: 10Jbond) [11:30:55] (03PS1) 10Volans: Add compiled Python files to gitignore [software] - 10https://gerrit.wikimedia.org/r/581993 [11:30:57] (03PS1) 10Volans: Relax max-line-length for flake8 [software] - 10https://gerrit.wikimedia.org/r/581994 [11:30:59] (03PS1) 10Volans: Add tool to validate RPKI invalid prefixes [software] - 10https://gerrit.wikimedia.org/r/581995 [11:31:01] (03PS1) 10Jbond: profile::idp: update profile to use tlsproxy::envoy [puppet] - 10https://gerrit.wikimedia.org/r/581996 (https://phabricator.wikimedia.org/T240941) [11:31:44] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [11:32:04] (03Abandoned) 10Elukey: admin: refactor analytics-related groups and add documentation [puppet] - 10https://gerrit.wikimedia.org/r/581983 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [11:32:23] (03PS2) 10Jbond: profile::idp: update profile to use tlsproxy::envoy [puppet] - 10https://gerrit.wikimedia.org/r/581996 (https://phabricator.wikimedia.org/T240941) [11:35:02] (03CR) 10jerkins-bot: [V: 04-1] profile::idp: update profile to use tlsproxy::envoy [puppet] - 10https://gerrit.wikimedia.org/r/581996 (https://phabricator.wikimedia.org/T240941) (owner: 10Jbond) [11:37:31] (03PS3) 10Jbond: profile::idp: update profile to use tlsproxy::envoy [puppet] - 10https://gerrit.wikimedia.org/r/581996 (https://phabricator.wikimedia.org/T240941) [11:38:24] (03PS2) 10Cparle: Enable WikibaseQualityConstraints on commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/581678 (https://phabricator.wikimedia.org/T248117) [11:39:21] (03CR) 10Jbond: [C: 03+2] pick_nodes: add ability to pick nodes based on a puppet class (031 comment) [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/579579 (https://phabricator.wikimedia.org/T245288) (owner: 10Jbond) [11:39:24] (03CR) 10jerkins-bot: [V: 04-1] Enable WikibaseQualityConstraints on commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/581678 (https://phabricator.wikimedia.org/T248117) (owner: 10Cparle) [11:39:48] (03PS1) 10Hnowlan: changeprop: Release new version [deployment-charts] - 10https://gerrit.wikimedia.org/r/581997 (https://phabricator.wikimedia.org/T213193) [11:42:45] (03PS1) 10Jbond: 0.7.0: prepare release [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/581998 [11:46:08] PROBLEM - Check no envoy runtime configuration is left persistent on idp1001 is CRITICAL: connect to address 127.0.0.1 and port 9631: Connection refused https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23Envoy [11:46:18] PROBLEM - Check that envoy is running on idp1001 is CRITICAL: CRITICAL - Expecting active but unit envoyproxy.service is inactive https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23Envoy [11:46:25] ^fixing [11:48:19] (03CR) 10Muehlenhoff: [C: 03+1] "Looks good to me" [puppet] - 10https://gerrit.wikimedia.org/r/581996 (https://phabricator.wikimedia.org/T240941) (owner: 10Jbond) [11:50:21] (03CR) 10Jbond: [C: 03+2] 0.7.0: prepare release [software/puppet-compiler] - 10https://gerrit.wikimedia.org/r/581998 (owner: 10Jbond) [11:51:33] (03CR) 10Hnowlan: [C: 03+2] changeprop: Release new version [deployment-charts] - 10https://gerrit.wikimedia.org/r/581997 (https://phabricator.wikimedia.org/T213193) (owner: 10Hnowlan) [11:51:51] (03Merged) 10jenkins-bot: changeprop: Release new version [deployment-charts] - 10https://gerrit.wikimedia.org/r/581997 (https://phabricator.wikimedia.org/T213193) (owner: 10Hnowlan) [11:52:57] !log hnowlan@deploy1001 helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' . [11:53:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:56:15] (03PS4) 10Jbond: profile::idp: update profile to use tlsproxy::envoy [puppet] - 10https://gerrit.wikimedia.org/r/581996 (https://phabricator.wikimedia.org/T240941) [11:57:09] (03CR) 10Jbond: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/581996 (https://phabricator.wikimedia.org/T240941) (owner: 10Jbond) [12:08:05] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [12:11:37] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [12:13:45] 10Operations, 10vm-requests: codfw: 1 VM for builder - https://phabricator.wikimedia.org/T248165 (10MoritzMuehlenhoff) [12:16:28] !log marostegui@cumin1001 dbctl commit (dc=all): 'Decrease db1087, vslow host weight in main, given that the CPU across s8 is now doing a lot better', diff saved to https://phabricator.wikimedia.org/P10741 and previous config saved to /var/cache/conftool/dbconfig/20200320-121628-marostegui.json [12:16:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:21:14] 10Operations, 10LDAP-Access-Requests, 10Patch-For-Review: Request for a ldap account and be added to nda ldap group for PHPCC - https://phabricator.wikimedia.org/T247731 (10darthmon_wmde) >>! In T247731#5986356, @MoritzMuehlenhoff wrote: >> Is is possible to leave it open for now and I will notify you as soo... [12:23:44] (03PS1) 10Giuseppe Lavagetto: profile::services_proxy: refactor to simplify cluster definitions [puppet] - 10https://gerrit.wikimedia.org/r/582008 [12:27:57] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [12:29:32] (03PS2) 10Giuseppe Lavagetto: profile::services_proxy: refactor to simplify cluster definitions [puppet] - 10https://gerrit.wikimedia.org/r/582008 [12:30:05] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [12:32:25] (03PS1) 10Cparle: Constraints fix for beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/582010 (https://phabricator.wikimedia.org/T239939) [12:36:27] (03PS3) 10Giuseppe Lavagetto: profile::services_proxy: refactor to simplify cluster definitions [puppet] - 10https://gerrit.wikimedia.org/r/582008 [12:39:18] (03CR) 10Giuseppe Lavagetto: [C: 03+2] "https://puppet-compiler.wmflabs.org/compiler1003/21511/mw1261.eqiad.wmnet/ this is a noop. I am merging it on the basis it's a large reduc" [puppet] - 10https://gerrit.wikimedia.org/r/582008 (owner: 10Giuseppe Lavagetto) [12:41:30] (03Abandoned) 10Cparle: Constraints fix for beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/582010 (https://phabricator.wikimedia.org/T239939) (owner: 10Cparle) [12:47:31] 10Operations, 10cloud-services-team (Kanban): Migrate remaining self-hosted puppet masters to Puppet 5 / facter 3 - https://phabricator.wikimedia.org/T241719 (10Paladox) [12:56:17] 10Operations, 10Commons, 10SRE-swift-storage: Big number of uploads from DPLA bot - https://phabricator.wikimedia.org/T248151 (10Dominicbm) Hi, this is me! 😳If it's easier, I can get on Telegram or IRC to chat with you about my project. Obviously, I've been going at a high rate, but I don't really want to br... [12:58:11] 10Operations: Upgrade Puppet to 5.5.19 - https://phabricator.wikimedia.org/T248168 (10MoritzMuehlenhoff) [13:05:54] 10Puppet, 10puppet-compiler, 10User-jbond: puppet populate failing on some nodes - https://phabricator.wikimedia.org/T248169 (10jbond) [13:10:48] (03PS1) 10Hnowlan: calico: Fix rdb1009 IP address [deployment-charts] - 10https://gerrit.wikimedia.org/r/582018 (https://phabricator.wikimedia.org/T213193) [13:12:58] (03PS1) 10Muehlenhoff: New ferm sub profile for public mysql proxies [puppet] - 10https://gerrit.wikimedia.org/r/582020 [13:13:51] (03CR) 10Vgutierrez: [C: 03+1] profile::idp: update profile to use tlsproxy::envoy [puppet] - 10https://gerrit.wikimedia.org/r/581996 (https://phabricator.wikimedia.org/T240941) (owner: 10Jbond) [13:14:02] (03CR) 10jerkins-bot: [V: 04-1] New ferm sub profile for public mysql proxies [puppet] - 10https://gerrit.wikimedia.org/r/582020 (owner: 10Muehlenhoff) [13:18:32] (03PS2) 10Muehlenhoff: New ferm sub profile for public mysql proxies [puppet] - 10https://gerrit.wikimedia.org/r/582020 [13:19:19] (03CR) 10jerkins-bot: [V: 04-1] New ferm sub profile for public mysql proxies [puppet] - 10https://gerrit.wikimedia.org/r/582020 (owner: 10Muehlenhoff) [13:21:19] 10Operations, 10observability: Upgrade Grafana to 6.7 - https://phabricator.wikimedia.org/T244208 (10fgiunchedi) [13:23:49] (03PS3) 10Muehlenhoff: New ferm sub profile for public mysql proxies [puppet] - 10https://gerrit.wikimedia.org/r/582020 [13:25:34] 10Operations, 10fundraising-tech-ops, 10observability, 10Patch-For-Review, 10User-fgiunchedi: Icinga latency is skyrocketing and commands ignored - https://phabricator.wikimedia.org/T247538 (10fgiunchedi) [13:28:41] 10Operations, 10Commons, 10SRE-swift-storage: Big number of uploads from DPLA bot - https://phabricator.wikimedia.org/T248151 (10fgiunchedi) >>! In T248151#5986783, @Dominicbm wrote: > Hi, this is me! 😳If it's easier, I can get on Telegram or IRC to chat with you about my project. Obviously, I've been going... [13:31:44] 10Operations, 10Patch-For-Review, 10User-fgiunchedi: smartd not starting properly on gen9 + buster - https://phabricator.wikimedia.org/T246997 (10fgiunchedi) [13:31:56] (03CR) 10Alexandros Kosiaris: [C: 03+2] calico: Fix rdb1009 IP address [deployment-charts] - 10https://gerrit.wikimedia.org/r/582018 (https://phabricator.wikimedia.org/T213193) (owner: 10Hnowlan) [13:32:14] (03Merged) 10jenkins-bot: calico: Fix rdb1009 IP address [deployment-charts] - 10https://gerrit.wikimedia.org/r/582018 (https://phabricator.wikimedia.org/T213193) (owner: 10Hnowlan) [13:33:18] !log hnowlan@deploy1001 helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . [13:33:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:33:50] !log hnowlan@deploy1001 helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . [13:33:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:34:15] !log hnowlan@deploy1001 helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . [13:34:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:34:21] (03CR) 10Muehlenhoff: "I'd like a comment/clarification/confirmation from someone more familiar with the setup on the designated 3306 endpoints on these proxies:" [puppet] - 10https://gerrit.wikimedia.org/r/582020 (owner: 10Muehlenhoff) [13:46:37] (03CR) 10Marostegui: "So yeah, we need dbproxies to be able to accept traffic to 3306 from anywhere (anywhere meaning internal wmnet and wmflabs and quarry, whi" [puppet] - 10https://gerrit.wikimedia.org/r/582020 (owner: 10Muehlenhoff) [13:49:15] (03PS11) 10Arturo Borrero Gonzalez: toolforge: support canonical redirects in urlproxy [puppet] - 10https://gerrit.wikimedia.org/r/579952 (https://phabricator.wikimedia.org/T234617) [13:49:28] 10Operations, 10cloud-services-team (Kanban): Migrate remaining self-hosted puppet masters to Puppet 5 / facter 3 - https://phabricator.wikimedia.org/T241719 (10JHedden) [13:59:28] 10Operations, 10observability, 10User-CDanis: Upgrade Grafana to 6.7 - https://phabricator.wikimedia.org/T244208 (10CDanis) [13:59:44] 10Puppet, 10puppet-compiler, 10User-jbond: puppet populate failing on some nodes - https://phabricator.wikimedia.org/T248169 (10jbond) I think the reason that it works on some nodes and not others is because the puppetdb contains trusted facts. My working theory is that if you successfully compile a host th... [14:06:01] 10Operations, 10cloud-services-team (Kanban): Migrate remaining self-hosted puppet masters to Puppet 5 / facter 3 - https://phabricator.wikimedia.org/T241719 (10JHedden) [14:12:12] 10Operations, 10cloud-services-team (Kanban): Migrate remaining self-hosted puppet masters to Puppet 5 / facter 3 - https://phabricator.wikimedia.org/T241719 (10JHedden) [14:15:58] (03PS4) 10Alexandros Kosiaris: admin: Deduplicate defaults.yaml [deployment-charts] - 10https://gerrit.wikimedia.org/r/581507 [14:16:00] (03PS2) 10Alexandros Kosiaris: admin: deduplicate main helmfile.yaml [deployment-charts] - 10https://gerrit.wikimedia.org/r/581656 [14:16:02] (03PS2) 10Alexandros Kosiaris: admin/namespace: Deduplicate all helmfile templates [deployment-charts] - 10https://gerrit.wikimedia.org/r/581657 [14:16:04] (03PS2) 10Alexandros Kosiaris: admin: Default to sensible values for deploUser, namespaceName [deployment-charts] - 10https://gerrit.wikimedia.org/r/581658 [14:16:06] (03PS2) 10Alexandros Kosiaris: admin: Remove all override files [deployment-charts] - 10https://gerrit.wikimedia.org/r/581748 [14:18:31] (03PS1) 10Reedy: Enforce upload ratelimits on commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/582046 [14:21:22] (03PS1) 10Jbond: realm.pp: fact the trusted certname fact if performing a lookup or pcc [puppet] - 10https://gerrit.wikimedia.org/r/582048 (https://phabricator.wikimedia.org/T248169) [14:23:06] (03PS2) 10Jbond: realm.pp: fact the trusted certname fact if performing a lookup or pcc [puppet] - 10https://gerrit.wikimedia.org/r/582048 (https://phabricator.wikimedia.org/T248169) [14:23:20] (03CR) 10Andrew Bogott: [C: 03+1] toolserver: refactor into profile and move off "toollabs" name [puppet] - 10https://gerrit.wikimedia.org/r/581654 (https://phabricator.wikimedia.org/T246689) (owner: 10Bstorm) [14:27:10] 10Puppet, 10puppet-compiler, 10Patch-For-Review, 10User-jbond: puppet populate failing on some nodes - https://phabricator.wikimedia.org/T248169 (10jbond) looks like `puppet lookup` has [[ https://github.com/puppetlabs/puppet/blob/master/lib/puppet/application/lookup.rb#L48-L49 | plans ]] to support this a... [14:35:26] (03CR) 10Jbond: "labs pcc: https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/21512" [puppet] - 10https://gerrit.wikimedia.org/r/582048 (https://phabricator.wikimedia.org/T248169) (owner: 10Jbond) [14:35:54] 10Operations, 10Commons: Enforce upload rate limits for bots on commons - https://phabricator.wikimedia.org/T248177 (10fgiunchedi) [14:40:44] (03CR) 10Thiemo Kreuz (WMDE): [C: 03+1] TwoColConflict: Limited default deployment [mediawiki-config] - 10https://gerrit.wikimedia.org/r/581991 (https://phabricator.wikimedia.org/T244863) (owner: 10Andrew-WMDE) [14:42:09] (03CR) 10Jcrespo: "Idea is right, implementation is improvable- this needs to be a define that can be used multiple times with different port number, like th" [puppet] - 10https://gerrit.wikimedia.org/r/582020 (owner: 10Muehlenhoff) [14:42:26] (03CR) 10Andrew Bogott: "What will $_trusted be used for? If it's going to block access to the private repo maybe that change should be in the same patch?" [puppet] - 10https://gerrit.wikimedia.org/r/582048 (https://phabricator.wikimedia.org/T248169) (owner: 10Jbond) [14:47:23] (03CR) 10Andrew Bogott: [C: 03+1] realm.pp: fact the trusted certname fact if performing a lookup or pcc [puppet] - 10https://gerrit.wikimedia.org/r/582048 (https://phabricator.wikimedia.org/T248169) (owner: 10Jbond) [14:50:55] 10Operations, 10Analytics, 10Traffic: Create replacement for Varnishkafka - https://phabricator.wikimedia.org/T237993 (10ema) [14:50:57] 10Operations, 10Analytics, 10Traffic, 10Patch-For-Review: Test atskafka deployment - https://phabricator.wikimedia.org/T247497 (10ema) 05Open→03Resolved a:03ema We do have an atskafka instance currently running in production on cp3050. This task can be considered now done, further improvements to at... [14:51:14] (03PS1) 10Ema: Reopen Unix socket after upon errors [software/atskafka] - 10https://gerrit.wikimedia.org/r/582060 (https://phabricator.wikimedia.org/T237993) [14:51:16] (03PS3) 10Jbond: realm.pp: trusted facts unavailable when performing a lookup or pcc [puppet] - 10https://gerrit.wikimedia.org/r/582048 (https://phabricator.wikimedia.org/T248169) [14:51:18] (03CR) 10Bstorm: [C: 03+2] toolserver: refactor into profile and move off "toollabs" name [puppet] - 10https://gerrit.wikimedia.org/r/581654 (https://phabricator.wikimedia.org/T246689) (owner: 10Bstorm) [14:52:16] (03PS2) 10Ema: Reopen Unix socket upon read errors [software/atskafka] - 10https://gerrit.wikimedia.org/r/582060 (https://phabricator.wikimedia.org/T237993) [14:52:54] (03CR) 10Vgutierrez: [C: 03+1] Reopen Unix socket upon read errors [software/atskafka] - 10https://gerrit.wikimedia.org/r/582060 (https://phabricator.wikimedia.org/T237993) (owner: 10Ema) [14:53:10] (03CR) 10Jforrester: "Is this for a particular issue, or just clean-up?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/582046 (owner: 10Reedy) [14:54:30] (03CR) 10Reedy: "From a discussion in #wikimedia-sre about https://phabricator.wikimedia.org/T248151 and then Filippo filed https://phabricator.wikimedia.o" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/582046 (owner: 10Reedy) [14:58:58] 10Operations, 10cloud-services-team (Kanban): Migrate remaining self-hosted puppet masters to Puppet 5 / facter 3 - https://phabricator.wikimedia.org/T241719 (10JHedden) [15:00:25] (03PS1) 10CDanis: Emergency documentation & rm possibly-confusing esams mentions [dns] - 10https://gerrit.wikimedia.org/r/582063 [15:00:51] (03PS2) 10Reedy: Enforce upload ratelimits on commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/582046 (https://phabricator.wikimedia.org/T248177) [15:00:56] 10Operations, 10Commons, 10SRE-swift-storage: Big number of uploads from DPLA bot - https://phabricator.wikimedia.org/T248151 (10fgiunchedi) Summary of the IRC chat: the current batch of uploads is about halfway finished and will likely be done by early next week, although no byte size estimates are availabl... [15:02:12] 10Operations, 10Commons, 10Patch-For-Review: Enforce upload rate limits for bots on commons - https://phabricator.wikimedia.org/T248177 (10Umherirrender) Bots have `noratelimit` rights. With this right there is no way to have the upload action ratelimited, but edit actions not. https://commons.wikimedia.org... [15:04:42] 10Operations, 10Commons, 10Patch-For-Review: Enforce upload rate limits for bots on commons - https://phabricator.wikimedia.org/T248177 (10Reedy) >>! In T248177#5987180, @Umherirrender wrote: > Bots have `noratelimit` rights. With this right there is no way to have the upload action ratelimited, but edit act... [15:04:52] 10Operations, 10Commons, 10Patch-For-Review: Enforce upload rate limits for bots on commons - https://phabricator.wikimedia.org/T248177 (10Umherirrender) >>! In T248177#5987180, @Umherirrender wrote: > Bots have `noratelimit` rights. With this right there is no way to have the upload action ratelimited, but... [15:05:29] (03PS1) 10Elukey: admin: add more documentation to analytics posix groups [puppet] - 10https://gerrit.wikimedia.org/r/582064 [15:06:22] (03PS2) 10Elukey: admin: add more documentation to analytics posix groups [puppet] - 10https://gerrit.wikimedia.org/r/582064 (https://phabricator.wikimedia.org/T246578) [15:09:56] 10Operations, 10ops-esams, 10netops: 2*10G optics down on cr2-esams - https://phabricator.wikimedia.org/T245520 (10ayounsi) JTAC's next step is to reboot the FPC... I asked them if there was any other less intrusive ways to test first. [15:10:03] (03PS2) 10CDanis: Emergency documentation & rm possibly-confusing esams mentions [dns] - 10https://gerrit.wikimedia.org/r/582063 [15:11:13] (03PS1) 10Bstorm: toolforge: remove the last toollabs role [puppet] - 10https://gerrit.wikimedia.org/r/582065 (https://phabricator.wikimedia.org/T246689) [15:11:33] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [15:11:40] (03PS1) 10Elukey: admin: deprecate piwik-roots [puppet] - 10https://gerrit.wikimedia.org/r/582066 (https://phabricator.wikimedia.org/T246578) [15:12:10] (03CR) 10Ayounsi: [C: 03+1] "overall +1 but not a fan of having the data in several places, as one will eventually be more outdated than the other." [dns] - 10https://gerrit.wikimedia.org/r/582063 (owner: 10CDanis) [15:13:37] AaronSchulz: anomie: I am going to cherry pick / deploy the ActorMigration log hotfix [15:13:41] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [15:13:48] it is rather straightforward [15:14:24] (03PS1) 10Elukey: admin: flag notebook-roots as deprecated [puppet] - 10https://gerrit.wikimedia.org/r/582068 (https://phabricator.wikimedia.org/T246578) [15:14:56] hashar: I'll stand by for pings in case my assistance is needed. [15:14:58] (03CR) 10Jcrespo: "Let's also call the class something different from ferm_proxy- the misc "main" proxies have a different rule applied to them, let's call t" [puppet] - 10https://gerrit.wikimedia.org/r/582020 (owner: 10Muehlenhoff) [15:16:12] (03CR) 10CDanis: [C: 03+2] "> Patch Set 2: Code-Review+1" [dns] - 10https://gerrit.wikimedia.org/r/582063 (owner: 10CDanis) [15:16:24] anomie: you deserve a medal sir! [15:16:46] (03PS1) 10Elukey: admin: deprecate the eventlogging-roots group [puppet] - 10https://gerrit.wikimedia.org/r/582070 (https://phabricator.wikimedia.org/T246578) [15:18:05] (03PS6) 10Andrew Bogott: designate: move policy.json to yaml [puppet] - 10https://gerrit.wikimedia.org/r/580139 (https://phabricator.wikimedia.org/T247795) [15:19:40] (03CR) 10Filippo Giunchedi: [C: 03+1] profile: set icinga exporter scrape_timeout to 20s [puppet] - 10https://gerrit.wikimedia.org/r/581762 (https://phabricator.wikimedia.org/T248131) (owner: 10Cwhite) [15:20:03] (03CR) 10Andrew Bogott: [C: 03+2] designate: move policy.json to yaml [puppet] - 10https://gerrit.wikimedia.org/r/580139 (https://phabricator.wikimedia.org/T247795) (owner: 10Andrew Bogott) [15:21:11] (03PS1) 10Elukey: admin: use the *analytics_admins_members placeholder when possible [puppet] - 10https://gerrit.wikimedia.org/r/582071 (https://phabricator.wikimedia.org/T246578) [15:22:03] PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=icinga site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [15:22:28] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission, 10Patch-For-Review: Decommission dbproxy1006.eqiad.wmnet - https://phabricator.wikimedia.org/T233207 (10Papaul) [15:22:42] (03PS1) 10Elukey: admin: deprecate aqs-users [puppet] - 10https://gerrit.wikimedia.org/r/582072 (https://phabricator.wikimedia.org/T246578) [15:22:43] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission, 10Patch-For-Review: Decommission dbproxy1006.eqiad.wmnet - https://phabricator.wikimedia.org/T233207 (10Papaul) 05Open→03Resolved Complete [15:24:07] RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [15:24:38] (03CR) 10Cwhite: [C: 03+2] profile: set icinga exporter scrape_timeout to 20s [puppet] - 10https://gerrit.wikimedia.org/r/581762 (https://phabricator.wikimedia.org/T248131) (owner: 10Cwhite) [15:25:24] (03PS1) 10Papaul: DNS: remove mgmt asset tag for dbproxy1007 [dns] - 10https://gerrit.wikimedia.org/r/582073 [15:26:00] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: Decommission db1061.eqiad.wmnet - https://phabricator.wikimedia.org/T238624 (10Papaul) [15:26:15] 10Operations, 10DBA: Decommission db1061-db1073 - https://phabricator.wikimedia.org/T217396 (10Papaul) [15:26:17] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: Decommission db1061.eqiad.wmnet - https://phabricator.wikimedia.org/T238624 (10Papaul) 05Open→03Resolved complete [15:26:20] (03PS2) 10Elukey: admin: refactor eventlogging-related groups [puppet] - 10https://gerrit.wikimedia.org/r/582070 (https://phabricator.wikimedia.org/T246578) [15:26:22] (03PS2) 10Elukey: admin: use the *analytics_admins_members placeholder when possible [puppet] - 10https://gerrit.wikimedia.org/r/582071 (https://phabricator.wikimedia.org/T246578) [15:26:24] (03PS2) 10Elukey: admin: deprecate aqs-users [puppet] - 10https://gerrit.wikimedia.org/r/582072 (https://phabricator.wikimedia.org/T246578) [15:26:57] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: Decommission db1062.eqiad.wmnet - https://phabricator.wikimedia.org/T239188 (10Papaul) [15:27:12] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: Decommission db1062.eqiad.wmnet - https://phabricator.wikimedia.org/T239188 (10Papaul) 05Open→03Resolved complete [15:27:35] (03CR) 10Bstorm: [C: 03+2] toolforge: remove the last toollabs role [puppet] - 10https://gerrit.wikimedia.org/r/582065 (https://phabricator.wikimedia.org/T246689) (owner: 10Bstorm) [15:27:47] (03CR) 10Bstorm: [C: 03+2] "this is already removed from the instance it related to." [puppet] - 10https://gerrit.wikimedia.org/r/582065 (https://phabricator.wikimedia.org/T246689) (owner: 10Bstorm) [15:27:58] (03CR) 10Daniel Kinzler: Add configuration variable $wgEnableRestAPIDevelopmentEndpoints (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/581886 (https://phabricator.wikimedia.org/T247997) (owner: 10BPirkle) [15:28:39] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: Decommission db1066.eqiad.wmnet - https://phabricator.wikimedia.org/T233071 (10Papaul) [15:28:52] (03CR) 10Marostegui: "Agreed with Jaime's proposals. We might need different ports in the future indeed." [puppet] - 10https://gerrit.wikimedia.org/r/582020 (owner: 10Muehlenhoff) [15:28:54] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: Decommission db1066.eqiad.wmnet - https://phabricator.wikimedia.org/T233071 (10Papaul) 05Open→03Resolved complete [15:28:56] 10Operations, 10DBA: Decommission db1061-db1073 - https://phabricator.wikimedia.org/T217396 (10Papaul) [15:29:41] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: decommission db1067.eqiad.wmnet - https://phabricator.wikimedia.org/T238297 (10Papaul) [15:29:59] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: decommission db1067.eqiad.wmnet - https://phabricator.wikimedia.org/T238297 (10Papaul) 05Open→03Resolved complete [15:31:10] (03CR) 10Alexandros Kosiaris: [C: 03+1] prometheus: add mediawiki recording rules [puppet] - 10https://gerrit.wikimedia.org/r/580888 (owner: 10Filippo Giunchedi) [15:32:50] (03PS4) 10Herron: ELk7: add curator job to require disktype hdd after 7 days [puppet] - 10https://gerrit.wikimedia.org/r/579422 (https://phabricator.wikimedia.org/T247376) [15:37:15] (03CR) 10Papaul: [C: 03+2] DNS: remove mgmt asset tag for dbproxy1007 [dns] - 10https://gerrit.wikimedia.org/r/582073 (owner: 10Papaul) [15:37:22] (03PS2) 10Papaul: DNS: remove mgmt asset tag for dbproxy1007 [dns] - 10https://gerrit.wikimedia.org/r/582073 [15:37:25] (03CR) 10Papaul: [V: 03+2 C: 03+2] DNS: remove mgmt asset tag for dbproxy1007 [dns] - 10https://gerrit.wikimedia.org/r/582073 (owner: 10Papaul) [15:37:54] hotfix deployment [15:42:13] (03PS1) 10Bstorm: k8s: purge flannel from the environment [puppet] - 10https://gerrit.wikimedia.org/r/582090 (https://phabricator.wikimedia.org/T246689) [15:44:06] tested it on mw1298 [15:44:50] !log hashar@deploy1001 Synchronized php-1.35.0-wmf.24/includes/ActorMigration.php: Avoid upsert() log warning spam in ActorMigration due to unique key array format - T248147 (duration: 01m 01s) [15:44:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:44:58] T248147: Wikimedia\Rdbms\Database::normalizeUpsertKeys called with deprecated parameter style: the unique key array should be a string or array of string arrays generating 2 million warnings in 24 hours - https://phabricator.wikimedia.org/T248147 [15:45:49] 10Operations, 10Analytics, 10Research, 10Traffic, and 2 others: Enable layered data-access and sharing for a new form of collaboration - https://phabricator.wikimedia.org/T245833 (10elukey) I had a chat with Miriam about this: - The pageview granularity request from Sukhbir should be handled as separate t... [15:46:59] (03CR) 10Bstorm: "PCC on a physical host looks good https://puppet-compiler.wmflabs.org/compiler1001/21513/" [puppet] - 10https://gerrit.wikimedia.org/r/582090 (https://phabricator.wikimedia.org/T246689) (owner: 10Bstorm) [15:48:34] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission, 10Patch-For-Review: decommission dbproxy1007.eqiad.wmnet - https://phabricator.wikimedia.org/T245385 (10Papaul) [15:52:50] (03CR) 10Bstorm: "Some cloud k8s hosts. I wonder if there's anything for CI I need to check. https://puppet-compiler.wmflabs.org/compiler1001/21514/" [puppet] - 10https://gerrit.wikimedia.org/r/582090 (https://phabricator.wikimedia.org/T246689) (owner: 10Bstorm) [15:53:38] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: decommission auth1001 - https://phabricator.wikimedia.org/T234909 (10Papaul) [15:54:52] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: decommission auth1001 - https://phabricator.wikimedia.org/T234909 (10Papaul) 05Open→03Resolved complete [15:58:20] 10Operations, 10ops-eqiad, 10decommission: Decommission neodymium - https://phabricator.wikimedia.org/T220503 (10Papaul) [15:58:34] 10Operations, 10ops-eqiad, 10decommission: Decommission neodymium - https://phabricator.wikimedia.org/T220503 (10Papaul) 05Open→03Resolved Complete [16:08:30] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission, 10fundraising-tech-ops: decommission frav1001.frack.eqiad.wmnet - https://phabricator.wikimedia.org/T222109 (10Papaul) [16:09:34] (03CR) 10Ottomata: [C: 03+1] admin: add more documentation to analytics posix groups [puppet] - 10https://gerrit.wikimedia.org/r/582064 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [16:09:58] (03CR) 10Ottomata: [C: 03+1] admin: deprecate piwik-roots [puppet] - 10https://gerrit.wikimedia.org/r/582066 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [16:10:20] (03CR) 10Ottomata: [C: 03+1] admin: flag notebook-roots as deprecated [puppet] - 10https://gerrit.wikimedia.org/r/582068 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [16:10:27] 10Operations, 10ops-eqiad, 10DC-Ops, 10decommission: decommission auth1001 - https://phabricator.wikimedia.org/T234909 (10Dzahn) This is in netbox as status"offline" but should be "decommissioning", right? [16:10:39] (03CR) 10Ottomata: [C: 03+1] admin: refactor eventlogging-related groups [puppet] - 10https://gerrit.wikimedia.org/r/582070 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [16:11:03] (03CR) 10Ottomata: [C: 03+1] admin: use the *analytics_admins_members placeholder when possible [puppet] - 10https://gerrit.wikimedia.org/r/582071 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [16:11:25] (03CR) 10Ottomata: [C: 03+1] admin: deprecate aqs-users [puppet] - 10https://gerrit.wikimedia.org/r/582072 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [16:12:05] 10Operations, 10ops-eqiad, 10decommission: Decommission neodymium - https://phabricator.wikimedia.org/T220503 (10Dzahn) This is in netbox as status"offline" but should be "decommissioning", right? [16:25:24] (03PS6) 10Dzahn: add IPv6 records for cescout1001 [dns] - 10https://gerrit.wikimedia.org/r/581068 (https://phabricator.wikimedia.org/T239250) [16:27:58] (03CR) 10Dzahn: [C: 03+2] add IPv6 records for cescout1001 [dns] - 10https://gerrit.wikimedia.org/r/581068 (https://phabricator.wikimedia.org/T239250) (owner: 10Dzahn) [16:29:49] mutante: it should be offline since it has already been unrack [16:31:04] papaul: offline is a step after decommissioning? [16:31:12] somehow i thought it's before [16:31:54] papaul: i am a bit confused because "offline" isn't even a state in https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Server_transitions [16:32:24] i guess it is the same as "unracked" there then [16:32:30] volans: ^ yea? [16:34:10] mutante: after the decom script complete on a host it is set to decommissioning once onsite finished wiping the disk and unrack the server it is set to offline [16:35:11] papaul: ok! got it. i think then we should either rename it in wiki or in netbox. ACK! [16:39:08] (03PS14) 10Dzahn: gerrit: replace hiera() with lookup() [puppet] - 10https://gerrit.wikimedia.org/r/563546 [16:42:24] (03CR) 10Hashar: [C: 03+1] "The repository is obsolete. There might be some jobs here and there still relaying on it though. I guess if something is stall we will fi" [puppet] - 10https://gerrit.wikimedia.org/r/561683 (https://phabricator.wikimedia.org/T218900) (owner: 10Dzahn) [16:43:50] 10Operations, 10Commons, 10Wikimedia-Site-requests, 10Patch-For-Review: Enforce upload rate limits for bots on commons - https://phabricator.wikimedia.org/T248177 (10Reedy) [16:43:58] 10Operations, 10Wikimedia-Mailing-lists: Please decom reading-wmf mailing list - https://phabricator.wikimedia.org/T248126 (10Quiddity) For context: I do not expect that any new addresses will ever need to subscribe to the archived-list. I mainly want to confirm that the archives will be available indefinitely... [16:44:05] (03CR) 10Hashar: [C: 03+1] "That is used to "deploy" composer on the CI agents though I don't think any jobs are still relying on it." [puppet] - 10https://gerrit.wikimedia.org/r/561684 (https://phabricator.wikimedia.org/T218900) (owner: 10Dzahn) [16:52:57] 10Operations, 10Wikimedia-Mailing-lists: Please decom reading-wmf mailing list - https://phabricator.wikimedia.org/T248126 (10dr0ptp4kt) I agree. Institutional memory is good. What I'd like to do additionally is ensure that people's old passwords don't work beyond their end date of employment for an internal a... [16:53:03] 10Operations, 10Product-Infrastructure-Team-Backlog, 10Traffic: Increase in 503 responses since 2020-03-15 - https://phabricator.wikimedia.org/T248132 (10Mholloway) [16:53:58] (03PS3) 10Mstyles: kibana: add kibana to relforge [puppet] - 10https://gerrit.wikimedia.org/r/581111 (https://phabricator.wikimedia.org/T246961) [16:55:51] (03PS15) 10Dzahn: gerrit: replace hiera() with lookup() [puppet] - 10https://gerrit.wikimedia.org/r/563546 [16:56:25] (03CR) 10Jcrespo: "> Patch Set 3:" [puppet] - 10https://gerrit.wikimedia.org/r/582020 (owner: 10Muehlenhoff) [16:57:07] mutante: see https://wikitech.wikimedia.org/wiki/Server_Lifecycle#States [16:57:18] (03CR) 10Bstorm: [C: 03+1] "No objections from me. Obviously, if you want to refactor, for the concerns Jaime expressed, I'll take a look again later :)" [puppet] - 10https://gerrit.wikimedia.org/r/582020 (owner: 10Muehlenhoff) [16:59:28] (03PS1) 10Jforrester: scap: Make canary logstash dashboard link more like reality [puppet] - 10https://gerrit.wikimedia.org/r/582113 (https://phabricator.wikimedia.org/T247005) [17:01:18] volans: ah, ok. yep, thanks [17:02:19] (03CR) 10Krinkle: [C: 03+1] scap: Make canary logstash dashboard link more like reality [puppet] - 10https://gerrit.wikimedia.org/r/582113 (https://phabricator.wikimedia.org/T247005) (owner: 10Jforrester) [17:07:57] 10Operations, 10Product-Infrastructure-Team-Backlog, 10Traffic: Elevated 503 responses between 2020-03-15 and 2020-03-19 - https://phabricator.wikimedia.org/T248132 (10Mholloway) [17:08:32] (03CR) 10Arturo Borrero Gonzalez: [C: 03+1] k8s: purge flannel from the environment [puppet] - 10https://gerrit.wikimedia.org/r/582090 (https://phabricator.wikimedia.org/T246689) (owner: 10Bstorm) [17:08:58] (03PS1) 10Andrew Bogott: designate policy: restrict zone deletion to admins [puppet] - 10https://gerrit.wikimedia.org/r/582115 [17:10:25] (03CR) 10Andrew Bogott: [C: 03+2] designate policy: restrict zone deletion to admins [puppet] - 10https://gerrit.wikimedia.org/r/582115 (owner: 10Andrew Bogott) [17:11:05] (03CR) 10Urbanecm: [C: 03+1] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/581891 (https://phabricator.wikimedia.org/T248146) (owner: 10DannyS712) [17:17:59] (03CR) 10Dzahn: [C: 03+1] "https://puppet-compiler.wmflabs.org/compiler1003/21518/gerrit1001.wikimedia.org/" [puppet] - 10https://gerrit.wikimedia.org/r/563546 (owner: 10Dzahn) [17:18:13] (03CR) 10Dzahn: [C: 03+2] contint: use ensure=>present when cloning slave scripts [puppet] - 10https://gerrit.wikimedia.org/r/561683 (https://phabricator.wikimedia.org/T218900) (owner: 10Dzahn) [17:18:46] (03CR) 10Dzahn: [C: 03+2] contint: use ensure=>present when cloning composer [puppet] - 10https://gerrit.wikimedia.org/r/561684 (https://phabricator.wikimedia.org/T218900) (owner: 10Dzahn) [17:31:56] (03CR) 10Dzahn: [C: 03+2] gerrit: replace hiera() with lookup() [puppet] - 10https://gerrit.wikimedia.org/r/563546 (owner: 10Dzahn) [17:32:53] (03CR) 10Dzahn: "also: turn the $java_version into actual integers, not strings" [puppet] - 10https://gerrit.wikimedia.org/r/563546 (owner: 10Dzahn) [17:35:42] (03PS12) 10Arturo Borrero Gonzalez: toolforge: support canonical redirects in urlproxy [puppet] - 10https://gerrit.wikimedia.org/r/579952 (https://phabricator.wikimedia.org/T234617) [17:46:51] (03Abandoned) 10Hashar: package_builder: do not set webproxy by default [puppet] - 10https://gerrit.wikimedia.org/r/579231 (https://phabricator.wikimedia.org/T247496) (owner: 10Hashar) [17:47:31] (03CR) 10Dzahn: "noop in prod" [puppet] - 10https://gerrit.wikimedia.org/r/563546 (owner: 10Dzahn) [17:57:22] (03PS1) 10Andrew Bogott: keystone policy: remove redundant rules [puppet] - 10https://gerrit.wikimedia.org/r/582118 (https://phabricator.wikimedia.org/T247795) [17:57:52] 10Operations, 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team-TODO, 10Release-Engineering-Team (CI & Testing services): debian-glue jobs ignored error messages about libeatmydata.so in LD_PRELOAD - https://phabricator.wikimedia.org/T240430 (10Krinkle) p:05Triage→03Low Triaging as lo... [18:00:32] (03PS1) 10Andrew Bogott: nova policy: add some comments [puppet] - 10https://gerrit.wikimedia.org/r/582122 (https://phabricator.wikimedia.org/T247795) [18:16:06] (03PS2) 10Andrew Bogott: keystone policy: remove redundant rules [puppet] - 10https://gerrit.wikimedia.org/r/582118 (https://phabricator.wikimedia.org/T247795) [18:16:08] (03PS2) 10Andrew Bogott: nova policy: add some comments [puppet] - 10https://gerrit.wikimedia.org/r/582122 (https://phabricator.wikimedia.org/T247795) [18:16:10] (03PS1) 10Andrew Bogott: neutron: replace policy.json with policy.yaml [puppet] - 10https://gerrit.wikimedia.org/r/582127 (https://phabricator.wikimedia.org/T247795) [18:20:02] (03CR) 10Ayounsi: [C: 03+1] Add tool to validate RPKI invalid prefixes (033 comments) [software] - 10https://gerrit.wikimedia.org/r/581995 (owner: 10Volans) [18:28:35] (03CR) 10Volans: "thanks, replies inline, will fix soon" (033 comments) [software] - 10https://gerrit.wikimedia.org/r/581995 (owner: 10Volans) [18:31:04] (03CR) 10Hashar: "Daniel pointed we would need a ssh key pair in the keyholder. I could not find how to configure that in puppet though." [puppet] - 10https://gerrit.wikimedia.org/r/579587 (https://phabricator.wikimedia.org/T215458) (owner: 10Hashar) [18:41:03] (03PS2) 10Thcipriani: Integration Cluster: update gitcache nightly [puppet] - 10https://gerrit.wikimedia.org/r/579602 [19:02:08] (03CR) 10Jbond: [C: 03+1] "lgtm" [puppet] - 10https://gerrit.wikimedia.org/r/582070 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [19:05:21] (03CR) 10Jbond: "lgtm but seems there is still a resource using this group" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/582064 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [19:06:34] (03CR) 10Jbond: [C: 03+1] "lgtm" [puppet] - 10https://gerrit.wikimedia.org/r/582066 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [19:07:38] (03CR) 10Jbond: [C: 03+1] "lgtm" [puppet] - 10https://gerrit.wikimedia.org/r/582068 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [19:09:33] (03CR) 10Jbond: [C: 03+1] "lgtm" [puppet] - 10https://gerrit.wikimedia.org/r/582071 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [19:11:17] (03CR) 10Jbond: [C: 03+1] "lgtm, thanks for all the clean up :)" [puppet] - 10https://gerrit.wikimedia.org/r/582072 (https://phabricator.wikimedia.org/T246578) (owner: 10Elukey) [19:13:51] (03CR) 10Jhedden: [C: 03+1] "nice change!" [puppet] - 10https://gerrit.wikimedia.org/r/582118 (https://phabricator.wikimedia.org/T247795) (owner: 10Andrew Bogott) [19:14:51] (03PS1) 10Dzahn: bastionhost: replace auth1001 with auth1002 in pam-sshd config [puppet] - 10https://gerrit.wikimedia.org/r/582147 (https://phabricator.wikimedia.org/T234909) [19:16:33] (03CR) 10Jhedden: "This will need '[oslo_policy]\n policy_file = policy.yaml` in neutron.conf" [puppet] - 10https://gerrit.wikimedia.org/r/582127 (https://phabricator.wikimedia.org/T247795) (owner: 10Andrew Bogott) [19:24:16] (03CR) 10Andrew Bogott: [C: 03+2] keystone policy: remove redundant rules [puppet] - 10https://gerrit.wikimedia.org/r/582118 (https://phabricator.wikimedia.org/T247795) (owner: 10Andrew Bogott) [19:24:27] (03CR) 10Andrew Bogott: [C: 03+2] nova policy: add some comments [puppet] - 10https://gerrit.wikimedia.org/r/582122 (https://phabricator.wikimedia.org/T247795) (owner: 10Andrew Bogott) [19:25:22] (03PS2) 10Andrew Bogott: neutron: replace policy.json with policy.yaml [puppet] - 10https://gerrit.wikimedia.org/r/582127 (https://phabricator.wikimedia.org/T247795) [19:27:32] (03CR) 10Andrew Bogott: "Mercifully, this is the last chance I had to forget the .conf change" [puppet] - 10https://gerrit.wikimedia.org/r/582127 (https://phabricator.wikimedia.org/T247795) (owner: 10Andrew Bogott) [19:33:55] (03CR) 10Jhedden: [C: 03+1] neutron: replace policy.json with policy.yaml [puppet] - 10https://gerrit.wikimedia.org/r/582127 (https://phabricator.wikimedia.org/T247795) (owner: 10Andrew Bogott) [19:34:53] (03CR) 10Andrew Bogott: [C: 03+2] neutron: replace policy.json with policy.yaml [puppet] - 10https://gerrit.wikimedia.org/r/582127 (https://phabricator.wikimedia.org/T247795) (owner: 10Andrew Bogott) [19:44:15] (03PS1) 10Krinkle: scap: Sync logstash_checker.py canary query with current dashboard [puppet] - 10https://gerrit.wikimedia.org/r/582153 (https://phabricator.wikimedia.org/T247113) [19:44:20] (03PS1) 10Andrew Bogott: Revert "neutron: replace policy.json with policy.yaml" [puppet] - 10https://gerrit.wikimedia.org/r/582154 [19:46:35] (03CR) 10Andrew Bogott: [C: 03+2] Revert "neutron: replace policy.json with policy.yaml" [puppet] - 10https://gerrit.wikimedia.org/r/582154 (owner: 10Andrew Bogott) [19:46:37] PROBLEM - nova instance creation test on cloudcontrol1003 is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args nova-fullstack https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [19:50:51] RECOVERY - nova instance creation test on cloudcontrol1003 is OK: PROCS OK: 1 process with command name python, args nova-fullstack https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [19:52:10] (03PS1) 10Dzahn: site/conftool: decom another batch of old appservers [puppet] - 10https://gerrit.wikimedia.org/r/582160 [19:57:14] (03CR) 10Jforrester: [C: 03+1] scap: Sync logstash_checker.py canary query with current dashboard [puppet] - 10https://gerrit.wikimedia.org/r/582153 (https://phabricator.wikimedia.org/T247113) (owner: 10Krinkle) [20:02:52] (03PS1) 10Andrew Bogott: neutron: replace policy.json with policy.yaml [puppet] - 10https://gerrit.wikimedia.org/r/582161 (https://phabricator.wikimedia.org/T247795) [20:05:04] (03CR) 10Andrew Bogott: [C: 03+2] neutron: replace policy.json with policy.yaml [puppet] - 10https://gerrit.wikimedia.org/r/582161 (https://phabricator.wikimedia.org/T247795) (owner: 10Andrew Bogott) [20:18:13] (03PS2) 10Dzahn: site/conftool: decom mw1244-mw1249 and mw1227-mw1231 [puppet] - 10https://gerrit.wikimedia.org/r/582160 (https://phabricator.wikimedia.org/T247780) [20:18:23] !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw122[7-9].eqiad.wmnet [20:18:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:18:35] !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw123[0-1].eqiad.wmnet [20:18:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:18:52] !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw124[4-9].eqiad.wmnet [20:18:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:32:25] !log dzahn@cumin1001 conftool action : set/pooled=inactive; selector: name=mw122[7-9].eqiad.wmnet [20:32:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:37:54] !log dzahn@cumin1001 conftool action : set/pooled=inactive; selector: name=mw123[0-1].eqiad.wmnet [20:37:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:39:56] !log dzahn@cumin1001 START - Cookbook sre.hosts.downtime [20:39:58] !log dzahn@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [20:40:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:40:03] 10Operations, 10serviceops, 10Patch-For-Review: decom old appservers in eqiad - https://phabricator.wikimedia.org/T247780 (10ops-monitoring-bot) Icinga downtime for 2:00:00 set by dzahn@cumin1001 on 3 host(s) and their services with reason: decom ` mw[1227-1229].eqiad.wmnet ` [20:40:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:40:20] !log dzahn@cumin1001 START - Cookbook sre.hosts.downtime [20:40:21] !log dzahn@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [20:40:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:40:26] 10Operations, 10serviceops, 10Patch-For-Review: decom old appservers in eqiad - https://phabricator.wikimedia.org/T247780 (10ops-monitoring-bot) Icinga downtime for 2:00:00 set by dzahn@cumin1001 on 2 host(s) and their services with reason: decom ` mw[1230-1231].eqiad.wmnet ` [20:40:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:40:48] !log dzahn@cumin1001 START - Cookbook sre.hosts.downtime [20:40:51] !log dzahn@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [20:40:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:40:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:40:59] 10Operations, 10serviceops, 10Patch-For-Review: decom old appservers in eqiad - https://phabricator.wikimedia.org/T247780 (10ops-monitoring-bot) Icinga downtime for 2:00:00 set by dzahn@cumin1001 on 6 host(s) and their services with reason: decom ` mw[1244-1249].eqiad.wmnet ` [20:41:20] !log dzahn@cumin1001 conftool action : set/pooled=inactive; selector: name=mw124[4-9].eqiad.wmnet [20:41:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:53:01] !log dzahn@cumin1001 START - Cookbook sre.hosts.decommission [20:53:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:55:38] !log dzahn@cumin1001 END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [20:55:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:55:44] 10Operations, 10serviceops, 10Patch-For-Review: decom old appservers in eqiad - https://phabricator.wikimedia.org/T247780 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: `mw[1227-1229].eqiad.wmnet` - mw1227.eqiad.wmnet (**PASS**) - Downtimed host on Icinga... [20:56:07] !log dzahn@cumin1001 START - Cookbook sre.hosts.decommission [20:56:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:57:16] !log dzahn@cumin1001 END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [20:57:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:57:21] 10Operations, 10serviceops, 10Patch-For-Review: decom old appservers in eqiad - https://phabricator.wikimedia.org/T247780 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: `mw[1230-1231].eqiad.wmnet` - mw1230.eqiad.wmnet (**PASS**) - Downtimed host on Icinga... [20:59:03] !log dzahn@cumin1001 START - Cookbook sre.hosts.decommission [20:59:03] !log dzahn@cumin1001 END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) [20:59:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:59:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:59:33] !log dzahn@cumin1001 START - Cookbook sre.hosts.decommission [20:59:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:01:29] !log dzahn@cumin1001 END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [21:01:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:01:37] 10Operations, 10serviceops, 10Patch-For-Review: decom old appservers in eqiad - https://phabricator.wikimedia.org/T247780 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: `mw[1244-1247].eqiad.wmnet` - mw1244.eqiad.wmnet (**PASS**) - Downtimed host on Icinga... [21:04:59] !log dzahn@cumin1001 START - Cookbook sre.hosts.decommission [21:05:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:06:01] !log dzahn@cumin1001 END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [21:06:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:06:07] 10Operations, 10serviceops, 10Patch-For-Review: decom old appservers in eqiad - https://phabricator.wikimedia.org/T247780 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: `mw[1248-1249].eqiad.wmnet` - mw1248.eqiad.wmnet (**PASS**) - Downtimed host on Icinga... [21:11:23] (03CR) 10Bstorm: [C: 03+2] kubernetes: Set php7.3 as the default type [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/496564 (owner: 10BryanDavis) [21:12:44] (03Merged) 10jenkins-bot: kubernetes: Set php7.3 as the default type [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/496564 (owner: 10BryanDavis) [21:19:59] (03PS2) 10BPirkle: Add configuration variable $wgEnableRestAPIDevelopmentEndpoints [mediawiki-config] - 10https://gerrit.wikimedia.org/r/581886 (https://phabricator.wikimedia.org/T247997) [21:21:49] (03CR) 10BPirkle: Add configuration variable $wgEnableRestAPIDevelopmentEndpoints (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/581886 (https://phabricator.wikimedia.org/T247997) (owner: 10BPirkle) [21:23:43] (03CR) 10Dzahn: [C: 03+2] site/conftool: decom mw1244-mw1249 and mw1227-mw1231 [puppet] - 10https://gerrit.wikimedia.org/r/582160 (https://phabricator.wikimedia.org/T247780) (owner: 10Dzahn) [21:23:53] (03PS3) 10Dzahn: site/conftool: decom mw1244-mw1249 and mw1227-mw1231 [puppet] - 10https://gerrit.wikimedia.org/r/582160 (https://phabricator.wikimedia.org/T247780) [21:32:56] 10Operations, 10DNS, 10Technical blog, 10Traffic, and 2 others: Setup DNS to direct techblog.wikimedia.org to new Wordpress VIP hosting - https://phabricator.wikimedia.org/T246507 (10bd808) @JHedden kindly volunteered to be the root to help out with this. [21:38:03] (03CR) 10Bstorm: [C: 03+2] Make Kubernetes the default backend and warn when guessing [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/443190 (https://phabricator.wikimedia.org/T154504) (owner: 10Nehajha) [21:38:40] (03Merged) 10jenkins-bot: Make Kubernetes the default backend and warn when guessing [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/443190 (https://phabricator.wikimedia.org/T154504) (owner: 10Nehajha) [21:43:13] 10Operations, 10DNS, 10Technical blog, 10Traffic, and 2 others: Setup DNS to direct techblog.wikimedia.org to new Wordpress VIP hosting - https://phabricator.wikimedia.org/T246507 (10Krinkle) At T226044, it was planned to self-host with Phabricator. The the domain itself is to be rerouted at the DNS layer... [21:45:17] (03CR) 10Jhedden: [C: 04-1] "on hold until 2020-03-23" [dns] - 10https://gerrit.wikimedia.org/r/577370 (https://phabricator.wikimedia.org/T246507) (owner: 10BryanDavis) [21:47:10] I’m getting a mysql error on phab [21:47:27] Woe! This request had its journey cut short by unexpected circumstances (Can Not Connect to MySQL). [21:47:53] 10Operations, 10DNS, 10Technical blog, 10Traffic, and 2 others: Setup DNS to direct techblog.wikimedia.org to new Wordpress VIP hosting - https://phabricator.wikimedia.org/T246507 (10bd808) >>! In T246507#5988291, @Krinkle wrote: > At T226044, it was planned to self-host with Phabricator. The the domain it... [21:48:10] Back, last about a minute [22:07:46] (03CR) 10Bstorm: [C: 03+2] Bump manifest version [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578407 (owner: 10BryanDavis) [22:08:28] (03Merged) 10jenkins-bot: Bump manifest version [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578407 (owner: 10BryanDavis) [22:09:47] (03CR) 10Bstorm: [C: 03+2] Remove temporary code from 2020 Kubernetes migration [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578408 (https://phabricator.wikimedia.org/T246689) (owner: 10BryanDavis) [22:10:39] (03Merged) 10jenkins-bot: Remove temporary code from 2020 Kubernetes migration [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578408 (https://phabricator.wikimedia.org/T246689) (owner: 10BryanDavis) [22:16:47] (03CR) 10Bstorm: [C: 03+2] Refactor argparse setup [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578409 (owner: 10BryanDavis) [22:18:05] (03Merged) 10jenkins-bot: Refactor argparse setup [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578409 (owner: 10BryanDavis) [22:26:00] (03CR) 10Bstorm: "I almost want to see self.type changed to self.wstype. However, I'm just a bit on the fence about that. It's uncommon for python to use an" [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578410 (owner: 10BryanDavis) [22:26:35] bstorm_: I don't disagree on that comment [22:26:47] I'd be happy to see the change go all the way down [22:27:03] Perhaps, we should do that then [22:27:40] better to fix now than worry later ;) [22:28:28] bstorm_: want me to amend, or are you already on it? [22:28:45] Go ahead :) I'm likely to log out soon [22:34:47] (03PS2) 10BryanDavis: Reuse toolforge.common.tool.PROJECT in KubernetesBackend [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578410 [22:34:49] (03PS2) 10BryanDavis: Introduce command "template" feature [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578411 [22:34:51] (03PS2) 10BryanDavis: Add support for Kubernetes replica scaling [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578412 [22:34:53] (03PS3) 10BryanDavis: Add support for redirecting to toolforge.org [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578413 (https://phabricator.wikimedia.org/T234617) [22:35:30] (03CR) 10BryanDavis: "> I almost want to see self.type changed to self.wstype. However, I'm" [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578410 (owner: 10BryanDavis) [22:42:34] (03CR) 10Bstorm: [C: 03+2] Reuse toolforge.common.tool.PROJECT in KubernetesBackend [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578410 (owner: 10BryanDavis) [22:43:12] (03Merged) 10jenkins-bot: Reuse toolforge.common.tool.PROJECT in KubernetesBackend [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/578410 (owner: 10BryanDavis) [22:58:37] (03PS1) 10Dzahn: add otrs1001.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/582239 (https://phabricator.wikimedia.org/T248028) [23:00:30] (03CR) 10Dzahn: [C: 03+2] add otrs1001.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/582239 (https://phabricator.wikimedia.org/T248028) (owner: 10Dzahn) [23:00:36] (03PS2) 10Dzahn: add otrs1001.eqiad.wmnet [dns] - 10https://gerrit.wikimedia.org/r/582239 (https://phabricator.wikimedia.org/T248028) [23:02:10] 10Operations, 10vm-requests, 10Patch-For-Review: eqiad: 1 VM request for OTRS - https://phabricator.wikimedia.org/T248028 (10Dzahn) [23:04:41] !log dzahn@cumin1001 START - Cookbook sre.ganeti.makevm [23:04:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:16:46] !log dzahn@cumin1001 END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [23:16:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log