[00:00:05] MaxSem, RoanKattouw, Niharika, and Urbanecm: #bothumor Q:How do functions break up? A:They stop calling each other. Rise for Evening SWAT (Max 6 patches) deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20191106T0000). [00:00:05] No GERRIT patches in the queue for this window AFAICS. [00:07:16] !log created table wikimedia_editor_tasks_edit_streak on x1/wikishared (T234956) [00:07:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:07:21] T234956: API for streak for SE v3 - https://phabricator.wikimedia.org/T234956 [00:08:03] 10Operations, 10cloud-services-team: Failing puppet runs on labtestpuppetmaster2001 - https://phabricator.wikimedia.org/T235819 (10Andrew) >>! In T235819#5638642, @Dzahn wrote: > @Andrew Can we remove labtestpuppetmaster / turn it into a spare::system ? Nope, it's still doing useful work. I'll replace it eve... [00:15:09] 10Operations, 10observability, 10Patch-For-Review, 10Performance-Team (Radar): Fully migrate >= 30% of producers off statsd - https://phabricator.wikimedia.org/T205870 (10colewhite) [00:18:17] (03PS1) 10Cwhite: hiera: update ores matching rules and drop undefined metrics [puppet] - 10https://gerrit.wikimedia.org/r/548938 (https://phabricator.wikimedia.org/T233448) [00:22:25] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [00:34:01] PROBLEM - ElasticSearch shard size check - 9243 on search.svc.codfw.wmnet is CRITICAL: CRITICAL - commonswiki_content_1556151793(71.66666666666667gb) https://wikitech.wikimedia.org/wiki/Search%23If_it_has_been_indexed [00:44:45] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [00:54:45] 10Operations, 10ops-eqiad: rack/setup/install ms-be105[7-9].eqiad.wmnet - https://phabricator.wikimedia.org/T237438 (10RobH) a:05Jclark-ctr→03RobH Please note the mgmt dns was already input into the dns repo: ms-be1057 1H IN A 10.65.5.16 wmf5251 1H IN A 10.65.5.17 ms-be1058 1H IN A 10.... [00:56:15] 10Operations, 10serviceops: decom cobalt - https://phabricator.wikimedia.org/T236187 (10RobH) [00:56:39] 10Operations, 10serviceops: decom cobalt - https://phabricator.wikimedia.org/T236187 (10RobH) [01:01:33] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [01:03:03] 10Operations, 10Gerrit, 10Release-Engineering-Team, 10Wikimedia Design Style Guide (Wikimedia Design Style Guide v1.1): Automatic pickup of Gerrit clone master doesn't happen due to missing git-lfs – new deployment env - https://phabricator.wikimedia.org/T235677 (10Volker_E) [01:11:09] (03PS1) 10Dzahn: allow different memory limit settings for parsoid-php servers [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548944 (https://phabricator.wikimedia.org/T236833) [01:11:58] (03CR) 10jerkins-bot: [V: 04-1] allow different memory limit settings for parsoid-php servers [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548944 (https://phabricator.wikimedia.org/T236833) (owner: 10Dzahn) [01:13:56] 10Operations, 10Traffic: ats-tls-restart failed on cp4027 - https://phabricator.wikimedia.org/T237425 (10Vgutierrez) this is a known issue caused by update-ocsp-all [01:28:20] (03PS2) 10Dzahn: allow different memory limit settings for parsoid-php servers [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548944 (https://phabricator.wikimedia.org/T236833) [01:33:21] 10Operations, 10Parsoid-PHP, 10serviceops, 10Patch-For-Review: wt2html: Out of memory crashers - https://phabricator.wikimedia.org/T236833 (10Dzahn) I uploaded 2 changes. One to simply change it on all and the second one is an attempt to allow different settings for parsoid-php servers vs. the other appser... [01:37:35] (03PS4) 10Jeena Huneidi: Modify Restrouter chart to allow for minikube development [deployment-charts] - 10https://gerrit.wikimedia.org/r/545421 (https://phabricator.wikimedia.org/T228910) [01:38:46] (03CR) 10Jeena Huneidi: Modify Restrouter chart to allow for minikube development (034 comments) [deployment-charts] - 10https://gerrit.wikimedia.org/r/545421 (https://phabricator.wikimedia.org/T228910) (owner: 10Jeena Huneidi) [02:23:11] (03PS2) 10Ammarpad: Rename DPL extension variable to non-ambiguous name [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548569 [02:24:52] 10Operations, 10Parsoid-PHP, 10serviceops, 10Patch-For-Review: wt2html: Out of memory crashers - https://phabricator.wikimedia.org/T236833 (10Krinkle) >>! @ssastry > How should we handle this asymmetry so that when we eventually switch everything to Parsoid, we don't have previously rendered pages stop ren... [02:25:25] (03PS3) 10Ammarpad: Rename DPL extension variable to non-ambiguous name [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548569 [02:36:41] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [02:57:25] (03PS1) 10Vgutierrez: hiera: Set nginx on port 4443 for cache text_ats on eqsin [puppet] - 10https://gerrit.wikimedia.org/r/548947 (https://phabricator.wikimedia.org/T231627) [02:57:26] (03PS1) 10Vgutierrez: hiera: Set ats-tls on port 443 for cache text_ats on eqsin [puppet] - 10https://gerrit.wikimedia.org/r/548948 (https://phabricator.wikimedia.org/T231627) [02:59:14] !log Switch from nginx to ats-tls on cp5012 - T231627 [02:59:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:59:20] T231627: Move cache text cluster from nginx to ats-tls - https://phabricator.wikimedia.org/T231627 [03:10:13] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [03:14:44] (03CR) 10Vgutierrez: [C: 03+2] hiera: Set nginx on port 4443 for cache text_ats on eqsin [puppet] - 10https://gerrit.wikimedia.org/r/548947 (https://phabricator.wikimedia.org/T231627) (owner: 10Vgutierrez) [03:17:11] (03CR) 10Vgutierrez: [C: 03+2] hiera: Set ats-tls on port 443 for cache text_ats on eqsin [puppet] - 10https://gerrit.wikimedia.org/r/548948 (https://phabricator.wikimedia.org/T231627) (owner: 10Vgutierrez) [03:24:01] 10Operations, 10Traffic, 10Patch-For-Review: Move cache text cluster from nginx to ats-tls - https://phabricator.wikimedia.org/T231627 (10Vgutierrez) [03:30:57] (03PS1) 10Vgutierrez: ATS: remap stream.wmo.org requests on ats-tls as well [puppet] - 10https://gerrit.wikimedia.org/r/548949 (https://phabricator.wikimedia.org/T227432) [03:32:43] (03CR) 10Vgutierrez: [C: 03+2] ATS: remap stream.wmo.org requests on ats-tls as well [puppet] - 10https://gerrit.wikimedia.org/r/548949 (https://phabricator.wikimedia.org/T227432) (owner: 10Vgutierrez) [04:04:45] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [04:21:22] 10Operations, 10Traffic: Create a second text-lb IP address for test purposes - https://phabricator.wikimedia.org/T237492 (10BBlack) p:05Triage→03Normal [04:38:19] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [05:09:13] 10Operations, 10Traffic, 10observability: Add ats-tls status and availability graphs to frontend-traffic - https://phabricator.wikimedia.org/T236482 (10Vgutierrez) Added ats-tls status panel: https://grafana.wikimedia.org/d/000000479/frontend-traffic?refresh=1m&panelId=12&fullscreen&orgId=1&var-site=All&var-... [05:11:53] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [05:18:23] PROBLEM - Router interfaces on cr2-eqiad is CRITICAL: CRITICAL: host 208.80.154.197, interfaces up: 240, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [05:18:31] PROBLEM - Check systemd state on stat1007 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [05:19:29] PROBLEM - Router interfaces on cr2-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 54, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [05:21:37] RECOVERY - Router interfaces on cr2-eqiad is OK: OK: host 208.80.154.197, interfaces up: 242, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [05:22:41] RECOVERY - Router interfaces on cr2-eqord is OK: OK: host 208.80.154.198, interfaces up: 56, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [05:34:39] (03PS1) 10Vgutierrez: prometheus: Provide global aggregation rules for trafficserver requests [puppet] - 10https://gerrit.wikimedia.org/r/548954 (https://phabricator.wikimedia.org/T236482) [07:01:10] (03CR) 10Jcrespo: "Could you link to the HEAD file/gerrit CR with the table structure, I couldn't find it referenced on the ticket? Thanks." [puppet] - 10https://gerrit.wikimedia.org/r/548886 (https://phabricator.wikimedia.org/T227349) (owner: 10Mholloway) [07:11:14] (03CR) 10Elukey: "Jaime/Alex, just to confirm, can I proceed with the merge anytime?" [puppet] - 10https://gerrit.wikimedia.org/r/548711 (https://phabricator.wikimedia.org/T231208) (owner: 10Elukey) [07:20:35] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [07:22:40] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "While I agree with Krinkle's sentiment, I think we should be able to vary the memory limit depending on the called endpoint." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548923 (https://phabricator.wikimedia.org/T236833) (owner: 10Dzahn) [07:24:59] (03CR) 10Jcrespo: [C: 03+1] "Ok for me" [puppet] - 10https://gerrit.wikimedia.org/r/548711 (https://phabricator.wikimedia.org/T231208) (owner: 10Elukey) [07:37:19] !log onimisionipe@deploy1001 Started deploy [wdqs/wdqs@2cb2dde]: T233213 [07:37:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:38:26] 10Operations, 10LDAP-Access-Requests, 10Graphite: Nikerabbit unable to edit/create dashboards in Grafana - https://phabricator.wikimedia.org/T237498 (10Nikerabbit) [07:47:29] (03PS4) 10Alexandros Kosiaris: Add Bacula backups for Analytics Meta's mylvm snapshots [puppet] - 10https://gerrit.wikimedia.org/r/548711 (https://phabricator.wikimedia.org/T231208) (owner: 10Elukey) [07:47:38] (03CR) 10Alexandros Kosiaris: [C: 03+2] Add Bacula backups for Analytics Meta's mylvm snapshots [puppet] - 10https://gerrit.wikimedia.org/r/548711 (https://phabricator.wikimedia.org/T231208) (owner: 10Elukey) [07:48:56] !log onimisionipe@deploy1001 Finished deploy [wdqs/wdqs@2cb2dde]: T233213 (duration: 11m 38s) [07:48:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:49:27] 10Operations, 10LDAP-Access-Requests, 10Graphite: Nikerabbit unable to edit/create dashboards in Grafana - https://phabricator.wikimedia.org/T237498 (10MoritzMuehlenhoff) You are in a group which should allow you to edit dashboards (cn=wmf), did you login? (Using the arrow to the right in the grey navigation... [07:49:30] (03CR) 10jerkins-bot: [V: 04-1] Add Bacula backups for Analytics Meta's mylvm snapshots [puppet] - 10https://gerrit.wikimedia.org/r/548711 (https://phabricator.wikimedia.org/T231208) (owner: 10Elukey) [07:49:42] 10Operations: stunnel-wrap all rsync::server usage - https://phabricator.wikimedia.org/T237424 (10MoritzMuehlenhoff) p:05Triage→03Normal [07:50:45] 10Operations, 10Wikimedia-Mailing-lists, 10Mobile: Pipermail on lists.wikimedia.org is not mobile friendly - https://phabricator.wikimedia.org/T190054 (10MoritzMuehlenhoff) p:05Triage→03Normal [07:51:51] mmm I am abit confused about the error :D [07:52:27] (03CR) 10Elukey: "recheck" [puppet] - 10https://gerrit.wikimedia.org/r/548711 (https://phabricator.wikimedia.org/T231208) (owner: 10Elukey) [07:54:18] (03CR) 10jerkins-bot: [V: 04-1] Add Bacula backups for Analytics Meta's mylvm snapshots [puppet] - 10https://gerrit.wikimedia.org/r/548711 (https://phabricator.wikimedia.org/T231208) (owner: 10Elukey) [07:55:34] /srv/workspace/puppet/rake_modules/taskgen.rb:202:in `block in setup_typos' [07:56:34] or it is not that? [07:57:11] I think it's complaining about the default value in $use_kerberos = hiera('profile::analytics::database::meta::bakcup_dest::use_kerberos', false), [07:57:31] it is "bakcup" [07:57:36] yeah it must be that, but yesterday it was ok [07:57:46] beacause we added the check [07:57:53] ahhhh [07:57:54] to detect bakcup [07:58:03] it is ok if you want to change it on a later patch [07:58:09] from my side [07:58:12] 10Operations, 10LDAP-Access-Requests, 10Graphite: Nikerabbit unable to edit/create dashboards in Grafana - https://phabricator.wikimedia.org/T237498 (10Nikerabbit) 05Open→03Invalid *facepalm*. I was going through every button through the interface but did not consider that one. [07:58:18] nono doing it now, makes sense [08:01:55] (03PS1) 10Elukey: profile::analytics::database::meta::backup_dest: fix typo in hiera [puppet] - 10https://gerrit.wikimedia.org/r/548958 [08:04:03] (03PS5) 10KartikMistry: Enable CX out of beta for newly created WPs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548730 (https://phabricator.wikimedia.org/T234318) [08:04:24] (03CR) 10Elukey: [C: 03+2] profile::analytics::database::meta::backup_dest: fix typo in hiera [puppet] - 10https://gerrit.wikimedia.org/r/548958 (owner: 10Elukey) [08:05:19] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [08:07:43] (03PS5) 10Elukey: Add Bacula backups for Analytics Meta's mylvm snapshots [puppet] - 10https://gerrit.wikimedia.org/r/548711 (https://phabricator.wikimedia.org/T231208) [08:10:24] (03CR) 10Elukey: [C: 03+2] Add Bacula backups for Analytics Meta's mylvm snapshots [puppet] - 10https://gerrit.wikimedia.org/r/548711 (https://phabricator.wikimedia.org/T231208) (owner: 10Elukey) [08:10:43] (03CR) 10Mobrovac: "> While I agree with Krinkle's sentiment, I think we should be able" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548923 (https://phabricator.wikimedia.org/T236833) (owner: 10Dzahn) [08:16:53] (03CR) 10Mobrovac: [C: 04-1] Modify Restrouter chart to allow for minikube development (032 comments) [deployment-charts] - 10https://gerrit.wikimedia.org/r/545421 (https://phabricator.wikimedia.org/T228910) (owner: 10Jeena Huneidi) [08:22:07] (03CR) 10Mobrovac: [C: 04-1] "I'd like Alex to take a look and give his opinion too." [deployment-charts] - 10https://gerrit.wikimedia.org/r/545421 (https://phabricator.wikimedia.org/T228910) (owner: 10Jeena Huneidi) [08:33:29] !log upgrading db2102 mariadb (test-s1) [08:33:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:34:09] 10Operations, 10observability, 10serviceops: basic prometheus monitoring for PoolCounter - https://phabricator.wikimedia.org/T237407 (10MoritzMuehlenhoff) p:05Triage→03Normal [08:37:01] (03PS1) 10Elukey: profile::analytics::database:meta::backup_dest: absent hdfs uploader [puppet] - 10https://gerrit.wikimedia.org/r/548963 (https://phabricator.wikimedia.org/T231208) [08:38:54] (03CR) 10jerkins-bot: [V: 04-1] profile::analytics::database:meta::backup_dest: absent hdfs uploader [puppet] - 10https://gerrit.wikimedia.org/r/548963 (https://phabricator.wikimedia.org/T231208) (owner: 10Elukey) [08:44:36] (03PS3) 10DCausse: [cirrus] Disable instant indexing on wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/539117 [08:45:20] (03PS2) 10Elukey: profile::analytics::database:meta::backup_dest: absent hdfs uploader [puppet] - 10https://gerrit.wikimedia.org/r/548963 (https://phabricator.wikimedia.org/T231208) [08:48:35] (03CR) 10Elukey: [C: 03+2] profile::analytics::database:meta::backup_dest: absent hdfs uploader [puppet] - 10https://gerrit.wikimedia.org/r/548963 (https://phabricator.wikimedia.org/T231208) (owner: 10Elukey) [08:51:31] !log upgrading remaining mwdebug* servers to PHP 7.2.24 T237239 [08:51:35] !log upgrading wmf-mariadb101-client on cumin hosts [08:51:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:51:38] T237239: Upgrade to PHP 7.2.24 - https://phabricator.wikimedia.org/T237239 [08:51:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:52:47] (03PS1) 10Elukey: profile::analytics::database::meta::backup_dest: clean up hdfs configs [puppet] - 10https://gerrit.wikimedia.org/r/548965 (https://phabricator.wikimedia.org/T231208) [08:57:52] (03CR) 10Elukey: [C: 03+2] profile::analytics::database::meta::backup_dest: clean up hdfs configs [puppet] - 10https://gerrit.wikimedia.org/r/548965 (https://phabricator.wikimedia.org/T231208) (owner: 10Elukey) [08:58:54] !log onimisionipe@deploy1001 Started deploy [wdqs/wdqs@2cb2dde]: T233213 [08:58:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:03:43] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "Overall seems that the change goes into the right direction, but I'd like a more narrowly focused template change - something like a "Valu" (036 comments) [deployment-charts] - 10https://gerrit.wikimedia.org/r/545421 (https://phabricator.wikimedia.org/T228910) (owner: 10Jeena Huneidi) [09:08:17] (03PS4) 10Jcrespo: Add percona support, and standarize xtrabackup reference [software] - 10https://gerrit.wikimedia.org/r/546455 [09:08:19] (03PS1) 10Jcrespo: mariadb package: Add 10.1.42 packages [software] - 10https://gerrit.wikimedia.org/r/548967 [09:10:33] !log onimisionipe@deploy1001 Finished deploy [wdqs/wdqs@2cb2dde]: T233213 (duration: 11m 38s) [09:10:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:13:41] (03CR) 10Gehel: "I spot checked the elasticsearch.yml template, and LGTM." (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/545867 (https://phabricator.wikimedia.org/T234854) (owner: 10Filippo Giunchedi) [09:14:50] (03PS1) 10Jcrespo: dbproxy: Depool labsdb1011 for maintenance [puppet] - 10https://gerrit.wikimedia.org/r/548969 (https://phabricator.wikimedia.org/T236015) [09:15:21] 10Operations, 10netbox: Netbox: tracking of hardware errors / grouping servers in order/batches - https://phabricator.wikimedia.org/T233774 (10MoritzMuehlenhoff) Ack, searching by procurement properly addresses the batching aspect, I missed that before. [09:17:57] (03PS2) 10Jcrespo: dbproxy: Depool labsdb1011 for maintenance [puppet] - 10https://gerrit.wikimedia.org/r/548969 (https://phabricator.wikimedia.org/T236015) [09:18:13] 10Operations, 10CX-cxserver, 10Citoid, 10RESTBase, and 4 others: Decom legacy ex-parsoidcache cxserver, citoid, and restbase service hostnames - https://phabricator.wikimedia.org/T133001 (10hashar) [09:21:47] (03CR) 10Jcrespo: [C: 03+2] dbproxy: Depool labsdb1011 for maintenance [puppet] - 10https://gerrit.wikimedia.org/r/548969 (https://phabricator.wikimedia.org/T236015) (owner: 10Jcrespo) [09:24:03] (03CR) 10Hashar: "Eventually I had the issue again and remembered about this change, I should not have forgotten about it :\" [puppet] - 10https://gerrit.wikimedia.org/r/484308 (https://phabricator.wikimedia.org/T137890) (owner: 10Hashar) [09:24:10] (03CR) 10Filippo Giunchedi: [C: 04-1] "See inline" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/548954 (https://phabricator.wikimedia.org/T236482) (owner: 10Vgutierrez) [09:25:10] !log depooling labsdb1011 for wikireplica service T236015 [09:25:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:32:20] 10Operations, 10ops-eqiad: rack/setup/install ms-be105[7-9].eqiad.wmnet - https://phabricator.wikimedia.org/T237438 (10fgiunchedi) I see ms-be1059 has been changed from row D to row C @Jclark-ctr and I was wondering why? Note that to keep disk space per row balanced we shouldn't put more hosts in row C if we c... [09:33:10] !log stop and upgrade labsdb1011 T236015 [09:33:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:38:04] PROBLEM - haproxy failover on dbproxy1010 is CRITICAL: CRITICAL check_failover servers up 1 down 1 https://wikitech.wikimedia.org/wiki/HAProxy [09:39:00] 10Operations, 10serviceops: decom cobalt - https://phabricator.wikimedia.org/T236187 (10hashar) Thank you @Dzahn for all the clean up tasks! [09:39:44] checking [09:39:55] (03CR) 10Filippo Giunchedi: [C: 03+2] mtail: add logstash program [puppet] - 10https://gerrit.wikimedia.org/r/548280 (https://phabricator.wikimedia.org/T236343) (owner: 10Filippo Giunchedi) [09:39:58] PROBLEM - haproxy failover on dbproxy1018 is CRITICAL: CRITICAL check_failover servers up 1 down 1 https://wikitech.wikimedia.org/wiki/HAProxy [09:40:04] (03CR) 10Filippo Giunchedi: [C: 03+2] profile: add mtail to logstash [puppet] - 10https://gerrit.wikimedia.org/r/548281 (https://phabricator.wikimedia.org/T236343) (owner: 10Filippo Giunchedi) [09:41:49] I don't know why it detects labsdb1009 as down [09:41:54] maybe wrong ip? [09:42:38] yeah [09:42:47] things are ok, no user impact [09:43:01] I just wrote the wrong ip for the redundancy [09:43:38] they will come back when my reboot finishes [09:44:13] (03CR) 10Jcrespo: "Note I wrote the wrong ip for the secondary, and I didn't touch the new proxies." [puppet] - 10https://gerrit.wikimedia.org/r/548969 (https://phabricator.wikimedia.org/T236015) (owner: 10Jcrespo) [09:44:30] (03PS1) 10Jcrespo: Revert "dbproxy: Depool labsdb1011 for maintenance" [puppet] - 10https://gerrit.wikimedia.org/r/548974 [09:45:52] RECOVERY - haproxy failover on dbproxy1010 is OK: OK check_failover servers up 2 down 0 https://wikitech.wikimedia.org/wiki/HAProxy [09:46:13] !log upgrading mw1262-mw1265,mw1276 servers to PHP 7.2.24 T237239 [09:46:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:46:17] T237239: Upgrade to PHP 7.2.24 - https://phabricator.wikimedia.org/T237239 [09:46:22] (03CR) 10Volans: "For the unit test failure see my comment inline. For the prospector ones I think they are self-explanatory but ping me offline if somethin" (032 comments) [software/homer] - 10https://gerrit.wikimedia.org/r/547562 (owner: 10Ayounsi) [09:46:59] dbproxy1018 should recover now [09:47:42] PROBLEM - Check systemd state on logstash1007 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [09:47:48] RECOVERY - haproxy failover on dbproxy1018 is OK: OK check_failover servers up 2 down 0 https://wikitech.wikimedia.org/wiki/HAProxy [09:47:55] logstash failures are me [09:48:18] PROBLEM - Check systemd state on logstash1008 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [09:48:24] (03PS1) 10Filippo Giunchedi: hieradata: fix mtail::logs location for logstash role [puppet] - 10https://gerrit.wikimedia.org/r/548975 (https://phabricator.wikimedia.org/T236343) [09:49:34] PROBLEM - Check systemd state on logstash2004 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [09:49:50] (03CR) 10Filippo Giunchedi: [C: 03+2] "PCC https://puppet-compiler.wmflabs.org/compiler1003/19270/logstash1007.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/548975 (https://phabricator.wikimedia.org/T236343) (owner: 10Filippo Giunchedi) [09:50:09] (03CR) 10Urbanecm: [C: 03+1] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548717 (https://phabricator.wikimedia.org/T237369) (owner: 10Jon Harald Søby) [09:50:38] (03PS2) 10Urbanecm: Add 104 (Cookbook) to $wgContentNamespaces for bnwikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548738 (https://phabricator.wikimedia.org/T236840) (owner: 10Majavah) [09:50:41] ACKNOWLEDGEMENT - Maps - OSM synchronization lag - codfw on icinga1001 is CRITICAL: 8.13e+05 ge 2.592e+05 Mathew.onipe see https://phabricator.wikimedia.org/T237228 - The acknowledgement expires at: 2019-11-09 09:47:45. https://wikitech.wikimedia.org/wiki/Maps/Runbook https://grafana.wikimedia.org/dashboard/db/maps-performances?panelId=12&fullscreen&orgId=1 [09:50:41] ACKNOWLEDGEMENT - Maps - OSM synchronization lag - eqiad on icinga1001 is CRITICAL: 8.13e+05 ge 2.592e+05 Mathew.onipe see https://phabricator.wikimedia.org/T237228 - The acknowledgement expires at: 2019-11-09 09:47:45. https://wikitech.wikimedia.org/wiki/Maps/Runbook https://grafana.wikimedia.org/dashboard/db/maps-performances?panelId=11&fullscreen&orgId=1 [09:52:30] RECOVERY - Check systemd state on logstash1007 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [09:55:26] 10Operations, 10Puppet, 10User-jbond: Rake tasks: add colours - https://phabricator.wikimedia.org/T237508 (10jbond) [09:56:48] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] new *.wmflabs.org certificate [puppet] - 10https://gerrit.wikimedia.org/r/547680 (https://phabricator.wikimedia.org/T237066) (owner: 10RobH) [09:57:16] (03PS1) 10Jbond: rake_modules: add coloured output to spec tests [puppet] - 10https://gerrit.wikimedia.org/r/548977 (https://phabricator.wikimedia.org/T237508) [09:59:44] (03PS2) 10Jbond: rake_modules: add coloured output to spec tests [puppet] - 10https://gerrit.wikimedia.org/r/548977 (https://phabricator.wikimedia.org/T237508) [10:05:11] (03CR) 10Giuseppe Lavagetto: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/548977 (https://phabricator.wikimedia.org/T237508) (owner: 10Jbond) [10:05:46] (03CR) 10Jbond: [C: 03+2] rake_modules: add coloured output to spec tests [puppet] - 10https://gerrit.wikimedia.org/r/548977 (https://phabricator.wikimedia.org/T237508) (owner: 10Jbond) [10:12:56] 10Operations, 10ops-eqiad: (Need by Aug 1) rack/setup/install dumpsdata1003.eqiad.wmnet - https://phabricator.wikimedia.org/T234076 (10MoritzMuehlenhoff) @Cmjohnson: According to the DHCP logs on install1002, the server correctly assigned the IP address, but I suspect the error is caused by the OS here; it's c... [10:16:28] 10Operations, 10Puppet, 10Cloud-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Puppet tab in Horizon unusably slow - https://phabricator.wikimedia.org/T149589 (10fgiunchedi) >>! In T149589#5633373, @bd808 wrote: >>>! In T149589#5632880, @Andrew wrote: >> I've added an alternative editor th... [10:16:48] RECOVERY - Check systemd state on logstash2004 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [10:17:02] RECOVERY - Check systemd state on logstash1008 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [10:17:44] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [10:20:01] (03CR) 10Jbond: logstash: add version param and exclude plugins when non 5.x (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/548880 (https://phabricator.wikimedia.org/T217340) (owner: 10Herron) [10:21:07] 10Operations, 10Puppet, 10User-jbond: Rake tasks: add colours and buffer output - https://phabricator.wikimedia.org/T237508 (10jbond) [10:33:02] (03CR) 10Jcrespo: [C: 03+2] Revert "dbproxy: Depool labsdb1011 for maintenance" [puppet] - 10https://gerrit.wikimedia.org/r/548974 (owner: 10Jcrespo) [10:35:27] (03CR) 10Volans: [C: 04-1] "Let's chat offline about the details, see comments inline." (032 comments) [software/homer] - 10https://gerrit.wikimedia.org/r/547562 (owner: 10Ayounsi) [10:43:37] (03PS1) 10DCausse: [beta] deployment-eventgate-1 -> deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549055 [10:43:39] (03PS1) 10DCausse: [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 [10:44:15] (03PS1) 10Filippo Giunchedi: swift: don't clean up swiftrepl user/group [puppet] - 10https://gerrit.wikimedia.org/r/549057 [10:47:57] (03PS1) 10Arturo Borrero Gonzalez: base: certificates: add new GlobalSign CA file [puppet] - 10https://gerrit.wikimedia.org/r/549058 (https://phabricator.wikimedia.org/T237066) [10:49:45] (03PS1) 10Giuseppe Lavagetto: Also check charts generated by helmfile [deployment-charts] - 10https://gerrit.wikimedia.org/r/549059 [10:55:20] PROBLEM - Check systemd state on netbox1001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [10:55:38] PROBLEM - Check the last execution of netbox_ganeti_codfw_sync on netbox1001 is CRITICAL: CRITICAL: Status of the systemd unit netbox_ganeti_codfw_sync https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [10:58:35] (03CR) 10Jbond: [C: 03+1] "lgtm although im not sure what the policy is in relation to trusting new root CA's so please also have moritz sign of" [puppet] - 10https://gerrit.wikimedia.org/r/549058 (https://phabricator.wikimedia.org/T237066) (owner: 10Arturo Borrero Gonzalez) [11:01:38] jouncebot: next [11:01:39] In 0 hour(s) and 58 minute(s): European Mid-day SWAT(Max 6 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20191106T1200) [11:02:31] (03PS2) 10DCausse: [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 [11:04:34] (03PS1) 10Ema: Add architecture diagram [software/fifo-log-demux] - 10https://gerrit.wikimedia.org/r/549064 [11:06:05] !log jynus@cumin1001 dbctl commit (dc=all): 'Depool db1074', diff saved to https://phabricator.wikimedia.org/P9536 and previous config saved to /var/cache/conftool/dbconfig/20191106-110603-jynus.json [11:06:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:06:12] RECOVERY - Check the last execution of netbox_ganeti_codfw_sync on netbox1001 is OK: OK: Status of the systemd unit netbox_ganeti_codfw_sync https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:06:26] RECOVERY - Check systemd state on netbox1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [11:14:21] !log stopping db1074 for maintenance (will create temporary s2 lag on wikireplicas) [11:14:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:22:31] 10Operations, 10Puppet, 10Cloud-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Puppet tab in Horizon unusably slow - https://phabricator.wikimedia.org/T149589 (10Volans) It's great to see some movement in this space! I've tried it and indeed is much better. But the current experience IMHO... [11:22:35] could the stucks on dewiki and enwiki come from this lag. Now both work fine again [11:27:22] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] toolforge: Reload/restart services as appropriate on certificate change [puppet] - 10https://gerrit.wikimedia.org/r/548927 (owner: 10Alex Monk) [11:35:11] !log jynus@cumin1001 dbctl commit (dc=all): 'Repool db1074 at 10%', diff saved to https://phabricator.wikimedia.org/P9537 and previous config saved to /var/cache/conftool/dbconfig/20191106-113510-jynus.json [11:35:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:46:51] (03PS3) 10Arturo Borrero Gonzalez: toolforge: proxy: adjust setup for the new k8s cluster [puppet] - 10https://gerrit.wikimedia.org/r/543135 (https://phabricator.wikimedia.org/T234037) [11:48:50] (03CR) 10jerkins-bot: [V: 04-1] toolforge: proxy: adjust setup for the new k8s cluster [puppet] - 10https://gerrit.wikimedia.org/r/543135 (https://phabricator.wikimedia.org/T234037) (owner: 10Arturo Borrero Gonzalez) [11:52:00] (03PS1) 10Elukey: profile::mariadb::misc::eventlogging::sanitization: ease clean up [puppet] - 10https://gerrit.wikimedia.org/r/549070 (https://phabricator.wikimedia.org/T236818) [11:52:02] (03PS1) 10Elukey: Remove Eventloggging sanitization automation from log databases [puppet] - 10https://gerrit.wikimedia.org/r/549071 (https://phabricator.wikimedia.org/T236818) [11:52:50] jouncebot: next [11:52:50] In 0 hour(s) and 7 minute(s): European Mid-day SWAT(Max 6 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20191106T1200) [11:57:52] !log upgrade and restart db2048 [11:57:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:59:15] (03PS2) 10Elukey: profile::mariadb::misc::eventlogging::sanitization: ease clean up [puppet] - 10https://gerrit.wikimedia.org/r/549070 (https://phabricator.wikimedia.org/T236818) [11:59:17] (03PS2) 10Elukey: Remove Eventloggging sanitization automation from log databases [puppet] - 10https://gerrit.wikimedia.org/r/549071 (https://phabricator.wikimedia.org/T236818) [12:00:04] Amir1, Lucas_WMDE, and Urbanecm: I seem to be stuck in Groundhog week. Sigh. Time for (yet another) European Mid-day SWAT(Max 6 patches) deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20191106T1200). [12:00:05] dcausse and tassu: A patch you scheduled for European Mid-day SWAT(Max 6 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [12:00:10] (03CR) 10Elukey: [C: 04-2] "Sanitization still running!" [puppet] - 10https://gerrit.wikimedia.org/r/549070 (https://phabricator.wikimedia.org/T236818) (owner: 10Elukey) [12:00:30] I can SWAT today! [12:00:38] o/ [12:00:45] tassu: around? [12:00:46] If we could wait like to 20 minutes for my patch that would be great [12:00:54] tassu: we can :-) [12:00:57] ping me once you're ready [12:01:02] Thanks [12:01:11] (03CR) 10Urbanecm: [C: 03+2] [cirrus] Disable instant indexing on wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/539117 (owner: 10DCausse) [12:03:08] (03Merged) 10jenkins-bot: [cirrus] Disable instant indexing on wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/539117 (owner: 10DCausse) [12:03:44] dcausse: please test your patch at mwdebug1001 [12:04:47] (03PS1) 10Arturo Borrero Gonzalez: toolforge: proxy: make https fully optional [puppet] - 10https://gerrit.wikimedia.org/r/549072 (https://phabricator.wikimedia.org/T237443) [12:07:04] Urbanecm: looks good (no errors in logstash) but cannot really test as this affects indexing jobs, please proceed with the deploy [12:08:00] dcausse: ack [12:08:29] to be fair, I think the most important thing is that current webrequests don't get affected [12:08:43] (03PS7) 10Urbanecm: Allow certain users to create account at closed wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/542755 (https://phabricator.wikimedia.org/T222117) [12:09:05] (03CR) 10Urbanecm: [C: 03+2] "let's try this" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/542755 (https://phabricator.wikimedia.org/T222117) (owner: 10Urbanecm) [12:09:24] (03CR) 10Mobrovac: [C: 03+1] [beta] deployment-eventgate-1 -> deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549055 (owner: 10DCausse) [12:09:48] !log urbanecm@deploy1001 Synchronized wmf-config/InitialiseSettings.php: SWAT: 5875c45: [cirrus] Disable instant indexing on wikidata (duration: 01m 15s) [12:09:49] (03Merged) 10jenkins-bot: Allow certain users to create account at closed wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/542755 (https://phabricator.wikimedia.org/T222117) (owner: 10Urbanecm) [12:09:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:09:56] dcausse: deployed [12:10:09] Urbanecm: thanks! [12:10:15] you're welcome [12:11:15] hauskater: hi, you around? [12:11:27] Somewhat Urbanecm [12:11:30] Tell me [12:12:04] hauskater: would it be possible to create a temp global group for me that would have `autocreateaccount`? Or perhaps add that priv to glob. rollbackers temporarily? [12:12:25] Urbanecm: I think we can create a test group for you [12:12:35] great. Thanks! [12:12:40] Do we have a ticket at hand? [12:12:50] hauskater: T222117 [12:12:51] T222117: Create accounts for new stewards in closed wikis - https://phabricator.wikimedia.org/T222117 [12:13:13] awesome, do you need it now or you can wait 10-15' ? [12:13:49] hauskater: I can wait :) [12:15:02] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [12:15:02] Thanks. I'm filing a bug and I need to concentrate :) [12:15:13] sure [12:18:38] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] toolforge: proxy: make https fully optional [puppet] - 10https://gerrit.wikimedia.org/r/549072 (https://phabricator.wikimedia.org/T237443) (owner: 10Arturo Borrero Gonzalez) [12:19:49] (03PS1) 10Phamhi: tools-webservice: Remove user warning on start/restart on access.log [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/549075 (https://phabricator.wikimedia.org/T233347) [12:21:19] hauskater: once you create the group, please make Martin Urbanec (test) a member of it. Thx! [12:21:34] Sure [12:22:37] Urbanecm: I am ready [12:23:33] (03CR) 10Arturo Borrero Gonzalez: [C: 03+1] "LGTM" [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/549075 (https://phabricator.wikimedia.org/T233347) (owner: 10Phamhi) [12:23:39] tassu: ack [12:23:52] (03CR) 10Phamhi: [C: 03+2] tools-webservice: Remove user warning on start/restart on access.log [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/549075 (https://phabricator.wikimedia.org/T233347) (owner: 10Phamhi) [12:23:54] (03PS3) 10Urbanecm: Add 104 (Cookbook) to $wgContentNamespaces for bnwikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548738 (https://phabricator.wikimedia.org/T236840) (owner: 10Majavah) [12:23:57] (03CR) 10Urbanecm: [C: 03+2] Add 104 (Cookbook) to $wgContentNamespaces for bnwikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548738 (https://phabricator.wikimedia.org/T236840) (owner: 10Majavah) [12:24:38] (03PS1) 10Jcrespo: mariadb: Depool es1019 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549076 [12:24:46] (03Merged) 10jenkins-bot: Add 104 (Cookbook) to $wgContentNamespaces for bnwikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548738 (https://phabricator.wikimedia.org/T236840) (owner: 10Majavah) [12:24:50] (03PS1) 10Arturo Borrero Gonzalez: toolforge: proxy: prevent variable from reassignation [puppet] - 10https://gerrit.wikimedia.org/r/549077 [12:26:32] (03PS1) 10Urbanecm: Add logger to passed closedwikiprovider check [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549078 (https://phabricator.wikimedia.org/T222117) [12:26:33] (03PS1) 10Urbanecm: Do not mention user group in ClosedWikiProvider logging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549079 (https://phabricator.wikimedia.org/T222117) [12:27:02] tassu: please check at mwdebug1001 and let me know [12:27:15] (03CR) 10Urbanecm: [C: 03+2] Add logger to passed closedwikiprovider check [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549078 (https://phabricator.wikimedia.org/T222117) (owner: 10Urbanecm) [12:27:19] (03CR) 10Urbanecm: [C: 03+2] Do not mention user group in ClosedWikiProvider logging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549079 (https://phabricator.wikimedia.org/T222117) (owner: 10Urbanecm) [12:28:05] (03Merged) 10jenkins-bot: Add logger to passed closedwikiprovider check [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549078 (https://phabricator.wikimedia.org/T222117) (owner: 10Urbanecm) [12:28:08] (03Merged) 10jenkins-bot: Do not mention user group in ClosedWikiProvider logging [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549079 (https://phabricator.wikimedia.org/T222117) (owner: 10Urbanecm) [12:29:29] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "> I'd like Alex to take a look and give his opinion too." [deployment-charts] - 10https://gerrit.wikimedia.org/r/545421 (https://phabricator.wikimedia.org/T228910) (owner: 10Jeena Huneidi) [12:29:56] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] toolforge: proxy: prevent variable from reassignation [puppet] - 10https://gerrit.wikimedia.org/r/549077 (owner: 10Arturo Borrero Gonzalez) [12:31:01] (03PS1) 10Arturo Borrero Gonzalez: toolforge: toolviews: add default hiera values [puppet] - 10https://gerrit.wikimedia.org/r/549080 (https://phabricator.wikimedia.org/T237443) [12:31:08] Urbanecm: seems to be working [12:32:07] tassu: okay, syncing [12:32:14] Urbanecm: I'll create the global group after lunch in 30/35 minutes [12:32:21] Got to leave and make some calls [12:32:30] okay [12:34:36] !log urbanecm@deploy1001 Synchronized wmf-config/InitialiseSettings.php: SWAT: 3e9ede0: Add 104 (Cookbook) to $wgContentNamespaces for bnwikibooks (T236840) (duration: 01m 00s) [12:34:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:34:40] tassu: done! [12:34:41] T236840: Enable VisualEditor Extension on bn.wikibooks at রন্ধনপ্রণালী (cookbook) Namespace - https://phabricator.wikimedia.org/T236840 [12:35:14] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] toolforge: toolviews: add default hiera values [puppet] - 10https://gerrit.wikimedia.org/r/549080 (https://phabricator.wikimedia.org/T237443) (owner: 10Arturo Borrero Gonzalez) [12:35:22] thanks [12:35:48] you're welcome [12:36:43] !log urbanecm@deploy1001 Synchronized wmf-config/CommonSettings.php: SWAT: a239b14: Allow certain users to create account at closed wikis (T222117; 1/2) (duration: 00m 59s) [12:36:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:36:49] T222117: Create accounts for new stewards in closed wikis - https://phabricator.wikimedia.org/T222117 [12:38:03] !log urbanecm@deploy1001 Synchronized wmf-config/InitialiseSettings.php: SWAT: a239b14: Allow certain users to create account at closed wikis (T222117; 2/2) (duration: 01m 00s) [12:38:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:38:53] !log EU SWAT done [12:38:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:01:35] (03CR) 10Mobrovac: [C: 04-1] "> If you want to be able to use sqlite, what needs to be added to the" [deployment-charts] - 10https://gerrit.wikimedia.org/r/545421 (https://phabricator.wikimedia.org/T228910) (owner: 10Jeena Huneidi) [13:04:50] (03PS4) 10Arturo Borrero Gonzalez: toolforge: proxy: adjust setup for the new k8s cluster [puppet] - 10https://gerrit.wikimedia.org/r/543135 (https://phabricator.wikimedia.org/T234037) [13:09:59] (03PS23) 10Gehel: query_service: rename wdqs module to query_service [puppet] - 10https://gerrit.wikimedia.org/r/538572 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:10:01] (03PS31) 10Gehel: query_service: prepare query_service for reusbility [puppet] - 10https://gerrit.wikimedia.org/r/537138 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:10:03] (03PS29) 10Gehel: query_service: rename profile/wdqs to profile/query_service [puppet] - 10https://gerrit.wikimedia.org/r/538849 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:10:05] (03PS27) 10Gehel: query_service: separate categories from main blazegraph profile [puppet] - 10https://gerrit.wikimedia.org/r/539285 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:10:07] (03PS28) 10Gehel: query_service: properly adapt query_service profile [puppet] - 10https://gerrit.wikimedia.org/r/539513 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:10:09] (03PS28) 10Gehel: query_service: properly adapt hiera configs [puppet] - 10https://gerrit.wikimedia.org/r/539998 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:11:07] (03PS5) 10Arturo Borrero Gonzalez: toolforge: proxy: adjust setup for the new k8s cluster [puppet] - 10https://gerrit.wikimedia.org/r/543135 (https://phabricator.wikimedia.org/T234037) [13:12:36] (03PS1) 10DCausse: [wdqs] configure eventgate endpoint for sparql/query events [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) [13:14:26] (03CR) 10jerkins-bot: [V: 04-1] [wdqs] configure eventgate endpoint for sparql/query events [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) (owner: 10DCausse) [13:15:40] (03CR) 10Gehel: [C: 03+2] query_service: rename wdqs module to query_service [puppet] - 10https://gerrit.wikimedia.org/r/538572 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:16:28] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [13:17:02] (03CR) 10Gehel: [C: 03+2] query_service: prepare query_service for reusbility [puppet] - 10https://gerrit.wikimedia.org/r/537138 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:17:32] 10Operations, 10ops-eqiad: rack/setup/install ms-be105[7-9].eqiad.wmnet - https://phabricator.wikimedia.org/T237438 (10Jclark-ctr) @fgiunchedi The host was put into rack prior to this ticket being made i can move to row D today [13:21:46] (03CR) 10Gehel: query_service: prepare query_service for reusbility (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/537138 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:23:50] (03CR) 10Gehel: [C: 03+2] query_service: rename profile/wdqs to profile/query_service [puppet] - 10https://gerrit.wikimedia.org/r/538849 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:25:07] 10Operations, 10ops-eqiad: rack/setup/install ms-be105[7-9].eqiad.wmnet - https://phabricator.wikimedia.org/T237438 (10fgiunchedi) >>! In T237438#5640344, @Jclark-ctr wrote: > @fgiunchedi The host was put into rack prior to this ticket being made i can move to row D today Thanks! Please move to row D [13:25:24] (03CR) 10Gehel: [C: 03+2] query_service: separate categories from main blazegraph profile [puppet] - 10https://gerrit.wikimedia.org/r/539285 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:26:28] (03CR) 10Gehel: [C: 03+2] query_service: properly adapt query_service profile [puppet] - 10https://gerrit.wikimedia.org/r/539513 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:26:33] (03CR) 10Filippo Giunchedi: [C: 03+2] swift: don't clean up swiftrepl user/group [puppet] - 10https://gerrit.wikimedia.org/r/549057 (owner: 10Filippo Giunchedi) [13:26:57] godog: can I merge your patch as well? [13:27:07] gehel: yeah! thank you [13:27:09] great timing [13:27:19] :) [13:27:30] done [13:28:27] (03CR) 10Gehel: [C: 03+2] query_service: properly adapt hiera configs [puppet] - 10https://gerrit.wikimedia.org/r/539998 (https://phabricator.wikimedia.org/T232297) (owner: 10Mathew.onipe) [13:31:45] (03PS1) 10Elukey: Include Kerberos profiles in the Analytics infrastructure [puppet] - 10https://gerrit.wikimedia.org/r/549083 (https://phabricator.wikimedia.org/T237269) [13:37:34] (03PS7) 10RLazarus: httpbb: Create a new Puppet module for httpbb. [puppet] - 10https://gerrit.wikimedia.org/r/548461 (https://phabricator.wikimedia.org/T236699) [13:37:44] (03CR) 10RLazarus: httpbb: Create a new Puppet module for httpbb. (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/548461 (https://phabricator.wikimedia.org/T236699) (owner: 10RLazarus) [13:38:11] (03CR) 10jerkins-bot: [V: 04-1] httpbb: Create a new Puppet module for httpbb. [puppet] - 10https://gerrit.wikimedia.org/r/548461 (https://phabricator.wikimedia.org/T236699) (owner: 10RLazarus) [13:38:24] (03PS2) 10DCausse: [beta] deployment-eventgate-1 -> deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549055 [13:38:26] (03PS3) 10DCausse: [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 [13:38:27] (03PS2) 10DCausse: [wdqs] configure eventgate endpoint for sparql/query events [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) [13:39:01] (03PS8) 10RLazarus: httpbb: Create a new Puppet module for httpbb. [puppet] - 10https://gerrit.wikimedia.org/r/548461 (https://phabricator.wikimedia.org/T236699) [13:39:35] (03CR) 10jerkins-bot: [V: 04-1] [wdqs] configure eventgate endpoint for sparql/query events [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) (owner: 10DCausse) [13:41:46] (03PS3) 10DCausse: [wdqs] configure eventgate endpoint for sparql/query events [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) [13:43:36] 10Operations, 10Epic, 10Maps (Kartotherian), 10Patch-For-Review: Move Kartotherian and Tilerator to Kubernetes - https://phabricator.wikimedia.org/T216826 (10MSantos) @Mathew.onipe and @Jdforrester-WMF just FYI: I have tested kartotherian with debian buster and upstream mapnik library and it works just fin... [13:46:49] (03CR) 10Giuseppe Lavagetto: [C: 03+1] "LGTM" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/548461 (https://phabricator.wikimedia.org/T236699) (owner: 10RLazarus) [14:00:54] (03PS9) 10RLazarus: httpbb: Create a new Puppet module for httpbb. [puppet] - 10https://gerrit.wikimedia.org/r/548461 (https://phabricator.wikimedia.org/T236699) [14:01:34] (03CR) 10RLazarus: httpbb: Create a new Puppet module for httpbb. (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/548461 (https://phabricator.wikimedia.org/T236699) (owner: 10RLazarus) [14:01:37] (03CR) 10RLazarus: [C: 03+2] httpbb: Create a new Puppet module for httpbb. [puppet] - 10https://gerrit.wikimedia.org/r/548461 (https://phabricator.wikimedia.org/T236699) (owner: 10RLazarus) [14:02:00] (03PS4) 10DCausse: [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 [14:02:02] (03PS4) 10DCausse: [wdqs] configure eventgate endpoint for sparql/query events [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) [14:07:04] !log jynus@cumin1001 dbctl commit (dc=all): 'Repool db1074 at 50%', diff saved to https://phabricator.wikimedia.org/P9539 and previous config saved to /var/cache/conftool/dbconfig/20191106-140702-jynus.json [14:07:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:10:30] (03PS2) 10Elukey: Include Kerberos profiles in the Analytics infrastructure [puppet] - 10https://gerrit.wikimedia.org/r/549083 (https://phabricator.wikimedia.org/T237269) [14:13:10] (03CR) 10Elukey: [C: 03+2] Include Kerberos profiles in the Analytics infrastructure [puppet] - 10https://gerrit.wikimedia.org/r/549083 (https://phabricator.wikimedia.org/T237269) (owner: 10Elukey) [14:19:22] (03PS2) 10Jcrespo: mariadb: Depool es1019 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549076 [14:20:38] 10Operations, 10Cloud-VPS, 10User-fgiunchedi, 10cloud-services-team (Kanban): CPU scaling governor audit - https://phabricator.wikimedia.org/T225713 (10hashar) @andrew @bd808 @aborrero can you look at updating the bios setting for some of the affected cloudvirt? Based on the benchmark in my previous comm... [14:20:45] (03CR) 10Jcrespo: [C: 03+2] mariadb: Depool es1019 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549076 (owner: 10Jcrespo) [14:21:28] (03Merged) 10jenkins-bot: mariadb: Depool es1019 for maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549076 (owner: 10Jcrespo) [14:21:48] (03PS1) 10Jcrespo: Revert "mariadb: Depool es1019 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549089 [14:23:41] !log jynus@deploy1001 Synchronized wmf-config/db-eqiad.php: depool es1019 (duration: 01m 00s) [14:23:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:25:18] (03CR) 10Gehel: "question in line (open to discussion)" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/549056 (owner: 10DCausse) [14:25:41] (03PS1) 10Jcrespo: mariadb: Repool es1019 with low weight after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549090 [14:25:52] (03CR) 10Ottomata: [wdqs] configure eventgate endpoint for sparql/query events (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) (owner: 10DCausse) [14:27:00] !log upgrade and restart es1019 [14:27:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:27:45] (03CR) 10Ottomata: [C: 03+1] Remove Eventloggging sanitization automation from log databases [puppet] - 10https://gerrit.wikimedia.org/r/549071 (https://phabricator.wikimedia.org/T236818) (owner: 10Elukey) [14:28:07] (03CR) 10Ottomata: [C: 03+2] "Oh! Thank you!" [puppet] - 10https://gerrit.wikimedia.org/r/549055 (owner: 10DCausse) [14:28:15] (03PS3) 10Ottomata: [beta] deployment-eventgate-1 -> deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549055 (owner: 10DCausse) [14:28:29] (03CR) 10Ottomata: [V: 03+2 C: 03+2] [beta] deployment-eventgate-1 -> deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549055 (owner: 10DCausse) [14:32:18] elukey: so according to ottomata in fact I should send to eventgate-analytics, and just checked it has a discovery ns entry as well, not sure why I thought it had not [14:34:40] (03PS2) 10Mholloway: MachineVision: Update filtered_tables.txt [puppet] - 10https://gerrit.wikimedia.org/r/548886 (https://phabricator.wikimedia.org/T227349) [14:35:53] (03CR) 10Jforrester: [C: 04-2] "This isn't deploy-safe. Deploys are not atomic, they are file-based. On attempted deployment, this would break the canaries and not be dep" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548569 (owner: 10Ammarpad) [14:36:47] (03PS5) 10Mholloway: WikimediaEditorTasks: Enable streaks and revert counts in production [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548832 (https://phabricator.wikimedia.org/T234955) [14:36:47] dcausse: ah yes! My bad too, didn't check.. needed more coffee probably! [14:37:01] np :) [14:37:09] o/ :) [14:38:18] (03CR) 10Mholloway: [C: 03+2] WikimediaEditorTasks: Enable streaks and revert counts in production [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548832 (https://phabricator.wikimedia.org/T234955) (owner: 10Mholloway) [14:41:18] !log mholloway-shell@deploy1001 Synchronized wmf-config/InitialiseSettings.php: WikimediaEditorTasks: Enable streaks and revert counts (T234955, T234956) (duration: 01m 00s) [14:41:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:41:24] T234955: API for reverts for SE v3 - https://phabricator.wikimedia.org/T234955 [14:41:25] T234956: API for streak for SE v3 - https://phabricator.wikimedia.org/T234956 [14:42:33] (03CR) 10Jcrespo: [C: 03+2] mariadb: Repool es1019 with low weight after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549090 (owner: 10Jcrespo) [14:43:21] (03Merged) 10jenkins-bot: mariadb: Repool es1019 with low weight after maintenance [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549090 (owner: 10Jcrespo) [14:43:42] (03PS1) 10Jbond: wmflib - puppet_config: add master section config [puppet] - 10https://gerrit.wikimedia.org/r/549094 [14:45:07] (03CR) 10jerkins-bot: [V: 04-1] wmflib - puppet_config: add master section config [puppet] - 10https://gerrit.wikimedia.org/r/549094 (owner: 10Jbond) [14:45:33] !log jynus@deploy1001 Synchronized wmf-config/db-eqiad.php: repool es1019 with low weight (duration: 00m 59s) [14:45:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:46:24] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "If you rebase this patch on top of https://gerrit.wikimedia.org/r/#/c/operations/deployment-charts/+/549059/, you should see how the new c" [deployment-charts] - 10https://gerrit.wikimedia.org/r/545421 (https://phabricator.wikimedia.org/T228910) (owner: 10Jeena Huneidi) [14:48:28] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [14:49:02] (03PS2) 10Jcrespo: Revert "mariadb: Depool es1019 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549089 [14:50:10] (03CR) 10Alexandros Kosiaris: [C: 04-1] Also check charts generated by helmfile (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/549059 (owner: 10Giuseppe Lavagetto) [14:55:14] (03CR) 10Jcrespo: [C: 04-1] "See comment" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/548886 (https://phabricator.wikimedia.org/T227349) (owner: 10Mholloway) [14:56:24] (03PS1) 10Jbond: taskgen: force colour for rubocop errors [puppet] - 10https://gerrit.wikimedia.org/r/549097 [14:57:34] (03CR) 10Jbond: [C: 03+2] taskgen: force colour for rubocop errors [puppet] - 10https://gerrit.wikimedia.org/r/549097 (owner: 10Jbond) [14:58:40] (03PS2) 10Jbond: wmflib - puppet_config: add master section config [puppet] - 10https://gerrit.wikimedia.org/r/549094 [14:58:59] (03PS3) 10Mholloway: MachineVision: Update filtered_tables.txt [puppet] - 10https://gerrit.wikimedia.org/r/548886 (https://phabricator.wikimedia.org/T227349) [14:59:51] (03CR) 10Mholloway: MachineVision: Update filtered_tables.txt (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/548886 (https://phabricator.wikimedia.org/T227349) (owner: 10Mholloway) [15:00:01] (03CR) 10jerkins-bot: [V: 04-1] wmflib - puppet_config: add master section config [puppet] - 10https://gerrit.wikimedia.org/r/549094 (owner: 10Jbond) [15:00:04] mdholloway: Dear deployers, time to do the Enable MachineVision on testcommonswiki deploy. Dont look at me like that. You signed up for it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20191106T1500). [15:00:22] (03CR) 10Volans: "I've commented in the homer's related CR, and I'm looking at the data to see if we can simplify a bit this." (031 comment) [homer/public] - 10https://gerrit.wikimedia.org/r/547584 (owner: 10Ayounsi) [15:01:05] (03CR) 10Jcrespo: [C: 03+2] MachineVision: Update filtered_tables.txt [puppet] - 10https://gerrit.wikimedia.org/r/548886 (https://phabricator.wikimedia.org/T227349) (owner: 10Mholloway) [15:02:21] (03PS3) 10Jbond: wmflib - puppet_config: add master section config [puppet] - 10https://gerrit.wikimedia.org/r/549094 [15:06:07] 10Operations, 10SRE-tools, 10User-jbond: Puppet compiler: abort on git rebase conflict - https://phabricator.wikimedia.org/T157001 (10jbond) [15:06:56] (03CR) 10Jbond: [C: 03+2] wmflib - puppet_config: add master section config [puppet] - 10https://gerrit.wikimedia.org/r/549094 (owner: 10Jbond) [15:10:43] (03PS1) 10Phedenskog: Grafana alert for WebPageReplay enwiki tests [puppet] - 10https://gerrit.wikimedia.org/r/549098 (https://phabricator.wikimedia.org/T237308) [15:10:55] (03PS2) 10Mholloway: MachineVision: Do not restrict to testing users in Beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548868 [15:10:57] (03PS4) 10Mholloway: Configure and enable MachineVision on testcommonswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548869 (https://phabricator.wikimedia.org/T227349) [15:15:00] (03CR) 10Phedenskog: "Thank you Filippo for noticing that I missed to update in T237515. I've removed the old alerts and prepared for the new structure but forg" [puppet] - 10https://gerrit.wikimedia.org/r/549098 (https://phabricator.wikimedia.org/T237308) (owner: 10Phedenskog) [15:15:02] (03PS5) 10Mholloway: Configure and enable MachineVision on testcommonswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548869 (https://phabricator.wikimedia.org/T227349) [15:17:30] 10Operations, 10SRE-tools, 10Patch-For-Review, 10User-jbond: sre.hosts.downtime fails with "No hosts provided" - https://phabricator.wikimedia.org/T236684 (10Volans) 05Open→03Resolved We've migrated to the new puppetdb hosts with the newer version. The queue size is under control for now. Resolving. [15:18:48] (03CR) 10Alexandros Kosiaris: [C: 04-2] "It's already part of the scap class (which gets included via scap::target), no need for this (plus it would probably cause a duplicate dec" [puppet] - 10https://gerrit.wikimedia.org/r/547778 (https://phabricator.wikimedia.org/T235013) (owner: 1020after4) [15:20:31] (03CR) 10Filippo Giunchedi: [C: 03+2] "> Patch Set 1:" [puppet] - 10https://gerrit.wikimedia.org/r/549098 (https://phabricator.wikimedia.org/T237308) (owner: 10Phedenskog) [15:20:53] (03Abandoned) 10Jbond: Revert "puppetdb6: update config to use the new puppetdb6 servers" [puppet] - 10https://gerrit.wikimedia.org/r/548256 (owner: 10Jbond) [15:24:32] (03PS1) 10Mholloway: MachineVision: Fix Beta config with updated service name [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549099 [15:25:03] 10Operations, 10SRE-tools, 10puppet-compiler, 10User-jbond: Puppet compiler: re-add the concurrency option NUM_THREADS - https://phabricator.wikimedia.org/T157002 (10jbond) [15:26:37] (03CR) 10Mholloway: [C: 03+2] MachineVision: Fix Beta config with updated service name [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549099 (owner: 10Mholloway) [15:26:52] (03CR) 10Alexandros Kosiaris: [C: 03+1] "Let us know when this will be deployed. There is the blocker of creating namespaces as well (another change in this repo and a deployment)" [deployment-charts] - 10https://gerrit.wikimedia.org/r/547307 (https://phabricator.wikimedia.org/T236386) (owner: 10Ottomata) [15:29:21] 10Operations, 10Puppet, 10puppet-compiler, 10User-jbond: puppet master command will be removed in puppet 6 - https://phabricator.wikimedia.org/T236373 (10jbond) [15:30:04] mdholloway: #bothumor When your hammer is PHP, everything starts looking like a thumb. Rise for Enable MachineVision on testcommonswiki. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20191106T1530). [15:30:29] (03CR) 10DCausse: [beta] configure sparql/query logging to deployment-eventgate-3 (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/549056 (owner: 10DCausse) [15:30:33] (03CR) 10DCausse: [wdqs] configure eventgate endpoint for sparql/query events (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) (owner: 10DCausse) [15:31:02] !log mholloway-shell@deploy1001 Synchronized wmf-config/InitialiseSettings-labs.php: MachineVision: Fix Beta config with updated service name (duration: 01m 02s) [15:31:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:31:09] (03PS5) 10DCausse: [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 [15:31:11] (03PS5) 10DCausse: [wdqs] configure eventgate endpoint for sparql/query events [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) [15:31:38] (03CR) 10Ottomata: [wdqs] configure eventgate endpoint for sparql/query events (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) (owner: 10DCausse) [15:33:16] (03CR) 10jerkins-bot: [V: 04-1] [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 (owner: 10DCausse) [15:35:10] (03PS9) 10Jbond: puppetmnasters: use localcacert setting for CA file in apache [puppet] - 10https://gerrit.wikimedia.org/r/545575 (https://phabricator.wikimedia.org/T234332) [15:36:21] (03CR) 10jerkins-bot: [V: 04-1] puppetmnasters: use localcacert setting for CA file in apache [puppet] - 10https://gerrit.wikimedia.org/r/545575 (https://phabricator.wikimedia.org/T234332) (owner: 10Jbond) [15:38:02] (03PS10) 10Jbond: puppetmnasters: use localcacert setting for CA file in apache [puppet] - 10https://gerrit.wikimedia.org/r/545575 (https://phabricator.wikimedia.org/T234332) [15:38:27] (03PS3) 10Mholloway: MachineVision: Do not restrict to testing users in Beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548868 [15:38:36] (03CR) 10Ottomata: [beta] configure sparql/query logging to deployment-eventgate-3 (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/549056 (owner: 10DCausse) [15:38:44] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [15:39:11] (03CR) 10jerkins-bot: [V: 04-1] puppetmnasters: use localcacert setting for CA file in apache [puppet] - 10https://gerrit.wikimedia.org/r/545575 (https://phabricator.wikimedia.org/T234332) (owner: 10Jbond) [15:40:37] (03CR) 10Mholloway: [C: 03+2] MachineVision: Do not restrict to testing users in Beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548868 (owner: 10Mholloway) [15:40:49] (03PS11) 10Jbond: puppetmnasters: use localcacert setting for CA file in apache [puppet] - 10https://gerrit.wikimedia.org/r/545575 (https://phabricator.wikimedia.org/T234332) [15:41:54] (03CR) 10jerkins-bot: [V: 04-1] puppetmnasters: use localcacert setting for CA file in apache [puppet] - 10https://gerrit.wikimedia.org/r/545575 (https://phabricator.wikimedia.org/T234332) (owner: 10Jbond) [15:42:25] !log mholloway-shell@deploy1001 Synchronized wmf-config/InitialiseSettings-labs.php: MachineVision: Do not restrict to testing users on Beta (duration: 01m 00s) [15:42:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:43:38] (03PS2) 10Mholloway: MachineVision: Use an HTTP proxy in production [mediawiki-config] - 10https://gerrit.wikimedia.org/r/547741 (https://phabricator.wikimedia.org/T236797) [15:43:58] jbond42: fyi because until about a minute from now I'm the only one who hasn't filtered cron spam away -- there are a hojillion mails about "ERROR puppetlabs.facter - error while resolving custom fact "puppet_config": undefined local variable or method `sections' for #" [15:43:59] is that https://gerrit.wikimedia.org/r/c/operations/puppet/+/549094 ? [15:44:49] ah whoops I just missed it in -sre, ignore :) [15:44:50] (03CR) 10Mholloway: [C: 03+2] MachineVision: Use an HTTP proxy in production [mediawiki-config] - 10https://gerrit.wikimedia.org/r/547741 (https://phabricator.wikimedia.org/T236797) (owner: 10Mholloway) [15:45:56] rlazarus: you're not the only one :) [15:47:13] !log mholloway-shell@deploy1001 Synchronized wmf-config/CommonSettings.php: MachineVision: Use an HTTP proxy in production (T236843) (duration: 01m 01s) [15:47:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:47:18] (03PS1) 10Jbond: wmflib: fix typo in puppet_config fact [puppet] - 10https://gerrit.wikimedia.org/r/549101 [15:47:18] T236843: Add HTTP proxy support to the MachineVision extension - https://phabricator.wikimedia.org/T236843 [15:47:47] (03PS6) 10Mholloway: Configure and enable MachineVision on testcommonswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548869 (https://phabricator.wikimedia.org/T227349) [15:50:05] (03CR) 10Mholloway: [C: 03+2] Configure and enable MachineVision on testcommonswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548869 (https://phabricator.wikimedia.org/T227349) (owner: 10Mholloway) [15:50:29] (03CR) 10Jbond: [C: 03+2] wmflib: fix typo in puppet_config fact [puppet] - 10https://gerrit.wikimedia.org/r/549101 (owner: 10Jbond) [15:51:58] (03PS1) 10Elukey: Deploy Kerberos keytabs to Analytics Hadoop hosts [puppet] - 10https://gerrit.wikimedia.org/r/549103 (https://phabricator.wikimedia.org/T237269) [15:52:21] !log mholloway-shell@deploy1001 Synchronized wmf-config/InitialiseSettings.php: Configure MachineVision and enable on testcommonswiki (T227349) (duration: 01m 00s) [15:52:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:52:25] T227349: Deploy the MachineVision extension to production - https://phabricator.wikimedia.org/T227349 [15:56:44] Urbanecm: the autocreateaccount thing works [15:57:04] though not sure if thanks to your code, the global group or both :) [15:58:05] (03PS12) 10Jbond: puppetmnasters: use localcacert setting for CA file in apache [puppet] - 10https://gerrit.wikimedia.org/r/545575 (https://phabricator.wikimedia.org/T234332) [15:58:45] !log created MachineVision tables on testcommonswiki (T227349) [15:58:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:58:50] T227349: Deploy the MachineVision extension to production - https://phabricator.wikimedia.org/T227349 [15:59:30] (03CR) 10jerkins-bot: [V: 04-1] puppetmnasters: use localcacert setting for CA file in apache [puppet] - 10https://gerrit.wikimedia.org/r/545575 (https://phabricator.wikimedia.org/T234332) (owner: 10Jbond) [16:00:16] (03CR) 10DCausse: [beta] configure sparql/query logging to deployment-eventgate-3 (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/549056 (owner: 10DCausse) [16:00:24] (03PS6) 10DCausse: [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 [16:00:26] (03PS6) 10DCausse: [wdqs] configure eventgate endpoint for sparql/query events [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) [16:01:55] (03CR) 10Ottomata: [C: 03+1] [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 (owner: 10DCausse) [16:02:34] (03CR) 10Ottomata: [C: 03+1] [wdqs] configure eventgate endpoint for sparql/query events [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) (owner: 10DCausse) [16:02:41] (03CR) 10jerkins-bot: [V: 04-1] [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 (owner: 10DCausse) [16:08:47] (03PS13) 10Jbond: puppetmnasters: use localcacert setting for CA file in apache [puppet] - 10https://gerrit.wikimedia.org/r/545575 (https://phabricator.wikimedia.org/T234332) [16:09:46] Urbanecm, hauskater: are you aware that currently all users get accounts automatically on closed wikis? For example I just got an account on closed gotwikibooks (https://got.wikibooks.org/wiki/Special:Log) [16:10:19] tassu: are you in any global groups? [16:10:28] 2fa testers, nothing else [16:10:55] It looks Urbanecm patches need to be reverted then [16:11:05] (03CR) 10Jcrespo: "The proposal is ok (spliting labswiki section and wiki name is way less confusing and less prone to error), the s-names are a bit ugly per" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/547596 (owner: 10Jforrester) [16:11:19] !log MachineVision: Imported Freebase to Wikidata ID mappings on testcommonswiki (T227349) [16:11:19] tassu: I assume you cannot edit over there right? [16:11:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:11:23] T227349: Deploy the MachineVision extension to production - https://phabricator.wikimedia.org/T227349 [16:11:26] tassu: hauskater: I'll check it soon [16:11:29] in a meeting rn [16:11:37] Alright [16:12:27] correct, can't edit anything (atleast there, haven't checked others) [16:12:52] Great, thanks for checking [16:14:20] (03CR) 10Jcrespo: "The proposal is ok (spliting labswiki section and wiki name is way less confusing and less prone to error), the s-names are a bit ugly per" (032 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/547596 (owner: 10Jforrester) [16:15:08] (03PS1) 10Ottomata: Add schema.discovery.wmnet [dns] - 10https://gerrit.wikimedia.org/r/549105 (https://phabricator.wikimedia.org/T233630) [16:15:24] (03PS1) 10Ottomata: Set up schema.discovery.wmnet [puppet] - 10https://gerrit.wikimedia.org/r/549106 (https://phabricator.wikimedia.org/T233630) [16:16:10] (03CR) 10Volans: "Some comment inline." (036 comments) [puppet] - 10https://gerrit.wikimedia.org/r/544214 (owner: 10Jbond) [16:16:33] Reedy: it sounded like you and Bryan cooked up a scheme for moving those hiera pages to 'obsolete' and marking them read-only with an explanation? I have an OpenStackManager patch that does that which I can just merge, but if you're interested in doing the namespace juggling instead, that will let me prune out a bit more code. Do you have a preference? [16:17:34] (03PS1) 10Bstorm: toolforge: Distribute the roles for toolforge users [puppet] - 10https://gerrit.wikimedia.org/r/549108 (https://phabricator.wikimedia.org/T227290) [16:17:58] 10Operations, 10Analytics, 10Analytics-EventLogging, 10Event-Platform, and 4 others: Public schema.wikimedia.org endpoint for schema.svc - https://phabricator.wikimedia.org/T233630 (10Ottomata) Ok @ema @akosiaris patches for schema.discovery.wmnet ^. Not sure if I need this but I think it would be good fo... [16:18:13] (03CR) 10jerkins-bot: [V: 04-1] toolforge: Distribute the roles for toolforge users [puppet] - 10https://gerrit.wikimedia.org/r/549108 (https://phabricator.wikimedia.org/T227290) (owner: 10Bstorm) [16:20:10] (03PS2) 10Bstorm: toolforge: Distribute the roles for toolforge users [puppet] - 10https://gerrit.wikimedia.org/r/549108 (https://phabricator.wikimedia.org/T227290) [16:22:59] (03CR) 10Arturo Borrero Gonzalez: [C: 03+1] toolforge: Distribute the roles for toolforge users [puppet] - 10https://gerrit.wikimedia.org/r/549108 (https://phabricator.wikimedia.org/T227290) (owner: 10Bstorm) [16:23:34] 10Operations, 10ops-eqiad: rack/setup/install ms-be105[7-9].eqiad.wmnet - https://phabricator.wikimedia.org/T237438 (10RobH) @Jclark-ctr: Once all three hosts are racked, and mgmt is online and accessible, please reassign this task to me for the OS installs. Thanks! [16:26:49] (03CR) 10Bstorm: [C: 03+2] toolforge: Distribute the roles for toolforge users [puppet] - 10https://gerrit.wikimedia.org/r/549108 (https://phabricator.wikimedia.org/T227290) (owner: 10Bstorm) [16:27:12] (03PS19) 10CRusnov: backends: add Netbox backend [software/cumin] - 10https://gerrit.wikimedia.org/r/514840 (https://phabricator.wikimedia.org/T205900) [16:28:57] (03PS20) 10CRusnov: backends: add Netbox backend [software/cumin] - 10https://gerrit.wikimedia.org/r/514840 (https://phabricator.wikimedia.org/T205900) [16:29:28] (03PS5) 10Herron: Introduce Elastic 7 support [puppet] - 10https://gerrit.wikimedia.org/r/545867 (https://phabricator.wikimedia.org/T234854) (owner: 10Filippo Giunchedi) [16:29:32] andrewbogott: Do you know if this page ever worked for the Heira NS? https://wikitech.wikimedia.org/wiki/Special:AllPages?from=&to=&namespace=666 [16:30:00] (03CR) 10CRusnov: backends: add Netbox backend (035 comments) [software/cumin] - 10https://gerrit.wikimedia.org/r/514840 (https://phabricator.wikimedia.org/T205900) (owner: 10CRusnov) [16:30:12] Reedy: I don't know. I've been using https://wikitech.wikimedia.org/wiki/Special:PrefixIndex?prefix=&namespace=666&hideredirects=1 [16:30:25] Reedy: complicating this is that there currently are only deleted pages in that namespace. [16:30:31] Ohhh [16:30:37] That explains why they're not showing [16:30:37] Haha [16:30:39] It's vaguely useful to allow people to see the history of those pages though [16:31:10] But what I really want to avoid is folks creating new ones (or undeleting) and expecting them to do things [16:31:14] Oooh. I didn't realise they were deleted [16:31:34] https://wikitech.wikimedia.org/wiki/Special:AllPages?from=&to=&namespace=667 [16:31:36] Out of laziness I deleted them as I moved them over to horizon [16:31:40] :) [16:31:46] We should probably clean up/delete the few talk pages too [16:31:50] yeah [16:32:02] (I'm about to become distracted by the tech update) [16:32:11] Ditto [16:32:14] Talk after :) [16:32:17] ok! [16:32:31] (03CR) 10Elukey: [C: 03+2] Deploy Kerberos keytabs to Analytics Hadoop hosts [puppet] - 10https://gerrit.wikimedia.org/r/549103 (https://phabricator.wikimedia.org/T237269) (owner: 10Elukey) [16:35:06] (03CR) 10Herron: "Updated the ExecStart and ES_JVM_OPTIONS in the systemd template to include the instance name." [puppet] - 10https://gerrit.wikimedia.org/r/545867 (https://phabricator.wikimedia.org/T234854) (owner: 10Filippo Giunchedi) [16:35:55] 10Operations, 10Traffic, 10netops, 10observability: Network port utilization alerts should be paging - https://phabricator.wikimedia.org/T224888 (10ayounsi) [16:37:35] 10Operations, 10observability, 10Patch-For-Review, 10Performance-Team (Radar): Fully migrate producers off statsd - https://phabricator.wikimedia.org/T205870 (10colewhite) [16:38:49] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [16:43:06] (03Abandoned) 1020after4: Add git::lfs on design/style-guide targets [puppet] - 10https://gerrit.wikimedia.org/r/547778 (https://phabricator.wikimedia.org/T235013) (owner: 1020after4) [16:45:59] 10Operations, 10Icinga: dwisehaupt needs access to iginca for frack hosts - https://phabricator.wikimedia.org/T235676 (10Jgreen) p:05Normal→03Triage [16:48:58] (03PS6) 10Herron: Introduce Elastic 7 support [puppet] - 10https://gerrit.wikimedia.org/r/545867 (https://phabricator.wikimedia.org/T234854) (owner: 10Filippo Giunchedi) [16:51:15] 10Operations, 10Icinga: dwisehaupt needs access to iginca for frack hosts - https://phabricator.wikimedia.org/T235676 (10MoritzMuehlenhoff) @Dwisehaupt To access Icinga you need to be in cn=wmf, which will also grant you access to a number of other services: https://wikitech.wikimedia.org/wiki/LDAP/Groups [16:51:34] 10Operations, 10Icinga, 10LDAP-Access-Requests: dwisehaupt needs access to iginca for frack hosts - https://phabricator.wikimedia.org/T235676 (10MoritzMuehlenhoff) [16:58:39] Reedy: In the meantime Bryan merged my OSM patch, so we can probably ignore the namespace change for now unless you're truly interested :) [16:58:47] heh [16:59:33] andrewbogott: If they're all already deleted... We might need to do stuff in the DB... We should probably deal with the Heira talk pages before that patch goes live [16:59:51] Reedy: what would need to happen in the DB? [17:00:05] MaxSem, RoanKattouw, Niharika, and Urbanecm: Dear deployers, time to do the Morning SWAT (Max 6 patches) deploy. Dont look at me like that. You signed up for it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20191106T1700). [17:00:05] No GERRIT patches in the queue for this window AFAICS. [17:00:23] Changing NS/page names to "move" the pages [17:00:31] ah, ok [17:00:59] well… let's just skip it then. The history there will decrease in value as time goes on… in six months or so we can just delete the namespace entirely and declare victory. [17:01:06] I'll delete those talk pages now [17:01:11] (03PS7) 10DCausse: [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 [17:01:14] (03PS7) 10DCausse: [wdqs] configure eventgate endpoint for sparql/query events [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) [17:01:18] (03PS1) 10RLazarus: httpbb: Add httpbb module and baseurls test suite cribbed from apache-fast-test. [puppet] - 10https://gerrit.wikimedia.org/r/549116 [17:02:38] (03PS2) 10Jforrester: Split out DB-related concerns for wikitech and test wikitech into s-wikitech, s-wikitech-wmcs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/547596 [17:03:26] (03CR) 10jerkins-bot: [V: 04-1] [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 (owner: 10DCausse) [17:03:32] 10Operations, 10Icinga, 10LDAP-Access-Requests: dwisehaupt needs access to iginca for frack hosts - https://phabricator.wikimedia.org/T235676 (10Dwisehaupt) @MoritzMuehlenhoff Thanks. Is there something else I need to do to request this or will this ticket suffice? [17:05:18] (03PS4) 10Jforrester: Enable WebAuthn on all beta wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/546975 (https://phabricator.wikimedia.org/T227242) (owner: 10Reedy) [17:06:30] (03CR) 10Jforrester: [C: 03+2] "Vendor is live. Let's roll." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/546975 (https://phabricator.wikimedia.org/T227242) (owner: 10Reedy) [17:07:21] (03Merged) 10jenkins-bot: Enable WebAuthn on all beta wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/546975 (https://phabricator.wikimedia.org/T227242) (owner: 10Reedy) [17:08:53] !log jynus@cumin1001 dbctl commit (dc=all): 'Repool db1074 fully', diff saved to https://phabricator.wikimedia.org/P9541 and previous config saved to /var/cache/conftool/dbconfig/20191106-170852-jynus.json [17:08:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:09:54] !log jforrester@deploy1001 Synchronized wmf-config/InitialiseSettings.php: Set wmgUseWebAuthn false in all of production T227242 (duration: 01m 01s) [17:09:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:09:58] (03PS3) 10Jcrespo: Revert "mariadb: Depool es1019 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549089 [17:09:58] T227242: Deploy WebAuthn to Wikimedia Wikis - https://phabricator.wikimedia.org/T227242 [17:11:30] !log jforrester@deploy1001 Synchronized wmf-config/CommonSettings.php: Enable WebAuthn extension if wmgUseWebAuthn is set (false in all of production) T227242 (duration: 01m 00s) [17:11:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:11:59] (03CR) 10Jcrespo: [C: 03+2] Revert "mariadb: Depool es1019 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549089 (owner: 10Jcrespo) [17:12:42] (03Merged) 10jenkins-bot: Revert "mariadb: Depool es1019 for maintenance" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549089 (owner: 10Jcrespo) [17:14:12] (03PS10) 10Jbond: puppet-merge: refactor [puppet] - 10https://gerrit.wikimedia.org/r/544214 [17:15:25] (03CR) 10jerkins-bot: [V: 04-1] puppet-merge: refactor [puppet] - 10https://gerrit.wikimedia.org/r/544214 (owner: 10Jbond) [17:15:27] !log jynus@deploy1001 Synchronized wmf-config/db-eqiad.php: repool es1019 fully (duration: 00m 59s) [17:15:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:17:17] (03CR) 10Jbond: "thanks" (036 comments) [puppet] - 10https://gerrit.wikimedia.org/r/544214 (owner: 10Jbond) [17:18:51] 10Operations, 10Icinga, 10LDAP-Access-Requests: dwisehaupt needs access to iginca for frack hosts - https://phabricator.wikimedia.org/T235676 (10MoritzMuehlenhoff) That's all, I'll add you tomorrow to the group. [17:19:07] (03CR) 10Dzahn: "oh. that was a nice outcome then:)" [puppet] - 10https://gerrit.wikimedia.org/r/547778 (https://phabricator.wikimedia.org/T235013) (owner: 1020after4) [17:19:43] 10Operations, 10Icinga, 10LDAP-Access-Requests: dwisehaupt needs access to iginca for frack hosts - https://phabricator.wikimedia.org/T235676 (10Dwisehaupt) Wonderful, thanks again. [17:20:55] (03CR) 10Jbond: "not sure why im getting" [puppet] - 10https://gerrit.wikimedia.org/r/544214 (owner: 10Jbond) [17:22:29] (03PS2) 10Giuseppe Lavagetto: Also check charts generated by helmfile [deployment-charts] - 10https://gerrit.wikimedia.org/r/549059 [17:22:36] !log jynus@cumin1001 dbctl commit (dc=all): 'Reduce db1126 weight, too much backlog', diff saved to https://phabricator.wikimedia.org/P9542 and previous config saved to /var/cache/conftool/dbconfig/20191106-172235-jynus.json [17:22:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:22:42] 10Operations, 10Icinga, 10LDAP-Access-Requests: dwisehaupt needs access to iginca for frack hosts - https://phabricator.wikimedia.org/T235676 (10Dzahn) That only gives read access to the web interface though. To run commands like scheduling a downtime for a server or ACKing alerts and also the notification... [17:23:34] (03PS14) 10Jbond: puppetmnasters: use localcacert setting for CA file in apache [puppet] - 10https://gerrit.wikimedia.org/r/545575 (https://phabricator.wikimedia.org/T234332) [17:27:44] (03CR) 10Ottomata: [C: 03+2] Add eventgate-logging-external instance [deployment-charts] - 10https://gerrit.wikimedia.org/r/547307 (https://phabricator.wikimedia.org/T236386) (owner: 10Ottomata) [17:27:50] (03PS8) 10Ottomata: Add eventgate-logging-external instance [deployment-charts] - 10https://gerrit.wikimedia.org/r/547307 (https://phabricator.wikimedia.org/T236386) [17:27:52] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Add eventgate-logging-external instance [deployment-charts] - 10https://gerrit.wikimedia.org/r/547307 (https://phabricator.wikimedia.org/T236386) (owner: 10Ottomata) [17:31:21] 10Operations, 10Puppet, 10DBA, 10Patch-For-Review, 10User-jbond: Document all uses of the puppetCA certificate - https://phabricator.wikimedia.org/T237259 (10jbond) @Eevans moritz mentioned there maybe some cassandra consideration to take into account and you could enlighten me as to what they are :) [17:39:38] 10Operations, 10Machine vision, 10serviceops, 10Product-Infrastructure-Team-Backlog (Kanban): Configure Google Cloud Vision credentials in production - https://phabricator.wikimedia.org/T236426 (10Joe) Hi, the credentials file should be stored, as any secret that needs to be accessed by MediaWiki, within i... [17:51:20] (03PS1) 10Jbond: wmflib puppet_config fact: correct typo setting vs settings [puppet] - 10https://gerrit.wikimedia.org/r/549128 [17:54:13] (03PS1) 10Dzahn: icinga: add dwisehaupt to fr-tech-ops contact group [puppet] - 10https://gerrit.wikimedia.org/r/549129 (https://phabricator.wikimedia.org/T235676) [17:54:25] (03CR) 10Jbond: [C: 03+2] wmflib puppet_config fact: correct typo setting vs settings [puppet] - 10https://gerrit.wikimedia.org/r/549128 (owner: 10Jbond) [17:55:27] (03PS3) 10Bstorm: cloud: Replace SSHSessions diamond collector with prometheus [puppet] - 10https://gerrit.wikimedia.org/r/543268 (https://phabricator.wikimedia.org/T210993) (owner: 10BryanDavis) [17:58:13] (03CR) 10Dzahn: [C: 03+2] icinga: add dwisehaupt to fr-tech-ops contact group [puppet] - 10https://gerrit.wikimedia.org/r/549129 (https://phabricator.wikimedia.org/T235676) (owner: 10Dzahn) [18:04:32] (03Abandoned) 10Jgreen: Modify secret.rb to accept a file list and use first match, like http://www.puppetcookbook.com/posts/select-a-file-based-on-a-fact.html [puppet] - 10https://gerrit.wikimedia.org/r/294331 (owner: 10Jgreen) [18:05:45] (03CR) 10Dwisehaupt: [C: 03+1] icinga: add dwisehaupt to fr-tech-ops contact group [puppet] - 10https://gerrit.wikimedia.org/r/549129 (https://phabricator.wikimedia.org/T235676) (owner: 10Dzahn) [18:09:47] 10Operations, 10Puppet, 10DBA, 10Patch-For-Review, 10User-jbond: Document all uses of the puppetCA certificate - https://phabricator.wikimedia.org/T237259 (10CDanis) [18:10:06] 10Operations, 10Puppet, 10DBA, 10Patch-For-Review, 10User-jbond: Document all uses of the puppetCA certificate - https://phabricator.wikimedia.org/T237259 (10CDanis) [18:11:04] (03PS1) 10Dzahn: icinga: authorize Dwisehaupt to run commands [puppet] - 10https://gerrit.wikimedia.org/r/549133 (https://phabricator.wikimedia.org/T235676) [18:14:13] (03PS1) 10Elukey: role::analytics_cluster::coordinator: add missing kerberos profile [puppet] - 10https://gerrit.wikimedia.org/r/549134 [18:14:33] (03CR) 10Elukey: [C: 03+2] role::analytics_cluster::coordinator: add missing kerberos profile [puppet] - 10https://gerrit.wikimedia.org/r/549134 (owner: 10Elukey) [18:14:34] (03PS11) 10Jbond: puppet-merge: refactor [puppet] - 10https://gerrit.wikimedia.org/r/544214 [18:14:36] (03PS2) 10Jbond: puppet-merge: add Repository class [puppet] - 10https://gerrit.wikimedia.org/r/544943 [18:16:01] (03CR) 10jerkins-bot: [V: 04-1] puppet-merge: add Repository class [puppet] - 10https://gerrit.wikimedia.org/r/544943 (owner: 10Jbond) [18:16:29] hauskater: I can't reproduce [18:16:45] tassu: ^^ [18:17:59] Just tried it in wikimania2008wiki and it created an account for me (https://wikimania2008.wikimedia.org/wiki/Special:Log) [18:18:18] I'll check the logs [18:20:10] tassu: could you temporarily disable 2FA at your account to test if it has any effect? [18:20:30] i'll try [18:20:36] that would be a really weird side effect [18:21:43] (03CR) 10Bstorm: [C: 03+2] cloud: Replace SSHSessions diamond collector with prometheus [puppet] - 10https://gerrit.wikimedia.org/r/543268 (https://phabricator.wikimedia.org/T210993) (owner: 10BryanDavis) [18:21:48] Urbanecm: still happening, at least in wikimania2009wiki (https://wikimania2009.wikimedia.org/wiki/Special:Log) [18:21:54] Interesting [18:21:57] let me check... [18:22:30] it's a little funky and might sometimes only create it on second or third page view [18:22:38] tassu: so it doesn' work every time? [18:22:47] jouncebot: now [18:22:47] No deployments scheduled for the next 0 hour(s) and 37 minute(s) [18:22:49] 60% of the times it works every time [18:22:50] jouncebot: next [18:22:50] In 0 hour(s) and 37 minute(s): Pre MediaWiki train sanity break (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20191106T1900) [18:23:49] (03PS8) 10DCausse: [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 [18:23:51] (03PS8) 10DCausse: [wdqs] configure eventgate endpoint for sparql/query events [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) [18:24:34] (03CR) 10jerkins-bot: [V: 04-1] [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 (owner: 10DCausse) [18:26:10] (03PS1) 10Urbanecm: Instrument logging to ClosedWikiProvider [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549138 [18:26:33] (03PS2) 10Urbanecm: Instrument logging to ClosedWikiProvider [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549138 (https://phabricator.wikimedia.org/T222117) [18:26:59] (03CR) 10Urbanecm: [C: 03+2] Instrument logging to ClosedWikiProvider [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549138 (https://phabricator.wikimedia.org/T222117) (owner: 10Urbanecm) [18:27:26] RECOVERY - Check systemd state on stat1007 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [18:27:47] (03Merged) 10jenkins-bot: Instrument logging to ClosedWikiProvider [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549138 (https://phabricator.wikimedia.org/T222117) (owner: 10Urbanecm) [18:28:33] !log urbanecm@deploy1001 Synchronized wmf-config/CommonSettings.php: SWAT: Instrument logging to ClosedWikiProvider (T222117) (duration: 01m 01s) [18:28:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:28:38] T222117: Create accounts for new stewards in closed wikis - https://phabricator.wikimedia.org/T222117 [18:28:46] tassu: could you check once more? [18:28:51] I've added more logging [18:29:06] sure, do you want me to do that on mwdebug1001 or on any server? [18:29:27] tassu: I've synced that to the world [18:30:19] Urbanecm: checked on wikimania2010wiki [18:30:22] still happening [18:30:29] I've just added logging [18:31:28] 10Operations, 10Icinga, 10LDAP-Access-Requests, 10Patch-For-Review: dwisehaupt needs access to iginca for frack hosts - https://phabricator.wikimedia.org/T235676 (10Dzahn) a:03Dzahn [18:32:06] Urbanecm: you have a typo on your patch ("$loger->debug" instead of "$logger->debug") [18:32:26] (03PS1) 10CDanis: profile::swap: rsync::server::module::hosts_allow is an array [puppet] - 10https://gerrit.wikimedia.org/r/549142 [18:33:19] lol [18:33:54] (03CR) 10Bstorm: toolforge: proxy: adjust setup for the new k8s cluster (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/543135 (https://phabricator.wikimedia.org/T234037) (owner: 10Arturo Borrero Gonzalez) [18:33:54] tassu: ehh... [18:34:12] (03PS1) 10Dzahn: admins: add Dallas Wisehaupt to ldap_only_admins (wmf) [puppet] - 10https://gerrit.wikimedia.org/r/549143 (https://phabricator.wikimedia.org/T235676) [18:34:38] Urbanecm: you will still get a log entry from that line, just not something that you wanted [18:34:49] !log urbanecm@deploy1001 Synchronized wmf-config/CommonSettings.php: Fix typo (T222117) (duration: 01m 00s) [18:34:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:34:54] T222117: Create accounts for new stewards in closed wikis - https://phabricator.wikimedia.org/T222117 [18:35:19] (03CR) 10Elukey: [C: 03+2] profile::swap: rsync::server::module::hosts_allow is an array [puppet] - 10https://gerrit.wikimedia.org/r/549142 (owner: 10CDanis) [18:36:37] (03PS1) 10Urbanecm: Fix typo in ClosedWikiProvider [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549148 [18:36:51] (03CR) 10CDanis: [C: 03+2] "PCC verifies fixed https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/19272/console" [puppet] - 10https://gerrit.wikimedia.org/r/549142 (owner: 10CDanis) [18:37:00] Urbanecm: want me to test again? [18:37:19] (03CR) 10Urbanecm: [C: 03+2] Fix typo in ClosedWikiProvider [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549148 (owner: 10Urbanecm) [18:37:24] tassu: yes please [18:38:05] (03Merged) 10jenkins-bot: Fix typo in ClosedWikiProvider [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549148 (owner: 10Urbanecm) [18:38:14] Urbanecm: wikimania2011wiki [18:39:05] (03CR) 10Dzahn: [C: 03+2] admins: add Dallas Wisehaupt to ldap_only_admins (wmf) [puppet] - 10https://gerrit.wikimedia.org/r/549143 (https://phabricator.wikimedia.org/T235676) (owner: 10Dzahn) [18:39:17] (03PS2) 10Dzahn: admins: add Dallas Wisehaupt to ldap_only_admins (wmf) [puppet] - 10https://gerrit.wikimedia.org/r/549143 (https://phabricator.wikimedia.org/T235676) [18:39:28] 10Operations: stunnel-wrap all rsync::server usage - https://phabricator.wikimedia.org/T237424 (10CDanis) Another thing that just came up: not all users of `rsync::server::module` are actually passing an array to the `$hosts_allow` argument: https://gerrit.wikimedia.org/r/c/operations/puppet/+/549142 Need to go... [18:40:05] tassu: could you check at mwdebug1001 this time please? [18:41:26] Urbanecm: done in wikimania2013wiki [18:41:52] why there is no log entry for your account... [18:42:02] no idea [18:42:40] you remember that my wiki username is "Majavah" not "tassu", correct? [18:42:49] I'm viewing all logs in real time [18:42:58] I'll note that in the task [18:43:18] !log LDAP - add dwisehaupt to wmf group (T235676) [18:43:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:43:23] T235676: dwisehaupt needs access to iginca for frack hosts - https://phabricator.wikimedia.org/T235676 [18:44:06] Urbanecm: check wikimania2014wiki also, just autocreated there also [18:45:27] it's strange... I'll look into that later [18:45:37] leaving that in place, given it only creates a powerless account [18:47:54] (03PS1) 10CDanis: rsync::server::module: type annotations for hosts_allow/deny [puppet] - 10https://gerrit.wikimedia.org/r/549164 (https://phabricator.wikimedia.org/T237424) [18:49:47] (03CR) 10Muehlenhoff: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/549133 (https://phabricator.wikimedia.org/T235676) (owner: 10Dzahn) [18:49:56] (03CR) 10jerkins-bot: [V: 04-1] rsync::server::module: type annotations for hosts_allow/deny [puppet] - 10https://gerrit.wikimedia.org/r/549164 (https://phabricator.wikimedia.org/T237424) (owner: 10CDanis) [18:50:49] oh Urbanecm I've created that global group too, when you need it, I can add you to it [18:50:52] RECOVERY - Long running screen/tmux on snapshot1008 is OK: OK: No SCREEN or tmux processes detected. https://wikitech.wikimedia.org/wiki/Monitoring/Long_running_screens [18:50:52] RECOVERY - Long running screen/tmux on snapshot1005 is OK: OK: No SCREEN or tmux processes detected. https://wikitech.wikimedia.org/wiki/Monitoring/Long_running_screens [18:55:52] (03PS6) 10Cwhite: mtail,profile: add smtp metrics collection with mtail [puppet] - 10https://gerrit.wikimedia.org/r/546290 (https://phabricator.wikimedia.org/T236505) [18:56:59] 10Operations, 10Analytics, 10Analytics-EventLogging, 10Event-Platform, and 4 others: Public schema.wikimedia.org endpoint for schema.svc - https://phabricator.wikimedia.org/T233630 (10Ottomata) a:03Ottomata [18:57:01] (03CR) 10Cwhite: [C: 03+2] "PCC looks good https://puppet-compiler.wmflabs.org/compiler1001/19273/" [puppet] - 10https://gerrit.wikimedia.org/r/546290 (https://phabricator.wikimedia.org/T236505) (owner: 10Cwhite) [18:57:10] 10Operations, 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, and 5 others: Public schema.wikimedia.org endpoint for schema.svc - https://phabricator.wikimedia.org/T233630 (10Ottomata) [18:59:35] hauskater: great, could you add Martin Urbanec (test) to it please? [18:59:51] yup [19:00:04] Deploy window Pre MediaWiki train sanity break (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20191106T1900) [19:01:17] done [19:02:04] 10Operations, 10observability: mw1247: IPMI Sensor Status UNKNOWN internal IPMI error - https://phabricator.wikimedia.org/T237567 (10ayounsi) p:05Triage→03Normal [19:03:37] thank you [19:05:14] !log mw1225 - re-enabling puppet (no reason given, nothing in SAL or Phab but disabled) [19:05:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:06:47] (03PS4) 10Cwhite: profile: get exim metrics from lists [puppet] - 10https://gerrit.wikimedia.org/r/546992 (https://phabricator.wikimedia.org/T236505) [19:07:07] (03PS1) 10Ottomata: Add schema.wikimedia.org [dns] - 10https://gerrit.wikimedia.org/r/549173 (https://phabricator.wikimedia.org/T233630) [19:07:45] (03PS5) 10Cwhite: profile: get exim metrics from lists [puppet] - 10https://gerrit.wikimedia.org/r/546992 (https://phabricator.wikimedia.org/T236505) [19:08:10] (03CR) 10Cwhite: "Ready for review." [puppet] - 10https://gerrit.wikimedia.org/r/546992 (https://phabricator.wikimedia.org/T236505) (owner: 10Cwhite) [19:08:25] 10Operations, 10Machine vision, 10serviceops, 10Product-Infrastructure-Team-Backlog (Kanban): Configure Google Cloud Vision credentials in production - https://phabricator.wikimedia.org/T236426 (10Mholloway) @Joe Thanks, that's helpful. I found the private repo you're referring to, and the credentials can... [19:12:06] (03PS1) 10Ottomata: Set up cache routing for schema.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/549177 (https://phabricator.wikimedia.org/T233630) [19:13:04] (03CR) 10Ottomata: Set up cache routing for schema.wikimedia.org (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/549177 (https://phabricator.wikimedia.org/T233630) (owner: 10Ottomata) [19:15:53] (03PS1) 10Cwhite: prometheus: add lists server mtail scrape to mtail jobs [puppet] - 10https://gerrit.wikimedia.org/r/549179 (https://phabricator.wikimedia.org/T236505) [19:19:13] 10Operations, 10observability: mw1247: IPMI Sensor Status UNKNOWN internal IPMI error - https://phabricator.wikimedia.org/T237567 (10Dzahn) going through: https://wikitech.wikimedia.org/wiki/Management_Interfaces#Troubleshooting_Commands - Does IPMI works locally? No, it's "busy". ` [mw1247:~] $ sudo ipmi... [19:21:54] 10Operations, 10observability: mw1247: IPMI Sensor Status UNKNOWN internal IPMI error - https://phabricator.wikimedia.org/T237567 (10Dzahn) Can't ssh to mgmt to do racreset. Needs help from onsite then. adding dcops. [19:22:11] 10Operations, 10ops-eqiad: mw1247: IPMI Sensor Status UNKNOWN internal IPMI error - https://phabricator.wikimedia.org/T237567 (10Dzahn) [19:22:45] 10Operations, 10ops-eqiad: mw1247: IPMI Sensor Status UNKNOWN internal IPMI error - https://phabricator.wikimedia.org/T237567 (10Dzahn) a:03Jclark-ctr [19:22:45] ACKNOWLEDGEMENT - ElasticSearch shard size check - 9243 on search.svc.codfw.wmnet is CRITICAL: CRITICAL - commonswiki_content_1556151793(71.66666666666667gb) Mathew.onipe phabricator.wikimedia.org/T231446 - The acknowledgement expires at: 2019-11-09 19:22:19. https://wikitech.wikimedia.org/wiki/Search%23If_it_has_been_indexed [19:23:54] 10Operations, 10ops-eqiad: mw1247: IPMI Sensor Status UNKNOWN internal IPMI error - https://phabricator.wikimedia.org/T237567 (10Dzahn) p:05Normal→03Low @Jclark-ctr Please see if you can reset the DRAC. If the server needs to go down please ping me or somebody else in serviceops to depool it. [19:43:11] 10Operations, 10MediaWiki-Shell: Update limit.sh to support systemd-based cgroup management - https://phabricator.wikimedia.org/T136603 (10CCicalese_WMF) This does not appear to be work for #core_platform_team. [19:44:32] Urbanecm: did you see errors at https://logstash.wikimedia.org/goto/3ff3c4ae0a98ca6cc0013f169991279d ? [20:00:04] twentyafterfour and hashar: #bothumor Q:Why did functions stop calling each other? A:They had arguments. Rise for MediaWiki train - American version . (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20191106T2000). [20:00:33] in chat with tyler [20:05:09] (03PS2) 10CDanis: rsync::server::module: ensure hosts_allow is an array [puppet] - 10https://gerrit.wikimedia.org/r/549164 (https://phabricator.wikimedia.org/T237424) [20:07:15] (03CR) 10jerkins-bot: [V: 04-1] rsync::server::module: ensure hosts_allow is an array [puppet] - 10https://gerrit.wikimedia.org/r/549164 (https://phabricator.wikimedia.org/T237424) (owner: 10CDanis) [20:09:02] (03PS3) 10CDanis: rsync::server::module: ensure hosts_allow is an array [puppet] - 10https://gerrit.wikimedia.org/r/549164 (https://phabricator.wikimedia.org/T237424) [20:09:45] jynus: seeing that for first time now, why should that be caused by me? Am i missing something? [20:09:59] (03PS2) 10BBlack: base: certificates: add new GlobalSign CA files [puppet] - 10https://gerrit.wikimedia.org/r/549058 (https://phabricator.wikimedia.org/T237066) (owner: 10Arturo Borrero Gonzalez) [20:11:08] (03CR) 10jerkins-bot: [V: 04-1] rsync::server::module: ensure hosts_allow is an array [puppet] - 10https://gerrit.wikimedia.org/r/549164 (https://phabricator.wikimedia.org/T237424) (owner: 10CDanis) [20:12:29] (03CR) 10BBlack: [C: 03+1] base: certificates: add new GlobalSign CA files [puppet] - 10https://gerrit.wikimedia.org/r/549058 (https://phabricator.wikimedia.org/T237066) (owner: 10Arturo Borrero Gonzalez) [20:14:46] (03PS4) 10CDanis: rsync::server::module: ensure hosts_allow is an array [puppet] - 10https://gerrit.wikimedia.org/r/549164 (https://phabricator.wikimedia.org/T237424) [20:18:50] (03CR) 10Dzahn: [C: 03+2] icinga: authorize Dwisehaupt to run commands [puppet] - 10https://gerrit.wikimedia.org/r/549133 (https://phabricator.wikimedia.org/T235676) (owner: 10Dzahn) [20:21:10] (03CR) 10CDanis: "Running a large PCC and it looks good so far: https://puppet-compiler.wmflabs.org/compiler1002/19277/" [puppet] - 10https://gerrit.wikimedia.org/r/549164 (https://phabricator.wikimedia.org/T237424) (owner: 10CDanis) [20:23:07] (03CR) 10Herron: [C: 03+1] "LGTM!" [puppet] - 10https://gerrit.wikimedia.org/r/546992 (https://phabricator.wikimedia.org/T236505) (owner: 10Cwhite) [20:23:52] (03CR) 10Dzahn: [C: 03+1] rsync::server::module: ensure hosts_allow is an array [puppet] - 10https://gerrit.wikimedia.org/r/549164 (https://phabricator.wikimedia.org/T237424) (owner: 10CDanis) [20:24:13] lots of errors from mwdebug1001, specifically: Uncaught LogicException: The UdpSocket to 127.0.0.1:10514 has been closed and can not be written to anymore in /srv/mediawiki/php-1.35.0-wmf.4/vendor/monolog/monolog/src/Monolog/Handler/SyslogUdp/UdpSocket.php:45 [20:25:42] (03PS1) 1020after4: group1 wikis to 1.35.0-wmf.5 refs T233853 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549193 [20:25:44] (03CR) 1020after4: [C: 03+2] group1 wikis to 1.35.0-wmf.5 refs T233853 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549193 (owner: 1020after4) [20:26:44] (03Merged) 10jenkins-bot: group1 wikis to 1.35.0-wmf.5 refs T233853 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549193 (owner: 1020after4) [20:27:24] sorry I had my kibana time range set weirdly, mwdebug1001 errors not happening now [20:32:05] (03PS1) 10DannyS712: Give commonswiki filemovers `suppressredirect` rights [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549194 (https://phabricator.wikimedia.org/T236348) [20:33:30] (03CR) 10CDanis: [C: 03+2] rsync::server::module: ensure hosts_allow is an array [puppet] - 10https://gerrit.wikimedia.org/r/549164 (https://phabricator.wikimedia.org/T237424) (owner: 10CDanis) [20:33:37] !log twentyafterfour@deploy1001 rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.5 refs T233853 [20:33:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:33:42] T233853: 1.35.0-wmf.5 deployment blockers - https://phabricator.wikimedia.org/T233853 [20:33:52] (03PS2) 10DannyS712: Give commonswiki filemovers `suppressredirect` rights [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549194 (https://phabricator.wikimedia.org/T236348) [20:34:38] !log twentyafterfour@deploy1001 Synchronized php: group1 wikis to 1.35.0-wmf.5 refs T233853 (duration: 01m 00s) [20:34:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:36:03] bd808: andrewbogott: heads up, syncing your openstackmanager patch now [20:36:37] !log twentyafterfour@deploy1001 Synchronized php-1.35.0-wmf.5/extensions/OpenStackManager/: sync openstackmanager to deploy https://gerrit.wikimedia.org/r/#/q/I5b08f0069941052acdd9f05a62aac5b2cf9ecdd5 (duration: 01m 00s) [20:36:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:37:39] twentyafterfour: ok [20:37:42] (03PS1) 10Bstorm: bugfix: fix typo in tcl sssd image definition [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/549197 [20:40:03] (03CR) 10Andrew Bogott: [C: 03+1] bugfix: fix typo in tcl sssd image definition [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/549197 (owner: 10Bstorm) [20:42:44] (03CR) 10Bstorm: [C: 03+2] bugfix: fix typo in tcl sssd image definition [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/549197 (owner: 10Bstorm) [20:47:16] (03PS1) 10Bstorm: bugfix: actually fix the typo this time. [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/549199 [20:47:31] 10Operations, 10Icinga, 10LDAP-Access-Requests, 10Patch-For-Review: dwisehaupt needs access to iginca for frack hosts - https://phabricator.wikimedia.org/T235676 (10Dzahn) 05Open→03Resolved Dallas confirmed he already got an SMS and could schedule a downtime. So this should be resolved. [20:48:37] (03CR) 10Bstorm: [C: 03+2] bugfix: actually fix the typo this time. [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/549199 (owner: 10Bstorm) [20:54:29] 10Operations, 10MediaWiki-Documentation, 10User-Dereckson, 10patch-welcome: Repair "svn.wikimedia.org/doc/" redirect for doc.wikimedia.org - https://phabricator.wikimedia.org/T109950 (10CCicalese_WMF) Untagging #core_platform_team, since there is currently no work for the team to do on this. [20:59:48] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [20:59:58] 10Operations, 10observability, 10serviceops: basic prometheus monitoring for PoolCounter - https://phabricator.wikimedia.org/T237407 (10RLazarus) a:03RLazarus [21:00:04] cscott, arlolra, subbu, bearND, halfak, accraze, and mdholloway: (Dis)respected human, time to deploy Services – Parsoid / Citoid / Mobileapps / ORES / … (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20191106T2100). Please do the needful. [21:01:17] !log milimetric@deploy1001 Started deploy [analytics/refinery@dc85f9d]: Hdfs Cleaner and TLS columns [21:01:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:03:31] (03PS1) 10Bstorm: jessie fixes: port the fix from the base image to the jessie-sssd one [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/549201 (https://phabricator.wikimedia.org/T215531) [21:06:57] (03CR) 10BryanDavis: [C: 03+2] jessie fixes: port the fix from the base image to the jessie-sssd one [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/549201 (https://phabricator.wikimedia.org/T215531) (owner: 10Bstorm) [21:07:23] (03Merged) 10jenkins-bot: jessie fixes: port the fix from the base image to the jessie-sssd one [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/549201 (https://phabricator.wikimedia.org/T215531) (owner: 10Bstorm) [21:08:22] (03CR) 10Bstorm: [C: 03+2] new k8s: adjust things to be compatible with migration to the new cluster [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/547676 (https://phabricator.wikimedia.org/T236202) (owner: 10Bstorm) [21:09:46] (03PS9) 10DCausse: [beta] configure sparql/query logging to deployment-eventgate-3 [puppet] - 10https://gerrit.wikimedia.org/r/549056 [21:09:48] (03PS9) 10DCausse: [wdqs] configure eventgate endpoint for sparql/query events [puppet] - 10https://gerrit.wikimedia.org/r/549081 (https://phabricator.wikimedia.org/T101013) [21:11:41] 10Operations, 10Wikimedia-Mailing-lists: Close Knowledge Integrity mailing list - https://phabricator.wikimedia.org/T237427 (10Dzahn) 05Open→03Resolved a:03Dzahn Done. ` [fermium:~] $ sudo rmlist knowledgeintegrity Not removing archives. Reinvoke with -a to remove them. Removing list info ` [21:12:10] !log milimetric@deploy1001 Finished deploy [analytics/refinery@dc85f9d]: Hdfs Cleaner and TLS columns (duration: 10m 52s) [21:12:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:12:51] 10Operations: SRE quarterly goal: Ability to serve a fraction of the production traffic from PHP7 - https://phabricator.wikimedia.org/T206336 (10Dzahn) @Joe Resolved? [21:14:11] !log push standard forwarding-options to cr3-esams [21:14:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:19:27] (03PS15) 10Jbond: puppetmnasters: use localcacert setting for CA file in apache [puppet] - 10https://gerrit.wikimedia.org/r/545575 (https://phabricator.wikimedia.org/T234332) [21:19:29] (03PS1) 10Bstorm: new k8s: bump changelog [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/549204 [21:19:44] (03CR) 10jerkins-bot: [V: 04-1] new k8s: bump changelog [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/549204 (owner: 10Bstorm) [21:21:21] 10Operations: stunnel-wrap all rsync::server usage - https://phabricator.wikimedia.org/T237424 (10CDanis) a:03CDanis Generated a list of all `$hosts_allow` arguments from `rsync::server::module` invocations across all of Puppet: P9544 They're mostly arrays of FQDNs, which is nice. [21:21:53] !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw1247.eqiad.wmnet [21:21:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:23:41] (03PS3) 10Herron: logstash: add version param and exclude plugins when non 5.x [puppet] - 10https://gerrit.wikimedia.org/r/548880 (https://phabricator.wikimedia.org/T217340) [21:24:27] 10Operations, 10ops-eqiad: mw1247: IPMI Sensor Status UNKNOWN internal IPMI error - https://phabricator.wikimedia.org/T237567 (10Dzahn) server is depooled now. [21:24:49] !log arlolra@deploy1001 Started deploy [parsoid/deploy@7e86f83]: Updating Parsoid to 1d283ed [21:24:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:25:08] (03PS2) 10Bstorm: new k8s: bump changelog [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/549204 [21:30:23] (03CR) 10Dzahn: "> Patch Set 2:" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/548923 (https://phabricator.wikimedia.org/T236833) (owner: 10Dzahn) [21:31:09] (03PS1) 10Ayounsi: Analytics bacula v6 term missing them accept [homer/public] - 10https://gerrit.wikimedia.org/r/549206 [21:31:43] (03CR) 10Ayounsi: [V: 03+2 C: 03+2] Analytics bacula v6 term missing them accept [homer/public] - 10https://gerrit.wikimedia.org/r/549206 (owner: 10Ayounsi) [21:32:45] (03CR) 10Bstorm: [C: 03+2] new k8s: bump changelog [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/549204 (owner: 10Bstorm) [21:35:11] !log arlolra@deploy1001 Finished deploy [parsoid/deploy@7e86f83]: Updating Parsoid to 1d283ed (duration: 10m 22s) [21:35:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:36:20] (03PS7) 10Herron: Introduce Elastic 7 support [puppet] - 10https://gerrit.wikimedia.org/r/545867 (https://phabricator.wikimedia.org/T234854) (owner: 10Filippo Giunchedi) [21:36:52] (03Abandoned) 10Ottomata: Use HDFS trash settings as default everywhere [puppet] - 10https://gerrit.wikimedia.org/r/548850 (https://phabricator.wikimedia.org/T235200) (owner: 10Ottomata) [21:37:26] (03PS1) 10Ottomata: HDFSCleaner - change command args [puppet] - 10https://gerrit.wikimedia.org/r/549207 (https://phabricator.wikimedia.org/T235200) [21:38:24] (03PS4) 10Herron: logstash: add version param and exclude plugins when non 5.x [puppet] - 10https://gerrit.wikimedia.org/r/548880 (https://phabricator.wikimedia.org/T217340) [21:40:43] (03CR) 10Ottomata: [C: 03+2] HDFSCleaner - change command args [puppet] - 10https://gerrit.wikimedia.org/r/549207 (https://phabricator.wikimedia.org/T235200) (owner: 10Ottomata) [21:42:42] 10Operations, 10ops-eqiad: frqueue1001 system battery needs replacement - https://phabricator.wikimedia.org/T237582 (10Jgreen) [21:45:57] (03PS1) 10Krinkle: doc.wikimedia.org: Partially revert "Fix up CSP headers" [puppet] - 10https://gerrit.wikimedia.org/r/549208 (https://phabricator.wikimedia.org/T213223) [21:46:35] (03PS2) 10Krinkle: doc.wikimedia.org: Partially revert "Fix up CSP headers" [puppet] - 10https://gerrit.wikimedia.org/r/549208 (https://phabricator.wikimedia.org/T213223) [21:46:42] (03CR) 10Krinkle: "Which doc sites was this added for?" [puppet] - 10https://gerrit.wikimedia.org/r/549208 (https://phabricator.wikimedia.org/T213223) (owner: 10Krinkle) [21:47:00] (03CR) 10Krinkle: Fix up CSP headers for doc.wikimedia.org (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/546378 (https://phabricator.wikimedia.org/T213223) (owner: 10Brian Wolff) [21:47:30] !log Updated Parsoid to 1d283ed (T237104, T227209, T236865) [21:47:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:47:38] T237104: PHP Notice: Undefined index: duration - https://phabricator.wikimedia.org/T237104 [21:47:38] T227209: Security Review For Parsoid-PHP - https://phabricator.wikimedia.org/T227209 [21:47:39] T236865: Validation of `domain` failed: mwparsoid-invalid-domain - https://phabricator.wikimedia.org/T236865 [21:50:10] PROBLEM - Host mw1247 is DOWN: PING CRITICAL - Packet loss = 100% [21:50:21] ^ maintenance [21:50:50] (03CR) 10Brian Wolff: "www.wikimedia.org was added for https://www.wikimedia.org/static/favicon/wmf.ico which is loaded by the landing page" [puppet] - 10https://gerrit.wikimedia.org/r/549208 (https://phabricator.wikimedia.org/T213223) (owner: 10Krinkle) [21:51:16] PROBLEM - PHP opcache health on wtp1027 is CRITICAL: CRITICAL: opcache cache-hit ratio is below 99.85% https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_opcache_health [21:52:48] yea have a few hours [21:52:50] RECOVERY - PHP opcache health on wtp1027 is OK: OK: opcache is healthy https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_opcache_health [21:53:14] 10Operations, 10ops-eqiad: mw1247: IPMI Sensor Status UNKNOWN internal IPMI error - https://phabricator.wikimedia.org/T237567 (10Jclark-ctr) preformed flea power drain [21:53:40] !log checkout /srv/mediawiki-staging/php-1.35.0-wmf.5/maintenance/Maintenance.php looks like a local change for debugging left behind [21:53:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:53:43] disregard last message wrong window [21:53:45] Krinkle: I also have https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/547718/ for doc.wikimedia.org. I was travelling this week so didn't have a chance to ask people to get it deployed [21:53:58] RECOVERY - Host mw1247 is UP: PING OK - Packet loss = 0%, RTA = 0.25 ms [21:57:33] !log mholloway-shell@deploy1001 Synchronized php-1.35.0-wmf.5/extensions/MachineVision: Allow specifying API credentials as an associative array (T236426) (duration: 01m 01s) [21:57:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:57:38] T236426: Configure Google Cloud Vision credentials in production - https://phabricator.wikimedia.org/T236426 [21:59:41] 10Operations, 10ops-eqiad: mw1247: IPMI Sensor Status UNKNOWN internal IPMI error - https://phabricator.wikimedia.org/T237567 (10Dzahn) - ssh to mgmt works again - local IPMI works again - Icinga check changed to "ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-mw1247.localhost: internal IPMI error" [22:00:04] mdholloway: Your horoscope predicts another unfortunate Enable MachineVision on commonswiki deploy. May Zuul be (nice) with you. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20191106T2200). [22:01:10] PROBLEM - Check the Netbox report puppetdb for fail status. on netbox1001 is CRITICAL: puppetdb.PuppetDB CRITICAL https://wikitech.wikimedia.org/wiki/Netbox%23Reports [22:02:04] 10Operations, 10ops-eqiad: mw1247: IPMI Sensor Status UNKNOWN internal IPMI error - https://phabricator.wikimedia.org/T237567 (10Dzahn) 05Open→03Resolved and now it's fixed and Icinga is green again on next check. Thanks @Jclark-ctr for the quick response! [22:03:29] !log mholloway-shell@deploy1001 Synchronized private/GoogleCloudVision.php: Configure Google Cloud Vision API credentials (1/2) (T236426) (duration: 00m 59s) [22:03:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:03:34] T236426: Configure Google Cloud Vision credentials in production - https://phabricator.wikimedia.org/T236426 [22:04:11] (03CR) 10Alexandros Kosiaris: [C: 03+1] Add schema.wikimedia.org [dns] - 10https://gerrit.wikimedia.org/r/549173 (https://phabricator.wikimedia.org/T233630) (owner: 10Ottomata) [22:04:17] !log dzahn@cumin1001 conftool action : set/pooled=yes; selector: name=mw1247.eqiad.wmnet [22:04:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:04:32] 10Operations, 10ops-eqiad: mw1247: IPMI Sensor Status UNKNOWN internal IPMI error - https://phabricator.wikimedia.org/T237567 (10Dzahn) repooled [22:04:45] !log mholloway-shell@deploy1001 Synchronized private/PrivateSettings.php: Configure Google Cloud Vision API credentials (2/2) (T236426) (duration: 00m 59s) [22:04:48] !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw1290.eqiad.wmnet [22:04:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:04:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:05:01] 10Operations, 10ops-eqiad: Can't SSH to mw1290.mgmt - https://phabricator.wikimedia.org/T234153 (10Dzahn) server depooled [22:05:24] 10Operations, 10ops-eqiad, 10serviceops: Can't SSH to mw1290.mgmt - https://phabricator.wikimedia.org/T234153 (10Dzahn) [22:08:34] (03PS1) 10Bstorm: toolforge: set a new package builder role [puppet] - 10https://gerrit.wikimedia.org/r/549211 [22:13:22] PROBLEM - Host mw1290 is DOWN: PING CRITICAL - Packet loss = 100% [22:13:31] ^ more IPMI fixes [22:13:35] !log push standard forwarding-options to cr3/4-ulsfo [22:13:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:16:40] RECOVERY - Host mw1290 is UP: PING OK - Packet loss = 0%, RTA = 0.22 ms [22:17:04] !log created MachineVision extension tables on commonswiki [22:17:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:19:19] (03PS1) 10Mholloway: Enable MachineVision on commonswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549216 (https://phabricator.wikimedia.org/T227349) [22:19:50] (03CR) 10Bstorm: "This primarily strips out unneeded firewall and rsync services that are intended for the production hosts and allows flexibility for the t" [puppet] - 10https://gerrit.wikimedia.org/r/549211 (owner: 10Bstorm) [22:20:58] (03PS1) 10Mholloway: MachineVision: Delay annotation jobs on commonswiki only [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549218 [22:22:11] (03CR) 10Mholloway: [C: 03+2] MachineVision: Delay annotation jobs on commonswiki only [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549218 (owner: 10Mholloway) [22:23:55] 10Operations, 10ops-eqiad, 10serviceops: Can't SSH to mw1290.mgmt - https://phabricator.wikimedia.org/T234153 (10Dzahn) 05Open→03Resolved flea power was drained by Jclark. I can ssh to mgmt (faster than before). Looking good so far. Works for now. If we see it again we will just reopen this. [22:24:40] !log mholloway-shell@deploy1001 Synchronized wmf-config/InitialiseSettings.php: MachineVision: Delay annotation jobs on commonswiki only (duration: 01m 01s) [22:24:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:25:04] (03PS2) 10Mholloway: Enable MachineVision on commonswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549216 (https://phabricator.wikimedia.org/T227349) [22:25:57] (03CR) 10Krinkle: Set CSP on doc.wikimedia.org to enforce. (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/547718 (https://phabricator.wikimedia.org/T213223) (owner: 10Brian Wolff) [22:26:30] (03CR) 10Mholloway: [C: 03+2] Enable MachineVision on commonswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549216 (https://phabricator.wikimedia.org/T227349) (owner: 10Mholloway) [22:27:38] (03PS1) 10Tim Starling: Enable REST API by default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549219 (https://phabricator.wikimedia.org/T237555) [22:28:19] (03CR) 10jerkins-bot: [V: 04-1] Enable REST API by default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549219 (https://phabricator.wikimedia.org/T237555) (owner: 10Tim Starling) [22:29:05] !log mholloway-shell@deploy1001 Synchronized wmf-config/InitialiseSettings.php: Enable MachineVision on commonswiki (T227349) (duration: 01m 00s) [22:29:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:29:09] T227349: Deploy the MachineVision extension to production - https://phabricator.wikimedia.org/T227349 [22:30:17] !log dzahn@cumin1001 conftool action : set/pooled=yes; selector: name=mw1290.eqiad.wmnet [22:30:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:31:20] 10Operations, 10ops-eqiad, 10serviceops: Can't SSH to mw1290.mgmt - https://phabricator.wikimedia.org/T234153 (10Dzahn) repooled [22:33:42] (03CR) 10Brian Wolff: "piwik is used by https://doc.wikimedia.org/oojs-ui/master/demos" [puppet] - 10https://gerrit.wikimedia.org/r/547718 (https://phabricator.wikimedia.org/T213223) (owner: 10Brian Wolff) [22:36:52] !log MachineVision: Imported Freebase to Wikidata ID mappings on commonswiki (T227349) [22:36:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:36:57] T227349: Deploy the MachineVision extension to production - https://phabricator.wikimedia.org/T227349 [22:37:22] 10Operations, 10netops, 10observability: Determine & implement near-term method for escalating network alerts - https://phabricator.wikimedia.org/T237587 (10herron) p:05Triage→03Normal [22:39:14] (03CR) 10Krinkle: [C: 03+1] "OK. That should ideally send data there with beacon without loading executable code from it, but that can be fixed later." [puppet] - 10https://gerrit.wikimedia.org/r/547718 (https://phabricator.wikimedia.org/T213223) (owner: 10Brian Wolff) [22:41:11] 10Operations, 10ops-eqiad: rack/setup/install ms-be105[7-9].eqiad.wmnet - https://phabricator.wikimedia.org/T237438 (10Jclark-ctr) [22:43:16] 10Operations, 10netops, 10observability: Determine & implement near-term method for escalating network alerts - https://phabricator.wikimedia.org/T237587 (10herron) In terms of “what” should be escalated, so far we discussed * Fastnetmon “Potential DDOS” * Interface saturation What else is in scope here?... [22:46:48] (03CR) 10Brian Wolff: "This adds it as both script-src & img-src, so it will run executable JS from piwik." [puppet] - 10https://gerrit.wikimedia.org/r/547718 (https://phabricator.wikimedia.org/T213223) (owner: 10Brian Wolff) [22:50:21] (03PS2) 10Dzahn: gerrit: remove pre-buster support [puppet] - 10https://gerrit.wikimedia.org/r/548547 [22:50:23] (03PS3) 10Dzahn: gerrit: refactor, move java setup to separate class [puppet] - 10https://gerrit.wikimedia.org/r/548554 [22:51:13] (03CR) 10jerkins-bot: [V: 04-1] gerrit: refactor, move java setup to separate class [puppet] - 10https://gerrit.wikimedia.org/r/548554 (owner: 10Dzahn) [22:54:49] (03PS1) 10Cwhite: puppetmaster,icinga: naggen2 cleanup and update to python3 [puppet] - 10https://gerrit.wikimedia.org/r/549222 [22:56:14] (03CR) 10jerkins-bot: [V: 04-1] puppetmaster,icinga: naggen2 cleanup and update to python3 [puppet] - 10https://gerrit.wikimedia.org/r/549222 (owner: 10Cwhite) [22:57:07] (03PS2) 10Cwhite: puppetmaster,icinga: naggen2 cleanup and update to python3 [puppet] - 10https://gerrit.wikimedia.org/r/549222 [22:57:42] (03CR) 10jerkins-bot: [V: 04-1] puppetmaster,icinga: naggen2 cleanup and update to python3 [puppet] - 10https://gerrit.wikimedia.org/r/549222 (owner: 10Cwhite) [23:00:18] (03PS1) 10Jbond: wmflib - puppet_config: update fact resolution to use interpolate [puppet] - 10https://gerrit.wikimedia.org/r/549223 [23:00:39] RECOVERY - Check systemd state on an-coord1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [23:01:39] (03PS1) 10Andrew Bogott: wmcs puppet: remove the mwyaml hiera backend [puppet] - 10https://gerrit.wikimedia.org/r/549224 (https://phabricator.wikimedia.org/T235708) [23:02:39] (03PS3) 10Cwhite: puppetmaster,icinga: naggen2 cleanup and update to python3 [puppet] - 10https://gerrit.wikimedia.org/r/549222 [23:03:50] (03CR) 10jerkins-bot: [V: 04-1] puppetmaster,icinga: naggen2 cleanup and update to python3 [puppet] - 10https://gerrit.wikimedia.org/r/549222 (owner: 10Cwhite) [23:03:58] (03CR) 10Jbond: [C: 03+2] wmflib - puppet_config: update fact resolution to use interpolate [puppet] - 10https://gerrit.wikimedia.org/r/549223 (owner: 10Jbond) [23:04:29] (03CR) 10Cwhite: "It appears jenkins does not like fstrings." [puppet] - 10https://gerrit.wikimedia.org/r/549222 (owner: 10Cwhite) [23:05:17] (03Abandoned) 10Cwhite: naggen2: python3 and remove activerecord support [puppet] - 10https://gerrit.wikimedia.org/r/463133 (https://phabricator.wikimedia.org/T202782) (owner: 10Cwhite) [23:16:08] (03PS6) 10Cwhite: profile: get exim metrics from lists [puppet] - 10https://gerrit.wikimedia.org/r/546992 (https://phabricator.wikimedia.org/T236505) [23:17:30] (03CR) 10Cwhite: [C: 03+2] "PCC looks good: https://puppet-compiler.wmflabs.org/compiler1002/19283/" [puppet] - 10https://gerrit.wikimedia.org/r/546992 (https://phabricator.wikimedia.org/T236505) (owner: 10Cwhite) [23:21:15] 10Operations, 10ops-eqiad, 10Discovery-Search (Current work), 10Patch-For-Review: (Aug 30th, 2019) rack/setup/install elastic10[53-67].eqiad.wmnet - https://phabricator.wikimedia.org/T230746 (10EBernhardson) a:05Christopher→03Gehel [23:40:51] !log ebernhardson@deploy1001 Started deploy [search/mjolnir/deploy@d2ad2da]: bulk_daemon: support ltr model uploads [23:40:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:55:47] !log ebernhardson@deploy1001 Finished deploy [search/mjolnir/deploy@d2ad2da]: bulk_daemon: support ltr model uploads (duration: 14m 56s) [23:55:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:56:14] !log ebernhardson@deploy1001 Started deploy [search/mjolnir/deploy@d2ad2da]: bulk_daemon: support ltr model uploads [23:56:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:57:11] (03PS1) 10Bartosz Dziewoński: Set wgVisualEditorRestbaseParsoidVariant='php' on Beta enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/549227 (https://phabricator.wikimedia.org/T229074) [23:58:10] (03CR) 10Jeena Huneidi: "Thanks, I did not consider the helmfile and I will look at the fixtures." (032 comments) [deployment-charts] - 10https://gerrit.wikimedia.org/r/545421 (https://phabricator.wikimedia.org/T228910) (owner: 10Jeena Huneidi)