[00:00:04] <jouncebot>	 twentyafterfour: My dear minions, it's time we take the moon! Just kidding. Time for Phabricator update deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210520T0000).
[00:01:46] <jinxer-wm>	 (Primary outbound port utilisation over 80%  #page) resolved: Primary outbound port utilisation over 80%  #page - https://alerts.wikimedia.org
[00:05:54] <icinga-wm>	 PROBLEM - SSH on ms-be2035 is CRITICAL: Server answer: https://wikitech.wikimedia.org/wiki/SSH/monitoring
[00:07:00] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[00:07:36] <icinga-wm>	 PROBLEM - Check systemd state on an-launcher1002 is CRITICAL: CRITICAL - degraded: The following units failed: drop_event.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[00:10:20] <wikibugs>	 (03CR) 10Razzi: [C: 03+2] db1125: decommission db1125 [puppet] - 10https://gerrit.wikimedia.org/r/692984 (https://phabricator.wikimedia.org/T283125) (owner: 10Razzi)
[00:13:22] <icinga-wm>	 RECOVERY - SSH on ms-be2035 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u7 (protocol 2.0) https://wikitech.wikimedia.org/wiki/SSH/monitoring
[00:28:26] <icinga-wm>	 PROBLEM - SSH on ms-be2035 is CRITICAL: Server answer: https://wikitech.wikimedia.org/wiki/SSH/monitoring
[00:31:45] <wikibugs>	 (03PS1) 10Dzahn: install_server: add doh2* to use flat/virtual partman recipe [puppet] - 10https://gerrit.wikimedia.org/r/693025 (https://phabricator.wikimedia.org/T283192)
[00:36:00] <wikibugs>	 (03CR) 10Dzahn: [C: 03+2] install_server: add doh2* to use flat/virtual partman recipe [puppet] - 10https://gerrit.wikimedia.org/r/693025 (https://phabricator.wikimedia.org/T283192) (owner: 10Dzahn)
[00:39:00] <wikibugs>	 (03PS1) 10Razzi: site: add dbstore1006 to replace db1004 [puppet] - 10https://gerrit.wikimedia.org/r/693046
[00:39:28] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[00:41:52] <icinga-wm>	 PROBLEM - OSPF status on cr2-esams is CRITICAL: OSPFv2: 3/4 UP : OSPFv3: 3/3 UP : 4 v2 P2P interfaces vs. 3 v3 P2P interfaces https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[00:44:22] <icinga-wm>	 RECOVERY - OSPF status on cr2-esams is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[00:44:30] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[00:48:08] <wikibugs>	 (03CR) 10Razzi: "Let me know how the config looks; hardware address taken from its former name as db1125." [puppet] - 10https://gerrit.wikimedia.org/r/693046 (owner: 10Razzi)
[00:51:52] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[00:56:54] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[01:01:44] <icinga-wm>	 PROBLEM - OSPF status on cr2-esams is CRITICAL: OSPFv2: 3/4 UP : OSPFv3: 3/3 UP : 4 v2 P2P interfaces vs. 3 v3 P2P interfaces https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[01:01:45] <mutante>	 !log signing puppet certs for doh2001 and doh2002.wikimedia.org (T283192)
[01:01:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[01:01:51] <stashbot>	 T283192: Please create two Ganeti VMs for Wikidough - https://phabricator.wikimedia.org/T283192
[01:04:10] <icinga-wm>	 RECOVERY - OSPF status on cr2-esams is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[01:11:47] <wikibugs>	 10SRE, 10Traffic, 10vm-requests: Please create two Ganeti VMs for Wikidough - https://phabricator.wikimedia.org/T283192 (10Dzahn) 05Open→03Resolved VMs have been created, added to site.pp with "insetup", added to DHCP and partma.  OS has been installed (buster) and puppet certs signed. You can now SSH to...
[01:31:46] <wikibugs>	 10SRE, 10SRE-Access-Requests: Superset Access for Cooltey Feng - https://phabricator.wikimedia.org/T283189 (10Ottomata) Approved.    Verifying that this is a case of https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Dashboards_in_Superset_/_Hive_interfaces_(like_Hue)_that_do_access_private_data, and th...
[01:34:32] <wikibugs>	 (03PS1) 10Jforrester: PageProps: be prepared that PageIdentity is not proper title [core] (wmf/1.37.0-wmf.6) - 10https://gerrit.wikimedia.org/r/693028 (https://phabricator.wikimedia.org/T283170)
[01:34:51] <wikibugs>	 (03PS1) 10Jforrester: ActorStore: avoid throwing in case of invalid usernames [core] (wmf/1.37.0-wmf.6) - 10https://gerrit.wikimedia.org/r/693029 (https://phabricator.wikimedia.org/T283167)
[01:35:20] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops: (Need By: 2021-04-30) rack/setup/install backup100[4-7] - https://phabricator.wikimedia.org/T277327 (10wiki_willy) Hi @Jclark-ctr - are there specific racks that you need the space in?  We also have some high priority 740xd2 servers coming in Q1, that we should make room for at...
[01:35:35] <wikibugs>	 (03PS1) 10Jforrester: UploadFromStash: convert default user from false to null [core] (wmf/1.37.0-wmf.6) - 10https://gerrit.wikimedia.org/r/693030 (https://phabricator.wikimedia.org/T283196)
[01:44:30] <wikibugs>	 (03CR) 10Ottomata: [C: 03+1] db1125: decommission db1125 [puppet] - 10https://gerrit.wikimedia.org/r/692984 (https://phabricator.wikimedia.org/T283125) (owner: 10Razzi)
[01:48:04] <icinga-wm>	 RECOVERY - SSH on ms-be2035 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u7 (protocol 2.0) https://wikitech.wikimedia.org/wiki/SSH/monitoring
[01:50:14] <icinga-wm>	 PROBLEM - OSPF status on cr2-esams is CRITICAL: OSPFv2: 3/4 UP : OSPFv3: 3/3 UP : 4 v2 P2P interfaces vs. 3 v3 P2P interfaces https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[01:54:24] <icinga-wm>	 PROBLEM - SSH on ms-be2035 is CRITICAL: Server answer: https://wikitech.wikimedia.org/wiki/SSH/monitoring
[01:54:28] <icinga-wm>	 RECOVERY - OSPF status on cr2-esams is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[01:54:32] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[01:56:42] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[02:03:08] <icinga-wm>	 RECOVERY - SSH on ms-be2035 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u7 (protocol 2.0) https://wikitech.wikimedia.org/wiki/SSH/monitoring
[02:09:44] <icinga-wm>	 PROBLEM - SSH on ms-be2035 is CRITICAL: Server answer: https://wikitech.wikimedia.org/wiki/SSH/monitoring
[02:25:02] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[02:27:12] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[02:27:32] <wikibugs>	 (03CR) 10Ottomata: site: add dbstore1006 to replace db1004 (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693046 (owner: 10Razzi)
[02:33:42] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[02:36:02] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[02:40:28] <icinga-wm>	 PROBLEM - OSPF status on cr2-esams is CRITICAL: OSPFv2: 3/4 UP : OSPFv3: 3/3 UP : 4 v2 P2P interfaces vs. 3 v3 P2P interfaces https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[02:42:38] <icinga-wm>	 RECOVERY - OSPF status on cr2-esams is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[02:42:40] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[02:44:32] <icinga-wm>	 RECOVERY - SSH on ms-be2035 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u7 (protocol 2.0) https://wikitech.wikimedia.org/wiki/SSH/monitoring
[02:49:10] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[02:51:00] <icinga-wm>	 PROBLEM - SSH on ms-be2035 is CRITICAL: Server answer: https://wikitech.wikimedia.org/wiki/SSH/monitoring
[02:53:34] <icinga-wm>	 PROBLEM - OSPF status on cr2-esams is CRITICAL: OSPFv2: 3/4 UP : OSPFv3: 3/3 UP : 4 v2 P2P interfaces vs. 3 v3 P2P interfaces https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[02:55:32] <icinga-wm>	 RECOVERY - SSH on ms-be2035 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u7 (protocol 2.0) https://wikitech.wikimedia.org/wiki/SSH/monitoring
[02:55:54] <icinga-wm>	 RECOVERY - OSPF status on cr2-esams is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[03:00:32] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[03:02:20] <icinga-wm>	 PROBLEM - SSH on ms-be2035 is CRITICAL: Server answer: https://wikitech.wikimedia.org/wiki/SSH/monitoring
[03:05:06] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[03:42:04] <icinga-wm>	 PROBLEM - OSPF status on cr2-esams is CRITICAL: OSPFv2: 3/4 UP : OSPFv3: 3/3 UP : 4 v2 P2P interfaces vs. 3 v3 P2P interfaces https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[03:44:34] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[03:46:52] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[03:49:02] <icinga-wm>	 RECOVERY - OSPF status on cr2-esams is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[04:01:00] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[04:02:18] <icinga-wm>	 RECOVERY - Check systemd state on mwmaint1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[04:10:36] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[04:43:26] <icinga-wm>	 PROBLEM - OSPF status on cr2-esams is CRITICAL: OSPFv2: 3/4 UP : OSPFv3: 3/3 UP : 4 v2 P2P interfaces vs. 3 v3 P2P interfaces https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[04:48:10] <icinga-wm>	 RECOVERY - OSPF status on cr2-esams is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[04:55:34] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[04:58:33] <wikibugs>	 (03PS1) 10Marostegui: Revert "db1141: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/693032
[04:58:52] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1141 (re)pooling @ 25%: Repool db1141', diff saved to https://phabricator.wikimedia.org/P16107 and previous config saved to /var/cache/conftool/dbconfig/20210520-045852-root.json
[04:58:55] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[04:59:20] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depool db1166', diff saved to https://phabricator.wikimedia.org/P16108 and previous config saved to /var/cache/conftool/dbconfig/20210520-045919-marostegui.json
[04:59:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[04:59:49] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] Revert "db1141: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/693032 (owner: 10Marostegui)
[05:00:26] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depool db1143', diff saved to https://phabricator.wikimedia.org/P16109 and previous config saved to /var/cache/conftool/dbconfig/20210520-050025-marostegui.json
[05:00:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:02:52] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[05:11:32] <wikibugs>	 (03PS2) 10Jcrespo: Revert "bacula: Reenable read-write ES database backups, disable read-only" [puppet] - 10https://gerrit.wikimedia.org/r/692650
[05:12:29] <wikibugs>	 (03PS1) 10Marostegui: mariadb: Decommission labsdb1011 [puppet] - 10https://gerrit.wikimedia.org/r/693057 (https://phabricator.wikimedia.org/T282524)
[05:13:13] <wikibugs>	 (03CR) 10Jcrespo: [C: 03+2] Revert "bacula: Reenable read-write ES database backups, disable read-only" [puppet] - 10https://gerrit.wikimedia.org/r/692650 (owner: 10Jcrespo)
[05:13:20] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.decommission for hosts labsdb1011.eqiad.wmnet
[05:13:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:13:56] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1141 (re)pooling @ 50%: Repool db1141', diff saved to https://phabricator.wikimedia.org/P16110 and previous config saved to /var/cache/conftool/dbconfig/20210520-051355-root.json
[05:13:58] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:18:14] <wikibugs>	 (03PS1) 10Jcrespo: Revert "Revert "bacula: Reenable read-write ES database backups, disable read-only"" [puppet] - 10https://gerrit.wikimedia.org/r/693033
[05:18:36] <wikibugs>	 (03PS2) 10Jcrespo: Revert "Revert "bacula: Reenable read-write ES database backups, disable read-only"" [puppet] - 10https://gerrit.wikimedia.org/r/693033
[05:22:53] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] mariadb: Decommission labsdb1011 [puppet] - 10https://gerrit.wikimedia.org/r/693057 (https://phabricator.wikimedia.org/T282524) (owner: 10Marostegui)
[05:22:59] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts labsdb1011.eqiad.wmnet
[05:23:01] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:23:51] <wikibugs>	 10ops-eqiad, 10DBA, 10decommission-hardware: decommission labsdb1011.eqiad.wmnet - https://phabricator.wikimedia.org/T282524 (10Marostegui) This is ready for #dc-ops
[05:23:57] <wikibugs>	 10ops-eqiad, 10DBA, 10decommission-hardware: decommission labsdb1011.eqiad.wmnet - https://phabricator.wikimedia.org/T282524 (10Marostegui) a:05Marostegui→03wiki_willy
[05:24:07] <wikibugs>	 10ops-eqiad, 10decommission-hardware: decommission labsdb1011.eqiad.wmnet - https://phabricator.wikimedia.org/T282524 (10Marostegui)
[05:24:42] <logmsgbot>	 !log ryankemper@cumin1001 START - Cookbook sre.elasticsearch.rolling-operation reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic reboot - ryankemper@cumin1001 - T283223
[05:24:44] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:24:46] <stashbot>	 T283223: Reboot cloudelastic* to apply security updates - https://phabricator.wikimedia.org/T283223
[05:25:09] <wikibugs>	 (03PS1) 10Marostegui: maintain_dbusers.pp: Remove reference to labsdb1011 [puppet] - 10https://gerrit.wikimedia.org/r/693058 (https://phabricator.wikimedia.org/T282662)
[05:27:12] <logmsgbot>	 !log ryankemper@cumin1001 END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic reboot - ryankemper@cumin1001 - T283223
[05:27:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:27:43] <ryankemper>	 (ctrl+c'd, need to set a lower # of nodes at a time)
[05:29:00] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1141 (re)pooling @ 75%: Repool db1141', diff saved to https://phabricator.wikimedia.org/P16111 and previous config saved to /var/cache/conftool/dbconfig/20210520-052859-root.json
[05:29:02] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:29:03] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1002 is CRITICAL: CRITICAL - elasticsearch inactive shards 245 threshold =0.15 breach: active_shards: 1276, status: yellow, relocating_shards: 0, timed_out: False, active_primary_shards: 759, delayed_unassigned_shards: 0, task_max_waiting_in_queue_millis: 0, initializing_shards: 20, number_of_in_flight_fetch: 0, active_shards_percent_as_number: 83.892176199868
[05:29:03] <icinga-wm>	 rds: 225, number_of_data_nodes: 5, cluster_name: cloudelastic-chi-eqiad, number_of_nodes: 5, number_of_pending_tasks: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:30:27] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1002 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: active_shards: 1490, cluster_name: cloudelastic-chi-eqiad, delayed_unassigned_shards: 0, timed_out: False, active_shards_percent_as_number: 97.96186719263642, number_of_data_nodes: 6, unassigned_shards: 27, relocating_shards: 0, number_of_pending_tasks: 2, status: yellow, task_max_waiting_in_queue_
[05:30:27] <icinga-wm>	 ializing_shards: 4, active_primary_shards: 759, number_of_nodes: 6, number_of_in_flight_fetch: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:33:05] <logmsgbot>	 !log ryankemper@cumin1001 START - Cookbook sre.elasticsearch.rolling-operation reboot without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic reboot - ryankemper@cumin1001 - T283223
[05:33:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:33:09] <stashbot>	 T283223: Reboot cloudelastic* to apply security updates - https://phabricator.wikimedia.org/T283223
[05:33:21] <ryankemper>	 !log T283223 `sudo -i cookbook sre.elasticsearch.rolling-operation cloudelastic "cloudelastic reboot" --reboot --nodes-per-run 1 --start-datetime 2021-05-20T05:16:40 --task-id T283223` on `ryankemper@cumin1001` tmux session `restart_cloudelastic`
[05:33:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:36:13] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[05:37:49] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[05:38:15] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9600 on cloudelastic1005 is CRITICAL: CRITICAL - elasticsearch inactive shards 241 threshold =0.15 breach: unassigned_shards: 241, task_max_waiting_in_queue_millis: 49, active_shards: 1207, cluster_name: cloudelastic-psi-eqiad, timed_out: False, number_of_in_flight_fetch: 1002, active_primary_shards: 723, number_of_nodes: 6, status: yellow, delayed_unassigned_shards: 0, number_of
[05:38:15] <icinga-wm>	 umber_of_pending_tasks: 2, initializing_shards: 0, relocating_shards: 0, active_shards_percent_as_number: 83.35635359116023 https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:38:15] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9600 on cloudelastic1001 is CRITICAL: CRITICAL - elasticsearch inactive shards 241 threshold =0.15 breach: number_of_data_nodes: 6, active_shards_percent_as_number: 83.35635359116023, task_max_waiting_in_queue_millis: 195, active_shards: 1207, number_of_in_flight_fetch: 588, delayed_unassigned_shards: 0, cluster_name: cloudelastic-psi-eqiad, unassigned_shards: 241, number_of_node
[05:38:15] <icinga-wm>	 g_shards: 0, active_primary_shards: 723, timed_out: False, status: yellow, number_of_pending_tasks: 2, relocating_shards: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:40:01] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9600 on cloudelastic1005 is OK: OK - elasticsearch status cloudelastic-psi-eqiad: timed_out: False, unassigned_shards: 0, cluster_name: cloudelastic-psi-eqiad, initializing_shards: 0, relocating_shards: 0, active_shards: 1448, number_of_nodes: 6, delayed_unassigned_shards: 0, active_primary_shards: 723, number_of_in_flight_fetch: 0, active_shards_percent_as_number: 100.0, number
[05:40:01] <icinga-wm>	 : 0, status: green, task_max_waiting_in_queue_millis: 0, number_of_data_nodes: 6 https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:40:01] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9600 on cloudelastic1001 is OK: OK - elasticsearch status cloudelastic-psi-eqiad: relocating_shards: 0, timed_out: False, number_of_pending_tasks: 0, task_max_waiting_in_queue_millis: 0, active_shards_percent_as_number: 100.0, status: green, active_primary_shards: 723, delayed_unassigned_shards: 0, number_of_data_nodes: 6, number_of_nodes: 6, active_shards: 1448, initializing_sh
[05:40:01] <icinga-wm>	 ed_shards: 0, number_of_in_flight_fetch: 0, cluster_name: cloudelastic-psi-eqiad https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:41:19] <icinga-wm>	 PROBLEM - OSPF status on cr2-esams is CRITICAL: OSPFv2: 3/4 UP : OSPFv3: 3/3 UP : 4 v2 P2P interfaces vs. 3 v3 P2P interfaces https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[05:41:51] <icinga-wm>	 PROBLEM - Check systemd state on cloudelastic1004 is CRITICAL: CRITICAL - degraded: The following units failed: prometheus-wmf-elasticsearch-exporter-9200.service,prometheus-wmf-elasticsearch-exporter-9400.service,prometheus-wmf-elasticsearch-exporter-9600.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[05:42:05] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] maintain_dbusers.pp: Remove reference to labsdb1011 [puppet] - 10https://gerrit.wikimedia.org/r/693058 (https://phabricator.wikimedia.org/T282662) (owner: 10Marostegui)
[05:44:03] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1141 (re)pooling @ 100%: Repool db1141', diff saved to https://phabricator.wikimedia.org/P16112 and previous config saved to /var/cache/conftool/dbconfig/20210520-054402-root.json
[05:44:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:47:01] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9400 on cloudelastic1002 is CRITICAL: CRITICAL - elasticsearch inactive shards 250 threshold =0.15 breach: relocating_shards: 0, status: yellow, number_of_in_flight_fetch: 0, number_of_pending_tasks: 0, active_shards: 1248, number_of_nodes: 6, task_max_waiting_in_queue_millis: 0, timed_out: False, initializing_shards: 0, unassigned_shards: 250, active_primary_shards: 748, number_
[05:47:01] <icinga-wm>	  active_shards_percent_as_number: 83.31108144192257, delayed_unassigned_shards: 0, cluster_name: cloudelastic-omega-eqiad https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:47:17] <icinga-wm>	 RECOVERY - OSPF status on cr2-esams is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[05:47:35] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1002 is CRITICAL: CRITICAL - elasticsearch inactive shards 253 threshold =0.15 breach: active_shards_percent_as_number: 83.36620644312951, timed_out: False, initializing_shards: 0, task_max_waiting_in_queue_millis: 0, delayed_unassigned_shards: 0, number_of_pending_tasks: 0, active_shards: 1268, cluster_name: cloudelastic-chi-eqiad, status: yellow, unassigned_
[05:47:35] <icinga-wm>	 er_of_nodes: 6, relocating_shards: 0, number_of_in_flight_fetch: 0, number_of_data_nodes: 6, active_primary_shards: 759 https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:47:49] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1004 is CRITICAL: CRITICAL - elasticsearch inactive shards 246 threshold =0.15 breach: delayed_unassigned_shards: 0, timed_out: False, number_of_nodes: 6, status: yellow, task_max_waiting_in_queue_millis: 6485, number_of_in_flight_fetch: 0, active_primary_shards: 759, unassigned_shards: 242, cluster_name: cloudelastic-chi-eqiad, active_shards_percent_as_number
[05:47:49] <icinga-wm>	 13, relocating_shards: 0, initializing_shards: 4, number_of_data_nodes: 6, active_shards: 1275, number_of_pending_tasks: 4 https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:49:11] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9400 on cloudelastic1002 is OK: OK - elasticsearch status cloudelastic-omega-eqiad: number_of_pending_tasks: 0, active_primary_shards: 748, initializing_shards: 0, active_shards: 1498, active_shards_percent_as_number: 100.0, number_of_data_nodes: 6, unassigned_shards: 0, task_max_waiting_in_queue_millis: 0, status: green, timed_out: False, delayed_unassigned_shards: 0, cluster_n
[05:49:11] <icinga-wm>	 -omega-eqiad, number_of_in_flight_fetch: 0, number_of_nodes: 6, relocating_shards: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:49:35] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[05:49:42] <ryankemper>	 (Sorry for the shard health check noise - the cookbook downtimes each host so not sure why those alerts are coming through...will take a look tomorrow)
[05:49:45] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1002 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: task_max_waiting_in_queue_millis: 0, relocating_shards: 0, number_of_pending_tasks: 0, initializing_shards: 0, number_of_nodes: 6, number_of_data_nodes: 6, active_shards_percent_as_number: 100.0, active_primary_shards: 759, delayed_unassigned_shards: 0, timed_out: False, status: green, cluster_name
[05:49:45] <icinga-wm>	 i-eqiad, number_of_in_flight_fetch: 0, active_shards: 1521, unassigned_shards: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:49:57] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1004 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: task_max_waiting_in_queue_millis: 0, active_primary_shards: 759, delayed_unassigned_shards: 0, number_of_in_flight_fetch: 0, timed_out: False, number_of_data_nodes: 6, status: green, relocating_shards: 0, number_of_nodes: 6, unassigned_shards: 0, active_shards: 1521, initializing_shards: 0, cluster
[05:49:57] <icinga-wm>	 ic-chi-eqiad, active_shards_percent_as_number: 100.0, number_of_pending_tasks: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:53:42] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9600 on cloudelastic1005 is CRITICAL: CRITICAL - elasticsearch inactive shards 241 threshold =0.15 breach: task_max_waiting_in_queue_millis: 0, active_primary_shards: 723, number_of_in_flight_fetch: 0, number_of_nodes: 5, cluster_name: cloudelastic-psi-eqiad, delayed_unassigned_shards: 0, timed_out: False, number_of_data_nodes: 5, unassigned_shards: 241, relocating_shards: 0, ini
[05:53:42] <icinga-wm>	  0, active_shards_percent_as_number: 83.35635359116023, number_of_pending_tasks: 1, status: yellow, active_shards: 1207 https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:54:44] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9600 on cloudelastic1005 is OK: OK - elasticsearch status cloudelastic-psi-eqiad: timed_out: False, cluster_name: cloudelastic-psi-eqiad, active_primary_shards: 723, number_of_pending_tasks: 1, unassigned_shards: 35, task_max_waiting_in_queue_millis: 0, relocating_shards: 0, delayed_unassigned_shards: 0, number_of_data_nodes: 6, status: yellow, initializing_shards: 2, number_of_
[05:54:44] <icinga-wm>	 of_in_flight_fetch: 0, active_shards_percent_as_number: 97.44475138121547, active_shards: 1411 https://wikitech.wikimedia.org/wiki/Search%23Administration
[05:54:58] <Amir1>	 I deploy this UBN! https://gerrit.wikimedia.org/r/c/mediawiki/core/+/693028
[05:56:17] <wikibugs>	 (03CR) 10Marostegui: [C: 04-1] site: add dbstore1006 to replace db1004 (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693046 (owner: 10Razzi)
[05:56:29] <brennen>	 thanks Amir1 
[05:56:30] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+2] PageProps: be prepared that PageIdentity is not proper title [core] (wmf/1.37.0-wmf.6) - 10https://gerrit.wikimedia.org/r/693028 (https://phabricator.wikimedia.org/T283170) (owner: 10Jforrester)
[05:56:38] <icinga-wm>	 PROBLEM - OSPF status on cr2-esams is CRITICAL: OSPFv2: 3/4 UP : OSPFv3: 3/3 UP : 4 v2 P2P interfaces vs. 3 v3 P2P interfaces https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[05:58:02] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[05:58:33] <wikibugs>	 (03PS1) 10Marostegui: cumin: Remove labsdb* [puppet] - 10https://gerrit.wikimedia.org/r/693059 (https://phabricator.wikimedia.org/T282662)
[06:00:28] <icinga-wm>	 RECOVERY - OSPF status on cr2-esams is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[06:01:36] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9400 on cloudelastic1003 is CRITICAL: CRITICAL - elasticsearch inactive shards 250 threshold =0.15 breach: number_of_pending_tasks: 0, initializing_shards: 0, number_of_nodes: 5, status: yellow, number_of_data_nodes: 5, active_shards_percent_as_number: 83.31108144192257, delayed_unassigned_shards: 0, timed_out: False, active_shards: 1248, unassigned_shards: 250, relocating_shards
[06:01:36] <icinga-wm>	 _flight_fetch: 0, active_primary_shards: 748, cluster_name: cloudelastic-omega-eqiad, task_max_waiting_in_queue_millis: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:01:36] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1004 is CRITICAL: CRITICAL - elasticsearch inactive shards 254 threshold =0.15 breach: status: yellow, number_of_pending_tasks: 0, unassigned_shards: 254, active_shards: 1267, active_shards_percent_as_number: 83.30046022353714, task_max_waiting_in_queue_millis: 0, delayed_unassigned_shards: 0, number_of_nodes: 5, relocating_shards: 0, active_primary_shards: 75
[06:01:36] <icinga-wm>	 se, cluster_name: cloudelastic-chi-eqiad, number_of_in_flight_fetch: 0, initializing_shards: 0, number_of_data_nodes: 5 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:01:36] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9600 on cloudelastic1002 is CRITICAL: CRITICAL - elasticsearch inactive shards 242 threshold =0.15 breach: active_primary_shards: 723, cluster_name: cloudelastic-psi-eqiad, active_shards_percent_as_number: 83.28729281767956, timed_out: False, number_of_in_flight_fetch: 0, unassigned_shards: 242, status: yellow, initializing_shards: 0, task_max_waiting_in_queue_millis: 0, number_o
[06:01:36] <icinga-wm>	 number_of_pending_tasks: 0, relocating_shards: 0, number_of_nodes: 5, active_shards: 1206, delayed_unassigned_shards: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:01:36] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1002 is CRITICAL: CRITICAL - elasticsearch inactive shards 254 threshold =0.15 breach: timed_out: False, active_shards_percent_as_number: 83.30046022353714, number_of_in_flight_fetch: 0, relocating_shards: 0, task_max_waiting_in_queue_millis: 0, delayed_unassigned_shards: 0, number_of_nodes: 5, initializing_shards: 0, unassigned_shards: 254, cluster_name: clou
[06:01:37] <icinga-wm>	 d, active_shards: 1267, number_of_data_nodes: 5, number_of_pending_tasks: 0, active_primary_shards: 759, status: yellow https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:01:38] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1001 is CRITICAL: CRITICAL - elasticsearch inactive shards 254 threshold =0.15 breach: active_shards: 1267, timed_out: False, status: yellow, task_max_waiting_in_queue_millis: 0, number_of_nodes: 5, initializing_shards: 0, cluster_name: cloudelastic-chi-eqiad, unassigned_shards: 254, number_of_data_nodes: 5, active_primary_shards: 759, relocating_shards: 0, nu
[06:01:38] <icinga-wm>	 asks: 0, number_of_in_flight_fetch: 0, active_shards_percent_as_number: 83.30046022353714, delayed_unassigned_shards: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:01:42] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9400 on cloudelastic1002 is CRITICAL: CRITICAL - elasticsearch inactive shards 250 threshold =0.15 breach: status: yellow, active_shards_percent_as_number: 83.31108144192257, active_shards: 1248, timed_out: False, cluster_name: cloudelastic-omega-eqiad, task_max_waiting_in_queue_millis: 0, active_primary_shards: 748, number_of_data_nodes: 5, number_of_pending_tasks: 0, relocating
[06:01:42] <icinga-wm>	 r_of_in_flight_fetch: 0, unassigned_shards: 250, initializing_shards: 0, number_of_nodes: 5, delayed_unassigned_shards: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:01:42] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9400 on cloudelastic1001 is CRITICAL: CRITICAL - elasticsearch inactive shards 250 threshold =0.15 breach: number_of_data_nodes: 5, active_shards: 1248, status: yellow, relocating_shards: 0, task_max_waiting_in_queue_millis: 0, active_primary_shards: 748, timed_out: False, number_of_in_flight_fetch: 0, cluster_name: cloudelastic-omega-eqiad, initializing_shards: 0, number_of_node
[06:01:42] <icinga-wm>	 shards: 250, number_of_pending_tasks: 0, active_shards_percent_as_number: 83.31108144192257, delayed_unassigned_shards: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:01:42] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1003 is CRITICAL: CRITICAL - elasticsearch inactive shards 254 threshold =0.15 breach: task_max_waiting_in_queue_millis: 0, number_of_data_nodes: 5, number_of_pending_tasks: 0, active_primary_shards: 759, status: yellow, cluster_name: cloudelastic-chi-eqiad, delayed_unassigned_shards: 0, initializing_shards: 0, relocating_shards: 0, active_shards_percent_as_nu
[06:01:42] <icinga-wm>	 353714, number_of_in_flight_fetch: 0, unassigned_shards: 254, timed_out: False, active_shards: 1267, number_of_nodes: 5 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:02:14] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9600 on cloudelastic1005 is CRITICAL: CRITICAL - elasticsearch inactive shards 242 threshold =0.15 breach: active_shards: 1206, relocating_shards: 0, task_max_waiting_in_queue_millis: 0, cluster_name: cloudelastic-psi-eqiad, active_primary_shards: 723, unassigned_shards: 242, delayed_unassigned_shards: 0, number_of_pending_tasks: 0, initializing_shards: 0, number_of_in_flight_fet
[06:02:14] <icinga-wm>	 data_nodes: 5, number_of_nodes: 5, active_shards_percent_as_number: 83.28729281767956, status: yellow, timed_out: False https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:02:14] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9600 on cloudelastic1001 is CRITICAL: CRITICAL - elasticsearch inactive shards 242 threshold =0.15 breach: unassigned_shards: 242, number_of_data_nodes: 5, active_shards_percent_as_number: 83.28729281767956, number_of_pending_tasks: 0, task_max_waiting_in_queue_millis: 0, number_of_nodes: 5, number_of_in_flight_fetch: 0, status: yellow, cluster_name: cloudelastic-psi-eqiad, reloc
[06:02:14] <icinga-wm>	 delayed_unassigned_shards: 0, initializing_shards: 0, active_shards: 1206, timed_out: False, active_primary_shards: 723 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:02:35] <ryankemper>	 ^ Working on downtiming those manually...guessing the master just restarted (expected) thus why each host is complaining
[06:02:36] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9400 on cloudelastic1004 is CRITICAL: CRITICAL - elasticsearch inactive shards 250 threshold =0.15 breach: number_of_data_nodes: 5, number_of_nodes: 5, number_of_pending_tasks: 0, delayed_unassigned_shards: 0, task_max_waiting_in_queue_millis: 0, relocating_shards: 0, status: yellow, cluster_name: cloudelastic-omega-eqiad, unassigned_shards: 250, number_of_in_flight_fetch: 0, tim
[06:02:36] <icinga-wm>	 tive_shards_percent_as_number: 83.31108144192257, initializing_shards: 0, active_shards: 1248, active_primary_shards: 748 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:02:36] <icinga-wm>	 PROBLEM - ElasticSearch health check for shards on 9600 on cloudelastic1003 is CRITICAL: CRITICAL - elasticsearch inactive shards 242 threshold =0.15 breach: number_of_data_nodes: 5, delayed_unassigned_shards: 0, task_max_waiting_in_queue_millis: 0, relocating_shards: 0, cluster_name: cloudelastic-psi-eqiad, number_of_pending_tasks: 0, timed_out: False, active_shards: 1206, status: yellow, number_of_nodes: 5, active_shards_percen
[06:02:36] <icinga-wm>	 8729281767956, number_of_in_flight_fetch: 0, initializing_shards: 0, unassigned_shards: 242, active_primary_shards: 723 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:03:32] <ryankemper>	 (Downtime set for two hours on `cloudelastic100[1-6]`)
[06:03:46] <icinga-wm>	 RECOVERY - Check systemd state on cloudelastic1004 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[06:04:06] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9600 on cloudelastic1003 is OK: OK - elasticsearch status cloudelastic-psi-eqiad: number_of_nodes: 6, cluster_name: cloudelastic-psi-eqiad, task_max_waiting_in_queue_millis: 0, initializing_shards: 2, status: yellow, relocating_shards: 0, active_shards: 1378, active_shards_percent_as_number: 95.1657458563536, number_of_pending_tasks: 0, timed_out: False, number_of_data_nodes: 6,
[06:04:06] <icinga-wm>	 ght_fetch: 0, delayed_unassigned_shards: 0, active_primary_shards: 723, unassigned_shards: 68 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:04:34] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9600 on cloudelastic1002 is OK: OK - elasticsearch status cloudelastic-psi-eqiad: active_shards_percent_as_number: 100.0, unassigned_shards: 0, number_of_data_nodes: 6, relocating_shards: 0, status: green, active_primary_shards: 723, delayed_unassigned_shards: 0, number_of_in_flight_fetch: 0, task_max_waiting_in_queue_millis: 0, cluster_name: cloudelastic-psi-eqiad, number_of_no
[06:04:34] <icinga-wm>	 ards: 1448, timed_out: False, number_of_pending_tasks: 0, initializing_shards: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:04:34] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1004 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: delayed_unassigned_shards: 0, relocating_shards: 0, number_of_pending_tasks: 1, initializing_shards: 4, timed_out: False, task_max_waiting_in_queue_millis: 0, unassigned_shards: 63, status: yellow, active_primary_shards: 759, number_of_data_nodes: 6, cluster_name: cloudelastic-chi-eqiad, active_sha
[06:04:34] <icinga-wm>	 _shards_percent_as_number: 95.59500328731097, number_of_nodes: 6, number_of_in_flight_fetch: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:04:34] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9400 on cloudelastic1003 is OK: OK - elasticsearch status cloudelastic-omega-eqiad: active_shards_percent_as_number: 100.0, active_shards: 1498, task_max_waiting_in_queue_millis: 0, initializing_shards: 0, delayed_unassigned_shards: 0, cluster_name: cloudelastic-omega-eqiad, active_primary_shards: 748, number_of_in_flight_fetch: 0, unassigned_shards: 0, number_of_nodes: 6, reloc
[06:04:34] <icinga-wm>	 number_of_pending_tasks: 0, status: green, number_of_data_nodes: 6, timed_out: False https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:04:36] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1002 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: status: yellow, cluster_name: cloudelastic-chi-eqiad, timed_out: False, number_of_data_nodes: 6, active_primary_shards: 759, number_of_in_flight_fetch: 0, active_shards: 1472, unassigned_shards: 45, active_shards_percent_as_number: 96.7784352399737, relocating_shards: 0, initializing_shards: 4, del
[06:04:36] <icinga-wm>	 hards: 0, number_of_nodes: 6, task_max_waiting_in_queue_millis: 205, number_of_pending_tasks: 2 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:04:36] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1001 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: number_of_in_flight_fetch: 0, unassigned_shards: 37, cluster_name: cloudelastic-chi-eqiad, delayed_unassigned_shards: 0, task_max_waiting_in_queue_millis: 250, timed_out: False, relocating_shards: 0, number_of_data_nodes: 6, number_of_pending_tasks: 3, active_shards: 1480, active_primary_shards: 75
[06:04:36] <icinga-wm>	 percent_as_number: 97.30440499671269, initializing_shards: 4, status: yellow, number_of_nodes: 6 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:04:42] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9400 on cloudelastic1001 is OK: OK - elasticsearch status cloudelastic-omega-eqiad: cluster_name: cloudelastic-omega-eqiad, status: green, active_primary_shards: 748, timed_out: False, number_of_nodes: 6, relocating_shards: 0, number_of_pending_tasks: 0, unassigned_shards: 0, task_max_waiting_in_queue_millis: 0, number_of_data_nodes: 6, active_shards_percent_as_number: 100.0, de
[06:04:42] <icinga-wm>	 shards: 0, active_shards: 1498, number_of_in_flight_fetch: 0, initializing_shards: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:04:42] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9400 on cloudelastic1002 is OK: OK - elasticsearch status cloudelastic-omega-eqiad: number_of_data_nodes: 6, active_primary_shards: 748, active_shards: 1498, relocating_shards: 0, number_of_nodes: 6, timed_out: False, initializing_shards: 0, number_of_in_flight_fetch: 0, task_max_waiting_in_queue_millis: 0, number_of_pending_tasks: 0, status: green, delayed_unassigned_shards: 0,
[06:04:42] <icinga-wm>	 s: 0, cluster_name: cloudelastic-omega-eqiad, active_shards_percent_as_number: 100.0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:04:42] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1003 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: initializing_shards: 4, delayed_unassigned_shards: 0, number_of_pending_tasks: 6, unassigned_shards: 19, relocating_shards: 0, active_primary_shards: 759, number_of_nodes: 6, number_of_data_nodes: 6, task_max_waiting_in_queue_millis: 1501, cluster_name: cloudelastic-chi-eqiad, number_of_in_flight_f
[06:04:42] <icinga-wm>	 hards_percent_as_number: 98.4878369493754, active_shards: 1498, timed_out: False, status: yellow https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:04:48] <icinga-wm>	 PROBLEM - OSPF status on cr2-esams is CRITICAL: OSPFv2: 3/4 UP : OSPFv3: 3/3 UP : 4 v2 P2P interfaces vs. 3 v3 P2P interfaces https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[06:05:14] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9600 on cloudelastic1005 is OK: OK - elasticsearch status cloudelastic-psi-eqiad: delayed_unassigned_shards: 0, active_shards_percent_as_number: 100.0, relocating_shards: 0, number_of_nodes: 6, active_primary_shards: 723, timed_out: False, number_of_data_nodes: 6, cluster_name: cloudelastic-psi-eqiad, task_max_waiting_in_queue_millis: 0, number_of_in_flight_fetch: 0, initializin
[06:05:14] <icinga-wm>	 ve_shards: 1448, unassigned_shards: 0, number_of_pending_tasks: 0, status: green https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:05:14] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9600 on cloudelastic1001 is OK: OK - elasticsearch status cloudelastic-psi-eqiad: status: green, delayed_unassigned_shards: 0, initializing_shards: 0, active_shards: 1448, relocating_shards: 0, number_of_nodes: 6, active_primary_shards: 723, unassigned_shards: 0, cluster_name: cloudelastic-psi-eqiad, number_of_data_nodes: 6, task_max_waiting_in_queue_millis: 0, timed_out: False,
[06:05:14] <icinga-wm>	 ght_fetch: 0, active_shards_percent_as_number: 100.0, number_of_pending_tasks: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:05:36] <icinga-wm>	 RECOVERY - ElasticSearch health check for shards on 9400 on cloudelastic1004 is OK: OK - elasticsearch status cloudelastic-omega-eqiad: number_of_in_flight_fetch: 0, timed_out: False, active_primary_shards: 748, relocating_shards: 0, unassigned_shards: 0, active_shards_percent_as_number: 100.0, delayed_unassigned_shards: 0, active_shards: 1498, task_max_waiting_in_queue_millis: 0, initializing_shards: 0, status: green, number_of_
[06:05:36] <icinga-wm>	 uster_name: cloudelastic-omega-eqiad, number_of_pending_tasks: 0, number_of_nodes: 6 https://wikitech.wikimedia.org/wiki/Search%23Administration
[06:06:18] <icinga-wm>	 RECOVERY - OSPF status on cr2-esams is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[06:06:40] <icinga-wm>	 PROBLEM - Check systemd state on cloudelastic1006 is CRITICAL: CRITICAL - degraded: The following units failed: prometheus-wmf-elasticsearch-exporter-9200.service,prometheus-wmf-elasticsearch-exporter-9400.service,prometheus-wmf-elasticsearch-exporter-9600.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[06:08:44] <elukey>	 !log powercycle ms-be2035 - no ssh available, no metrics since hours ago, I/O errors registered in the main tty on serial console
[06:08:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:10:40] <icinga-wm>	 PROBLEM - Host ms-be2035 is DOWN: PING CRITICAL - Packet loss = 100%
[06:12:16] <icinga-wm>	 RECOVERY - Host ms-be2035 is UP: PING OK - Packet loss = 0%, RTA = 33.03 ms
[06:13:02] <icinga-wm>	 PROBLEM - puppet last run on ms-be2035 is CRITICAL: CRITICAL: Puppet last ran 10 hours ago https://wikitech.wikimedia.org/wiki/Monitoring/puppet_checkpuppetrun
[06:13:35] <elukey>	 Cc: godog: --^
[06:13:50] <_joe_>	 oh well good morning
[06:13:51] <elukey>	 (not sure if there is anything to follow up on)
[06:16:43] <wikibugs>	 (03Merged) 10jenkins-bot: PageProps: be prepared that PageIdentity is not proper title [core] (wmf/1.37.0-wmf.6) - 10https://gerrit.wikimedia.org/r/693028 (https://phabricator.wikimedia.org/T283170) (owner: 10Jforrester)
[06:17:24] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[06:19:04] <icinga-wm>	 RECOVERY - puppet last run on ms-be2035 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures https://wikitech.wikimedia.org/wiki/Monitoring/puppet_checkpuppetrun
[06:20:18] <icinga-wm>	 RECOVERY - SSH on ms-be2035 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u7 (protocol 2.0) https://wikitech.wikimedia.org/wiki/SSH/monitoring
[06:23:46] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[06:25:10] <logmsgbot>	 !log ladsgroup@deploy1002 Synchronized php-1.37.0-wmf.6/includes/PageProps.php: Backport: [[gerrit:693028|PageProps: be prepared that PageIdentity is not proper title (T283170)]] (duration: 01m 06s)
[06:25:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:25:14] <stashbot>	 T283170: Special:RecentChanges in it.wikiversity dies with an internal error - https://phabricator.wikimedia.org/T283170
[06:29:22] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: Repool db1166', diff saved to https://phabricator.wikimedia.org/P16113 and previous config saved to /var/cache/conftool/dbconfig/20210520-062921-root.json
[06:29:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:31:47] <wikibugs>	 (03PS1) 10Marostegui: check_private_data_report: Remove references to labsdb [puppet] - 10https://gerrit.wikimedia.org/r/693060 (https://phabricator.wikimedia.org/T282662)
[06:32:19] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] check_private_data_report: Remove references to labsdb [puppet] - 10https://gerrit.wikimedia.org/r/693060 (https://phabricator.wikimedia.org/T282662) (owner: 10Marostegui)
[06:33:00] <icinga-wm>	 RECOVERY - Check systemd state on cloudelastic1006 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[06:39:20] <icinga-wm>	 PROBLEM - Check systemd state on cumin1001 is CRITICAL: CRITICAL - degraded: The following units failed: database-backups-snapshots.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[06:44:26] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: Repool db1166', diff saved to https://phabricator.wikimedia.org/P16114 and previous config saved to /var/cache/conftool/dbconfig/20210520-064425-root.json
[06:44:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:48:04] <icinga-wm>	 RECOVERY - Check systemd state on cumin1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[06:49:47] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 03+1] Add tokens and users for mwdebug service [puppet] - 10https://gerrit.wikimedia.org/r/692667 (https://phabricator.wikimedia.org/T283056) (owner: 10Effie Mouzeli)
[06:50:08] <ryankemper>	 !log T283223 Write queue not draining fast enough for the next node to reboot, will finish reboot tomorrow
[06:50:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:50:11] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 03+1] Add tokens for mwdebug service [labs/private] - 10https://gerrit.wikimedia.org/r/692672 (https://phabricator.wikimedia.org/T283056) (owner: 10Effie Mouzeli)
[06:50:12] <stashbot>	 T283223: Reboot cloudelastic* to apply security updates - https://phabricator.wikimedia.org/T283223
[06:50:12] <logmsgbot>	 !log ryankemper@cumin1001 END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) reboot without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic reboot - ryankemper@cumin1001 - T283223
[06:50:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:55:21] <wikibugs>	 (03CR) 10Effie Mouzeli: [C: 03+2] Add tokens for mwdebug service [labs/private] - 10https://gerrit.wikimedia.org/r/692672 (https://phabricator.wikimedia.org/T283056) (owner: 10Effie Mouzeli)
[06:56:36] <wikibugs>	 (03CR) 10Effie Mouzeli: [V: 03+2 C: 03+2] Add tokens for mwdebug service [labs/private] - 10https://gerrit.wikimedia.org/r/692672 (https://phabricator.wikimedia.org/T283056) (owner: 10Effie Mouzeli)
[06:57:09] <wikibugs>	 (03CR) 10Effie Mouzeli: [C: 03+2] Add tokens and users for mwdebug service [puppet] - 10https://gerrit.wikimedia.org/r/692667 (https://phabricator.wikimedia.org/T283056) (owner: 10Effie Mouzeli)
[06:57:42] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=routinator site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[06:58:22] <wikibugs>	 (03PS1) 10Marostegui: db1143: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/693061
[06:59:29] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: Repool db1166', diff saved to https://phabricator.wikimedia.org/P16115 and previous config saved to /var/cache/conftool/dbconfig/20210520-065928-root.json
[06:59:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:00:02] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[07:01:51] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] db1143: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/693061 (owner: 10Marostegui)
[07:11:08] <godog>	 elukey: thanks for the reboot! I'll take a look shortly
[07:14:33] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: Repool db1166', diff saved to https://phabricator.wikimedia.org/P16116 and previous config saved to /var/cache/conftool/dbconfig/20210520-071432-root.json
[07:14:35] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:17:24] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depool db1179', diff saved to https://phabricator.wikimedia.org/P16117 and previous config saved to /var/cache/conftool/dbconfig/20210520-071723-marostegui.json
[07:17:26] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:17:28] <godog>	 yeah looks like the host is back no problem
[07:18:08] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[07:20:49] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[07:21:29] <icinga-wm>	 PROBLEM - OSPF status on cr2-esams is CRITICAL: OSPFv2: 3/4 UP : OSPFv3: 3/3 UP : 4 v2 P2P interfaces vs. 3 v3 P2P interfaces https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[07:23:37] <icinga-wm>	 RECOVERY - OSPF status on cr2-esams is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[07:23:53] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[07:24:57] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[07:25:01] <wikibugs>	 (03PS1) 10Marostegui: orchestrator.conf.json: Remove labsdb* [puppet] - 10https://gerrit.wikimedia.org/r/693123 (https://phabricator.wikimedia.org/T282662)
[07:25:57] <wikibugs>	 (03PS2) 10Marostegui: orchestrator.conf.json: Remove labsdb* [puppet] - 10https://gerrit.wikimedia.org/r/693123 (https://phabricator.wikimedia.org/T282662)
[07:33:16] <wikibugs>	 (03PS1) 10Effie Mouzeli: Add a namespace for mwdebug service [deployment-charts] - 10https://gerrit.wikimedia.org/r/693124 (https://phabricator.wikimedia.org/T283056)
[07:34:26] <wikibugs>	 (03CR) 10JMeybohm: [C: 03+1] Add a namespace for mwdebug service [deployment-charts] - 10https://gerrit.wikimedia.org/r/693124 (https://phabricator.wikimedia.org/T283056) (owner: 10Effie Mouzeli)
[07:34:29] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[07:36:27] <wikibugs>	 (03CR) 10Effie Mouzeli: [C: 03+2] Add a namespace for mwdebug service [deployment-charts] - 10https://gerrit.wikimedia.org/r/693124 (https://phabricator.wikimedia.org/T283056) (owner: 10Effie Mouzeli)
[07:37:17] <icinga-wm>	 PROBLEM - Check whether ferm is active by checking the default input chain on labstore1006 is CRITICAL: ERROR ferm input drop default policy not set, ferm might not have been started correctly https://wikitech.wikimedia.org/wiki/Monitoring/check_ferm
[07:38:36] <wikibugs>	 (03Merged) 10jenkins-bot: Add a namespace for mwdebug service [deployment-charts] - 10https://gerrit.wikimedia.org/r/693124 (https://phabricator.wikimedia.org/T283056) (owner: 10Effie Mouzeli)
[07:39:47] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[07:41:45] <logmsgbot>	 !log jiji@deploy1002 helmfile [staging-codfw] START helmfile.d/admin 'apply'.
[07:41:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:44:32] <logmsgbot>	 !log jiji@deploy1002 helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
[07:44:34] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:47:53] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[07:50:07] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[07:56:09] <icinga-wm>	 PROBLEM - Check systemd state on sodium is CRITICAL: CRITICAL - degraded: The following units failed: update-ubuntu-mirror.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[08:01:05] <icinga-wm>	 PROBLEM - OSPF status on cr2-esams is CRITICAL: OSPFv2: 3/4 UP : OSPFv3: 3/3 UP : 4 v2 P2P interfaces vs. 3 v3 P2P interfaces https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[08:03:31] <icinga-wm>	 PROBLEM - Check systemd state on cloudelastic1003 is CRITICAL: CRITICAL - degraded: The following units failed: prometheus-wmf-elasticsearch-exporter-9200.service,prometheus-wmf-elasticsearch-exporter-9400.service,prometheus-wmf-elasticsearch-exporter-9600.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[08:05:59] <icinga-wm>	 RECOVERY - OSPF status on cr2-esams is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[08:06:31] <icinga-wm>	 PROBLEM - High average GET latency for mw requests on api_appserver in codfw on alert1001 is CRITICAL: cluster=api_appserver code=200 handler=proxy:unix:/run/php/fpm-www.sock https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-me
[08:07:07] <icinga-wm>	 PROBLEM - High average GET latency for mw requests on appserver in codfw on alert1001 is CRITICAL: cluster=appserver code=200 handler=proxy:unix:/run/php/fpm-www.sock https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=appserver&var-method=GET
[08:08:33] <icinga-wm>	 RECOVERY - Check whether ferm is active by checking the default input chain on labstore1006 is OK: OK ferm input default policy is set https://wikitech.wikimedia.org/wiki/Monitoring/check_ferm
[08:09:25] <icinga-wm>	 RECOVERY - High average GET latency for mw requests on appserver in codfw on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=appserver&var-method=GET
[08:11:09] <icinga-wm>	 RECOVERY - High average GET latency for mw requests on api_appserver in codfw on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-method=GET
[08:15:51] <wikibugs>	 (03PS1) 10Muehlenhoff: Skip Cumin/Homer/Spicerack on cumin2001 [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589)
[08:22:51] <icinga-wm>	 PROBLEM - High average GET latency for mw requests on api_appserver in codfw on alert1001 is CRITICAL: cluster=api_appserver code=200 handler=proxy:unix:/run/php/fpm-www.sock https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-me
[08:23:27] <icinga-wm>	 PROBLEM - High average GET latency for mw requests on appserver in codfw on alert1001 is CRITICAL: cluster=appserver code=200 handler=proxy:unix:/run/php/fpm-www.sock https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=appserver&var-method=GET
[08:24:15] <elukey>	 mmmm
[08:24:49] <elukey>	 the only thing that I can think of is restbase-async running on codfw appservers
[08:24:52] <elukey>	 app/api
[08:25:07] <elukey>	 but in theory I'd expect only api-appservers
[08:25:45] <icinga-wm>	 RECOVERY - High average GET latency for mw requests on appserver in codfw on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=appserver&var-method=GET
[08:27:18] <wikibugs>	 (03PS1) 10Andrew-WMDE: [beta] Enable back button in the VisualEditor transclusion dialog [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693131 (https://phabricator.wikimedia.org/T272354)
[08:27:23] <icinga-wm>	 RECOVERY - High average GET latency for mw requests on api_appserver in codfw on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-method=GET
[08:30:14] <_joe_>	 elukey: wait what?
[08:30:25] <_joe_>	 restbase-async doesn't run on codfw appservers
[08:30:33] <_joe_>	 if it's doing so, It's a huge incident
[08:30:38] <_joe_>	 also, please move over
[08:30:53] <_joe_>	 (to a chat network not controlled by a piece of shit)
[08:31:40] <elukey>	 _joe_ no no it doesn't seem so, it seems general monitoring being a little slow
[08:31:47] <elukey>	 I checked on one api appserver in codfw
[08:31:53] <elukey>	 didn't see anything weird so far
[08:31:59] <icinga-wm>	 PROBLEM - High average GET latency for mw requests on api_appserver in codfw on alert1001 is CRITICAL: cluster=api_appserver code=200 handler=proxy:unix:/run/php/fpm-www.sock https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-me
[08:32:10] <elukey>	 move over where? Operations on libera? I am there, but icinga-wm is not :D
[08:32:38] <_joe_>	 yeah I just asked about it
[08:32:45] <_joe_>	 but I'd prefer not to chat here anymore
[08:34:17] <icinga-wm>	 RECOVERY - High average GET latency for mw requests on api_appserver in codfw on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-method=GET
[08:37:03] <icinga-wm>	 PROBLEM - Router interfaces on cr2-esams is CRITICAL: CRITICAL: host 91.198.174.244, interfaces up: 68, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[08:41:49] <icinga-wm>	 RECOVERY - Router interfaces on cr2-esams is OK: OK: host 91.198.174.244, interfaces up: 69, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[08:47:47] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1179 (re)pooling @ 25%: Repool db1179', diff saved to https://phabricator.wikimedia.org/P16118 and previous config saved to /var/cache/conftool/dbconfig/20210520-084746-root.json
[08:47:50] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:52:03] <wikibugs>	 (03CR) 10Thiemo Kreuz (WMDE): [C: 03+1] [beta] Enable back button in the VisualEditor transclusion dialog [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693131 (https://phabricator.wikimedia.org/T272354) (owner: 10Andrew-WMDE)
[08:54:23] <wikibugs>	 (03PS1) 10Filippo Giunchedi: icinga: move icinga-wm to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693132 (https://phabricator.wikimedia.org/T283213)
[08:54:56] <godog>	 seeking reviewers for ^
[08:55:07] <wikibugs>	 (03CR) 10Kormat: cumin: Remove labsdb* (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693059 (https://phabricator.wikimedia.org/T282662) (owner: 10Marostegui)
[08:55:09] <wikibugs>	 (03CR) 10RhinosF1: [C: 03+1] icinga: move icinga-wm to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693132 (https://phabricator.wikimedia.org/T283213) (owner: 10Filippo Giunchedi)
[08:55:46] <godog>	 thanks RhinosF1 !
[08:55:55] <RhinosF1>	 godog: no
[08:55:59] <RhinosF1>	 Np*
[08:56:03] <wikibugs>	 (03CR) 10Filippo Giunchedi: [C: 03+2] icinga: move icinga-wm to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693132 (https://phabricator.wikimedia.org/T283213) (owner: 10Filippo Giunchedi)
[08:56:13] <wikibugs>	 (03CR) 10Kormat: [C: 03+1] orchestrator.conf.json: Remove labsdb* [puppet] - 10https://gerrit.wikimedia.org/r/693123 (https://phabricator.wikimedia.org/T282662) (owner: 10Marostegui)
[08:56:50] <godog>	 !log move icinga-wm to libera.chat
[08:56:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:57:08] <godog>	 ok I'll stop writing here
[09:00:40] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] orchestrator.conf.json: Remove labsdb* [puppet] - 10https://gerrit.wikimedia.org/r/693123 (https://phabricator.wikimedia.org/T282662) (owner: 10Marostegui)
[09:02:51] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: Repool db1179', diff saved to https://phabricator.wikimedia.org/P16119 and previous config saved to /var/cache/conftool/dbconfig/20210520-090250-root.json
[09:02:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:06:41] <wikibugs>	 (03PS1) 10Filippo Giunchedi: alertmanager: move to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693133 (https://phabricator.wikimedia.org/T283213)
[09:09:25] <wikibugs>	 (03CR) 10Filippo Giunchedi: [C: 03+2] alertmanager: move to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693133 (https://phabricator.wikimedia.org/T283213) (owner: 10Filippo Giunchedi)
[09:11:44] <_joe_>	 we still need to move stashbot before we can fully migrate
[09:12:41] <wikibugs>	 (03PS2) 10Marostegui: cumin: Remove labsdb* [puppet] - 10https://gerrit.wikimedia.org/r/693059 (https://phabricator.wikimedia.org/T282662)
[09:12:58] <wikibugs>	 (03CR) 10Marostegui: cumin: Remove labsdb* (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693059 (https://phabricator.wikimedia.org/T282662) (owner: 10Marostegui)
[09:17:55] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1179 (re)pooling @ 75%: Repool db1179', diff saved to https://phabricator.wikimedia.org/P16120 and previous config saved to /var/cache/conftool/dbconfig/20210520-091754-root.json
[09:17:57] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:20:32] <wikibugs>	 (03CR) 10Kormat: [C: 03+1] cumin: Remove labsdb* [puppet] - 10https://gerrit.wikimedia.org/r/693059 (https://phabricator.wikimedia.org/T282662) (owner: 10Marostegui)
[09:20:44] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] cumin: Remove labsdb* [puppet] - 10https://gerrit.wikimedia.org/r/693059 (https://phabricator.wikimedia.org/T282662) (owner: 10Marostegui)
[09:28:29] <wikibugs>	 (03CR) 10Ayounsi: [C: 03+1] Skip Cumin/Homer/Spicerack on cumin2001 [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589) (owner: 10Muehlenhoff)
[09:32:58] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: Repool db1179', diff saved to https://phabricator.wikimedia.org/P16121 and previous config saved to /var/cache/conftool/dbconfig/20210520-093257-root.json
[09:33:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:35:11] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depool db1112', diff saved to https://phabricator.wikimedia.org/P16122 and previous config saved to /var/cache/conftool/dbconfig/20210520-093510-marostegui.json
[09:35:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:37:49] <wikibugs>	 (03CR) 10Andrew-WMDE: [C: 03+2] "Deploy" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693131 (https://phabricator.wikimedia.org/T272354) (owner: 10Andrew-WMDE)
[09:38:39] <wikibugs>	 (03Merged) 10jenkins-bot: [beta] Enable back button in the VisualEditor transclusion dialog [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693131 (https://phabricator.wikimedia.org/T272354) (owner: 10Andrew-WMDE)
[09:45:54] <wikibugs>	 (03PS10) 10Kormat: mariadb: Convert pt-heartbeat to a systemd service. [puppet] - 10https://gerrit.wikimedia.org/r/665324 (https://phabricator.wikimedia.org/T252528)
[09:58:05] <wikibugs>	 (03CR) 10Elukey: "I am trying to import a more recent version of Istio (1.9.x series) since it seems supported by kubeflow (and there are also some CVEs fix" [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192) (owner: 10Elukey)
[10:00:04] <jouncebot>	 mvolz: I, the Bot under the Fountain, allow thee, The Deployer, to do Services – Citoid /  Zotero deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210520T1000).
[10:10:51] <logmsgbot>	 !log jiji@deploy1002 helmfile [staging-codfw] START helmfile.d/admin 'apply'.
[10:10:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:10:54] <logmsgbot>	 !log jiji@deploy1002 helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
[10:10:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:15:00] <marostegui>	 !log Deploy schema change on s1 codfw, lag will appear in codfw T266486 T268392 T273360
[10:15:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:15:06] <stashbot>	 T268392: Schema change for watchlist.wl_notificationtimestamp going binary(14) from varbinary(14) - https://phabricator.wikimedia.org/T268392
[10:15:07] <stashbot>	 T273360: Schema change for dropping default of img_timestamp and making it binary(14) - https://phabricator.wikimedia.org/T273360
[10:15:07] <stashbot>	 T266486: Schema change to turn user_last_timestamp.user_newtalk to binary(14) - https://phabricator.wikimedia.org/T266486
[10:20:34] <wikibugs>	 (03CR) 10Hnowlan: maps: DB performance improvements (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/685743 (owner: 10MSantos)
[10:27:51] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related channels to Libera - https://phabricator.wikimedia.org/T283230 (10Joe)
[10:28:53] <wikibugs>	 10SRE, 10CAS-SSO, 10Patch-For-Review: Kryo memcached transcoder broken in CAS 6.3/6.4 - https://phabricator.wikimedia.org/T273867 (10MoritzMuehlenhoff) p:05Medium→03Low
[10:33:38] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related channels to Libera - https://phabricator.wikimedia.org/T283230 (10Joe)
[10:33:42] <wikibugs>	 10SRE, 10Security-Team, 10CAS-SSO, 10User-jbond: CAS Single Logout Flow - https://phabricator.wikimedia.org/T233941 (10MoritzMuehlenhoff)
[10:38:01] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related channels to Libera - https://phabricator.wikimedia.org/T283230 (10Marostegui)
[10:39:38] <wikibugs>	 (03PS1) 10Filippo Giunchedi: icinga: move logmsgbot to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693138 (https://phabricator.wikimedia.org/T283213)
[10:39:53] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] icinga: move logmsgbot to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693138 (https://phabricator.wikimedia.org/T283213) (owner: 10Filippo Giunchedi)
[10:40:39] <wikibugs>	 (03PS2) 10Filippo Giunchedi: icinga: move logmsgbot to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693138 (https://phabricator.wikimedia.org/T283213)
[10:42:02] <wikibugs>	 (03CR) 10Filippo Giunchedi: [V: 03+1] "PCC SUCCESS (DIFF 2): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29624/console" [puppet] - 10https://gerrit.wikimedia.org/r/693138 (https://phabricator.wikimedia.org/T283213) (owner: 10Filippo Giunchedi)
[10:43:00] <wikibugs>	 (03CR) 10Filippo Giunchedi: [V: 03+1] "As per task, this will need to be coordinated once stashbot is on libera.chat too" [puppet] - 10https://gerrit.wikimedia.org/r/693138 (https://phabricator.wikimedia.org/T283213) (owner: 10Filippo Giunchedi)
[10:50:19] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1112 (re)pooling @ 25%: Repool db1112', diff saved to https://phabricator.wikimedia.org/P16123 and previous config saved to /var/cache/conftool/dbconfig/20210520-105018-root.json
[10:50:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:59:38] <wikibugs>	 10SRE, 10Security-Team, 10CAS-SSO, 10User-jbond: CAS Single Logout Flow - https://phabricator.wikimedia.org/T233941 (10MoritzMuehlenhoff) Status update:   The Single Logout has been implemented across all applications using mod_cas and tested for all affected applications. It works fine for all application...
[11:00:04] <jouncebot>	 Amir1, Lucas_WMDE, apergos, and duesen: I seem to be stuck in Groundhog week. Sigh. Time for (yet another) EU Backport and Config training deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210520T1100).
[11:00:16] <Lucas_WMDE>	 o/
[11:00:24] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related channels to Libera - https://phabricator.wikimedia.org/T283230 (10jbond)
[11:02:57] <Lucas_WMDE>	 anything to train or deploy?
[11:03:06] <Lucas_WMDE>	 (train as in training, not deployment train :D)
[11:03:30] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related channels to Libera - https://phabricator.wikimedia.org/T283230 (10MoritzMuehlenhoff)
[11:04:32] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related channels to Libera - https://phabricator.wikimedia.org/T283230 (10MoritzMuehlenhoff)
[11:05:23] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1112 (re)pooling @ 50%: Repool db1112', diff saved to https://phabricator.wikimedia.org/P16124 and previous config saved to /var/cache/conftool/dbconfig/20210520-110522-root.json
[11:05:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:14:29] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10Aklapper)
[11:20:26] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1112 (re)pooling @ 75%: Repool db1112', diff saved to https://phabricator.wikimedia.org/P16125 and previous config saved to /var/cache/conftool/dbconfig/20210520-112026-root.json
[11:20:29] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:20:55] <apergos>	 I am in a colab session for the cpt code jam, but there was no one scheduled for training during this backport window.
[11:21:00] <apergos>	 my apologies for not being around.
[11:21:11] <wikibugs>	 10SRE, 10netops: routinator: create gabage collection job - https://phabricator.wikimedia.org/T282469 (10jbond) >>! In T282469#7099149, @ayounsi wrote: > @jbond would the upcoming changes in https://github.com/NLnetLabs/routinator/releases/tag/v0.9.0-rc1 solve that issue by using a database instead of the file...
[11:22:27] <wikibugs>	 (03PS1) 10Jcrespo: dbbackups: Switchover codfw s5 backups from db2099 to db2101 (buster) [puppet] - 10https://gerrit.wikimedia.org/r/693142 (https://phabricator.wikimedia.org/T283235)
[11:31:06] <wikibugs>	 10SRE, 10Traffic, 10vm-requests: Please create two Ganeti VMs for Wikidough - https://phabricator.wikimedia.org/T283192 (10ssingh) >>! In T283192#7099749, @Dzahn wrote: > VMs have been created, added to site.pp with "insetup", added to DHCP and partma. >  > OS has been installed (buster) and puppet certs sig...
[11:31:46] <wikibugs>	 (03PS7) 10Jbond: C:admin: add ability to manage home [puppet] - 10https://gerrit.wikimedia.org/r/691131 (https://phabricator.wikimedia.org/T280989)
[11:33:38] <wikibugs>	 (03CR) 10Ssingh: "As per the discussion in https://phabricator.wikimedia.org/T252132#7098776, we decided that this should be done manually through operation" [dns] - 10https://gerrit.wikimedia.org/r/692625 (https://phabricator.wikimedia.org/T252132) (owner: 10Ssingh)
[11:34:26] <wikibugs>	 (03CR) 10Ssingh: Add zone for wikimedia-dns.org (Wikidough) (032 comments) [dns] - 10https://gerrit.wikimedia.org/r/692625 (https://phabricator.wikimedia.org/T252132) (owner: 10Ssingh)
[11:35:30] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1112 (re)pooling @ 100%: Repool db1112', diff saved to https://phabricator.wikimedia.org/P16126 and previous config saved to /var/cache/conftool/dbconfig/20210520-113529-root.json
[11:35:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:52:41] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 03+1] "Looks good!" [puppet] - 10https://gerrit.wikimedia.org/r/692635 (owner: 10Jbond)
[11:58:06] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 03+1] "Looks good" [puppet] - 10https://gerrit.wikimedia.org/r/692632 (owner: 10Jbond)
[12:00:04] <jouncebot>	 Deploy window Pre MediaWiki train break (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210520T1200)
[12:00:34] <wikibugs>	 (03CR) 10Muehlenhoff: (test) migrate sretest to new role_data  profile (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/692636 (owner: 10Jbond)
[12:02:29] <wikibugs>	 (03PS3) 10Jbond: (test) migrate sretest to new role_data  profile [puppet] - 10https://gerrit.wikimedia.org/r/692636
[12:02:48] <wikibugs>	 (03CR) 10Volans: [C: 03+1] "LGTM, I would probably add something to the MOTD too if you don't mind." [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589) (owner: 10Muehlenhoff)
[12:03:27] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] (test) migrate sretest to new role_data  profile [puppet] - 10https://gerrit.wikimedia.org/r/692636 (owner: 10Jbond)
[12:03:29] <wikibugs>	 (03CR) 10Jbond: (test) migrate sretest to new role_data  profile (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/692636 (owner: 10Jbond)
[12:11:55] <wikibugs>	 (03PS1) 10Marostegui: site.pp: s3 is no longer default for new wikis. [puppet] - 10https://gerrit.wikimedia.org/r/693148 (https://phabricator.wikimedia.org/T259438)
[12:12:15] <wikibugs>	 (03PS1) 10Jbond: (WIP) create a logout.d proffile for managing logout scripts [puppet] - 10https://gerrit.wikimedia.org/r/693149
[12:12:30] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] site.pp: s3 is no longer default for new wikis. [puppet] - 10https://gerrit.wikimedia.org/r/693148 (https://phabricator.wikimedia.org/T259438) (owner: 10Marostegui)
[12:14:11] <wikibugs>	 (03CR) 10Muehlenhoff: (WIP) create a logout.d proffile for managing logout scripts (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693149 (owner: 10Jbond)
[12:25:23] <wikibugs>	 (03CR) 10Marostegui: "What about dbprov? Does it need to be removed from there too?" [puppet] - 10https://gerrit.wikimedia.org/r/692341 (https://phabricator.wikimedia.org/T280751) (owner: 10Jcrespo)
[12:27:12] <wikibugs>	 (03CR) 10Jcrespo: "> Patch Set 2:" [puppet] - 10https://gerrit.wikimedia.org/r/692341 (https://phabricator.wikimedia.org/T280751) (owner: 10Jcrespo)
[12:28:08] <wikibugs>	 (03CR) 10Marostegui: "> Patch Set 2:" [puppet] - 10https://gerrit.wikimedia.org/r/692341 (https://phabricator.wikimedia.org/T280751) (owner: 10Jcrespo)
[12:30:49] <kormat>	 !log Deploying wmfmariadbpy 0.7 T283228
[12:30:52] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:30:55] <stashbot>	 T283228: Deploy wmfmariadbpy 0.7 - https://phabricator.wikimedia.org/T283228
[12:31:49] <wikibugs>	 (03CR) 10Kormat: [C: 03+2] "It's Go time :)" [puppet] - 10https://gerrit.wikimedia.org/r/665324 (https://phabricator.wikimedia.org/T252528) (owner: 10Kormat)
[12:32:51] <wikibugs>	 (03CR) 10Jcrespo: "> Patch Set 2:" [puppet] - 10https://gerrit.wikimedia.org/r/692341 (https://phabricator.wikimedia.org/T280751) (owner: 10Jcrespo)
[12:33:23] <wikibugs>	 (03CR) 10Marostegui: [C: 03+1] "Thanks for double checking :-)" [puppet] - 10https://gerrit.wikimedia.org/r/692341 (https://phabricator.wikimedia.org/T280751) (owner: 10Jcrespo)
[12:35:44] <wikibugs>	 (03CR) 10Jcrespo: "@marostegui To clarify, this is an example of when it is removed from the dbbackups configuration FYI. Here is when it becomes passive, bu" [puppet] - 10https://gerrit.wikimedia.org/r/693142 (https://phabricator.wikimedia.org/T283235) (owner: 10Jcrespo)
[12:36:36] <wikibugs>	 (03CR) 10Marostegui: "thank you - I didn't remember the s6 one" [puppet] - 10https://gerrit.wikimedia.org/r/693142 (https://phabricator.wikimedia.org/T283235) (owner: 10Jcrespo)
[12:37:01] <wikibugs>	 (03CR) 10Jcrespo: [C: 04-1] "Waiting for green light from DBAs (ticket stalled)." [puppet] - 10https://gerrit.wikimedia.org/r/693142 (https://phabricator.wikimedia.org/T283235) (owner: 10Jcrespo)
[12:37:26] <wikibugs>	 (03CR) 10Marostegui: "It will take around 1 month if all goes well" [puppet] - 10https://gerrit.wikimedia.org/r/693142 (https://phabricator.wikimedia.org/T283235) (owner: 10Jcrespo)
[12:39:00] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10Majavah)
[12:41:59] <wikibugs>	 10SRE, 10MW-on-K8s, 10serviceops: Create a mwdebug deployment for mediawiki on kubernetes - https://phabricator.wikimedia.org/T283056 (10Joe)
[12:42:09] <wikibugs>	 10SRE, 10Traffic: OpenSSL < 1.1.0 compatibility issues with new LE issuance chain - https://phabricator.wikimedia.org/T283165 (10BBlack) bump for testing purposes
[12:44:42] <wikibugs>	 (03PS2) 10Jbond: (WIP) create a logout.d profile for managing logout scripts [puppet] - 10https://gerrit.wikimedia.org/r/693149
[12:46:13] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] (WIP) create a logout.d profile for managing logout scripts [puppet] - 10https://gerrit.wikimedia.org/r/693149 (owner: 10Jbond)
[12:50:04] <wikibugs>	 (03PS1) 10Andrew Bogott: Nova vendordata.txt: fix up new VMs that have chrony installed [puppet] - 10https://gerrit.wikimedia.org/r/693152 (https://phabricator.wikimedia.org/T280801)
[12:55:15] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] Nova vendordata.txt: fix up new VMs that have chrony installed [puppet] - 10https://gerrit.wikimedia.org/r/693152 (https://phabricator.wikimedia.org/T280801) (owner: 10Andrew Bogott)
[13:00:07] <jouncebot>	 hashar and dancy: I, the Bot under the Fountain, allow thee, The Deployer, to do MediaWiki train - European+American Version deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210520T1300).
[13:01:59] <wmf-insecte2>	 ^ it is still blocked
[13:02:03] <wmf-insecte2>	 bah
[13:03:07] <wikibugs>	 (03CR) 10Hashar: [C: 03+2] ActorStore: avoid throwing in case of invalid usernames [core] (wmf/1.37.0-wmf.6) - 10https://gerrit.wikimedia.org/r/693029 (https://phabricator.wikimedia.org/T283167) (owner: 10Jforrester)
[13:03:17] <wikibugs>	 (03CR) 10Hashar: [C: 03+2] UploadFromStash: convert default user from false to null [core] (wmf/1.37.0-wmf.6) - 10https://gerrit.wikimedia.org/r/693030 (https://phabricator.wikimedia.org/T283196) (owner: 10Jforrester)
[13:05:27] <wikibugs>	 (03Abandoned) 10Andrew Bogott: Install systemd-timesyncd on Bullseye and later [puppet] - 10https://gerrit.wikimedia.org/r/691960 (https://phabricator.wikimedia.org/T280801) (owner: 10Andrew Bogott)
[13:15:38] <wikibugs>	 (03CR) 10Muehlenhoff: "> Patch Set 1: Code-Review+1" [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589) (owner: 10Muehlenhoff)
[13:16:06] <wikibugs>	 (03CR) 10Hnowlan: [C: 03+1] "lgtm!" [deployment-charts] - 10https://gerrit.wikimedia.org/r/693008 (https://phabricator.wikimedia.org/T261367) (owner: 10Clarakosi)
[13:17:02] <wikibugs>	 (03PS1) 10Jbond: P::puppetdb::microservice: add pki to acl for puppetdb microservice [puppet] - 10https://gerrit.wikimedia.org/r/693158
[13:22:44] <wikibugs>	 (03Merged) 10jenkins-bot: ActorStore: avoid throwing in case of invalid usernames [core] (wmf/1.37.0-wmf.6) - 10https://gerrit.wikimedia.org/r/693029 (https://phabricator.wikimedia.org/T283167) (owner: 10Jforrester)
[13:24:22] <wikibugs>	 (03Merged) 10jenkins-bot: UploadFromStash: convert default user from false to null [core] (wmf/1.37.0-wmf.6) - 10https://gerrit.wikimedia.org/r/693030 (https://phabricator.wikimedia.org/T283196) (owner: 10Jforrester)
[13:26:19] <wikibugs>	 (03PS1) 10Jbond: puppetdb: add site specific cnames for puppetdb [dns] - 10https://gerrit.wikimedia.org/r/693159 (https://phabricator.wikimedia.org/T283185)
[13:26:30] <wikibugs>	 (03PS2) 10Jbond: P::puppetdb::microservice: add pki to acl for puppetdb microservice [puppet] - 10https://gerrit.wikimedia.org/r/693158 (https://phabricator.wikimedia.org/T283185)
[13:26:34] <wikibugs>	 10SRE, 10Data-Persistence-Backup, 10SRE-swift-storage, 10Epic, 10Goal: WMF media storage must be adequately backed up - https://phabricator.wikimedia.org/T262668 (10LSobanski)
[13:26:41] <wikibugs>	 (03CR) 10Hnowlan: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29625/console" [puppet] - 10https://gerrit.wikimedia.org/r/685743 (owner: 10MSantos)
[13:27:00] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] puppetdb: add site specific cnames for puppetdb [dns] - 10https://gerrit.wikimedia.org/r/693159 (https://phabricator.wikimedia.org/T283185) (owner: 10Jbond)
[13:35:42] <wikibugs>	 (03CR) 10Volans: "Should we maybe use a discovery name instead? I'm unsure." [dns] - 10https://gerrit.wikimedia.org/r/693159 (https://phabricator.wikimedia.org/T283185) (owner: 10Jbond)
[13:37:20] <wikibugs>	 (03CR) 10Volans: [V: 03+2 C: 03+2] Release v0.3.0 [software/debmonitor/deploy] - 10https://gerrit.wikimedia.org/r/692932 (owner: 10Volans)
[13:37:26] <wikibugs>	 10SRE, 10CAS-SSO: Cookbook for centralised logouts and session status queries - https://phabricator.wikimedia.org/T283242 (10MoritzMuehlenhoff)
[13:39:52] <logmsgbot>	 !log volans@deploy1002 Started deploy [debmonitor/deploy@444b931]: Release v0.3.0
[13:39:55] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:40:02] <wikibugs>	 (03PS2) 10Jbond: puppetdb: add site specific cnames for puppetdb [dns] - 10https://gerrit.wikimedia.org/r/693159 (https://phabricator.wikimedia.org/T283185)
[13:41:13] <logmsgbot>	 !log volans@deploy1002 Finished deploy [debmonitor/deploy@444b931]: Release v0.3.0 (duration: 01m 20s)
[13:41:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:50:00] <logmsgbot>	 !log hashar@deploy1002 Synchronized php-1.37.0-wmf.6/includes/user/ActorStore.php: ActorStore: avoid throwing in case of invalid usernames T283167 (duration: 01m 05s)
[13:50:07] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:50:08] <stashbot>	 T283167: InvalidArgumentException: Unable to normalize the provided actor name x.y.z.v/16 - https://phabricator.wikimedia.org/T283167
[13:52:11] <logmsgbot>	 !log hashar@deploy1002 Synchronized php-1.37.0-wmf.6/includes/upload/UploadFromStash.php: UploadFromStash: convert default user from false to null - T283196 (duration: 01m 05s)
[13:52:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:52:15] <stashbot>	 T283196: TypeError: Argument 2 passed to UploadStash::__construct() must implement interface MediaWiki\User\UserIdentity or be null, boolean given, called in /srv/mediawiki/php-1.37.0-wmf.6/includes/upload/UploadFromStash.php on line 66 - https://phabricator.wikimedia.org/T283196
[13:54:03] <hashar>	 group 1 promotion!
[13:55:13] <wikibugs>	 (03PS1) 10Hashar: group1 wikis to 1.37.0-wmf.6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693161
[13:55:15] <wikibugs>	 (03CR) 10Hashar: [C: 03+2] group1 wikis to 1.37.0-wmf.6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693161 (owner: 10Hashar)
[13:55:57] <wikibugs>	 (03Merged) 10jenkins-bot: group1 wikis to 1.37.0-wmf.6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693161 (owner: 10Hashar)
[13:56:00] <wikibugs>	 (03PS2) 10Muehlenhoff: Skip Cumin/Homer/Spicerack on cumin2001 [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589)
[13:56:20] <wikibugs>	 (03CR) 10Herron: [C: 03+1] icinga: move logmsgbot to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693138 (https://phabricator.wikimedia.org/T283213) (owner: 10Filippo Giunchedi)
[13:56:30] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Skip Cumin/Homer/Spicerack on cumin2001 [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589) (owner: 10Muehlenhoff)
[13:57:13] <logmsgbot>	 !log hashar@deploy1002 rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.6
[13:57:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:58:19] <logmsgbot>	 !log hashar@deploy1002 Synchronized php: group1 wikis to 1.37.0-wmf.6 (duration: 01m 05s)
[13:58:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:58:36] <wikibugs>	 (03CR) 10Muehlenhoff: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589) (owner: 10Muehlenhoff)
[13:58:53] <wikibugs>	 (03PS1) 10Kormat: mariadb: Use ROW binlog format for heartbeat on dbinventory. [puppet] - 10https://gerrit.wikimedia.org/r/693162
[13:59:36] <wikibugs>	 (03PS2) 10Kormat: mariadb: Use ROW binlog format for heartbeat on dbinventory. [puppet] - 10https://gerrit.wikimedia.org/r/693162
[14:00:51] <wikibugs>	 (03CR) 10Kormat: [V: 03+1] "PCC SUCCESS (DIFF 2 NOOP 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29626/console" [puppet] - 10https://gerrit.wikimedia.org/r/693162 (owner: 10Kormat)
[14:05:16] <wikibugs>	 (03PS3) 10Muehlenhoff: Skip Cumin/Homer/Spicerack on cumin2001 [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589)
[14:05:47] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Skip Cumin/Homer/Spicerack on cumin2001 [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589) (owner: 10Muehlenhoff)
[14:07:57] <wikibugs>	 (03CR) 10Marostegui: [C: 03+1] mariadb: Use ROW binlog format for heartbeat on dbinventory. [puppet] - 10https://gerrit.wikimedia.org/r/693162 (owner: 10Kormat)
[14:08:20] <wikibugs>	 (03CR) 10Kormat: [V: 03+1 C: 03+2] mariadb: Use ROW binlog format for heartbeat on dbinventory. [puppet] - 10https://gerrit.wikimedia.org/r/693162 (owner: 10Kormat)
[14:09:17] <wikibugs>	 (03CR) 10Muehlenhoff: "CI ignores the lint::ignore. I can either ignore CI's ignorance and +V2 or remove the MOTD again (after all cumin/spicerack won't be prese" [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589) (owner: 10Muehlenhoff)
[14:13:05] <wikibugs>	 (03PS1) 10Herron: librenms: move librenms-wmf to irc.libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693164 (https://phabricator.wikimedia.org/T283213)
[14:14:10] <wikibugs>	 (03CR) 10Volans: Skip Cumin/Homer/Spicerack on cumin2001 (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589) (owner: 10Muehlenhoff)
[14:17:02] <wikibugs>	 (03PS1) 10Ayounsi: Add jstep to ldap_only_users [puppet] - 10https://gerrit.wikimedia.org/r/693166 (https://phabricator.wikimedia.org/T282521)
[14:17:43] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Add jstep to ldap_only_users [puppet] - 10https://gerrit.wikimedia.org/r/693166 (https://phabricator.wikimedia.org/T282521) (owner: 10Ayounsi)
[14:20:06] <wikibugs>	 (03PS2) 10Ayounsi: Add jstep to ldap_only_users [puppet] - 10https://gerrit.wikimedia.org/r/693166 (https://phabricator.wikimedia.org/T282521)
[14:20:40] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/693166 (https://phabricator.wikimedia.org/T282521) (owner: 10Ayounsi)
[14:21:20] <wikibugs>	 (03PS1) 10Jbond: Nova vendordata.txt: delete systemd-coredump user [puppet] - 10https://gerrit.wikimedia.org/r/693167 (https://phabricator.wikimedia.org/T280801)
[14:22:04] <wikibugs>	 (03CR) 10Ayounsi: [C: 03+1] "LGTM, but I'm not sure it's still in use." [puppet] - 10https://gerrit.wikimedia.org/r/693164 (https://phabricator.wikimedia.org/T283213) (owner: 10Herron)
[14:22:23] <wikibugs>	 (03CR) 10Ayounsi: [C: 03+2] Add jstep to ldap_only_users [puppet] - 10https://gerrit.wikimedia.org/r/693166 (https://phabricator.wikimedia.org/T282521) (owner: 10Ayounsi)
[14:27:03] <wikibugs>	 10SRE, 10LDAP-Access-Requests, 10Patch-For-Review: Grant Access to ldap/wmf for JStephenson1980 - https://phabricator.wikimedia.org/T282521 (10ayounsi) Apologies for the delay, you should be good to go! Let me know if you're having any issues.
[14:29:37] <wikibugs>	 (03CR) 10Jbond: "> Patch Set 1:" [dns] - 10https://gerrit.wikimedia.org/r/693159 (https://phabricator.wikimedia.org/T283185) (owner: 10Jbond)
[14:29:58] <wikibugs>	 (03CR) 10Volans: [C: 03+1] "LGTM, thanks!" [cookbooks] - 10https://gerrit.wikimedia.org/r/692993 (https://phabricator.wikimedia.org/T283204) (owner: 10Herron)
[14:30:51] <wikibugs>	 (03CR) 10Jbond: [C: 03+2] P::puppetdb::microservice: add pki to acl for puppetdb microservice [puppet] - 10https://gerrit.wikimedia.org/r/693158 (https://phabricator.wikimedia.org/T283185) (owner: 10Jbond)
[14:31:49] <wikibugs>	 (03CR) 10Volans: [C: 04-1] "Pending agreement/discussion on the related task." [cookbooks] - 10https://gerrit.wikimedia.org/r/692992 (https://phabricator.wikimedia.org/T283204) (owner: 10Herron)
[14:33:54] <wikibugs>	 (03CR) 10Elukey: "Importing version 1.9.x might take time, it requires golang 1.15 (so bullseye golang docker images that we don't have yet) plus https://gi" [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192) (owner: 10Elukey)
[14:35:33] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10Dsharpe)
[14:35:42] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10Dsharpe)
[14:36:29] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10Dsharpe)
[14:37:35] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] Nova vendordata.txt: delete systemd-coredump user [puppet] - 10https://gerrit.wikimedia.org/r/693167 (https://phabricator.wikimedia.org/T280801) (owner: 10Jbond)
[14:37:52] <wikibugs>	 (03CR) 10Jbond: (WIP) create a logout.d profile for managing logout scripts (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693149 (owner: 10Jbond)
[14:37:54] <wikibugs>	 (03CR) 10Andrew Bogott: "Should this be coupled with chrony?" [puppet] - 10https://gerrit.wikimedia.org/r/693167 (https://phabricator.wikimedia.org/T280801) (owner: 10Jbond)
[14:38:25] <wikibugs>	 (03CR) 10Filippo Giunchedi: [C: 04-1] "Not in use anymore, we should disable/comment the options though!" [puppet] - 10https://gerrit.wikimedia.org/r/693164 (https://phabricator.wikimedia.org/T283213) (owner: 10Herron)
[14:38:26] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depool db1118', diff saved to https://phabricator.wikimedia.org/P16128 and previous config saved to /var/cache/conftool/dbconfig/20210520-143825-marostegui.json
[14:38:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:39:15] <wikibugs>	 (03CR) 10Jbond: "> Patch Set 1: -Code-Review" [puppet] - 10https://gerrit.wikimedia.org/r/693167 (https://phabricator.wikimedia.org/T280801) (owner: 10Jbond)
[14:39:42] <wikibugs>	 (03PS1) 10Effie Mouzeli: Add kubernetes mwdebug user [labs/private] - 10https://gerrit.wikimedia.org/r/693169 (https://phabricator.wikimedia.org/T283056)
[14:40:13] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10Dsharpe)
[14:40:30] <wikibugs>	 10SRE, 10SRE-Access-Requests: Allow JStephenson to access Superset - https://phabricator.wikimedia.org/T282515 (10ayounsi)
[14:40:42] <wikibugs>	 10SRE, 10LDAP-Access-Requests, 10Patch-For-Review: Grant Access to ldap/wmf for JStephenson1980 - https://phabricator.wikimedia.org/T282521 (10ayounsi) 05Open→03Resolved Other two deleted from LDAP.
[14:41:43] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+1] wmcs: add cloudvirt drain cookbook [cookbooks] - 10https://gerrit.wikimedia.org/r/683370 (https://phabricator.wikimedia.org/T280641) (owner: 10David Caro)
[14:48:33] <wikibugs>	 (03CR) 10Muehlenhoff: Nova vendordata.txt: delete systemd-coredump user (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693167 (https://phabricator.wikimedia.org/T280801) (owner: 10Jbond)
[14:52:51] <wikibugs>	 (03CR) 10Jbond: Nova vendordata.txt: delete systemd-coredump user (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693167 (https://phabricator.wikimedia.org/T280801) (owner: 10Jbond)
[14:54:12] <wikibugs>	 (03CR) 10Muehlenhoff: Skip Cumin/Homer/Spicerack on cumin2001 (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589) (owner: 10Muehlenhoff)
[14:57:36] <wikibugs>	 (03CR) 10Muehlenhoff: Nova vendordata.txt: delete systemd-coredump user (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693167 (https://phabricator.wikimedia.org/T280801) (owner: 10Jbond)
[14:57:58] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 03+1] Add kubernetes mwdebug user [labs/private] - 10https://gerrit.wikimedia.org/r/693169 (https://phabricator.wikimedia.org/T283056) (owner: 10Effie Mouzeli)
[15:00:14] <wikibugs>	 (03CR) 10JMeybohm: [C: 04-1] "We tried to migrate everything to using nobody instead of special users, but I see this might not make sense here as you need a dedicated " (0310 comments) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192) (owner: 10Elukey)
[15:05:15] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: docker::baseimages: drop alpine support [puppet] - 10https://gerrit.wikimedia.org/r/693174
[15:05:17] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: docker::baseimages: add script to build debian-slim [puppet] - 10https://gerrit.wikimedia.org/r/693175 (https://phabricator.wikimedia.org/T281596)
[15:06:04] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] docker::baseimages: add script to build debian-slim [puppet] - 10https://gerrit.wikimedia.org/r/693175 (https://phabricator.wikimedia.org/T281596) (owner: 10Giuseppe Lavagetto)
[15:06:38] <wikibugs>	 (03CR) 10Effie Mouzeli: [C: 03+2] Add kubernetes mwdebug user [labs/private] - 10https://gerrit.wikimedia.org/r/693169 (https://phabricator.wikimedia.org/T283056) (owner: 10Effie Mouzeli)
[15:08:55] <wikibugs>	 10SRE, 10DBA: wmf-auto-reinstall fails on hosts that run pt-heartbeat - https://phabricator.wikimedia.org/T252528 (10Kormat) 05Open→03Resolved a:03Kormat This is now fixed. Puppet will no longer start/stop heartbeat. That is managed by `db-switchover` when changing masters. This does mean that `pt-heartb...
[15:11:04] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops, 10cloud-services-team (Hardware): cloudvirt1038: PCIe error - https://phabricator.wikimedia.org/T276922 (10Cmjohnson) Received a new PCI card and the error returned immediately.  I disabled the PCI-E slot 1 and the server boots fine. I do not see any need for that riser in the...
[15:11:23] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29627/console" [puppet] - 10https://gerrit.wikimedia.org/r/693174 (owner: 10Giuseppe Lavagetto)
[15:12:08] <wikibugs>	 (03CR) 10Effie Mouzeli: [V: 03+2 C: 03+2] Add kubernetes mwdebug user [labs/private] - 10https://gerrit.wikimedia.org/r/693169 (https://phabricator.wikimedia.org/T283056) (owner: 10Effie Mouzeli)
[15:13:30] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10RobH)
[15:13:34] <wikibugs>	 (03CR) 10BryanDavis: [C: 03+1] "a copy of stashbot is now running in #wikimedia-operations@libera.chat" [puppet] - 10https://gerrit.wikimedia.org/r/693138 (https://phabricator.wikimedia.org/T283213) (owner: 10Filippo Giunchedi)
[15:13:47] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10RobH)
[15:15:12] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 04-1] "minor comment inline, but otherwise yes, let's drop alpine" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693174 (owner: 10Giuseppe Lavagetto)
[15:15:27] <wikibugs>	 (03PS1) 10Muehlenhoff: Add library hints for graphviz [puppet] - 10https://gerrit.wikimedia.org/r/693178
[15:17:03] <wikibugs>	 10SRE, 10ops-eqiad: Degraded RAID on ms-be1053 - https://phabricator.wikimedia.org/T282839 (10Cmjohnson) Thank you, I will get a ticket in with HPE ASAP
[15:18:29] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 03+2] Add library hints for graphviz [puppet] - 10https://gerrit.wikimedia.org/r/693178 (owner: 10Muehlenhoff)
[15:20:06] <wikibugs>	 (03CR) 10Volans: Skip Cumin/Homer/Spicerack on cumin2001 (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693130 (https://phabricator.wikimedia.org/T276589) (owner: 10Muehlenhoff)
[15:21:22] <logmsgbot>	 !log jiji@deploy1002 helmfile [staging-codfw] START helmfile.d/admin 'apply'.
[15:21:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:21:25] <logmsgbot>	 !log jiji@deploy1002 helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
[15:21:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:21:36] <logmsgbot>	 !log jiji@deploy1002 helmfile [staging-codfw] START helmfile.d/admin 'apply'.
[15:21:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:21:39] <logmsgbot>	 !log jiji@deploy1002 helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
[15:21:43] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:23:15] <moritzm>	 !log installing graphviz security updates on buster
[15:23:17] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:23:29] <wikibugs>	 10SRE, 10wikimedia-irc-freenode: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10Bugreporter)
[15:24:03] <Majavah>	 moritzm: stashbot is now on libera
[15:24:32] <wikibugs>	 10SRE, 10Data-Persistence-Backup, 10netops: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10faidon) p:05Triage→03High Given a) this was linked during budgeting in the context of of our cross-DC...
[15:30:28] <bd808>	 jouncebot: now
[15:30:28] <jouncebot>	 No deployments scheduled for the next 0 hour(s) and 29 minute(s)
[15:31:05] <bd808>	 jouncebot is now on libera.chat as well (and also here, but separate instances)
[15:31:48] <ryankemper>	 !log [cloudelastic] `ryankemper@cloudelastic1003:~$ sudo systemctl restart *search*` to clear `Check systemd state` alert on `cloudelastic1003`
[15:31:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:33:06] <moritzm>	 Majavah: ah, missed that. Thanks!
[15:33:28] <wikibugs>	 10SRE, 10Data-Persistence-Backup, 10netops: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10jcrespo) One of the things I raised to my manager is that this limitation means that, in the event of a c...
[15:33:53] <_joe_>	 ryankemper: you should move to the other network, read ops@ :)
[15:34:07] <apergos>	 time to update the topic in here yet?
[15:34:12] <ryankemper>	 _joe_: ack, catching up on the email and stuff now :)
[15:34:16] <_joe_>	 apergos: in a few :)
[15:34:19] <apergos>	 :-)
[15:34:25] <_joe_>	 ryankemper: yeah I figured, it's early morning for you
[15:35:32] <ryankemper>	 thanks for the heads up
[15:43:28] <logmsgbot>	 !log jiji@deploy1002 helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
[15:43:30] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:44:31] <logmsgbot>	 !log jiji@deploy1002 helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
[15:44:33] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:44:53] <logmsgbot>	 !log jiji@deploy1002 helmfile [codfw] START helmfile.d/admin 'apply'.
[15:44:55] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:46:07] <logmsgbot>	 !log jiji@deploy1002 helmfile [codfw] DONE helmfile.d/admin 'apply'.
[15:46:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:46:23] <logmsgbot>	 !log jiji@deploy1002 helmfile [eqiad] START helmfile.d/admin 'apply'.
[15:46:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:46:25] <wikibugs>	 (03CR) 10BBlack: [C: 03+1] Add zone for wikimedia-dns.org (Wikidough) [dns] - 10https://gerrit.wikimedia.org/r/692625 (https://phabricator.wikimedia.org/T252132) (owner: 10Ssingh)
[15:47:00] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 03+2] icinga: move logmsgbot to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693138 (https://phabricator.wikimedia.org/T283213) (owner: 10Filippo Giunchedi)
[15:47:15] <wikibugs>	 (03PS2) 10Herron: librenms: remove librenms-wmf irc config [puppet] - 10https://gerrit.wikimedia.org/r/693164 (https://phabricator.wikimedia.org/T283213)
[15:47:31] <wikibugs>	 (03CR) 10Ssingh: [C: 03+2] Add zone for wikimedia-dns.org (Wikidough) [dns] - 10https://gerrit.wikimedia.org/r/692625 (https://phabricator.wikimedia.org/T252132) (owner: 10Ssingh)
[15:48:01] <logmsgbot>	 !log jiji@deploy1002 helmfile [eqiad] DONE helmfile.d/admin 'apply'.
[15:48:02] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:52:24] <wikibugs>	 (03Abandoned) 10Ssingh: aptrepo: add a component for knot-dnsutils [puppet] - 10https://gerrit.wikimedia.org/r/685571 (https://phabricator.wikimedia.org/T252132) (owner: 10Ssingh)
[15:53:05] <wikibugs>	 (03PS2) 10Ssingh: bird: add Wikidough's /24 to vips_filter (accept) [puppet] - 10https://gerrit.wikimedia.org/r/692367 (https://phabricator.wikimedia.org/T283027)
[15:54:06] <wikibugs>	 (03CR) 10Ssingh: "This is ready for review." [puppet] - 10https://gerrit.wikimedia.org/r/692367 (https://phabricator.wikimedia.org/T283027) (owner: 10Ssingh)
[15:56:35] <wikibugs>	 (03PS3) 10Ssingh: WIP: wikidough: update role to work towards anycast support [puppet] - 10https://gerrit.wikimedia.org/r/692368 (https://phabricator.wikimedia.org/T283027)
[15:56:54] <wikibugs>	 (03PS4) 10Ssingh: wikidough: update role to work towards anycast support [puppet] - 10https://gerrit.wikimedia.org/r/692368 (https://phabricator.wikimedia.org/T283027)
[15:58:49] <wikibugs>	 (03CR) 10Ssingh: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29628/console" [puppet] - 10https://gerrit.wikimedia.org/r/692368 (https://phabricator.wikimedia.org/T283027) (owner: 10Ssingh)
[15:59:28] <wikibugs>	 (03PS2) 10Ppchelko: DNM: Changes to use envoyproxy's image of envoy 1.18.3 [deployment-charts] - 10https://gerrit.wikimedia.org/r/692695 (owner: 10Clarakosi)
[15:59:31] <wikibugs>	 (03CR) 10Ssingh: [V: 03+1] "Ready for review as well." [puppet] - 10https://gerrit.wikimedia.org/r/692368 (https://phabricator.wikimedia.org/T283027) (owner: 10Ssingh)
[16:00:04] <jouncebot>	 jbond42 and cdanis: #bothumor Q:How do functions break up? A:They stop calling each other. Rise for Puppet request window deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210520T1600).
[16:01:02] <Lucas_WMDE>	 :o two IRC networks means we get TWO jouncebot jokes per deployment window! <3
[16:06:58] <_joe_>	 Lucas_WMDE: not for long
[16:07:08] <Lucas_WMDE>	 aww
[16:08:16] <wikibugs>	 (03CR) 10Bstorm: "> Patch Set 1:" [puppet] - 10https://gerrit.wikimedia.org/r/691154 (owner: 10Arturo Borrero Gonzalez)
[16:08:58] <wikibugs>	 (03CR) 10Filippo Giunchedi: "LGTM, see inline" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693164 (https://phabricator.wikimedia.org/T283213) (owner: 10Herron)
[16:11:07] <wikibugs>	 (03PS2) 10Effie Mouzeli: Add kubernetes mwdebug user [labs/private] - 10https://gerrit.wikimedia.org/r/693169 (https://phabricator.wikimedia.org/T283056)
[16:11:37] <wikibugs>	 (03CR) 10Effie Mouzeli: [V: 03+2 C: 03+2] Add kubernetes mwdebug user [labs/private] - 10https://gerrit.wikimedia.org/r/693169 (https://phabricator.wikimedia.org/T283056) (owner: 10Effie Mouzeli)
[16:12:09] <wikibugs>	 (03PS3) 10Herron: librenms: remove librenms-wmf irc config [puppet] - 10https://gerrit.wikimedia.org/r/693164 (https://phabricator.wikimedia.org/T283213)
[16:13:01] <wikibugs>	 10SRE, 10MW-on-K8s, 10serviceops, 10Patch-For-Review: Create a mwdebug deployment for mediawiki on kubernetes - https://phabricator.wikimedia.org/T283056 (10jijiki)
[16:13:02] <wikibugs>	 (03CR) 10Filippo Giunchedi: [C: 03+1] "Thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/693164 (https://phabricator.wikimedia.org/T283213) (owner: 10Herron)
[16:13:39] <wikibugs>	 (03CR) 10Herron: [C: 03+2] librenms: remove librenms-wmf irc config [puppet] - 10https://gerrit.wikimedia.org/r/693164 (https://phabricator.wikimedia.org/T283213) (owner: 10Herron)
[16:16:26] <wikibugs>	 10SRE, 10netops: Lumen 10G Wave (cr2-eqiad to cr2-esams) Down - https://phabricator.wikimedia.org/T283227 (10cmooney) Came back up approx 40 mins ago: ` May 20 15:37:52  re0.cr2-eqiad mib2d[13184]: SNMP_TRAP_LINK_UP: ifIndex 660, ifAdminStatus up(1), ifOperStatus up(1), ifName xe-4/1/3 ` ` cmooney@re0.cr2-eqia...
[16:18:37] <wikibugs>	 (03PS2) 10Razzi: site: configure dbstore1006 as insetup [puppet] - 10https://gerrit.wikimedia.org/r/693046 (https://phabricator.wikimedia.org/T283125)
[16:22:18] <wikibugs>	 (03CR) 10Marostegui: [C: 03+1] "+1 (I haven't checked if the MAC is correct though)" [puppet] - 10https://gerrit.wikimedia.org/r/693046 (https://phabricator.wikimedia.org/T283125) (owner: 10Razzi)
[16:23:42] <wikibugs>	 (03PS1) 10Herron: librenms: remove librenms-ircbot service [puppet] - 10https://gerrit.wikimedia.org/r/693182
[16:29:05] <wikibugs>	 (03CR) 10Filippo Giunchedi: [C: 03+1] "LGTM, though the unit will need to be cleaned up manually on netmon1002" [puppet] - 10https://gerrit.wikimedia.org/r/693182 (owner: 10Herron)
[16:29:23] <wikibugs>	 (03CR) 10Herron: [C: 03+2] librenms: remove librenms-ircbot service [puppet] - 10https://gerrit.wikimedia.org/r/693182 (owner: 10Herron)
[16:54:08] <wikibugs>	 (03PS20) 10Elukey: Add istio base images build support [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192)
[16:54:10] <wikibugs>	 (03PS2) 10Elukey: Add knative serving and net-istio images [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/692899 (https://phabricator.wikimedia.org/T278194)
[16:54:46] <wikibugs>	 (03CR) 10Elukey: "Followed also what was suggested about the istio-proxy user!" (0310 comments) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192) (owner: 10Elukey)
[17:00:04] <jouncebot>	 chrisalbon and accraze: (Dis)respected human, time to deploy Services – Graphoid / ORES (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210520T1700). Please do the needful.
[17:02:04] <wikibugs>	 10SRE, 10ops-eqiad, 10decommission-hardware: decommission labsdb1011.eqiad.wmnet - https://phabricator.wikimedia.org/T282524 (10Cmjohnson)
[17:02:16] <wikibugs>	 10SRE, 10Analytics, 10LDAP-Access-Requests, 10SRE-Access-Requests: Account setup issues for jmixter-ctr - https://phabricator.wikimedia.org/T283250 (10elukey) Had a chat with Jeff on Slack (together with @JAllemandou), and the account `JMixter (WMF)` seems not accessible on meta (maybe the password reset +...
[17:03:23] <wikibugs>	 10SRE, 10ops-eqiad, 10decommission-hardware: decommission labsdb1011.eqiad.wmnet - https://phabricator.wikimedia.org/T282524 (10Cmjohnson) 05Open→03Resolved a:05wiki_willy→03Cmjohnson
[17:03:56] <wikibugs>	 10SRE, 10ops-eqiad, 10Data-Services, 10decommission-hardware: decommission labsdb1010.eqiad.wmnet - https://phabricator.wikimedia.org/T282523 (10Cmjohnson) 05Open→03Resolved All Decom tasks are complete
[17:04:11] <wikibugs>	 10SRE, 10DBA, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Cmjohnson)
[17:04:15] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops, 10decommission-hardware: decommission db1074.eqiad.wmnet - https://phabricator.wikimedia.org/T281959 (10Cmjohnson) 05Open→03Resolved All decom tasks are complete.
[17:04:35] <wikibugs>	 10SRE, 10DBA, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Cmjohnson)
[17:04:59] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops, 10decommission-hardware: decommission db1079.eqiad.wmnet - https://phabricator.wikimedia.org/T282079 (10Cmjohnson) 05Open→03Resolved All decom tasks are complete.
[17:06:44] <wikibugs>	 10SRE, 10DBA, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Cmjohnson)
[17:06:51] <wikibugs>	 (03CR) 10Hnowlan: [V: 03+2 C: 03+2] ratelimiter: update to new upstream version [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/692941 (https://phabricator.wikimedia.org/T246278) (owner: 10Ppchelko)
[17:07:09] <wikibugs>	 10SRE, 10ops-eqiad, 10decommission-hardware: decommission db1085.eqiad.wmnet - https://phabricator.wikimedia.org/T282096 (10Cmjohnson) 05Open→03Resolved All decom tasks are complete.
[17:07:53] <wikibugs>	 10SRE, 10DBA, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Cmjohnson)
[17:08:16] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops, 10decommission-hardware: decommission db1083.eqiad.wmnet - https://phabricator.wikimedia.org/T281445 (10Cmjohnson) 05Open→03Resolved All decom tasks are complete
[17:08:26] <wikibugs>	 10SRE, 10ops-eqiad, 10decommission-hardware: decommission labsdb1009.eqiad.wmnet - https://phabricator.wikimedia.org/T282522 (10Cmjohnson) 05Open→03Resolved All decom tasks are completed
[17:18:06] <wikibugs>	 10SRE, 10Analytics-Radar, 10LDAP-Access-Requests, 10SRE-Access-Requests: Account setup issues for jmixter-ctr - https://phabricator.wikimedia.org/T283250 (10odimitrijevic)
[17:26:09] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops: (Need By: TBD) rack/setup/install moss-be100[12] - https://phabricator.wikimedia.org/T276637 (10Cmjohnson)
[17:32:30] <wikibugs>	 (03PS5) 10Cwhite: logstash: update ES template to patch 2 [puppet] - 10https://gerrit.wikimedia.org/r/690538
[17:37:07] <wikibugs>	 (03CR) 10Cwhite: [C: 03+2] logstash: update ES template to patch 2 [puppet] - 10https://gerrit.wikimedia.org/r/690538 (owner: 10Cwhite)
[17:38:00] <wikibugs>	 (03PS3) 10Razzi: site: configure dbstore1006 as insetup [puppet] - 10https://gerrit.wikimedia.org/r/693046 (https://phabricator.wikimedia.org/T283125)
[17:40:31] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops: (Need By: TBD) rack/setup/install moss-be100[12] - https://phabricator.wikimedia.org/T276637 (10Cmjohnson)
[17:41:29] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops: (Need By: TBD) rack/setup/install moss-be100[12] - https://phabricator.wikimedia.org/T276637 (10Cmjohnson) a:05Jclark-ctr→03RobH @robh if you have time to do the installs that would be great, assign back to me if you're busy.
[17:47:33] <wikibugs>	 (03CR) 10Razzi: [C: 03+2] site: configure dbstore1006 as insetup [puppet] - 10https://gerrit.wikimedia.org/r/693046 (https://phabricator.wikimedia.org/T283125) (owner: 10Razzi)
[17:53:05] <wikibugs>	 (03PS1) 10Kosta Harlan: Check if task is link-recommendation type before showing onboarding [extensions/GrowthExperiments] (wmf/1.37.0-wmf.6) - 10https://gerrit.wikimedia.org/r/693040 (https://phabricator.wikimedia.org/T282826)
[17:53:36] <wikibugs>	 (03PS1) 10Kosta Harlan: Check if task is link-recommendation type before showing onboarding [extensions/GrowthExperiments] (wmf/1.37.0-wmf.5) - 10https://gerrit.wikimedia.org/r/693041 (https://phabricator.wikimedia.org/T282826)
[17:54:01] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops: (Need By: TBD) rack/setup/install phab1004 (was: phab1002) - https://phabricator.wikimedia.org/T280540 (10Cmjohnson) a:05Cmjohnson→03RobH @robh same thing, this server is ready for install if you have the time.
[17:54:07] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops: (Need By: TBD) rack/setup/install phab1004 (was: phab1002) - https://phabricator.wikimedia.org/T280540 (10Cmjohnson)
[18:00:04] <jouncebot>	 RoanKattouw, Niharika, and Urbanecm: Dear deployers, time to do the Morning backport window deploy. Dont look at me like that. You signed up for it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210520T1800).
[18:00:04] <jouncebot>	 kostajh: A patch you scheduled for Morning backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
[18:01:07] <wikibugs>	 (03CR) 10Urbanecm: [C: 03+2] Check if task is link-recommendation type before showing onboarding [extensions/GrowthExperiments] (wmf/1.37.0-wmf.6) - 10https://gerrit.wikimedia.org/r/693040 (https://phabricator.wikimedia.org/T282826) (owner: 10Kosta Harlan)
[18:01:09] <wikibugs>	 (03CR) 10Urbanecm: [C: 03+2] Check if task is link-recommendation type before showing onboarding [extensions/GrowthExperiments] (wmf/1.37.0-wmf.5) - 10https://gerrit.wikimedia.org/r/693041 (https://phabricator.wikimedia.org/T282826) (owner: 10Kosta Harlan)
[18:09:17] <wikibugs>	 (03PS1) 10Cwhite: logstash: allocate ecs shards to hdd nodes after one month [puppet] - 10https://gerrit.wikimedia.org/r/693198
[18:10:04] <wikibugs>	 (03PS1) 10Cwhite: logstash: allocate w3creportingapi shards older than 1 month to hdd nodes [puppet] - 10https://gerrit.wikimedia.org/r/693199
[18:15:19] <wikibugs>	 (03PS1) 10Hnowlan: New upstream envoy-future version 1.18.3 [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693200
[18:17:30] <wikibugs>	 (03CR) 10Clarakosi: [C: 03+1] New upstream envoy-future version 1.18.3 [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693200 (owner: 10Hnowlan)
[18:17:36] <wikibugs>	 (03PS1) 10Hnowlan: api-gateway: use envoy-future 1.18.3 in staging [deployment-charts] - 10https://gerrit.wikimedia.org/r/693201
[18:19:44] <wikibugs>	 (03CR) 10Hnowlan: [V: 03+2 C: 03+2] New upstream envoy-future version 1.18.3 [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/693200 (owner: 10Hnowlan)
[18:20:14] <wikibugs>	 (03CR) 10Clarakosi: [C: 03+1] api-gateway: use envoy-future 1.18.3 in staging [deployment-charts] - 10https://gerrit.wikimedia.org/r/693201 (owner: 10Hnowlan)
[18:21:04] <wikibugs>	 (03CR) 10Hnowlan: [C: 03+2] api-gateway: use envoy-future 1.18.3 in staging [deployment-charts] - 10https://gerrit.wikimedia.org/r/693201 (owner: 10Hnowlan)
[18:23:59] <wikibugs>	 (03Merged) 10jenkins-bot: api-gateway: use envoy-future 1.18.3 in staging [deployment-charts] - 10https://gerrit.wikimedia.org/r/693201 (owner: 10Hnowlan)
[18:24:51] <wikibugs>	 (03Merged) 10jenkins-bot: Check if task is link-recommendation type before showing onboarding [extensions/GrowthExperiments] (wmf/1.37.0-wmf.6) - 10https://gerrit.wikimedia.org/r/693040 (https://phabricator.wikimedia.org/T282826) (owner: 10Kosta Harlan)
[18:25:06] <wikibugs>	 (03Merged) 10jenkins-bot: Check if task is link-recommendation type before showing onboarding [extensions/GrowthExperiments] (wmf/1.37.0-wmf.5) - 10https://gerrit.wikimedia.org/r/693041 (https://phabricator.wikimedia.org/T282826) (owner: 10Kosta Harlan)
[18:38:17] <wikibugs>	 (03PS4) 10Clarakosi: api-gateway: Implement new ratelimit configurations from envoy 1.16 [deployment-charts] - 10https://gerrit.wikimedia.org/r/692404 (https://phabricator.wikimedia.org/T260591)
[18:42:12] <wikibugs>	 (03PS2) 10Clarakosi: api-gateway: Add default_value to dynamic_metadata if JWT is not set [deployment-charts] - 10https://gerrit.wikimedia.org/r/692714 (https://phabricator.wikimedia.org/T261350)
[18:43:23] <wikibugs>	 (03PS1) 10Ryan Kemper: cloudelastic: bump inactive shard alert threshold [puppet] - 10https://gerrit.wikimedia.org/r/693204 (https://phabricator.wikimedia.org/T283269)
[18:44:14] <wikibugs>	 (03CR) 10Gehel: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/693204 (https://phabricator.wikimedia.org/T283269) (owner: 10Ryan Kemper)
[18:45:14] <wikibugs>	 (03PS2) 10Ryan Kemper: cloudelastic: bump inactive shard alert threshold [puppet] - 10https://gerrit.wikimedia.org/r/693204 (https://phabricator.wikimedia.org/T283269)
[18:45:32] <wikibugs>	 (03PS3) 10Ryan Kemper: cloudelastic: bump inactive shard alert threshold [puppet] - 10https://gerrit.wikimedia.org/r/693204 (https://phabricator.wikimedia.org/T283269)
[18:45:59] <wikibugs>	 (03PS2) 10Clarakosi: api-gateway: Replace echoapi with http-https-echo [deployment-charts] - 10https://gerrit.wikimedia.org/r/693008 (https://phabricator.wikimedia.org/T261367)
[18:46:07] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] cloudelastic: bump inactive shard alert threshold [puppet] - 10https://gerrit.wikimedia.org/r/693204 (https://phabricator.wikimedia.org/T283269) (owner: 10Ryan Kemper)
[18:47:20] <wikibugs>	 (03PS1) 10Ebernhardson: mjolnir bulk daemon: Add topic for hourly updates [puppet] - 10https://gerrit.wikimedia.org/r/693205 (https://phabricator.wikimedia.org/T261407)
[18:55:22] <wikibugs>	 (03Abandoned) 10Clarakosi: DNM: Changes to use envoyproxy's image of envoy 1.18.3 [deployment-charts] - 10https://gerrit.wikimedia.org/r/692695 (owner: 10Clarakosi)
[18:55:44] <wikibugs>	 (03CR) 10Herron: [C: 03+2] sre.hosts.decommission: clarify "wipe bootloader" step [cookbooks] - 10https://gerrit.wikimedia.org/r/692993 (https://phabricator.wikimedia.org/T283204) (owner: 10Herron)
[18:55:47] <wikibugs>	 (03PS2) 10Clarakosi: Use envoy 1.16 nested json feature for access logging [deployment-charts] - 10https://gerrit.wikimedia.org/r/692709 (https://phabricator.wikimedia.org/T260820) (owner: 10Ppchelko)
[18:57:04] <wikibugs>	 10SRE, 10Analytics-Radar, 10LDAP-Access-Requests, 10SRE-Access-Requests: Account setup issues for jmixter-ctr - https://phabricator.wikimedia.org/T283250 (10Aklapper) On-wiki SUL account on meta, mentioned by elukey: https://meta.wikimedia.org/wiki/Special:Log?page=User:JMixter_(WMF) looks correct to me. W...
[18:57:35] <wikibugs>	 (03PS2) 10Cwhite: logstash: allocate ecs shards to hdd nodes after one month [puppet] - 10https://gerrit.wikimedia.org/r/693198
[18:58:27] <wikibugs>	 (03PS2) 10Cwhite: logstash: allocate w3creportingapi shards older than 1 month to hdd nodes [puppet] - 10https://gerrit.wikimedia.org/r/693199
[18:59:10] <wikibugs>	 (03CR) 10Herron: "> Patch Set 1: Code-Review-1" [cookbooks] - 10https://gerrit.wikimedia.org/r/692992 (https://phabricator.wikimedia.org/T283204) (owner: 10Herron)
[18:59:41] <wikibugs>	 (03Merged) 10jenkins-bot: sre.hosts.decommission: clarify "wipe bootloader" step [cookbooks] - 10https://gerrit.wikimedia.org/r/692993 (https://phabricator.wikimedia.org/T283204) (owner: 10Herron)
[19:00:04] <jouncebot>	 hashar and dancy: #bothumor When your hammer is PHP, everything starts looking like a thumb. Rise for MediaWiki train - European+American Version (secondary timeslot). (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210520T1900).
[19:03:07] <wikibugs>	 (03PS1) 10Hashar: all wikis to 1.37.0-wmf.6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693207
[19:03:09] <wikibugs>	 (03CR) 10Hashar: [C: 03+2] all wikis to 1.37.0-wmf.6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693207 (owner: 10Hashar)
[19:04:16] <wikibugs>	 (03Merged) 10jenkins-bot: all wikis to 1.37.0-wmf.6 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693207 (owner: 10Hashar)
[19:07:52] <wikibugs>	 (03PS1) 10Tks4Fish: ptwiki: Add 'flow-delete' to 'eliminator' user group [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693208 (https://phabricator.wikimedia.org/T283266)
[19:08:10] <wikibugs>	 (03CR) 10Ppchelko: "yeah, but I've added my stuff that is needed for 1.18.3 into this commit" [deployment-charts] - 10https://gerrit.wikimedia.org/r/692695 (owner: 10Clarakosi)
[19:09:32] <wikibugs>	 (03CR) 10Urbanecm: [C: 04-2] "not now, see task. pending community confirmation." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/693208 (https://phabricator.wikimedia.org/T283266) (owner: 10Tks4Fish)
[19:09:34] <wikibugs>	 (03CR) 10Clarakosi: "> Patch Set 2:" [deployment-charts] - 10https://gerrit.wikimedia.org/r/692695 (owner: 10Clarakosi)
[19:09:36] <wikibugs>	 (03Restored) 10Clarakosi: DNM: Changes to use envoyproxy's image of envoy 1.18.3 [deployment-charts] - 10https://gerrit.wikimedia.org/r/692695 (owner: 10Clarakosi)
[19:23:54] <wikibugs>	 10SRE, 10Analytics-Radar, 10LDAP-Access-Requests, 10SRE-Access-Requests: Account setup issues for jmixter-ctr - https://phabricator.wikimedia.org/T283250 (10jmixter) I was able to create the Developer Account following the instructions. I guess I am confused about all of the various accounts I needed to se...
[19:27:00] <wikibugs>	 (03PS3) 10Ppchelko: DNM: Changes to use envoyproxy's image of envoy 1.18.3 [deployment-charts] - 10https://gerrit.wikimedia.org/r/692695 (owner: 10Clarakosi)
[19:27:33] <wikibugs>	 (03PS4) 10Ppchelko: Changes to use envoyproxy's image of envoy 1.18.3 [deployment-charts] - 10https://gerrit.wikimedia.org/r/692695 (owner: 10Clarakosi)
[19:31:10] <wikibugs>	 (03PS1) 10Ssingh: acme_chief: add certificates for wikimedia-dns.org [puppet] - 10https://gerrit.wikimedia.org/r/693210 (https://phabricator.wikimedia.org/T252132)
[19:31:59] <wikibugs>	 (03CR) 10Clarakosi: Changes to use envoyproxy's image of envoy 1.18.3 (032 comments) [deployment-charts] - 10https://gerrit.wikimedia.org/r/692695 (owner: 10Clarakosi)
[19:33:32] <wikibugs>	 (03CR) 10Ssingh: [V: 03+1] "PCC SUCCESS (DIFF 2): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29629/console" [puppet] - 10https://gerrit.wikimedia.org/r/693210 (https://phabricator.wikimedia.org/T252132) (owner: 10Ssingh)
[19:35:09] <hashar>	 Lucas_WMDE: hi could use a hand if you are still around :]
[19:35:22] <hashar>	 it is about https://phabricator.wikimedia.org/T283240  which is a follow up to a train blocker
[19:35:37] <hashar>	 but it does not seem to be a blocker, just wanted to confirm it is indeed just a followup action for later :]
[19:35:52] <wikibugs>	 (03CR) 10Ppchelko: Changes to use envoyproxy's image of envoy 1.18.3 (032 comments) [deployment-charts] - 10https://gerrit.wikimedia.org/r/692695 (owner: 10Clarakosi)
[19:36:48] <wikibugs>	 (03CR) 10Clarakosi: [C: 03+1] Changes to use envoyproxy's image of envoy 1.18.3 [deployment-charts] - 10https://gerrit.wikimedia.org/r/692695 (owner: 10Clarakosi)
[19:38:57] <wikibugs>	 (03CR) 10Ppchelko: [C: 03+2] Changes to use envoyproxy's image of envoy 1.18.3 [deployment-charts] - 10https://gerrit.wikimedia.org/r/692695 (owner: 10Clarakosi)
[19:41:02] <wikibugs>	 (03Merged) 10jenkins-bot: Changes to use envoyproxy's image of envoy 1.18.3 [deployment-charts] - 10https://gerrit.wikimedia.org/r/692695 (owner: 10Clarakosi)
[19:42:29] <wikibugs>	 10SRE, 10Traffic, 10Patch-For-Review: Offer Wikidough as an anycasted service - https://phabricator.wikimedia.org/T283027 (10ssingh)
[19:45:18] <hashar>	 Lucas_WMDE: I just assumed it is a follow up and commented on the train blocker task ;] don't worry!
[19:45:19] <wikibugs>	 (03PS1) 10Herron: remove rescue boot dhcp entry for mwlog1001 [puppet] - 10https://gerrit.wikimedia.org/r/693211
[19:46:23] <wikibugs>	 (03CR) 10Herron: [C: 03+2] remove rescue boot dhcp entry for mwlog1001 [puppet] - 10https://gerrit.wikimedia.org/r/693211 (owner: 10Herron)
[19:49:13] <Lucas_WMDE>	 hashar: yes that’s just a followup, shouldn’t block anything
[19:49:27] <Lucas_WMDE>	 wasn’t sure how to attach it to which other tasks ^^
[19:49:42] <wikibugs>	 (03PS10) 10DCausse: rdf-streaming-updater: use session mode [deployment-charts] - 10https://gerrit.wikimedia.org/r/681497 (https://phabricator.wikimedia.org/T280166) (owner: 10Mstyles)
[19:51:07] <wikibugs>	 (03Abandoned) 10Herron: sre.hosts.decommssion: use dd to zero the bootloader [cookbooks] - 10https://gerrit.wikimedia.org/r/692992 (https://phabricator.wikimedia.org/T283204) (owner: 10Herron)
[19:51:31] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] rdf-streaming-updater: use session mode [deployment-charts] - 10https://gerrit.wikimedia.org/r/681497 (https://phabricator.wikimedia.org/T280166) (owner: 10Mstyles)
[19:52:42] <hashar>	 Lucas_WMDE: it is fine :]
[19:52:53] <hashar>	 Lucas_WMDE: thank you for confirming it at this late hour of the day!!!
[19:53:17] <Lucas_WMDE>	 I’m just leaving my laptop running because the new IRC channels don’t have logging yet and I don’t want to miss stuff :'D
[19:53:25] <Lucas_WMDE>	 hope the rest of the train goes well!
[19:54:40] <wikibugs>	 10SRE, 10decommission-hardware, 10observability, 10Patch-For-Review: decommission mwlog1001 - https://phabricator.wikimedia.org/T282575 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by herron@cumin1001 for hosts: `mwlog1001.eqiad.wmnet` - mwlog1001.eqiad.wmnet (**FAIL**)   - **Failed dow...
[19:56:08] <hashar>	 Lucas_WMDE: yeah it is all fine I am marking it solved right now! \o/
[20:13:27] <Lucas_WMDE>	 \o/
[20:27:02] <wikibugs>	 (03PS5) 10Cwhite: logstash: replace ECS allow list with filter_on_template [puppet] - 10https://gerrit.wikimedia.org/r/674718 (https://phabricator.wikimedia.org/T234565)
[20:43:48] <wikibugs>	 (03PS1) 10Cwhite: logstash: bugfix filter to exclude hdd-allocated indexes [puppet] - 10https://gerrit.wikimedia.org/r/693213
[21:02:38] <wikibugs>	 10ops-eqiad, 10DC-Ops, 10Dumps-Generation: (Need By: TBD) rack/setup/install dumpsdata100[45] - https://phabricator.wikimedia.org/T283290 (10RobH)
[21:03:08] <wikibugs>	 10ops-eqiad, 10DC-Ops, 10Dumps-Generation: (Need By: TBD) rack/setup/install dumpsdata100[45] - https://phabricator.wikimedia.org/T283290 (10RobH)
[21:07:50] <wikibugs>	 10ops-eqiad, 10DC-Ops, 10Dumps-Generation: (Need By: TBD) rack/setup/install dumpsdata100[45] - https://phabricator.wikimedia.org/T283290 (10RobH)
[21:07:55] <wikibugs>	 10ops-eqiad, 10DC-Ops, 10Dumps-Generation: (Need By: TBD) rack/setup/install dumpsdata100[45] - https://phabricator.wikimedia.org/T283290 (10RobH) Please note the original ask for networking was:  **Networking/Subnet/VLAN/IP:** Internal vlan, 10G for one host and 1G (for now) for the other. If the 10G connec...
[21:09:05] <wikibugs>	 (03PS1) 10Ottomata: Initial commit [debs/airflow] - 10https://gerrit.wikimedia.org/r/693216
[21:11:29] <wikibugs>	 (03CR) 10Ottomata: [V: 03+2 C: 03+2] Initial commit [debs/airflow] - 10https://gerrit.wikimedia.org/r/693216 (owner: 10Ottomata)
[21:15:07] <wikibugs>	 (03PS1) 10Ottomata: Add .gitreview [debs/airflow] (debian) - 10https://gerrit.wikimedia.org/r/693221
[21:15:50] <wikibugs>	 (03PS2) 10Ottomata: Add .gitreview [debs/airflow] (debian) - 10https://gerrit.wikimedia.org/r/693221
[21:16:29] <wikibugs>	 (03CR) 10Ottomata: [V: 03+2 C: 03+2] Add .gitreview [debs/airflow] (debian) - 10https://gerrit.wikimedia.org/r/693221 (owner: 10Ottomata)
[21:17:30] <wikibugs>	 (03PS1) 10Ottomata: Initial debianization and 2.0.2-1~py3.7 release [debs/airflow] (debian) - 10https://gerrit.wikimedia.org/r/693222 (https://phabricator.wikimedia.org/T277012)
[21:25:31] <wikibugs>	 (03PS2) 10Krinkle: trafficserver: Remove X-Request-Id from response headers unless debug [puppet] - 10https://gerrit.wikimedia.org/r/676682 (https://phabricator.wikimedia.org/T210484)
[21:27:21] <wikibugs>	 10SRE, 10Traffic, 10Performance-Team (Radar): Strip new X-Request-Id header from non-debug responses - https://phabricator.wikimedia.org/T283291 (10Krinkle)
[21:27:24] <wikibugs>	 (03PS3) 10Krinkle: trafficserver: Remove X-Request-Id from response headers unless debug [puppet] - 10https://gerrit.wikimedia.org/r/676682 (https://phabricator.wikimedia.org/T283291)
[21:28:31] <wikibugs>	 (03PS2) 10Ottomata: Initial debianization and 2.0.2-1~py3.7 release [debs/airflow] (debian) - 10https://gerrit.wikimedia.org/r/693222 (https://phabricator.wikimedia.org/T277012)
[21:42:05] <wikibugs>	 (03PS3) 10Ottomata: Initial debianization and 2.0.2-1~py3.7 release [debs/airflow] (debian) - 10https://gerrit.wikimedia.org/r/693222 (https://phabricator.wikimedia.org/T277012)
[21:43:36] <wikibugs>	 (03PS1) 10Zabe: [doc] switching from freenode to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693223
[21:46:13] <wikibugs>	 (03PS4) 10Ottomata: Initial debianization and 2.0.2-1~py3.7 release [debs/airflow] (debian) - 10https://gerrit.wikimedia.org/r/693222 (https://phabricator.wikimedia.org/T277012)
[22:11:26] <wikibugs>	 (03PS2) 10Zabe: [doc] switching from freenode to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693223 (https://phabricator.wikimedia.org/T283247)
[22:12:58] <wikibugs>	 (03CR) 10Legoktm: [C: 04-1] [doc] switching from freenode to libera.chat (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693223 (https://phabricator.wikimedia.org/T283247) (owner: 10Zabe)
[22:15:56] <wikibugs>	 (03PS3) 10Zabe: [doc] switching from freenode to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693223 (https://phabricator.wikimedia.org/T283247)
[22:16:09] <wikibugs>	 (03CR) 10Zabe: [doc] switching from freenode to libera.chat (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/693223 (https://phabricator.wikimedia.org/T283247) (owner: 10Zabe)
[22:28:37] <wikibugs>	 (03PS1) 10Razzi: netboot: Change dbstore1006 netboot.cfg to partman/custom/db.cfg [puppet] - 10https://gerrit.wikimedia.org/r/693224 (https://phabricator.wikimedia.org/T283125)
[22:30:48] <wikibugs>	 (03PS2) 10Razzi: netboot: Change dbstore1006 netboot.cfg to partman/custom/db.cfg [puppet] - 10https://gerrit.wikimedia.org/r/693224 (https://phabricator.wikimedia.org/T283125)
[22:35:43] <wikibugs>	 (03CR) 10Razzi: [C: 03+2] netboot: Change dbstore1006 netboot.cfg to partman/custom/db.cfg [puppet] - 10https://gerrit.wikimedia.org/r/693224 (https://phabricator.wikimedia.org/T283125) (owner: 10Razzi)
[22:59:26] <wikibugs>	 (03CR) 10Krinkle: [C: 03+1] [doc] switching from freenode to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693223 (https://phabricator.wikimedia.org/T283247) (owner: 10Zabe)
[23:00:05] <jouncebot>	 brennen: #bothumor My software never has bugs. It just develops random features. Rise for US Backport and Config training. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210520T2300).
[23:01:06] <wikibugs>	 (03CR) 10Bstorm: [C: 03+2] "First sync is done. Now I'm going to try this until cut over https://puppet-compiler.wmflabs.org/compiler1001/29630/cloudstore1009.wikimed" [puppet] - 10https://gerrit.wikimedia.org/r/690783 (https://phabricator.wikimedia.org/T224747) (owner: 10Bstorm)
[23:07:23] <wikibugs>	 (03CR) 10Legoktm: [C: 03+2] "Thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/693223 (https://phabricator.wikimedia.org/T283247) (owner: 10Zabe)
[23:17:57] <wikibugs>	 (03PS1) 10Razzi: site: add role for dbstore1006 [puppet] - 10https://gerrit.wikimedia.org/r/693230 (https://phabricator.wikimedia.org/T283125)
[23:21:28] <wikibugs>	 (03CR) 10Razzi: [C: 03+2] site: add role for dbstore1006 [puppet] - 10https://gerrit.wikimedia.org/r/693230 (https://phabricator.wikimedia.org/T283125) (owner: 10Razzi)
[23:23:21] <wikibugs>	 (03CR) 10Dzahn: [C: 03+2] [doc] switching from freenode to libera.chat [puppet] - 10https://gerrit.wikimedia.org/r/693223 (https://phabricator.wikimedia.org/T283247) (owner: 10Zabe)
[23:24:11] <wikibugs>	 (03CR) 10Dzahn: "tried to merge without noticing it was already done :)" [puppet] - 10https://gerrit.wikimedia.org/r/693223 (https://phabricator.wikimedia.org/T283247) (owner: 10Zabe)
[23:33:54] <wikibugs>	 10SRE, 10ops-codfw, 10DC-Ops, 10Discovery-Search (Current work): hw troubleshooting: ssh unreachable for wdqs2007.codfw.wmnet - https://phabricator.wikimedia.org/T281437 (10RobH) 05Open→03Resolved wdqs2007 raid fully rebuilt and system is online.  I set to staged in netbox, when its added back into ful...
[23:41:26] <wikibugs>	 (03PS1) 10Legoktm: codesearch: Use our own hound image [puppet] - 10https://gerrit.wikimedia.org/r/693233 (https://phabricator.wikimedia.org/T243380)
[23:42:12] <wikibugs>	 (03CR) 10Legoktm: [C: 03+2] codesearch: Use our own hound image [puppet] - 10https://gerrit.wikimedia.org/r/693233 (https://phabricator.wikimedia.org/T243380) (owner: 10Legoktm)