[00:10:52] 10serviceops, 10WMF-JobQueue, 10Platform Team Workboards (Clinic Duty Team), 10User-brennen, 10Wikimedia-production-error: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745 (10Krinkle) Still seen. Looks like we're consistently dro... [09:26:46] 10serviceops, 10cloud-services-team (Kanban): Reprepro: Refresh the kubernetes repo key once they refresh upstream - https://phabricator.wikimedia.org/T279042 (10dcaro) [10:51:44] 10serviceops: Envoy (admin) logs are not rotated/expired - https://phabricator.wikimedia.org/T279049 (10fgiunchedi) [12:50:30] 10serviceops, 10WMF-JobQueue, 10Platform Team Workboards (Clinic Duty Team), 10User-brennen, 10Wikimedia-production-error: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745 (10Ottomata) Ok, so the pod restarts were not the problem... [13:46:39] 10serviceops, 10User-jijiki: Remove parsoidJS leftovers from production - https://phabricator.wikimedia.org/T279059 (10jijiki) p:05Triage→03Medium [13:47:47] 10serviceops, 10User-jijiki: Remove parsoidJS leftovers from production - https://phabricator.wikimedia.org/T279059 (10jijiki) [13:47:51] 10serviceops, 10SRE, 10Parsoid (Tracking), 10Patch-For-Review: Upgrade Parsoid servers to buster - https://phabricator.wikimedia.org/T268524 (10jijiki) [14:09:14] 10serviceops, 10User-jijiki: Remove parsoidJS leftovers from production - https://phabricator.wikimedia.org/T279059 (10jijiki) [15:14:17] 10serviceops, 10Maps, 10Packaging, 10Patch-For-Review: Packaging PostGIS 3.1 for the new Maps stack - https://phabricator.wikimedia.org/T277064 (10MoritzMuehlenhoff) @MSantos You can find a backport for buster at https://people.wikimedia.org/~jmm/postgis/ Can you run some tests whether that's what you nee... [15:34:11] 10serviceops, 10Parsoid, 10Patch-For-Review, 10User-jijiki: Remove parsoidJS leftovers from production - https://phabricator.wikimedia.org/T279059 (10jijiki) [15:36:47] 10serviceops, 10Product-Infrastructure-Team-Backlog, 10SRE, 10Services, and 3 others: New Service Request geoshapes - https://phabricator.wikimedia.org/T274388 (10Aklapper) [18:34:27] 10serviceops, 10SRE: bring 26 new mediawiki appserver in codfw into production, rack A3 (mw2377 - mw2402) - https://phabricator.wikimedia.org/T278396 (10Dzahn) [20:41:32] 10serviceops, 10SRE, 10ops-codfw, 10Patch-For-Review: decom 8 codfw appservers purchased on 2016-06-02 - https://phabricator.wikimedia.org/T277780 (10Dzahn) 05Stalled→03Open 4 new jobrunners have been created. This can now continue. [20:50:00] 10serviceops, 10SRE, 10WMF-JobQueue, 10Sustainability (Incident Followup): Have some dedicated jobrunners that aren't active videoscalers - https://phabricator.wikimedia.org/T279100 (10Legoktm) [20:50:58] 10serviceops, 10SRE, 10observability, 10Sustainability (Incident Followup): Add alerting for Memcached timeout errors - https://phabricator.wikimedia.org/T278946 (10Legoktm) [20:53:48] 10serviceops, 10MediaWiki-Page-derived-data, 10Sustainability (Incident Followup): Add rate limiting to the jobqueue vidoscalers to prevent overloads - https://phabricator.wikimedia.org/T278945 (10Legoktm) [20:53:56] 10serviceops, 10TimedMediaHandler-Transcode, 10WMF-JobQueue, 10Sustainability (Incident Followup): Add rate limiting to the jobqueue vidoscalers to prevent overloads - https://phabricator.wikimedia.org/T278945 (10Legoktm) [20:53:58] 10serviceops, 10SRE, 10WMF-JobQueue, 10Sustainability (Incident Followup): Have some dedicated jobrunners that aren't active videoscalers - https://phabricator.wikimedia.org/T279100 (10Pchelolo) On the change-prop side we already route all video scaling jobs to videoscaler.discovery.wmnet and all other job... [21:58:13] 10serviceops, 10SRE, 10ops-codfw, 10Patch-For-Review: decom 8 codfw appservers purchased on 2016-06-02 - https://phabricator.wikimedia.org/T277780 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: `mw2243.codfw.wmnet` - mw2243.codfw.wmnet (**PASS**) - Downtime... [22:12:29] 10serviceops, 10SRE, 10ops-codfw, 10Patch-For-Review: decom 8 codfw appservers purchased on 2016-06-02 - https://phabricator.wikimedia.org/T277780 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: `mw2246.codfw.wmnet` - mw2246.codfw.wmnet (**PASS**) - Downtime... [22:33:14] 10serviceops, 10SRE, 10ops-codfw, 10Patch-For-Review: decom 8 codfw appservers purchased on 2016-06-02 - https://phabricator.wikimedia.org/T277780 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: `mw2247.codfw.wmnet` - mw2247.codfw.wmnet (**FAIL**) - Downtime... [22:35:17] 10serviceops, 10Parsoid (Tracking), 10Patch-For-Review, 10User-jijiki: Remove parsoidJS leftovers from production - https://phabricator.wikimedia.org/T279059 (10Arlolra) [22:53:23] 10serviceops, 10SRE, 10ops-codfw, 10Patch-For-Review: decom 8 codfw appservers purchased on 2016-06-02 - https://phabricator.wikimedia.org/T277780 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: `mw2248.codfw.wmnet` - mw2248.codfw.wmnet (**PASS**) - Downtime... [23:29:47] 10serviceops, 10SRE, 10ops-codfw, 10Patch-For-Review: decom 8 codfw appservers purchased on 2016-06-02 - https://phabricator.wikimedia.org/T277780 (10Dzahn) 05Open→03Resolved [23:30:37] 10serviceops, 10SRE, 10ops-codfw, 10Patch-For-Review: decom 8 codfw appservers purchased on 2016-06-02 - https://phabricator.wikimedia.org/T277780 (10Dzahn) 05Resolved→03Open @Papaul These were old servers in rack A4. They are also ready to go now. [23:31:11] 10serviceops, 10SRE, 10ops-codfw, 10Patch-For-Review: decom 8 codfw appservers purchased on 2016-06-02 - https://phabricator.wikimedia.org/T277780 (10Dzahn) p:05High→03Medium a:05Dzahn→03Papaul [23:31:15] 10serviceops, 10SRE, 10ops-codfw, 10Patch-For-Review: decom 8 codfw appservers purchased on 2016-06-02 - https://phabricator.wikimedia.org/T277780 (10Dzahn) @Papaul This is about these: https://netbox.wikimedia.org/dcim/devices/?q=mw2&mac_address=&has_primary_ip=&local_context_data=&virtual_chassis_mem... [23:34:06] 10serviceops, 10SRE, 10Patch-For-Review: bring 26 new mediawiki appserver in codfw into production, rack A3 (mw2377 - mw2402) - https://phabricator.wikimedia.org/T278396 (10Dzahn)