[11:31:20] 10serviceops, 10Operations, 10Patch-For-Review: replace bromine and vega with buster VMs - https://phabricator.wikimedia.org/T247650 (10ops-monitoring-bot) Icinga downtime for 1 day, 0:00:00 set by dzahn@cumin1001 on 1 host(s) and their services with reason: decom ` bromine.eqiad.wmnet ` [11:31:30] 10serviceops, 10Operations, 10Patch-For-Review: replace bromine and vega with buster VMs - https://phabricator.wikimedia.org/T247650 (10ops-monitoring-bot) Icinga downtime for 1 day, 0:00:00 set by dzahn@cumin1001 on 1 host(s) and their services with reason: decom ` vega.codfw.wmnet ` [11:51:01] 10serviceops, 10Operations, 10Patch-For-Review: replace bromine and vega with buster VMs - https://phabricator.wikimedia.org/T247650 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: `vega.codfw.wmnet` - vega.codfw.wmnet (**PASS**) - Downtimed host on Icinga... [11:52:13] 10serviceops, 10Operations, 10Patch-For-Review: replace bromine and vega with buster VMs - https://phabricator.wikimedia.org/T247650 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: `bromine.eqiad.wmnet` - bromine.eqiad.wmnet (**PASS**) - Downtimed host on Ici... [11:55:45] 10serviceops, 10Operations, 10Patch-For-Review: replace bromine and vega with buster VMs - https://phabricator.wikimedia.org/T247650 (10Dzahn) 05Open→03Resolved Done! bromine and vega are gone and the services on it have been fully merged into miscweb1002/2002 which are on buster. [11:55:56] 10serviceops, 10Operations: replace bromine and vega with buster VMs - https://phabricator.wikimedia.org/T247650 (10Dzahn) [12:14:07] 10serviceops, 10Operations, 10netops: Investigate D1 appservers<->memcache TKOs - https://phabricator.wikimedia.org/T251601 (10ayounsi) p:05Triage→03High [12:16:48] 10serviceops, 10Operations, 10netops: Investigate D1 appservers<->memcache TKOs - https://phabricator.wikimedia.org/T251601 (10ayounsi) [12:56:03] 10serviceops, 10Operations, 10netops: Investigate D1 appservers<->memcache TKOs - https://phabricator.wikimedia.org/T251601 (10ayounsi) [18:30:59] 10serviceops, 10Operations, 10ops-eqiad: mw1280 correctable memory errors logged in getsel - https://phabricator.wikimedia.org/T251077 (10Cmjohnson) @wiki_willy This server is out of warranty, If the DIMM has already been replaced on this server than most likely the system board or CPU is failing. My recom... [18:32:19] 10serviceops, 10DC-Ops, 10Operations, 10ops-eqiad: scb1001: Memory correctable errors -EDAC- - https://phabricator.wikimedia.org/T250482 (10Cmjohnson) 05Open→03Stalled I am stalling this task until it's ready for decom [18:36:29] 10serviceops, 10Operations, 10ops-eqiad: mw1280 correctable memory errors logged in getsel - https://phabricator.wikimedia.org/T251077 (10wiki_willy) @elukey - looks like we have another year left before the end of the 5yr life cycle mark. Let us know if have enough in production to able to decom this host,... [18:39:41] 10serviceops, 10Operations, 10ops-eqiad: mw1280 correctable memory errors logged in getsel - https://phabricator.wikimedia.org/T251077 (10Dzahn) This server has already had CPU and RAM replaced and crashed multiple times. Also see: T195734, T240187, T218006