[04:24:04] 10serviceops, 10WMF-JobQueue, 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team), and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745 (10Krinkle) @Pchelolo I'm not sure. **Failure is OK**. I think in a pr... [04:45:03] 10serviceops, 10WMF-JobQueue, 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team), and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745 (10Pchelolo) Thanks for the detailed reply. I agree - something is stil... [05:09:14] 10serviceops, 10WMF-JobQueue, 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team), and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745 (10Joe) >>! In T249745#7040025, @Pchelolo wrote: > ok, a bit more time... [10:47:25] 10serviceops, 10Phabricator, 10User-brennen: Phabricator intermittently slow; db connection failures to m3-master.eqiad.wmnet with "Temporary failure in name resolution" - https://phabricator.wikimedia.org/T279013 (10LSobanski) Removing the #DBA tag and subscribing myself instead. Once there are specific act... [12:16:02] 10serviceops, 10WMF-JobQueue, 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team), and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745 (10Ottomata) > Increase the number of pods we use for it (EventGate) F... [12:30:58] hi all any objections to sending pybal logs to ELK: https://gerrit.wikimedia.org/r/c/operations/puppet/+/682566 ? [13:15:36] <_joe_> no, but also not 100% sure it's that useful [13:16:02] <_joe_> you should probably ask to the traffic folks though [13:19:54] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=987544 [13:42:16] <_joe_> moritzm: I admire jelmer's courage [13:42:53] <_joe_> but I don't think it's feasible to build a "proper" debian package for envoy [13:43:04] <_joe_> in any way that won't make you insane [13:49:32] he forked bazaar to port it to Python 3, so there's definitely precedence in taking on seemingly infeasible projects :-) [13:52:26] 10serviceops, 10SRE, 10Patch-For-Review: upgrade conf2* servers to stretch - https://phabricator.wikimedia.org/T271573 (10JMeybohm) DNS SRV records, pybal's and confd instances in codfw, eqsin, ulsfo moved to the new cluster. navtiming.service on webperf needed a restart as well. [14:02:57] _joe_: thanks joe will check with traffic, the reason i thought it would be usefull is to just have a cerntal place to check if pybal is depooling services without knowing which lvs server the service is one. perhaps there is allready better way to check this? [14:03:15] <_joe_> I'm not convinced that helps [14:03:46] during the last insident you asked me to "check if pybal is depooling servies" thats where this came from [14:03:47] <_joe_> it would for instance mix logs from primary and backup, but yeah I do have a 1:1 mental map of where each thing is located on lvs [14:03:55] <_joe_> yeah I figured [14:04:06] <_joe_> so please go on :) [14:04:19] ack :) will check with traffic first thanks [14:06:35] _joe_: unrelated, when you get a some time can you give https://gerrit.wikimedia.org/r/c/operations/puppet/+/670917 (netbase/services) another review [14:17:35] <_joe_> jbond42: oh sure [14:18:25] cheers [14:41:40] 10serviceops, 10decommission-hardware: decommission conf200[1-3].codfw.wmnet - https://phabricator.wikimedia.org/T281374 (10JMeybohm) [14:42:03] 10serviceops, 10decommission-hardware: decommission conf200[1-3].codfw.wmnet - https://phabricator.wikimedia.org/T281374 (10JMeybohm) [14:42:09] 10serviceops, 10SRE, 10Patch-For-Review: upgrade conf2* servers to stretch - https://phabricator.wikimedia.org/T271573 (10JMeybohm) [15:19:19] 10serviceops, 10decommission-hardware: decommission conf200[1-3].codfw.wmnet - https://phabricator.wikimedia.org/T281374 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by jayme@cumin1001 for hosts: `conf[2001-2003].codfw.wmnet` - conf2001.codfw.wmnet (**PASS**) - Downtimed host on Icinga... [15:42:00] 10serviceops, 10decommission-hardware: decommission conf200[1-3].codfw.wmnet - https://phabricator.wikimedia.org/T281374 (10JMeybohm) [17:05:50] <_joe_> ottomata: your feedback on https://gerrit.wikimedia.org/r/c/operations/puppet/+/682971 and https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/683379 would be appreciated :) [19:19:01] _joe_: ya that is super cool [19:19:08] i left another coment on the puppet one [19:19:20] thanks for doing that! [22:44:33] 10serviceops, 10MW-on-K8s, 10SRE, 10Shellbox, and 4 others: RFC: PHP microservice for containerized shell execution - https://phabricator.wikimedia.org/T260330 (10Dzahn)