[08:29:35] 10serviceops, 10Operations, 10WMF-Legal, 10Patch-For-Review: Move old transparency report pages to historical URLs and setup redirect - https://phabricator.wikimedia.org/T230638 (10Prtksxna) After a discussion with @LMixter, @JbuattiWMF and @Varnent we'd like to propose: 1. `transparency.wikimedia.org` (an... [08:51:10] 10serviceops, 10Phabricator, 10Patch-For-Review: Phabricator downtime due to aphlict and websockets (aphlict current disabled) - https://phabricator.wikimedia.org/T238593 (10ema) p:05Triage→03Normal [08:51:19] 10serviceops, 10Operations, 10Phabricator, 10Traffic, 10Patch-For-Review: Phabricator downtime due to aphlict and websockets (aphlict current disabled) - https://phabricator.wikimedia.org/T238593 (10ema) [10:00:05] 10serviceops, 10Operations, 10Phabricator, 10Traffic, 10Patch-For-Review: Phabricator downtime due to aphlict and websockets (aphlict current disabled) - https://phabricator.wikimedia.org/T238593 (10ema) By going through SAL and the irc logs on #wikimedia-operations I've reconstructed the events as follo... [10:25:02] with more of us merging each other's patches, the security value of the merge 'yes or no?' question goes down a bit, since we can't really look at the contents of the other person's patch and know that it's not been tampered with [10:33:49] apergos: iirc puppet-merge alerts you when you are about to merge more than one patch [10:33:55] unless you are talking about something else [10:34:11] I mean that when it's your patch you can look at it and go 'yep that's what I wrote' [10:34:28] when it's someone else's, you ask them 'hey can I merge' but you don't actually vet it as 'yeah that's what they wrote' [10:34:48] it's a small thing but still [10:40:30] oh I see your point [10:48:12] 10serviceops, 10Operations: envoyproxy does not automatically reload certificates - https://phabricator.wikimedia.org/T238597 (10ema) [10:52:39] 10serviceops, 10Operations: envoyproxy does not automatically reload certificates - https://phabricator.wikimedia.org/T238597 (10ema) @Joe: is there any potential risk in making `profile::tlsproxy::envoy::use_hot_restarter` default to `true` as @cdanis suggested? [12:03:50] 10serviceops, 10Operations, 10WMF-Legal, 10Patch-For-Review: Move old transparency report pages to historical URLs and setup redirect - https://phabricator.wikimedia.org/T230638 (10Aklapper) I wonder why #1 should be performed before #2 is performed, and not the other way round (copy/move historical data o... [12:47:59] 10serviceops, 10Operations, 10Phabricator, 10Traffic, 10Patch-For-Review: Phabricator downtime due to aphlict and websockets (aphlict current disabled) - https://phabricator.wikimedia.org/T238593 (10Joe) The problem is most apache workers ended up being stuck talking to aphlict via `proxy_wstunnel` which... [13:57:33] 10serviceops, 10Operations, 10WMF-Legal, 10Patch-For-Review: Move old transparency report pages to historical URLs and setup redirect - https://phabricator.wikimedia.org/T230638 (10Prtksxna) @Aklapper, the legal team doesn't have the resources to do #2 at the moment, and not having #1 is already causing so... [14:44:36] 10serviceops, 10Operations, 10Phabricator, 10Traffic, 10Patch-For-Review: Phabricator downtime due to aphlict and websockets (aphlict current disabled) - https://phabricator.wikimedia.org/T238593 (10akosiaris) >>! In T238593#5674571, @ema wrote: > By going through SAL and the irc logs on #wikimedia-opera... [15:37:48] hello [15:38:05] any tips on which versions of k8s, minikube , helm, docker, etc. to run locally? [15:38:19] having trouble getting it all to line up and work with our helm stuff [15:38:27] latest k8s looks no longer compatible with our helm charts [15:58:25] akosiaris: yt? [15:58:30] should I do this? [15:58:30] https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/ [15:58:32] in my charts? [16:14:11] 10serviceops, 10Operations, 10Phabricator, 10Traffic: Phabricator downtime due to aphlict and websockets (aphlict current disabled) - https://phabricator.wikimedia.org/T238593 (10Dzahn) >>! In T238593#5674571, @ema wrote: > - 2019-11-15 17:30 SAL: `mutante: phabricator - -started phd service`. @Dzahn It's... [16:15:27] thcipriani: yt? [16:15:38] i'm having trouble using blubber (i guess with v4?) [16:15:58] the apt-get update step fails because deb repos don't have a Release file (?) [16:17:24] or maybe 404s? [16:17:29] E: Failed to fetch http://apt.wikimedia.org/wikimedia/dists/stretch-wikimedia/main/binary-amd64/Packages 404 Not Found [16:20:28] 10serviceops, 10Operations, 10Phabricator, 10Traffic: Phabricator downtime due to aphlict and websockets (aphlict current disabled) - https://phabricator.wikimedia.org/T238593 (10ema) >>! In T238593#5675662, @Dzahn wrote: > > You can entirely disregard that, i was on phab1001 and not phab1003 by accident.... [16:25:54] 10serviceops, 10Operations, 10Phabricator, 10Traffic: Phabricator downtime due to aphlict and websockets (aphlict current disabled) - https://phabricator.wikimedia.org/T238593 (10Dzahn) >>! In T238593#5675686, @ema wrote: > Was there any other admin action between the page and when @joe disabled proxy_wstu... [16:42:33] 10serviceops, 10Operations, 10Phabricator, 10Traffic: Phabricator downtime due to aphlict and websockets (aphlict current disabled) - https://phabricator.wikimedia.org/T238593 (10Dzahn) Regarding the puppetization: There is `hiera('phabricator_aphlict_enabled'.` which is now set to false. What this does:... [18:16:14] ottomata: no, not yet. I know about it, but we probably have to do it in a single go, not per chart as we target 1.16 [18:16:45] yeah, seems tricky [18:16:56] also helm 3? [18:34:34] 10serviceops, 10MediaWiki-REST-API, 10Operations, 10wikidiff2, and 2 others: Deploy version 1.10.0 of wikidiff2 to production - https://phabricator.wikimedia.org/T236963 (10eprodromou) [18:35:25] 10serviceops, 10MediaWiki-REST-API, 10Operations, 10wikidiff2, and 2 others: Deploy version 1.10.0 of wikidiff2 to production - https://phabricator.wikimedia.org/T236963 (10eprodromou) >>! In T236963#5662843, @jijiki wrote: > Currently we have 1.9.0 on releases.w.o and on the servers. Please upload the new... [18:44:38] 10serviceops: deploy CoreDNS as a in-cluster DNS service - https://phabricator.wikimedia.org/T226516 (10akosiaris) 05Open→03Resolved a:03akosiaris This has been done, resolving [18:56:22] 10serviceops, 10Editing-team, 10Beta-Cluster-reproducible, 10Core Platform Team Legacy (Watching / External), 10Services (next): Zotero container: Production is running candidate version, last production version is broken due to lack of ca-certificates package - https://phabricator.wikimedia.org/T223345 (... [18:56:53] 10serviceops, 10Operations, 10User-Joe: SRE FY2019 Q3 goal: Ramp-up serving traffic to PHP 7 - https://phabricator.wikimedia.org/T212828 (10akosiaris) Should we resolve this? [18:57:34] ottomata: don't get me started on that... [18:57:58] we still need to figure out how to integrate that without making a mess out of everything. It's a huge architectural change [19:04:24] yeah [19:04:55] sounds like another huge project and this one (deployment pipeline) is still chugging along [19:13:38] 10serviceops, 10Parsoid-PHP: php-fpm isn't restarted when deploys are rolled back - https://phabricator.wikimedia.org/T238685 (10ssastry) p:05Triage→03High [19:14:20] 10serviceops, 10Parsoid-PHP: php-fpm isn't restarted when deploys are rolled back - https://phabricator.wikimedia.org/T238685 (10ssastry) [19:14:34] 10serviceops, 10Parsoid-PHP: php-fpm isn't restarted when deploys are rolled back - https://phabricator.wikimedia.org/T238685 (10ssastry) [21:03:30] phabricator (phab1003) - aphlict service is also stopped and wstunnel module is unloaded from httpd (in addition to the config and ferm part). all that now happening by flipping the Hiera switch [21:22:59] 10serviceops, 10Operations, 10Phabricator, 10Traffic: Phabricator downtime due to aphlict and websockets (aphlict current disabled) - https://phabricator.wikimedia.org/T238593 (10Dzahn) The Hiera key now does all the things, also stopped the service and unloaded the httpd module wstunnel. After that Phabr... [21:46:50] 10serviceops, 10MediaWiki-REST-API, 10Operations, 10wikidiff2, and 2 others: Deploy version 1.10.0 of wikidiff2 to production - https://phabricator.wikimedia.org/T236963 (10eprodromou) So, I don't know if there's anyone on our team except Tim who has access to the releases server. @tstarling would you mind... [21:52:49] 10serviceops, 10MediaWiki-REST-API, 10Operations, 10wikidiff2, and 2 others: Deploy version 1.10.0 of wikidiff2 to production - https://phabricator.wikimedia.org/T236963 (10Dzahn) There are different releasers group for different software. The people who can upload releases of wikidiff2 are: members: [dem... [22:34:29] 10serviceops, 10Parsoid-PHP: php-fpm isn't restarted when deploys are rolled back - https://phabricator.wikimedia.org/T238685 (10jijiki) @ssastr php-fpm will be restarted during scap deployments only when a server's opcache free is below 100MB, I can check the code if there is an exception to that [23:02:50] 10serviceops, 10Parsoid-PHP: php-fpm isn't restarted when deploys are rolled back - https://phabricator.wikimedia.org/T238685 (10ssastry) >>! In T238685#5676681, @jijiki wrote: > @ssastr php-fpm will be restarted during scap deployments only when a server's opcache free is below 100MB, I can check the code to... [23:06:35] 10serviceops, 10Phabricator, 10Documentation, 10Patch-For-Review, and 3 others: Make PHD run on the backup phabricator server (phab2001, currently) - https://phabricator.wikimedia.org/T232883 (10Dzahn) - It's now possible to start the phd service with a [[ https://gerrit.wikimedia.org/r/c/operations/puppet... [23:13:05] 10serviceops, 10Phabricator, 10Documentation, 10Release-Engineering-Team (Development services), and 2 others: Make PHD run on the backup phabricator server (phab2001, currently) - https://phabricator.wikimedia.org/T232883 (10Dzahn)