[00:37:31] <icinga-wm>	 PROBLEM - varnish-http-requests grafana alert on alert1001 is CRITICAL: CRITICAL: Varnish HTTP Requests ( https://grafana.wikimedia.org/d/000000180/varnish-http-requests ) is alerting: 70% GET drop in 30min alert. https://phabricator.wikimedia.org/project/view/1201/ https://grafana.wikimedia.org/d/000000180/
[00:44:31] <icinga-wm>	 RECOVERY - varnish-http-requests grafana alert on alert1001 is OK: OK: Varnish HTTP Requests ( https://grafana.wikimedia.org/d/000000180/varnish-http-requests ) is not alerting. https://phabricator.wikimedia.org/project/view/1201/ https://grafana.wikimedia.org/d/000000180/
[00:47:20] <wikibugs>	 10SRE: try planet/people on bullseye / upgrade people.wikimedia.org backends to bullseye - https://phabricator.wikimedia.org/T280989 (10Dzahn)
[00:48:39] <wikibugs>	 10SRE: try planet/people on bullseye / upgrade people.wikimedia.org backends to bullseye - https://phabricator.wikimedia.org/T280989 (10Dzahn) 05Open→03Resolved p:05Low→03Medium This is done.  people1003 and people2002 on bullseye have completely replaced people1002 and people2001 on buster   the buster...
[00:51:27] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[00:53:47] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[01:22:19] <icinga-wm>	 PROBLEM - varnish-http-requests grafana alert on alert1001 is CRITICAL: CRITICAL: Varnish HTTP Requests ( https://grafana.wikimedia.org/d/000000180/varnish-http-requests ) is alerting: 70% GET drop in 30min alert. https://phabricator.wikimedia.org/project/view/1201/ https://grafana.wikimedia.org/d/000000180/
[01:24:39] <icinga-wm>	 RECOVERY - varnish-http-requests grafana alert on alert1001 is OK: OK: Varnish HTTP Requests ( https://grafana.wikimedia.org/d/000000180/varnish-http-requests ) is not alerting. https://phabricator.wikimedia.org/project/view/1201/ https://grafana.wikimedia.org/d/000000180/
[02:02:25] <icinga-wm>	 PROBLEM - varnish-http-requests grafana alert on alert1001 is CRITICAL: CRITICAL: Varnish HTTP Requests ( https://grafana.wikimedia.org/d/000000180/varnish-http-requests ) is alerting: 70% GET drop in 30min alert. https://phabricator.wikimedia.org/project/view/1201/ https://grafana.wikimedia.org/d/000000180/
[02:04:47] <icinga-wm>	 RECOVERY - varnish-http-requests grafana alert on alert1001 is OK: OK: Varnish HTTP Requests ( https://grafana.wikimedia.org/d/000000180/varnish-http-requests ) is not alerting. https://phabricator.wikimedia.org/project/view/1201/ https://grafana.wikimedia.org/d/000000180/
[03:22:47] <icinga-wm>	 PROBLEM - varnish-http-requests grafana alert on alert1001 is CRITICAL: CRITICAL: Varnish HTTP Requests ( https://grafana.wikimedia.org/d/000000180/varnish-http-requests ) is alerting: 70% GET drop in 30min alert. https://phabricator.wikimedia.org/project/view/1201/ https://grafana.wikimedia.org/d/000000180/
[03:25:05] <icinga-wm>	 RECOVERY - varnish-http-requests grafana alert on alert1001 is OK: OK: Varnish HTTP Requests ( https://grafana.wikimedia.org/d/000000180/varnish-http-requests ) is not alerting. https://phabricator.wikimedia.org/project/view/1201/ https://grafana.wikimedia.org/d/000000180/
[03:45:32] <wikibugs>	 (03PS1) 10PipelineBot: citoid: pipeline bot promote [deployment-charts] - 10https://gerrit.wikimedia.org/r/691466
[04:01:37] <icinga-wm>	 PROBLEM - varnish-http-requests grafana alert on alert1001 is CRITICAL: CRITICAL: Varnish HTTP Requests ( https://grafana.wikimedia.org/d/000000180/varnish-http-requests ) is alerting: 70% GET drop in 30min alert. https://phabricator.wikimedia.org/project/view/1201/ https://grafana.wikimedia.org/d/000000180/
[04:22:23] <icinga-wm>	 RECOVERY - varnish-http-requests grafana alert on alert1001 is OK: OK: Varnish HTTP Requests ( https://grafana.wikimedia.org/d/000000180/varnish-http-requests ) is not alerting. https://phabricator.wikimedia.org/project/view/1201/ https://grafana.wikimedia.org/d/000000180/
[05:02:37] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists: Mailman3 bounce runner is running very slowly - https://phabricator.wikimedia.org/T282348 (10Legoktm) @Platonides that's being discussed upstream at https://gitlab.com/warsaw/flufl.bounce/-/issues/7 if you want to chime in there :)
[05:50:55] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[05:53:19] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[06:20:28] <wikibugs>	 (03PS1) 10Majavah: P::kubernetes::deployment_server: Do not use ipv6 on beta [puppet] - 10https://gerrit.wikimedia.org/r/691494 (https://phabricator.wikimedia.org/T281986)
[06:30:34] <wikibugs>	 (03CR) 10Jcrespo: [C: 03+2] bacula: Reenable read-write ES database backups, disable read-only [puppet] - 10https://gerrit.wikimedia.org/r/690338 (https://phabricator.wikimedia.org/T282249) (owner: 10Jcrespo)
[06:35:29] <icinga-wm>	 RECOVERY - Disk space on backup2002 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=backup2002&var-datasource=codfw+prometheus/ops
[06:54:52] <Amir1>	 !log migrating most of last mailing lists of T280322
[06:54:55] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:54:56] <stashbot>	 T280322: Upgrade mailing lists from mailman2 to 3 in batches - https://phabricator.wikimedia.org/T280322
[07:18:17] <wikibugs>	 (03PS1) 10Legoktm: mailman3: Optionally enable memcached caching [puppet] - 10https://gerrit.wikimedia.org/r/691513 (https://phabricator.wikimedia.org/T282931)
[07:20:59] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[07:23:17] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[08:38:37] <icinga-wm>	 PROBLEM - SSH on logstash2020.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[08:39:46] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists, 10User-Ladsgroup: Upgrade mailing lists from mailman2 to 3 in batches - https://phabricator.wikimedia.org/T280322 (10Nemo_bis) > Group H is basically done, hyperkitty import failed on wikitech-l and unblock-en-l  Let me guess: the HTML archives were meddled with (to remove s...
[08:50:55] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[08:55:53] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[09:02:55] <wikibugs>	 (03CR) 10Addshore: "If this hadn't already happened for the WCQS I would have said lets not do this yet, and change all the URIs at once (wikidata and commons" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/679327 (https://phabricator.wikimedia.org/T258590) (owner: 10Seddon)
[09:03:36] <wikibugs>	 (03CR) 10Alexandros Kosiaris: [C: 04-2] P::kubernetes::deployment_server: Do not use ipv6 on beta (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/691494 (https://phabricator.wikimedia.org/T281986) (owner: 10Majavah)
[09:07:18] <wikibugs>	 (03PS18) 10Elukey: Add istio base images build support [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192)
[09:15:41] <Majavah>	 akosiaris: hi, what would you suggest for https://gerrit.wikimedia.org/r/691494? my understanding is that beta specific roles should not be used at all, so the only other option I see would be to move the if $::realm != 'labs' clause to role::deployment_server
[09:25:12] <wikibugs>	 (03CR) 10Majavah: [C: 03+1] "works fine on beta. Ideally beta would just use the service proxy like production but that's rather difficult due to service::catalog data" [puppet] - 10https://gerrit.wikimedia.org/r/688315 (https://phabricator.wikimedia.org/T277990) (owner: 10Giuseppe Lavagetto)
[09:27:07] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists, 10User-Ladsgroup: Upgrade mailing lists from mailman2 to 3 in batches - https://phabricator.wikimedia.org/T280322 (10Ladsgroup) that and encoding mess.
[09:39:45] <icinga-wm>	 RECOVERY - SSH on logstash2020.mgmt is OK: SSH OK - OpenSSH_6.6 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[09:49:49] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[09:52:25] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[09:57:14] <wikibugs>	 (03CR) 10Multichill: [C: 03+1] "Hi Adam," [mediawiki-config] - 10https://gerrit.wikimedia.org/r/679327 (https://phabricator.wikimedia.org/T258590) (owner: 10Seddon)
[10:09:17] <icinga-wm>	 PROBLEM - WDQS high update lag on wdqs1006 is CRITICAL: 4.829e+04 ge 4.32e+04 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen
[10:48:21] <wikibugs>	 (03PS19) 10Elukey: Add istio base images build support [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192)
[10:52:36] <wikibugs>	 (03CR) 10Elukey: "I was finally able to run a test on minikube like the following:" [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/688211 (https://phabricator.wikimedia.org/T278192) (owner: 10Elukey)
[11:44:17] <wikibugs>	 10SRE, 10Wikidata, 10Wikidata-Query-Service, 10User-Addshore: Wikidata produces a lot of failed requests for recentchanges API - https://phabricator.wikimedia.org/T202764 (10Ladsgroup)
[12:02:25] <wikibugs>	 (03PS1) 10Mvolz: [wip] Updated outdated commands [deployment-charts] - 10https://gerrit.wikimedia.org/r/691599
[12:12:05] <mvolz>	 Heads up I accidentally forgot to deploy on one of the two servers on thursday :/
[12:12:08] <mvolz>	 gonna do eqiad now
[12:12:11] <mvolz>	 whoops
[12:18:41] <mvolz>	 ugh nvm something is messed up.
[12:19:09] <mvolz>	 when I do diff it just wants to revert the chart?? 
[12:19:12] <mvolz>	 :/
[12:31:01] <mvolz>	 hmm, potentially a rebasing issue... :/ looks like codfw is on the new version but the old chart, and eqiad is on the old version but the new chart... 
[12:33:21] <Amir1>	 !log set fr_quality to 0 for all revisions on several wikis (T279761)
[12:33:26] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:33:28] <stashbot>	 T279761: When reviewing pending changes, raw message ID "⧼revreview-hist-quality⧽" shown instead of human readable string - https://phabricator.wikimedia.org/T279761
[12:37:14] <mvolz>	 XioNoX: any chance you could help with helmfile stuff? 
[12:37:56] <mvolz>	 I have one server on an old version of citoid but on the new chart, and another server with the new chart but on an old version of citoid
[12:38:05] <mvolz>	 So - inconsistent results on the user level.
[12:38:21] <mvolz>	 If I try to update eqiad, it wants to downgrade the chart. 
[12:40:28] <mvolz>	 master deployment_charts repository looks correct - everything is current on master. 
[13:50:36] <wikibugs>	 (03PS1) 10Ladsgroup: acme_chief: Migrate cron to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/691634 (https://phabricator.wikimedia.org/T273673)
[13:51:14] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] acme_chief: Migrate cron to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/691634 (https://phabricator.wikimedia.org/T273673) (owner: 10Ladsgroup)
[13:55:05] <wikibugs>	 (03PS1) 10Ladsgroup: Revert "Revert "prometheus: Migrate node_file_count cron to systemd timer"" [puppet] - 10https://gerrit.wikimedia.org/r/691317
[13:59:35] <wikibugs>	 (03PS2) 10Ladsgroup: Revert "Revert "prometheus: Migrate node_file_count cron to systemd timer"" [puppet] - 10https://gerrit.wikimedia.org/r/691317
[14:01:36] <wikibugs>	 (03PS2) 10Ladsgroup: acme_chief: Migrate cron to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/691634 (https://phabricator.wikimedia.org/T273673)
[14:01:38] <wikibugs>	 (03PS3) 10Ladsgroup: Revert "Revert "prometheus: Migrate node_file_count cron to systemd timer"" [puppet] - 10https://gerrit.wikimedia.org/r/691317
[14:03:41] <wikibugs>	 (03PS3) 10Ladsgroup: acme_chief: Migrate cron to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/691634 (https://phabricator.wikimedia.org/T273673)
[14:03:58] <wikibugs>	 (03PS4) 10Ladsgroup: Revert "Revert "prometheus: Migrate node_file_count cron to systemd timer"" [puppet] - 10https://gerrit.wikimedia.org/r/691317
[14:09:29] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[14:12:01] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[14:22:27] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists, 10User-Ladsgroup: Upgrade mailing lists from mailman2 to 3 in batches - https://phabricator.wikimedia.org/T280322 (10Ladsgroup)
[14:25:00] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists: Wikimedia-RU mailing list page has wrong encoding - https://phabricator.wikimedia.org/T135226 (10Ladsgroup) 05Open→03Resolved It's on mailman3 now: https://lists.wikimedia.org/postorius/lists/wikimedia-ru.lists.wikimedia.org/   Almost all mailing lists are now on mm3
[14:25:48] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists: Mailman sends bounce notification messages to list-admins with a subject line in Chinese language - https://phabricator.wikimedia.org/T278574 (10Ladsgroup) 05Open→03Resolved Except a very few mailing lists, all are now migrated to mailman3, I call this done.
[14:26:09] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists, 10Mobile: List archives on lists.wikimedia.org is not mobile friendly - https://phabricator.wikimedia.org/T190054 (10Ladsgroup) 05Open→03Resolved Except a very few mailing lists, all are now migrated to mailman3, I call this done.
[14:27:16] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists, 10I18n: Mailman password reminder mail (and other texts) has broken encoding in Czech - https://phabricator.wikimedia.org/T271123 (10Ladsgroup) 05Open→03Resolved This mailing list is one mm3 now https://lists.wikimedia.org/postorius/lists/wikics-l.lists.wikimedia.org/  E...
[14:27:25] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=swagger_check_citoid_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[14:28:09] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists, 10Upstream: https://lists.wikimedia.org/mailman/options/ doesn't set charset header - https://phabricator.wikimedia.org/T172929 (10Ladsgroup) 05Open→03Resolved Except a very few mailing lists, all are now migrated to mailman3, I call this done.
[14:29:53] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[14:30:08] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists, 10Upstream: Mailing lists need search function readded - https://phabricator.wikimedia.org/T19390 (10Ladsgroup) 05Open→03Resolved Except a very few mailing lists, all are now migrated to mailman3, I call this done.
[14:31:07] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists, 10Upstream: "From" at start of line becomes ">From" in pipermail - https://phabricator.wikimedia.org/T115329 (10Ladsgroup) 05Open→03Resolved Except a very few mailing lists, all are now migrated to mailman3, I call this done.
[14:32:07] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists, 10Mobile: Mailman on lists.wikimedia.org is not mobile friendly - https://phabricator.wikimedia.org/T190055 (10Ladsgroup) 05Open→03Resolved Except a very few mailing lists, all are now migrated to mailman3, I call this done.
[14:57:41] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 03+1] New envoy upstream version 1.15.5 [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/689950 (owner: 10Hnowlan)
[15:17:27] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[15:19:49] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[15:51:51] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[15:54:17] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:06:43] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:09:09] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:16:10] <wikibugs>	 10SRE, 10Analytics, 10LDAP-Access-Requests, 10CommRel-Specialists-Support (Apr-Jun-2021): Superset/Turnilo access - https://phabricator.wikimedia.org/T282947 (10STei-WMF)
[16:17:16] <wikibugs>	 10SRE, 10discovery-system: Document what #discovery-system is - https://phabricator.wikimedia.org/T282948 (10Aklapper)
[16:17:59] <wikibugs>	 10SRE, 10discovery-system: Document what #discovery-system is - https://phabricator.wikimedia.org/T282948 (10Aklapper)
[16:21:17] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:23:45] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:51:17] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:53:35] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[16:58:26] <wikibugs>	 10SRE, 10Analytics, 10LDAP-Access-Requests, 10CommRel-Specialists-Support (Apr-Jun-2021): Superset/Turnilo access - https://phabricator.wikimedia.org/T282947 (10Aklapper) Hi @Stei-WMF, please see the bullet points at https://phabricator.wikimedia.org/tag/ldap-access-requests/ - thanks!
[19:21:17] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[19:23:45] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[19:47:47] <wikibugs>	 (03PS1) 10Andrew Bogott: Add an OpenStack package class for Bullseye VMs [puppet] - 10https://gerrit.wikimedia.org/r/691741 (https://phabricator.wikimedia.org/T280801)
[19:49:41] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] Add an OpenStack package class for Bullseye VMs [puppet] - 10https://gerrit.wikimedia.org/r/691741 (https://phabricator.wikimedia.org/T280801) (owner: 10Andrew Bogott)
[19:57:07] <wikibugs>	 (03PS1) 10Andrew Bogott: cloud-vps VMs: use ssd for Bullseye [puppet] - 10https://gerrit.wikimedia.org/r/691744 (https://phabricator.wikimedia.org/T280801)
[20:00:29] <icinga-wm>	 PROBLEM - ats-tls HTTPS wikiworkshop.org RSA on cp5016 is CRITICAL: SSL CRITICAL - OCSP staple validity for wikiworkshop.org has 86371 seconds left https://wikitech.wikimedia.org/wiki/HTTPS
[20:01:20] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] cloud-vps VMs: use ssd for Bullseye [puppet] - 10https://gerrit.wikimedia.org/r/691744 (https://phabricator.wikimedia.org/T280801) (owner: 10Andrew Bogott)
[20:02:15] <icinga-wm>	 PROBLEM - ats-tls HTTPS wikiworkshop.org ECDSA on cp5016 is CRITICAL: SSL CRITICAL - OCSP staple validity for wikiworkshop.org has 86267 seconds left https://wikitech.wikimedia.org/wiki/HTTPS
[20:32:18] <wikibugs>	 (03PS1) 10Andrew Bogott: cloud-vps openstack packages: don't install python2 packages on bullseye [puppet] - 10https://gerrit.wikimedia.org/r/691746 (https://phabricator.wikimedia.org/T280801)
[20:33:39] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] cloud-vps openstack packages: don't install python2 packages on bullseye [puppet] - 10https://gerrit.wikimedia.org/r/691746 (https://phabricator.wikimedia.org/T280801) (owner: 10Andrew Bogott)
[20:51:09] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[20:53:37] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[22:30:41] <wikibugs>	 10SRE, 10Analytics, 10LDAP-Access-Requests, 10CommRel-Specialists-Support (Apr-Jun-2021): Superset/Turnilo access for User:STei - https://phabricator.wikimedia.org/T282947 (10Urbanecm)
[22:33:14] <wikibugs>	 (03PS1) 10Urbanecm: Enable SandboxLink at azwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/691754 (https://phabricator.wikimedia.org/T282954)
[23:21:23] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[23:23:53] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[23:45:01] <icinga-wm>	 PROBLEM - mailman archives on lists1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
[23:46:31] <icinga-wm>	 PROBLEM - mailman list info on lists1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Mailman/Monitoring