[00:03:48] <icinga-wm_>	 PROBLEM - Check systemd state on netflow3001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[00:03:54] <icinga-wm_>	 PROBLEM - Check systemd state on netflow4001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[00:03:58] <icinga-wm_>	 PROBLEM - Check systemd state on netflow2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[00:04:50] <icinga-wm_>	 PROBLEM - Check systemd state on netflow5001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[00:05:22] <icinga-wm_>	 PROBLEM - Check systemd state on netflow1001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[00:57:32] <wikibugs>	 10Operations, 10Security-Team, 10Wikimedia-Mailing-lists: Have a conversation about migrating from GNU Mailman 2.1 to GNU Mailman 3.0 - https://phabricator.wikimedia.org/T52864 (10Ladsgroup) >>! In T52864#6242301, @Tgr wrote: >>>! In T52864#6242222, @Ladsgroup wrote: >> The only thing is that with disabling...
[01:03:54] <wikibugs>	 (03CR) 10Ladsgroup: "> Patch Set 1:" [puppet] - 10https://gerrit.wikimedia.org/r/606464 (owner: 10Ladsgroup)
[02:30:24] <wikibugs>	 10Operations, 10DBA: db1088 crashed - https://phabricator.wikimedia.org/T255927 (10Peachey88)
[02:30:52] <wikibugs>	 10Operations, 10ops-eqiad, 10DBA: Degraded RAID on db1088 - https://phabricator.wikimedia.org/T255928 (10Peachey88)
[03:00:04] <icinga-wm_>	 RECOVERY - Check systemd state on an-launcher1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[03:18:22] <icinga-wm_>	 PROBLEM - Rate of JVM GC Old generation-s runs - elastic1052-production-search-psi-eqiad on elastic1052 is CRITICAL: 148.5 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=production-search-psi-eqiad&var-instance=elastic1052&panelId=37
[03:49:30] <icinga-wm_>	 PROBLEM - Check systemd state on netbox1001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[04:15:04] <icinga-wm_>	 RECOVERY - Check systemd state on netbox1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[04:47:48] <icinga-wm_>	 RECOVERY - Rate of JVM GC Old generation-s runs - elastic1052-production-search-psi-eqiad on elastic1052 is OK: (C)100 gt (W)80 gt 69.15 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=production-search-psi-eqiad&var-instance=elastic1052&panelId=37
[07:00:04] <jouncebot>	 Deploy window No deploys all day! See Deployments/Emergencies if things are broken. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200621T0700)
[08:57:46] <icinga-wm_>	 PROBLEM - MediaWiki exceptions and fatals per minute on icinga1001 is CRITICAL: cluster=logstash job=statsd_exporter level=ERROR site=eqiad https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/d/000000438/mediawiki-alerts?panelId=2&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops
[08:59:36] <icinga-wm_>	 RECOVERY - MediaWiki exceptions and fatals per minute on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/d/000000438/mediawiki-alerts?panelId=2&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops
[10:27:24] <icinga-wm_>	 PROBLEM - Rate of JVM GC Old generation-s runs - elastic1052-production-search-psi-eqiad on elastic1052 is CRITICAL: 138.3 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=production-search-psi-eqiad&var-instance=elastic1052&panelId=37
[11:55:52] <icinga-wm_>	 RECOVERY - Rate of JVM GC Old generation-s runs - elastic1052-production-search-psi-eqiad on elastic1052 is OK: (C)100 gt (W)80 gt 75.25 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=production-search-psi-eqiad&var-instance=elastic1052&panelId=37
[12:14:52] <icinga-wm_>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_cxserver_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[12:16:40] <icinga-wm_>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[12:18:15] <icinga-wm_>	 PROBLEM - Cxserver LVS eqiad on cxserver.svc.eqiad.wmnet is CRITICAL: /v2/suggest/sections/{title}/{from}/{to} (Suggest source sections to translate) timed out before a response was received https://wikitech.wikimedia.org/wiki/CX
[12:19:54] <icinga-wm_>	 RECOVERY - Cxserver LVS eqiad on cxserver.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/CX
[15:41:44] <wikibugs>	 10Operations, 10ops-eqiad, 10DBA: Degraded RAID on db1088 - https://phabricator.wikimedia.org/T255928 (10Zoranzoki21)
[15:41:47] <wikibugs>	 10Operations, 10DBA: db1088 crashed - https://phabricator.wikimedia.org/T255927 (10Zoranzoki21)
[16:07:08] <icinga-wm_>	 PROBLEM - Maps tiles generation on icinga1001 is CRITICAL: CRITICAL: 100.00% of data under the critical threshold [5.0] https://wikitech.wikimedia.org/wiki/Maps/Runbook https://grafana.wikimedia.org/dashboard/db/maps-performances?panelId=8&fullscreen&orgId=1
[16:36:27] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists: Password reset for wikimediaindia-l mailing list - https://phabricator.wikimedia.org/T255910 (10anirudhsbh) I have reached out to hpnadig and he asked me to follow up with the WMF technical team to have the password reset. The other two admins have not responded.
[16:36:40] <wikibugs>	 (03PS1) 10Ladsgroup: meet: Add /etc/meet-auth to store the configs and data [puppet] - 10https://gerrit.wikimedia.org/r/606824
[16:37:28] <wikibugs>	 (03Abandoned) 10Ladsgroup: meet: Change owner of account manager code to www-data [puppet] - 10https://gerrit.wikimedia.org/r/606464 (owner: 10Ladsgroup)
[16:37:48] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] meet: Add /etc/meet-auth to store the configs and data [puppet] - 10https://gerrit.wikimedia.org/r/606824 (owner: 10Ladsgroup)
[16:38:26] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists: Password reset for wikimediaindia-l mailing list - https://phabricator.wikimedia.org/T255910 (10anirudhsbh) Hello, I have an update... one of the old passwords started working. Sorry about the confusion!
[16:38:55] <wikibugs>	 (03PS2) 10Ladsgroup: meet: Add /etc/meet-auth to store the configs and data [puppet] - 10https://gerrit.wikimedia.org/r/606824
[16:39:26] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists: Password reset for wikimediaindia-l mailing list - https://phabricator.wikimedia.org/T255910 (10RhinosF1) 05Stalled→03Invalid >>! In T255910#6242788, @anirudhsbh wrote: > Hello, I have an update... one of the old passwords started working. Sorry about the confusion!...
[17:13:10] <icinga-wm_>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_citoid_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[17:14:56] <icinga-wm_>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[17:44:31] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+1] cloud nfs: only run nfs-exportd on the current active node (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/606543 (https://phabricator.wikimedia.org/T253353) (owner: 10Bstorm)
[17:45:28] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+1] unattendedupgrades: allow configurable kernel cleanup [puppet] - 10https://gerrit.wikimedia.org/r/606234 (https://phabricator.wikimedia.org/T127374) (owner: 10Bstorm)
[17:57:20] <icinga-wm_>	 PROBLEM - Rate of JVM GC Old generation-s runs - elastic1052-production-search-psi-eqiad on elastic1052 is CRITICAL: 155.6 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=production-search-psi-eqiad&var-instance=elastic1052&panelId=37
[18:23:28] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists: Creation of mailinglist for Board of WUG Esperanto and Free Knowledge - https://phabricator.wikimedia.org/T255951 (10KuboF)
[18:57:08] <icinga-wm_>	 RECOVERY - Rate of JVM GC Old generation-s runs - elastic1052-production-search-psi-eqiad on elastic1052 is OK: (C)100 gt (W)80 gt 62.03 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=production-search-psi-eqiad&var-instance=elastic1052&panelId=37
[19:09:56] <icinga-wm_>	 PROBLEM - Rate of JVM GC Old generation-s runs - elastic1052-production-search-psi-eqiad on elastic1052 is CRITICAL: 103.7 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=production-search-psi-eqiad&var-instance=elastic1052&panelId=37
[19:45:50] <icinga-wm_>	 PROBLEM - Rate of JVM GC Old generation-s runs - elastic1052-production-search-psi-eqiad on elastic1052 is CRITICAL: 106.8 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=production-search-psi-eqiad&var-instance=elastic1052&panelId=37
[21:03:28] <icinga-wm_>	 RECOVERY - Rate of JVM GC Old generation-s runs - elastic1052-production-search-psi-eqiad on elastic1052 is OK: (C)100 gt (W)80 gt 26.44 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=production-search-psi-eqiad&var-instance=elastic1052&panelId=37
[21:54:34] <wikibugs>	 (03PS2) 10QChris: gerrit: Add option to mark gerrit servers as upgraded [puppet] - 10https://gerrit.wikimedia.org/r/606530 (https://phabricator.wikimedia.org/T254158)
[21:54:36] <wikibugs>	 (03PS4) 10QChris: gerrit: Mark gerrit1002 (gerrit-test) as upgraded [puppet] - 10https://gerrit.wikimedia.org/r/606531 (https://phabricator.wikimedia.org/T254158)
[21:54:38] <wikibugs>	 (03PS3) 10QChris: gerrit: Add dedicated home dir for new Gerrit version [puppet] - 10https://gerrit.wikimedia.org/r/606532 (https://phabricator.wikimedia.org/T254158)
[21:54:40] <wikibugs>	 (03PS3) 10QChris: gerrit: Stop setting up a database for new Gerrits [puppet] - 10https://gerrit.wikimedia.org/r/606536 (https://phabricator.wikimedia.org/T254158)
[21:54:42] <wikibugs>	 (03PS4) 10QChris: gerrit: Drop its configuration for draft changes for new Gerrits [puppet] - 10https://gerrit.wikimedia.org/r/606533 (https://phabricator.wikimedia.org/T254158)
[21:54:44] <wikibugs>	 (03PS2) 10QChris: gerrit: Update its-phabricator templates for new Gerrits [puppet] - 10https://gerrit.wikimedia.org/r/606781 (https://phabricator.wikimedia.org/T254158)
[21:54:46] <wikibugs>	 (03PS2) 10QChris: gerrit: Update email templates for new Gerrits [puppet] - 10https://gerrit.wikimedia.org/r/606782 (https://phabricator.wikimedia.org/T254158)
[21:54:48] <wikibugs>	 (03PS2) 10QChris: gerrit: Drop empty unused Git config file [puppet] - 10https://gerrit.wikimedia.org/r/606783 (https://phabricator.wikimedia.org/T254158)
[21:54:50] <wikibugs>	 (03PS3) 10QChris: gerrit: Enable git protocol v2 on new Gerrits [puppet] - 10https://gerrit.wikimedia.org/r/606784 (https://phabricator.wikimedia.org/T254158)
[21:54:52] <wikibugs>	 (03PS2) 10QChris: gerrit: Allow to use request tracing for new Gerrits [puppet] - 10https://gerrit.wikimedia.org/r/606795 (https://phabricator.wikimedia.org/T254158)
[21:54:54] <wikibugs>	 (03PS2) 10QChris: gerrit: Do not enable the ability to move changes for new Gerrits [puppet] - 10https://gerrit.wikimedia.org/r/606796 (https://phabricator.wikimedia.org/T254158)
[21:54:56] <wikibugs>	 (03PS15) 10QChris: Gerrit: Convert CoC and Privacy links to use the new PolyGerrit extension point [puppet] - 10https://gerrit.wikimedia.org/r/520295 (https://phabricator.wikimedia.org/T254648) (owner: 10Paladox)
[21:54:58] <wikibugs>	 (03PS8) 10QChris: Gerrit: Migrate theme to support Polymer 2 [puppet] - 10https://gerrit.wikimedia.org/r/539180 (https://phabricator.wikimedia.org/T227509) (owner: 10Paladox)
[21:55:00] <wikibugs>	 (03PS1) 10QChris: gerrit: Use `replica` instead of `slave` for new Gerrits [puppet] - 10https://gerrit.wikimedia.org/r/606839 (https://phabricator.wikimedia.org/T254158)
[21:55:02] <wikibugs>	 (03PS1) 10QChris: gerrit: Remove old Polymer <2 styles for new Gerrits [puppet] - 10https://gerrit.wikimedia.org/r/606840 (https://phabricator.wikimedia.org/T227509)
[21:55:04] <wikibugs>	 (03PS1) 10QChris: gerrit: Switch header styling for new Gerrits from component to style [puppet] - 10https://gerrit.wikimedia.org/r/606841 (https://phabricator.wikimedia.org/T227509)
[21:55:06] <wikibugs>	 (03PS1) 10QChris: gerrit: Use colored header bar also in dark theme for new Gerrits [puppet] - 10https://gerrit.wikimedia.org/r/606842 (https://phabricator.wikimedia.org/T227509)
[21:55:08] <wikibugs>	 (03PS1) 10QChris: gerrit: Have a proper light and dark style for new Gerrits [puppet] - 10https://gerrit.wikimedia.org/r/606843 (https://phabricator.wikimedia.org/T227509)
[21:56:55] <wikibugs>	 (03CR) 10QChris: [C: 03+1] Gerrit: Convert CoC and Privacy links to use the new PolyGerrit extension point [puppet] - 10https://gerrit.wikimedia.org/r/520295 (https://phabricator.wikimedia.org/T254648) (owner: 10Paladox)
[21:57:01] <wikibugs>	 (03CR) 10QChris: [C: 03+1] Gerrit: Migrate theme to support Polymer 2 [puppet] - 10https://gerrit.wikimedia.org/r/539180 (https://phabricator.wikimedia.org/T227509) (owner: 10Paladox)
[21:57:28] <wikibugs>	 (03CR) 10QChris: "This theme is now online on https://gerrit-test.wikimedia.org/" [puppet] - 10https://gerrit.wikimedia.org/r/606843 (https://phabricator.wikimedia.org/T227509) (owner: 10QChris)
[22:09:34] <icinga-wm_>	 PROBLEM - Work requests waiting in Zuul Gearman server on contint2001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [150.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1
[22:31:04] <icinga-wm_>	 RECOVERY - Work requests waiting in Zuul Gearman server on contint2001 is OK: OK: Less than 100.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1
[23:01:48] <icinga-wm_>	 PROBLEM - Maps - OSM synchronization lag - codfw on icinga1001 is CRITICAL: 2.593e+05 ge 2.592e+05 https://wikitech.wikimedia.org/wiki/Maps/Runbook https://grafana.wikimedia.org/dashboard/db/maps-performances?panelId=12&fullscreen&orgId=1