[00:33:18] PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1006 is CRITICAL: CRITICAL - elasticsearch http://localhost:9200/_cluster/health error while fetching: HTTPConnectionPool(host=localhost, port=9200): Read timed out. (read timeout=4) https://wikitech.wikimedia.org/wiki/Search%23Administration [00:35:36] RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1006 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: active_shards: 1793, delayed_unassigned_shards: 0, status: green, cluster_name: cloudelastic-chi-eqiad, active_shards_percent_as_number: 100.0, initializing_shards: 0, number_of_pending_tasks: 0, number_of_nodes: 6, task_max_waiting_in_queue_millis: 0, unassigned_shards: 0, number_of_data_nodes: 6, [00:35:36] s: 0, timed_out: False, active_primary_shards: 895, number_of_in_flight_fetch: 0 https://wikitech.wikimedia.org/wiki/Search%23Administration [03:15:22] PROBLEM - WDQS SPARQL on wdqs1013 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [03:27:18] RECOVERY - Rate of JVM GC Old generation-s runs - cloudelastic1003-cloudelastic-chi-eqiad on cloudelastic1003 is OK: (C)100 gt (W)80 gt 75.25 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1003&panelId=37 [03:51:20] PROBLEM - Check systemd state on an-launcher1002 is CRITICAL: CRITICAL - degraded: The following units failed: monitor_refine_netflow_failure_flags.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [04:01:14] PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1005 is CRITICAL: CRITICAL - elasticsearch http://localhost:9200/_cluster/health error while fetching: HTTPConnectionPool(host=localhost, port=9200): Read timed out. (read timeout=4) https://wikitech.wikimedia.org/wiki/Search%23Administration [04:03:34] RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1005 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: unassigned_shards: 0, number_of_pending_tasks: 0, relocating_shards: 0, initializing_shards: 0, task_max_waiting_in_queue_millis: 0, number_of_nodes: 6, active_primary_shards: 895, active_shards: 1793, status: green, active_shards_percent_as_number: 100.0, number_of_in_flight_fetch: 0, cluster_name [04:03:34] i-eqiad, timed_out: False, delayed_unassigned_shards: 0, number_of_data_nodes: 6 https://wikitech.wikimedia.org/wiki/Search%23Administration [04:28:14] PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1006 is CRITICAL: CRITICAL - elasticsearch http://localhost:9200/_cluster/health error while fetching: HTTPConnectionPool(host=localhost, port=9200): Read timed out. (read timeout=4) https://wikitech.wikimedia.org/wiki/Search%23Administration [04:32:48] RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1006 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: number_of_in_flight_fetch: 0, number_of_data_nodes: 6, relocating_shards: 0, delayed_unassigned_shards: 0, cluster_name: cloudelastic-chi-eqiad, unassigned_shards: 0, timed_out: False, number_of_nodes: 6, number_of_pending_tasks: 0, active_primary_shards: 895, task_max_waiting_in_queue_millis: 0, i [04:32:48] s: 0, active_shards_percent_as_number: 100.0, status: green, active_shards: 1793 https://wikitech.wikimedia.org/wiki/Search%23Administration [06:04:00] PROBLEM - WMF Cloud -Chi Cluster- - Public Internet Port - HTTPS on cloudelastic.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Search%23Administration [06:06:16] RECOVERY - WMF Cloud -Chi Cluster- - Public Internet Port - HTTPS on cloudelastic.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 673 bytes in 0.026 second response time https://wikitech.wikimedia.org/wiki/Search%23Administration [06:40:43] !log cleaning watchlist of User:Mr._Ibrahem in wikidatawiki (in main ns only) [06:40:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:00:04] Deploy window No deploys all day! See Deployments/Emergencies if things are broken. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210418T0700) [07:29:02] RECOVERY - Check systemd state on an-launcher1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [08:54:42] Hi, Can someone look at https://gerrit.wikimedia.org/r/c/integration/config/+/680697/ ? [08:59:58] PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1005 is CRITICAL: CRITICAL - elasticsearch http://localhost:9200/_cluster/health error while fetching: HTTPConnectionPool(host=localhost, port=9200): Read timed out. (read timeout=4) https://wikitech.wikimedia.org/wiki/Search%23Administration [09:02:20] RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1005 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: timed_out: False, delayed_unassigned_shards: 0, number_of_in_flight_fetch: 0, unassigned_shards: 0, task_max_waiting_in_queue_millis: 0, initializing_shards: 0, number_of_data_nodes: 6, relocating_shards: 0, number_of_nodes: 6, number_of_pending_tasks: 0, status: green, active_primary_shards: 895, [09:02:20] 93, active_shards_percent_as_number: 100.0, cluster_name: cloudelastic-chi-eqiad https://wikitech.wikimedia.org/wiki/Search%23Administration [09:38:32] PROBLEM - Host wdqs1013 is DOWN: PING CRITICAL - Packet loss = 100% [09:39:46] RECOVERY - Host wdqs1013 is UP: PING WARNING - Packet loss = 33%, RTA = 0.28 ms [10:04:41] (03CR) 10Daimona Eaytoy: "> Patch Set 1: Code-Review+1" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/680753 (https://phabricator.wikimedia.org/T239990) (owner: 10Daimona Eaytoy) [10:11:52] RECOVERY - WDQS SPARQL on wdqs1013 is OK: HTTP OK: HTTP/1.1 200 OK - 688 bytes in 1.059 second response time https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [10:36:01] 10SRE, 10MediaWiki-General, 10Browser-Support-Apple-Safari: File:Chessboard480.svg not visible on safari when size is fixed at 208px - https://phabricator.wikimedia.org/T280439 (10Daimona) 05Open→03Resolved a:03Daimona Regenerating the thumbnail (via [[https://commons.wikimedia.org/w/thumb.php?f=Chessb... [10:44:06] PROBLEM - Varnish traffic drop between 30min ago and now at eqiad on alert1001 is CRITICAL: 51.77 le 60 https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1 [10:46:32] RECOVERY - Varnish traffic drop between 30min ago and now at eqiad on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Varnish%23Diagnosing_Varnish_alerts https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1 [12:39:22] PROBLEM - Rate of JVM GC Old generation-s runs - cloudelastic1003-cloudelastic-chi-eqiad on cloudelastic1003 is CRITICAL: 142.4 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1003&panelId=37 [13:40:12] RECOVERY - Rate of JVM GC Old generation-s runs - cloudelastic1003-cloudelastic-chi-eqiad on cloudelastic1003 is OK: (C)100 gt (W)80 gt 65.08 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1003&panelId=37 [14:01:44] PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1005 is CRITICAL: CRITICAL - elasticsearch http://localhost:9200/_cluster/health error while fetching: HTTPConnectionPool(host=localhost, port=9200): Read timed out. (read timeout=4) https://wikitech.wikimedia.org/wiki/Search%23Administration [14:04:08] RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1005 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: status: green, number_of_pending_tasks: 0, number_of_data_nodes: 6, initializing_shards: 0, active_primary_shards: 895, number_of_in_flight_fetch: 0, task_max_waiting_in_queue_millis: 0, timed_out: False, cluster_name: cloudelastic-chi-eqiad, active_shards: 1793, delayed_unassigned_shards: 0, activ [14:04:08] as_number: 100.0, relocating_shards: 0, unassigned_shards: 0, number_of_nodes: 6 https://wikitech.wikimedia.org/wiki/Search%23Administration [15:30:22] PROBLEM - Check for expired certificates on pki2001 is CRITICAL: CRITICAL https://wikitech.wikimedia.org/wiki/PKI/Debugging [15:31:02] PROBLEM - Check for expired certificates on pki1001 is CRITICAL: CRITICAL https://wikitech.wikimedia.org/wiki/PKI/Debugging [15:46:48] PROBLEM - Rate of JVM GC Old generation-s runs - cloudelastic1003-cloudelastic-chi-eqiad on cloudelastic1003 is CRITICAL: 141.4 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1003&panelId=37 [16:40:22] RECOVERY - Rate of JVM GC Old generation-s runs - cloudelastic1003-cloudelastic-chi-eqiad on cloudelastic1003 is OK: (C)100 gt (W)80 gt 73.22 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1003&panelId=37 [16:42:40] PROBLEM - ElasticSearch health check for shards on 9200 on cloudelastic1005 is CRITICAL: CRITICAL - elasticsearch http://localhost:9200/_cluster/health error while fetching: HTTPConnectionPool(host=localhost, port=9200): Read timed out. (read timeout=4) https://wikitech.wikimedia.org/wiki/Search%23Administration [16:45:04] RECOVERY - ElasticSearch health check for shards on 9200 on cloudelastic1005 is OK: OK - elasticsearch status cloudelastic-chi-eqiad: relocating_shards: 0, active_shards_percent_as_number: 100.0, timed_out: False, number_of_pending_tasks: 0, status: green, unassigned_shards: 0, cluster_name: cloudelastic-chi-eqiad, active_primary_shards: 895, number_of_nodes: 6, task_max_waiting_in_queue_millis: 0, delayed_unassigned_shards: 0, i [16:45:04] s: 0, active_shards: 1793, number_of_in_flight_fetch: 0, number_of_data_nodes: 6 https://wikitech.wikimedia.org/wiki/Search%23Administration [18:29:00] PROBLEM - WMF Cloud -Chi Cluster- - Public Internet Port - HTTPS on cloudelastic.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Search%23Administration [18:31:20] RECOVERY - WMF Cloud -Chi Cluster- - Public Internet Port - HTTPS on cloudelastic.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 673 bytes in 0.329 second response time https://wikitech.wikimedia.org/wiki/Search%23Administration [18:37:34] PROBLEM - Rate of JVM GC Old generation-s runs - cloudelastic1003-cloudelastic-chi-eqiad on cloudelastic1003 is CRITICAL: 130.2 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1003&panelId=37 [19:22:32] RECOVERY - Rate of JVM GC Old generation-s runs - cloudelastic1003-cloudelastic-chi-eqiad on cloudelastic1003 is OK: (C)100 gt (W)80 gt 71.19 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1003&panelId=37 [21:35:25] 10SRE, 10Wikimedia-SVG-rendering: Adding new font for CJK media display - https://phabricator.wikimedia.org/T280432 (10kaldari) Note that Source Han Sans is licensed under the SIL Open Font License.