[09:27:41] PROBLEM - High lag on wdqs2006 is CRITICAL: 4373 ge 3600 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [09:28:26] ACKNOWLEDGEMENT - High lag on wdqs2006 is CRITICAL: 4373 ge 3600 Gehel catching up after data reload - https://phabricator.wikimedia.org/T228122 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [10:02:17] RECOVERY - High lag on wdqs2006 is OK: (C)3600 ge (W)1200 ge 893.9 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [10:52:18] PROBLEM - High lag on wdqs1010 is CRITICAL: 4756 ge 3600 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [10:52:58] PROBLEM - High lag on wdqs1007 is CRITICAL: 4796 ge 3600 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [11:23:22] RECOVERY - High lag on wdqs1010 is OK: (C)3600 ge (W)1200 ge 941 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [11:30:48] RECOVERY - High lag on wdqs1007 is OK: (C)3600 ge (W)1200 ge 1092 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [15:19:30] PROBLEM - High lag on wdqs2004 is CRITICAL: 4019 ge 3600 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [15:20:44] ACKNOWLEDGEMENT - High lag on wdqs2004 is CRITICAL: 4082 ge 3600 Gehel catching up on lag after data reload - https://phabricator.wikimedia.org/T228122 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [15:38:44] RECOVERY - High lag on wdqs2004 is OK: (C)3600 ge (W)1200 ge 999.7 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [18:54:22] PROBLEM - Blazegraph process -wdqs-blazegraph- on wdqs1008 is CRITICAL: PROCS CRITICAL: 0 processes with UID = 499 (blazegraph), regex args ^java .* --port 9999 .* blazegraph-service-.*war https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [18:54:26] PROBLEM - Check systemd state on wdqs1008 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link [18:54:32] PROBLEM - Check systemd state on wdqs1010 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link [18:54:34] PROBLEM - Blazegraph Port for wdqs-blazegraph on wdqs1010 is CRITICAL: connect to address 127.0.0.1 and port 9999: Connection refused https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [18:54:52] PROBLEM - WDQS HTTP Port on wdqs1010 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 380 bytes in 0.000 second response time https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link [18:54:54] PROBLEM - Blazegraph Port for wdqs-blazegraph on wdqs1008 is CRITICAL: connect to address 127.0.0.1 and port 9999: Connection refused https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [18:54:56] PROBLEM - WDQS HTTP Port on wdqs1008 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 380 bytes in 0.001 second response time https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link [18:55:16] PROBLEM - Blazegraph process -wdqs-blazegraph- on wdqs1010 is CRITICAL: PROCS CRITICAL: 0 processes with UID = 499 (blazegraph), regex args ^java .* --port 9999 .* blazegraph-service-.*war https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [22:15:15] RECOVERY - Blazegraph Port for wdqs-blazegraph on wdqs1010 is OK: TCP OK - 0.000 second response time on 127.0.0.1 port 9999 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [22:15:35] PROBLEM - High lag on wdqs1010 is CRITICAL: 3.383e+04 ge 3600 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [22:16:03] RECOVERY - WDQS HTTP Port on wdqs1010 is OK: HTTP OK: HTTP/1.1 200 OK - 448 bytes in 0.040 second response time https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link [22:16:27] RECOVERY - Blazegraph process -wdqs-blazegraph- on wdqs1010 is OK: PROCS OK: 1 process with UID = 499 (blazegraph), regex args ^java .* --port 9999 .* blazegraph-service-.*war https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [22:17:07] RECOVERY - Check systemd state on wdqs1010 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link [22:22:03] RECOVERY - Blazegraph Port for wdqs-blazegraph on wdqs1008 is OK: TCP OK - 0.000 second response time on 127.0.0.1 port 9999 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [22:22:31] RECOVERY - WDQS HTTP Port on wdqs1008 is OK: HTTP OK: HTTP/1.1 200 OK - 448 bytes in 0.035 second response time https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link [22:23:15] RECOVERY - Blazegraph process -wdqs-blazegraph- on wdqs1008 is OK: PROCS OK: 1 process with UID = 499 (blazegraph), regex args ^java .* --port 9999 .* blazegraph-service-.*war https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [22:41:28] ACKNOWLEDGEMENT - High lag on wdqs1010 is CRITICAL: 2.92e+04 ge 3600 Gehel catching up on lag after data transfer - https://phabricator.wikimedia.org/T228122 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [22:47:37] RECOVERY - Check systemd state on wdqs1008 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link