[00:20:47] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=pdu_sentry4 site=eqsin https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [00:25:37] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [02:00:03] RECOVERY - Check systemd state on deneb is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [03:16:55] PROBLEM - WDQS SPARQL on wdqs1013 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [03:21:43] RECOVERY - WDQS SPARQL on wdqs1013 is OK: HTTP OK: HTTP/1.1 200 OK - 688 bytes in 1.051 second response time https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook [03:24:42] (03PS1) 10Andrew Bogott: Nova policy: limit os-rescue to projectadmins [puppet] - 10https://gerrit.wikimedia.org/r/678379 (https://phabricator.wikimedia.org/T279712) [03:25:34] (03CR) 10Andrew Bogott: [C: 03+2] Nova policy: limit os-rescue to projectadmins [puppet] - 10https://gerrit.wikimedia.org/r/678379 (https://phabricator.wikimedia.org/T279712) (owner: 10Andrew Bogott) [04:04:30] (03CR) 10Ori.livneh: "This change is ready for review." [puppet] - 10https://gerrit.wikimedia.org/r/678380 (owner: 10Ori.livneh) [04:16:16] (03PS1) 10Andrew Bogott: observerenv.sh: make observer_project into a hiera setting [puppet] - 10https://gerrit.wikimedia.org/r/678381 (https://phabricator.wikimedia.org/T279845) [04:25:53] (03PS2) 10Andrew Bogott: observerenv.sh: make observer_project into a hiera setting [puppet] - 10https://gerrit.wikimedia.org/r/678381 (https://phabricator.wikimedia.org/T279845) [04:26:58] (03CR) 10jerkins-bot: [V: 04-1] observerenv.sh: make observer_project into a hiera setting [puppet] - 10https://gerrit.wikimedia.org/r/678381 (https://phabricator.wikimedia.org/T279845) (owner: 10Andrew Bogott) [04:29:37] (03PS3) 10Andrew Bogott: observerenv.sh: make observer_project into a hiera setting [puppet] - 10https://gerrit.wikimedia.org/r/678381 (https://phabricator.wikimedia.org/T279845) [07:00:05] Deploy window No deploys all day! See [[Deployments/Emergencies]] if things are broken. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210411T0700) [08:49:47] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=routinator site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [08:52:13] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [09:22:17] PROBLEM - Rate of JVM GC Old generation-s runs - cloudelastic1001-cloudelastic-chi-eqiad on cloudelastic1001 is CRITICAL: 105.8 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1001&panelId=37 [09:29:39] PROBLEM - Rate of JVM GC Old generation-s runs - cloudelastic1001-cloudelastic-chi-eqiad on cloudelastic1001 is CRITICAL: 100.7 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1001&panelId=37 [09:33:38] (03PS3) 10Zabe: Add extendedconfirmed on svwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/678366 (https://phabricator.wikimedia.org/T279836) [09:36:57] PROBLEM - Rate of JVM GC Old generation-s runs - cloudelastic1001-cloudelastic-chi-eqiad on cloudelastic1001 is CRITICAL: 103.7 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1001&panelId=37 [09:59:07] RECOVERY - Rate of JVM GC Old generation-s runs - cloudelastic1001-cloudelastic-chi-eqiad on cloudelastic1001 is OK: (C)100 gt (W)80 gt 77.29 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1001&panelId=37 [12:20:23] (03PS1) 10Urbanecm: Explicitly set wgGEMentorshipMigrationStage: WRITE_OLD/READ_OLD [mediawiki-config] - 10https://gerrit.wikimedia.org/r/678386 (https://phabricator.wikimedia.org/T279853) [13:15:31] (03PS1) 10Andrew Bogott: OpenStack novaobserver.sh: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) [13:16:41] (03CR) 10jerkins-bot: [V: 04-1] OpenStack novaobserver.sh: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) (owner: 10Andrew Bogott) [13:19:59] (03PS2) 10Andrew Bogott: OpenStack novaobserver.sh: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) [13:21:17] (03CR) 10jerkins-bot: [V: 04-1] OpenStack novaobserver.sh: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) (owner: 10Andrew Bogott) [13:23:32] (03PS3) 10Andrew Bogott: OpenStack novaobserver.sh: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) [13:24:41] (03CR) 10jerkins-bot: [V: 04-1] OpenStack novaobserver.sh: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) (owner: 10Andrew Bogott) [13:27:06] (03PS4) 10Andrew Bogott: OpenStack novaobserver.sh: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) [13:28:14] (03CR) 10jerkins-bot: [V: 04-1] OpenStack novaobserver.sh: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) (owner: 10Andrew Bogott) [13:29:20] (03PS1) 10Urbanecm: labs: Set GEMentorshipMigrationStage to SCHEMA_COMPAT_WRITE_BOTH | SCHEMA_COMPAT_READ_OLD [mediawiki-config] - 10https://gerrit.wikimedia.org/r/678389 (https://phabricator.wikimedia.org/T279853) [13:32:29] (03CR) 10Urbanecm: [C: 03+1] "SGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/678366 (https://phabricator.wikimedia.org/T279836) (owner: 10Zabe) [13:33:09] (03CR) 10Urbanecm: [C: 03+1] [wikitech] Update logo to mirror the new MediaWiki logo [mediawiki-config] - 10https://gerrit.wikimedia.org/r/678342 (https://phabricator.wikimedia.org/T279087) (owner: 10Jforrester) [13:33:52] (03CR) 10Urbanecm: [C: 03+1] Enable on bswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/678237 (https://phabricator.wikimedia.org/T279635) (owner: 10Zabe) [13:49:21] (03PS4) 10Andrew Bogott: observerenv.sh: make observer_project into a hiera setting [puppet] - 10https://gerrit.wikimedia.org/r/678381 (https://phabricator.wikimedia.org/T279845) [13:49:23] (03PS5) 10Andrew Bogott: OpenStack env scripts: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) [13:49:25] (03PS1) 10Andrew Bogott: OpenStack control nodes: remove region-migrate.conf [puppet] - 10https://gerrit.wikimedia.org/r/678390 [13:54:36] (03PS6) 10Andrew Bogott: OpenStack env scripts: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) [13:55:53] (03CR) 10jerkins-bot: [V: 04-1] OpenStack env scripts: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) (owner: 10Andrew Bogott) [14:06:33] (03PS7) 10Andrew Bogott: OpenStack env scripts: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) [14:14:38] (03PS8) 10Andrew Bogott: OpenStack env scripts: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) [14:17:53] (03CR) 10Andrew Bogott: [C: 03+2] OpenStack control nodes: remove region-migrate.conf [puppet] - 10https://gerrit.wikimedia.org/r/678390 (owner: 10Andrew Bogott) [14:18:19] (03CR) 10Andrew Bogott: [C: 03+2] observerenv.sh: make observer_project into a hiera setting [puppet] - 10https://gerrit.wikimedia.org/r/678381 (https://phabricator.wikimedia.org/T279845) (owner: 10Andrew Bogott) [14:22:50] (03CR) 10Andrew Bogott: [C: 03+2] OpenStack env scripts: refactor to use a more general resource [puppet] - 10https://gerrit.wikimedia.org/r/678388 (https://phabricator.wikimedia.org/T279845) (owner: 10Andrew Bogott) [14:31:06] (03PS1) 10Andrew Bogott: profile::openstack::base::envscripts: add an env file for the osstackcanary user [puppet] - 10https://gerrit.wikimedia.org/r/678391 (https://phabricator.wikimedia.org/T279845) [14:32:09] (03CR) 10jerkins-bot: [V: 04-1] profile::openstack::base::envscripts: add an env file for the osstackcanary user [puppet] - 10https://gerrit.wikimedia.org/r/678391 (https://phabricator.wikimedia.org/T279845) (owner: 10Andrew Bogott) [14:52:52] (03PS2) 10Andrew Bogott: profile::openstack::base::envscripts: add an env file for the osstackcanary user [puppet] - 10https://gerrit.wikimedia.org/r/678391 (https://phabricator.wikimedia.org/T279845) [14:54:21] (03CR) 10jerkins-bot: [V: 04-1] profile::openstack::base::envscripts: add an env file for the osstackcanary user [puppet] - 10https://gerrit.wikimedia.org/r/678391 (https://phabricator.wikimedia.org/T279845) (owner: 10Andrew Bogott) [14:55:54] (03PS3) 10Andrew Bogott: profile::openstack::base::envscripts: add an env file for the osstackcanary user [puppet] - 10https://gerrit.wikimedia.org/r/678391 (https://phabricator.wikimedia.org/T279845) [15:01:07] (03PS4) 10Andrew Bogott: profile::openstack::base::envscripts: add an env file for the osstackcanary user [puppet] - 10https://gerrit.wikimedia.org/r/678391 (https://phabricator.wikimedia.org/T279845) [15:03:27] (03PS5) 10Andrew Bogott: profile::openstack::base::envscripts: add an env file for the osstackcanary user [puppet] - 10https://gerrit.wikimedia.org/r/678391 (https://phabricator.wikimedia.org/T279845) [15:23:24] (03CR) 10Andrew Bogott: [C: 03+2] profile::openstack::base::envscripts: add an env file for the osstackcanary user [puppet] - 10https://gerrit.wikimedia.org/r/678391 (https://phabricator.wikimedia.org/T279845) (owner: 10Andrew Bogott) [17:49:45] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=pdu_sentry4 site=eqsin https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [17:52:15] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [20:12:44] (03PS1) 10Andrew Bogott: mwopenstackclients3.py: add support for neutron and cinder clients [puppet] - 10https://gerrit.wikimedia.org/r/678428 (https://phabricator.wikimedia.org/T279845) [20:13:31] (03CR) 10Andrew Bogott: [C: 03+2] mwopenstackclients3.py: add support for neutron and cinder clients [puppet] - 10https://gerrit.wikimedia.org/r/678428 (https://phabricator.wikimedia.org/T279845) (owner: 10Andrew Bogott) [20:37:56] (03CR) 10DannyS712: Explicitly set wgGEMentorshipMigrationStage: WRITE_OLD/READ_OLD (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/678386 (https://phabricator.wikimedia.org/T279853) (owner: 10Urbanecm) [21:28:29] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=swagger_check_citoid_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [21:30:57] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [22:17:31] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [22:19:57] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [23:21:23] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=netbox_device_statistics site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [23:23:51] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets