[00:03:26] <wikibugs>	 10Operations, 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Wikimedia-production-error: Could not enqueue jobs from stream mediawiki.job.cirrusSearchIncomingLinkCount - https://phabricator.wikimedia.org/T263132 (10thcipriani) p:05Unbreak!→03High After a few rounds of spikes, messages seem to h...
[00:11:12] <icinga-wm>	 PROBLEM - Too many messages in kafka logging-eqiad #o11y on icinga1001 is CRITICAL: cluster=misc exported_cluster=logging-eqiad group=logstash-codfw instance=kafkamon1002 job=burrow partition={1,2,3,4} prometheus=ops site=eqiad topic={rsyslog-notice,udp_localhost-info,udp_localhost-warning} https://wikitech.wikimedia.org/wiki/Logstash%23Kafka_consumer_lag https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?from=now-3h&to
[00:11:12] <icinga-wm>	 datasource=thanos&var-cluster=logging-eqiad&var-topic=All&var-consumer_group=All
[00:13:17] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists: Mailing list for local development discussion - https://phabricator.wikimedia.org/T263216 (10jeena) Thanks!
[00:19:40] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_proton_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[00:21:36] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[00:40:18] <icinga-wm>	 RECOVERY - Too many messages in kafka logging-eqiad #o11y on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Logstash%23Kafka_consumer_lag https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?from=now-3h&to=now&orgId=1&var-datasource=thanos&var-cluster=logging-eqiad&var-topic=All&var-consumer_group=All
[01:01:38] <icinga-wm>	 PROBLEM - Too many messages in kafka logging-eqiad #o11y on icinga1001 is CRITICAL: cluster=misc exported_cluster=logging-eqiad group=logstash-codfw instance=kafkamon1002 job=burrow partition={1,5} prometheus=ops site=eqiad topic=rsyslog-notice https://wikitech.wikimedia.org/wiki/Logstash%23Kafka_consumer_lag https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?from=now-3h&to=now&orgId=1&var-datasource=thanos&var-cluster=
[01:01:38] <icinga-wm>	 -topic=All&var-consumer_group=All
[01:03:34] <icinga-wm>	 RECOVERY - Too many messages in kafka logging-eqiad #o11y on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Logstash%23Kafka_consumer_lag https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?from=now-3h&to=now&orgId=1&var-datasource=thanos&var-cluster=logging-eqiad&var-topic=All&var-consumer_group=All
[01:23:02] <icinga-wm>	 PROBLEM - Too many messages in kafka logging-eqiad #o11y on icinga1001 is CRITICAL: cluster=misc exported_cluster=logging-eqiad group=logstash-codfw instance=kafkamon1002 job=burrow partition={0,1,5} prometheus=ops site=eqiad topic={rsyslog-notice,udp_localhost-info,udp_localhost-warning} https://wikitech.wikimedia.org/wiki/Logstash%23Kafka_consumer_lag https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?from=now-3h&to=n
[01:23:02] <icinga-wm>	 tasource=thanos&var-cluster=logging-eqiad&var-topic=All&var-consumer_group=All
[01:28:52] <icinga-wm>	 RECOVERY - Too many messages in kafka logging-eqiad #o11y on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Logstash%23Kafka_consumer_lag https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?from=now-3h&to=now&orgId=1&var-datasource=thanos&var-cluster=logging-eqiad&var-topic=All&var-consumer_group=All
[01:50:26] <icinga-wm>	 PROBLEM - Too many messages in kafka logging-eqiad #o11y on icinga1001 is CRITICAL: cluster=misc exported_cluster=logging-eqiad group=logstash-codfw instance=kafkamon1002 job=burrow partition={0,1,2,3,4,5} prometheus=ops site=eqiad topic={rsyslog-notice,udp_localhost-info,udp_localhost-warning} https://wikitech.wikimedia.org/wiki/Logstash%23Kafka_consumer_lag https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?from=now-3
[01:50:26] <icinga-wm>	 var-datasource=thanos&var-cluster=logging-eqiad&var-topic=All&var-consumer_group=All
[03:54:05] <icinga-wm>	 PROBLEM - Persistent high iowait on labstore1006 is CRITICAL: 13.19 ge 10 https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Labstore https://grafana.wikimedia.org/dashboard/db/labs-monitoring
[04:19:32] <icinga-wm>	 RECOVERY - Persistent high iowait on labstore1006 is OK: (C)10 ge (W)5 ge 3.106 https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Labstore https://grafana.wikimedia.org/dashboard/db/labs-monitoring
[06:46:50] <icinga-wm>	 PROBLEM - Persistent high iowait on labstore1006 is CRITICAL: 12.17 ge 10 https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Labstore https://grafana.wikimedia.org/dashboard/db/labs-monitoring
[07:00:04] <jouncebot>	 Deploy window No deploys all day! See Deployments/Emergencies if things are broken. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20200919T0700)
[07:19:32] <icinga-wm>	 RECOVERY - Persistent high iowait on labstore1006 is OK: (C)10 ge (W)5 ge 4.019 https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Labstore https://grafana.wikimedia.org/dashboard/db/labs-monitoring
[07:24:56] <icinga-wm>	 RECOVERY - Too many messages in kafka logging-eqiad #o11y on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Logstash%23Kafka_consumer_lag https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?from=now-3h&to=now&orgId=1&var-datasource=thanos&var-cluster=logging-eqiad&var-topic=All&var-consumer_group=All
[09:35:05] <wikibugs>	 10Operations, 10Wikimedia-Mailing-lists, 10I18n: Mojibake on Mailman - https://phabricator.wikimedia.org/T263248 (10Ladsgroup) >>! In T263248#6474168, @jhsoby wrote: > @Ladsgroup I see in T52864 that you're involved in upgrading the lists. Do you have any idea what's causing this?  Yes, as Dzahn put it, it's...
[11:41:06] <wikibugs>	 (03CR) 10Elukey: "> Patch Set 2:" [software/pywmflib] - 10https://gerrit.wikimedia.org/r/626380 (https://phabricator.wikimedia.org/T257905) (owner: 10Elukey)
[11:45:14] <icinga-wm>	 PROBLEM - Check systemd state on ms-be2019 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[12:29:28] <icinga-wm>	 RECOVERY - Check systemd state on ms-be2019 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[16:42:08] <wikibugs>	 10Operations, 10ops-eqiad, 10cloud-services-team (Kanban): cloudvirt1033 ipmi alert - https://phabricator.wikimedia.org/T263145 (10Bstorm) This just alerted again. I'll downtime it if I can make my laptop work right.
[16:44:53] <wikibugs>	 10Operations, 10ops-eqiad, 10cloud-services-team (Kanban): cloudvirt1033 ipmi alert - https://phabricator.wikimedia.org/T263145 (10Bstorm) Wait the alert may have been the old acked alert re-alerting in VictorOps.  I will resolve it in victorops. The alert is still red in icinga, but it is acked so should no...
[16:49:06] <wikibugs>	 10Operations, 10ops-eqiad, 10cloud-services-team (Kanban): cloudvirt1033 ipmi alert - https://phabricator.wikimedia.org/T263145 (10Bstorm) >>! In T263145#6476367, @Bstorm wrote: > Wait the alert may have been the old acked alert re-alerting in VictorOps.  I will resolve it in victorops. The alert is still re...
[17:52:13] <wikibugs>	 (03PS3) 10Elukey: Add basic debian packaging [software/pywmflib] - 10https://gerrit.wikimedia.org/r/626380 (https://phabricator.wikimedia.org/T257905)
[18:38:12] <wikibugs>	 (03PS1) 10Evrifaessa: Merge branch 'master' of https://gerrit.wikimedia.org/r/operations/mediawiki-config Change-Id: I9e99b766da20824391fc5111586be998c46c4331 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/628513
[18:38:14] <wikibugs>	 (03PS1) 10Evrifaessa: Merge branch 'master' of https://gerrit.wikimedia.org/r/operations/mediawiki-config Change-Id: I11ce4d27374aacb96a8b03b7d777406a63d5d5e1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/628514
[18:38:16] <wikibugs>	 (03PS1) 10Evrifaessa: Set timezone for wikis of the CWIRP to Europe/Rome [mediawiki-config] - 10https://gerrit.wikimedia.org/r/628515 (https://phabricator.wikimedia.org/T263123)
[18:39:58] <wikibugs>	 (03Abandoned) 10Evrifaessa: Merge branch 'master' of https://gerrit.wikimedia.org/r/operations/mediawiki-config Change-Id: I11ce4d27374aacb96a8b03b7d777406a63d5d5e1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/628514 (owner: 10Evrifaessa)
[18:40:03] <wikibugs>	 (03Abandoned) 10Evrifaessa: Merge branch 'master' of https://gerrit.wikimedia.org/r/operations/mediawiki-config Change-Id: I9e99b766da20824391fc5111586be998c46c4331 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/628513 (owner: 10Evrifaessa)
[18:49:10] <wikibugs>	 (03PS1) 10ArielGlenn: don't get db creds unless needed for a query [dumps] - 10https://gerrit.wikimedia.org/r/628519 (https://phabricator.wikimedia.org/T263323)
[18:57:50] <wikibugs>	 (03PS2) 10Jforrester: Set timezone for wikis of the CWIRP to Europe/Rome [mediawiki-config] - 10https://gerrit.wikimedia.org/r/628515 (https://phabricator.wikimedia.org/T263123) (owner: 10Evrifaessa)
[18:57:57] <wikibugs>	 (03CR) 10ArielGlenn: [C: 03+2] don't get db creds unless needed for a query [dumps] - 10https://gerrit.wikimedia.org/r/628519 (https://phabricator.wikimedia.org/T263323) (owner: 10ArielGlenn)
[18:58:28] <wikibugs>	 (03CR) 10Jforrester: "Please make sure that your patches are based on the remote master branch before pushing, to avoid implicit merge commits. :-)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/628515 (https://phabricator.wikimedia.org/T263123) (owner: 10Evrifaessa)
[19:02:59] <logmsgbot>	 !log ariel@deploy1001 Started deploy [dumps/dumps@14ba6e9]: defer getting db creds until really needed
[19:03:00] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on icinga1001 is CRITICAL: job=swagger_check_cxserver_cluster_eqiad site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[19:03:03] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:03:03] <logmsgbot>	 !log ariel@deploy1001 Finished deploy [dumps/dumps@14ba6e9]: defer getting db creds until really needed (duration: 00m 04s)
[19:03:07] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:04:56] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[19:06:36] <wikibugs>	 (03PS1) 10Evrifaessa: Removing Wikipedia store link from enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/628521 (https://phabricator.wikimedia.org/T262329)
[19:27:19] <wikibugs>	 (03PS1) 10Urbanecm: Allow local steward group members to bigdelete [mediawiki-config] - 10https://gerrit.wikimedia.org/r/628522
[20:15:48] <icinga-wm>	 PROBLEM - Check systemd state on ms-be2019 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:29:10] <icinga-wm>	 RECOVERY - Check systemd state on ms-be2019 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:39:18] <icinga-wm>	 PROBLEM - Check systemd state on ms-be2030 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:43:32] <icinga-wm>	 PROBLEM - Check whether ferm is active by checking the default input chain on ms-be2030 is CRITICAL: ERROR ferm input drop default policy not set, ferm might not have been started correctly https://wikitech.wikimedia.org/wiki/Monitoring/check_ferm
[21:04:12] <icinga-wm>	 RECOVERY - Check systemd state on ms-be2030 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[21:14:26] <icinga-wm>	 RECOVERY - Check whether ferm is active by checking the default input chain on ms-be2030 is OK: OK ferm input default policy is set https://wikitech.wikimedia.org/wiki/Monitoring/check_ferm
[21:15:10] <icinga-wm>	 PROBLEM - Check systemd state on ms-be2019 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[21:30:38] <icinga-wm>	 RECOVERY - Check systemd state on ms-be2019 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[21:44:06] <icinga-wm>	 PROBLEM - Check systemd state on ms-be2019 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[21:50:12] <DannyS712>	 I know T261133 was declined. What about a project wanting to ban all ip edits from a content namespace? Please see https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/General_semi_protection_for_all_property_pages
[21:50:13] <stashbot>	 T261133: Ban IP editions on pt.wiki - https://phabricator.wikimedia.org/T261133
[22:30:06] <icinga-wm>	 RECOVERY - Check systemd state on ms-be2019 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[22:43:32] <icinga-wm>	 PROBLEM - Check systemd state on ms-be2019 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[23:23:16] <icinga-wm>	 PROBLEM - Query Service HTTP Port on wdqs2002 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 298 bytes in 0.002 second response time https://wikitech.wikimedia.org/wiki/Wikidata_query_service
[23:29:34] <icinga-wm>	 RECOVERY - Check systemd state on ms-be2019 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state