[00:05:54] RECOVERY - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp1087 is OK: HTTP OK: HTTP/1.0 200 OK - 22331 bytes in 0.006 second response time https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [03:05:20] PROBLEM - Check systemd state on an-launcher1001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [05:31:20] PROBLEM - Router interfaces on cr3-ulsfo is CRITICAL: CRITICAL: host 198.35.26.192, interfaces up: 74, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [05:32:10] PROBLEM - Router interfaces on cr2-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 62, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [06:44:08] RECOVERY - Check systemd state on an-launcher1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [07:01:58] PROBLEM - PHP opcache health on mw2334 is CRITICAL: CRITICAL: opcache cache-hit ratio is below 99.85% https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_opcache_health [07:20:30] RECOVERY - PHP opcache health on mw2334 is OK: OK: opcache is healthy https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_opcache_health [09:13:34] looks like wikibugs needs a poke /Reedy legoktm [09:13:48] Sorry I'm asleep [10:29:14] PROBLEM - Rate of JVM GC Old generation-s runs - cloudelastic1003-cloudelastic-chi-eqiad on cloudelastic1003 is CRITICAL: 105.8 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1003&panelId=37 [11:13:30] (03CR) 10Gergő Tisza: "Sorry, forgot about this existing." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/565438 (https://phabricator.wikimedia.org/T238295) (owner: 10Catrope) [11:16:52] (03PS9) 10Gergő Tisza: Enable GrowthExperiments welcome survey on Ukrainian, Hungarian, Armenian Wikipedias [mediawiki-config] - 10https://gerrit.wikimedia.org/r/584135 (https://phabricator.wikimedia.org/T238295) [11:34:10] RECOVERY - Rate of JVM GC Old generation-s runs - cloudelastic1003-cloudelastic-chi-eqiad on cloudelastic1003 is OK: (C)100 gt (W)80 gt 78.31 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1003&panelId=37 [15:05:35] (03CR) 10Reedy: [C: 03+1] "+1 the extent the tables are alright to be dumped" [puppet] - 10https://gerrit.wikimedia.org/r/573351 (https://phabricator.wikimedia.org/T236431) (owner: 10ArielGlenn) [15:05:52] (03PS1) 10Andrew Bogott: neutron-api: create /var/run/neutron [puppet] - 10https://gerrit.wikimedia.org/r/585895 (https://phabricator.wikimedia.org/T248635) [15:12:27] (03CR) 10Andrew Bogott: [C: 03+2] neutron-api: create /var/run/neutron [puppet] - 10https://gerrit.wikimedia.org/r/585895 (https://phabricator.wikimedia.org/T248635) (owner: 10Andrew Bogott) [15:23:43] (03PS1) 10Umherirrender: Use MediaWikiServices::getAuthManager [mediawiki-config] - 10https://gerrit.wikimedia.org/r/585910 [15:53:00] 10Operations, 10MediaWiki-Cache, 10Page Content Service, 10Product-Infrastructure-Team-Backlog, and 2 others: esams cache_text cluster consistently backlogged on purge requests - https://phabricator.wikimedia.org/T249325 (10CDanis) [16:28:49] (03PS1) 10Andrew Bogott: Neutron: replace l3 hacks for Rocky [puppet] - 10https://gerrit.wikimedia.org/r/585917 (https://phabricator.wikimedia.org/T248635) [16:30:00] (03CR) 10Andrew Bogott: [C: 03+2] Neutron: replace l3 hacks for Rocky [puppet] - 10https://gerrit.wikimedia.org/r/585917 (https://phabricator.wikimedia.org/T248635) (owner: 10Andrew Bogott) [16:35:24] PROBLEM - Rate of JVM GC Old generation-s runs - cloudelastic1003-cloudelastic-chi-eqiad on cloudelastic1003 is CRITICAL: 100.7 gt 100 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1003&panelId=37 [16:56:58] (03PS1) 10Andrew Bogott: Keystone hacks: update keystone api handling [puppet] - 10https://gerrit.wikimedia.org/r/585934 (https://phabricator.wikimedia.org/T248635) [16:57:47] (03CR) 10jerkins-bot: [V: 04-1] Keystone hacks: update keystone api handling [puppet] - 10https://gerrit.wikimedia.org/r/585934 (https://phabricator.wikimedia.org/T248635) (owner: 10Andrew Bogott) [17:01:01] (03PS2) 10Andrew Bogott: Keystone hacks: update keystone api handling [puppet] - 10https://gerrit.wikimedia.org/r/585934 (https://phabricator.wikimedia.org/T248635) [17:02:06] (03CR) 10Andrew Bogott: [C: 03+2] Keystone hacks: update keystone api handling [puppet] - 10https://gerrit.wikimedia.org/r/585934 (https://phabricator.wikimedia.org/T248635) (owner: 10Andrew Bogott) [17:30:34] RECOVERY - Rate of JVM GC Old generation-s runs - cloudelastic1003-cloudelastic-chi-eqiad on cloudelastic1003 is OK: (C)100 gt (W)80 gt 79.32 https://wikitech.wikimedia.org/wiki/Search%23Using_jstack_or_jmap_or_other_similar_tools_to_view_logs https://grafana.wikimedia.org/d/000000462/elasticsearch-memory?orgId=1&var-exported_cluster=cloudelastic-chi-eqiad&var-instance=cloudelastic1003&panelId=37 [18:24:13] 10Operations, 10Traffic, 10Wikimedia-General-or-Unknown: Different age of history logged in and out when from the EU (but not SF office) - https://phabricator.wikimedia.org/T246185 (10DB111) > Was it was 22:xx GMT when you made that request? Because the date response header says the response was generated at... [18:27:02] 10Operations, 10Traffic, 10Wikimedia-General-or-Unknown: Different age of history logged in and out when from the EU (but not SF office) - https://phabricator.wikimedia.org/T246185 (10Count_Count) >> If it happens again, it might be a good idea to check the timestamps and if the same cache machine is involve... [18:29:12] 10Operations, 10Traffic, 10Wikimedia-General-or-Unknown: Different age of history logged in and out when from the EU (but not SF office) - https://phabricator.wikimedia.org/T246185 (10DB111) https://de.wikipedia.org/wiki/Wikipedia:Technik/Werkstatt I don't see the new section after our discussion. [18:29:49] 10Operations, 10Traffic, 10Wikimedia-General-or-Unknown: Different age of history logged in and out when from the EU (but not SF office) - https://phabricator.wikimedia.org/T246185 (10Count_Count) > date: Sat, 04 Apr 2020 10:27:00 GMT > last-modified: Sat, 04 Apr 2020 09:46:05 GMT > server: mw1369.eqiad.wmne... [18:33:25] 10Operations, 10Traffic, 10Wikimedia-General-or-Unknown: Different age of history logged in and out when from the EU (but not SF office) - https://phabricator.wikimedia.org/T246185 (10DB111) Now (18:31 UTC); age: 29053 date: Sat, 04 Apr 2020 10:27:00 GMT last-modified: Sat, 04 Apr 2020 09:46:05 GMT server:... [18:39:29] 10Operations, 10Traffic, 10Wikimedia-General-or-Unknown: Different age of history logged in and out when from the EU (but not SF office) - https://phabricator.wikimedia.org/T246185 (10Count_Count) Interesting, this is what I get right now: ` $ while [ 1 ]; do curl --verbose https://de.wikipedia.org/wiki/Wiki... [18:46:30] PROBLEM - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3062 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [18:51:54] RECOVERY - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3062 is OK: HTTP OK: HTTP/1.0 200 OK - 22408 bytes in 0.261 second response time https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [18:53:49] 10Operations, 10Traffic, 10Wikimedia-General-or-Unknown: Different age of history logged in and out when from the EU (but not SF office) - https://phabricator.wikimedia.org/T246185 (10CDanis) [18:53:54] 10Operations, 10MediaWiki-Cache, 10Page Content Service, 10Product-Infrastructure-Team-Backlog, and 2 others: esams cache_text cluster consistently backlogged on purge requests - https://phabricator.wikimedia.org/T249325 (10CDanis) [18:54:38] 10Operations, 10MediaWiki-Cache, 10Page Content Service, 10Product-Infrastructure-Team-Backlog, and 2 others: esams cache_text cluster consistently backlogged on purge requests - https://phabricator.wikimedia.org/T249325 (10CDanis) p:05Triage→03High [19:29:19] 10Operations, 10MediaWiki-Cache, 10Page Content Service, 10Product-Infrastructure-Team-Backlog, and 3 others: esams cache_text cluster consistently backlogged on purge requests - https://phabricator.wikimedia.org/T249325 (10Krinkle) [20:04:44] PROBLEM - PHP opcache health on mw2308 is CRITICAL: CRITICAL: opcache cache-hit ratio is below 99.85% https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_opcache_health [20:19:36] RECOVERY - PHP opcache health on mw2308 is OK: OK: opcache is healthy https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_opcache_health [20:20:09] 10Operations, 10MediaWiki-Cache, 10Page Content Service, 10Product-Infrastructure-Team-Backlog, and 3 others: esams cache_text cluster consistently backlogged on purge requests - https://phabricator.wikimedia.org/T249325 (10Urbanecm) @CDanis @BBlack T169894 is likely the same (or similar) issue. https://en... [20:21:33] Urbanecm: mind a PN [20:21:35] PM? [20:21:44] RhinosF1: sure, PM me [20:57:55] 10Operations, 10MediaWiki-Cache, 10Page Content Service, 10Product-Infrastructure-Team-Backlog, and 3 others: esams cache_text cluster consistently backlogged on purge requests - https://phabricator.wikimedia.org/T249325 (10CDanis) Thanks. The reports from 2017 are cannot be due to the same cause; however... [21:45:36] (03PS4) 10BryanDavis: Fix partial rename of "type" parameter to "wstype" [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/585846 (https://phabricator.wikimedia.org/T249390) [21:45:38] (03PS1) 10BryanDavis: Yet another package rename mega patch [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/585963 (https://phabricator.wikimedia.org/T249079) [22:42:40] PROBLEM - Debian mirror in sync with upstream on sodium is CRITICAL: /srv/mirrors/debian is over 14 hours old. https://wikitech.wikimedia.org/wiki/Mirrors [22:58:36] 10Operations, 10MediaWiki-General, 10Traffic: Requests with utf-8 in the URL return a outdated page revision - https://phabricator.wikimedia.org/T23027 (10Krinkle) 05Open→03Resolved a:03tstarling Since 2013, we have VCL code in place that normalises these characters in Wikimedia's caching infrastructur... [22:59:00] 10Operations, 10MediaWiki-General, 10Traffic: Requests with utf-8 in the URL return a outdated page revision - https://phabricator.wikimedia.org/T23027 (10Krinkle) [23:01:54] 10Operations, 10MediaWiki-ResourceLoader, 10Performance-Team, 10Traffic: Support ESI for ResourceLoader - https://phabricator.wikimedia.org/T78963 (10Krinkle) 05Open→03Declined Obsolete per 96fc60533fb5 / . [23:07:56] 10Operations, 10Domains, 10Traffic, 10WMF-Legal: wikipedia.lol - https://phabricator.wikimedia.org/T88861 (10Krinkle) 05Open→03Resolved a:03Dzahn The domain is registered to WMF and parked in DNS. There is no need for a redirect at this time as it is new and highly unlikely to be accidentally used by... [23:10:13] 10Operations, 10RESTBase, 10Traffic, 10wikitech.wikimedia.org, 10Services (later): Fix RESTBase support for wikitech.wikimedia.org - https://phabricator.wikimedia.org/T102178 (10Krinkle) 05Open→03Declined Closing in favour of T237773. There is no particular need for it on its own, but the inconsisten... [23:31:06] 10Operations, 10Traffic: Configure varnish to use "Unconfigured domain" page for 404 Not Served (instead of generic error) - https://phabricator.wikimedia.org/T112316 (10Krinkle) @ema Would it be possible (and acceptable) to add a condition somewhere in VCL that for the built-in error of `Unconfigured domain`... [23:31:22] 10Operations, 10Traffic: Configure varnish to use "Unconfigured domain" page for 404 Not Served (instead of generic error) - https://phabricator.wikimedia.org/T112316 (10Krinkle) a:03Krinkle [23:44:51] (03PS1) 10Krinkle: apache: restore redirect from stats.wikipedia.org [puppet] - 10https://gerrit.wikimedia.org/r/585968 (https://phabricator.wikimedia.org/T126281)