[00:00:34] <wikibugs>	 10SRE, 10serviceops, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by dzahn on cumin1001.eqia...
[00:06:08] <wikibugs>	 10SRE: upgrade ping offload servers to bullseye (was: ping servers running out of disk) - https://phabricator.wikimedia.org/T273509 (10Dzahn) 05Open→03Stalled p:05Triage→03Low
[00:13:52] <logmsgbot>	 !log dzahn@cumin1001 START - Cookbook sre.hosts.downtime for 2:00:00 on mw1366.eqiad.wmnet with reason: REIMAGE
[00:13:55] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:15:50] <logmsgbot>	 !log dzahn@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1366.eqiad.wmnet with reason: REIMAGE
[00:15:52] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:17:29] <logmsgbot>	 !log dzahn@cumin1001 START - Cookbook sre.hosts.downtime for 2:00:00 on mw2265.codfw.wmnet with reason: REIMAGE
[00:17:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:17:37] <wikibugs>	 (03PS1) 10Dzahn: wmcs::monitoring: replace hiera inside hiera with lookup [puppet] - 10https://gerrit.wikimedia.org/r/662026 (https://phabricator.wikimedia.org/T209953)
[00:19:28] <logmsgbot>	 !log dzahn@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2265.codfw.wmnet with reason: REIMAGE
[00:19:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:23:38] <logmsgbot>	 !log dzahn@cumin1001 START - Cookbook sre.hosts.downtime for 2:00:00 on mw1319.eqiad.wmnet with reason: REIMAGE
[00:23:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:25:48] <logmsgbot>	 !log dzahn@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1319.eqiad.wmnet with reason: REIMAGE
[00:25:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:28:56] <logmsgbot>	 !log dzahn@cumin1001 START - Cookbook sre.hosts.downtime for 2:00:00 on mw1313.eqiad.wmnet with reason: REIMAGE
[00:28:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:30:56] <logmsgbot>	 !log dzahn@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1313.eqiad.wmnet with reason: REIMAGE
[00:30:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:31:41] <wikibugs>	 (03CR) 10Dzahn: [C: 03+1] ldap: Migrate hiera() to lookup() [puppet] - 10https://gerrit.wikimedia.org/r/661916 (https://phabricator.wikimedia.org/T209953) (owner: 10Ladsgroup)
[00:33:55] <wikibugs>	 (03CR) 10Dzahn: mailman3: Start apache2 for web (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) (owner: 10Ladsgroup)
[00:40:34] <wikibugs>	 10SRE, 10serviceops, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1366.eqiad.wmnet'] `  an...
[00:43:12] <wikibugs>	 10SRE, 10serviceops, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw2265.codfw.wmnet'] `  an...
[00:46:28] <wikibugs>	 (03PS1) 10Dzahn: profile::rsyslog::udp_json_logback_compat: hiera -> lookup [puppet] - 10https://gerrit.wikimedia.org/r/662033 (https://phabricator.wikimedia.org/T209953)
[00:46:28] <logmsgbot>	 !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw2265.codfw.wmnet
[00:46:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:46:51] <logmsgbot>	 !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw1366.eqiad.wmnet
[00:46:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:55:58] <wikibugs>	 (03CR) 10Bstorm: [C: 03+1] "Looks safe enough to me" [puppet] - 10https://gerrit.wikimedia.org/r/661916 (https://phabricator.wikimedia.org/T209953) (owner: 10Ladsgroup)
[00:57:20] <logmsgbot>	 !log dzahn@cumin1001 conftool action : set/pooled=yes; selector: name=mw1366.eqiad.wmnet
[00:57:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:57:36] <icinga-wm>	 PROBLEM - Postgres Replication Lag on maps2010 is CRITICAL: POSTGRES_HOT_STANDBY_DELAY CRITICAL: DB template1 (host:localhost) 2559898896 and 190 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[01:00:36] <logmsgbot>	 !log dzahn@cumin1001 conftool action : set/pooled=yes; selector: name=mw2265.codfw.wmnet
[01:00:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[01:04:40] <icinga-wm>	 RECOVERY - Postgres Replication Lag on maps2010 is OK: POSTGRES_HOT_STANDBY_DELAY OK: DB template1 (host:localhost) 1152 and 98 seconds https://wikitech.wikimedia.org/wiki/Postgres%23Monitoring
[01:05:29] <wikibugs>	 (03PS1) 10Dzahn: gerrit: replace certbot cron for cloud with systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/662035 (https://phabricator.wikimedia.org/T273673)
[01:05:31] <wikibugs>	 (03PS1) 10Dzahn: gerrit: remove code that absented cron [puppet] - 10https://gerrit.wikimedia.org/r/662036 (https://phabricator.wikimedia.org/T273673)
[01:07:04] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] gerrit: replace certbot cron for cloud with systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/662035 (https://phabricator.wikimedia.org/T273673) (owner: 10Dzahn)
[01:07:18] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] gerrit: remove code that absented cron [puppet] - 10https://gerrit.wikimedia.org/r/662036 (https://phabricator.wikimedia.org/T273673) (owner: 10Dzahn)
[01:13:02] <icinga-wm>	 RECOVERY - Kafka Broker Replica Max Lag on kafka-jumbo1008 is OK: (C)5e+06 ge (W)1e+06 ge 6.799e+05 https://wikitech.wikimedia.org/wiki/Kafka/Administration https://grafana.wikimedia.org/dashboard/db/kafka?panelId=16&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1008
[01:13:24] <icinga-wm>	 RECOVERY - Kafka Broker Replica Max Lag on kafka-jumbo1007 is OK: (C)5e+06 ge (W)1e+06 ge 5.748e+05 https://wikitech.wikimedia.org/wiki/Kafka/Administration https://grafana.wikimedia.org/dashboard/db/kafka?panelId=16&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=kafka-jumbo1007
[01:16:24] <wikibugs>	 10SRE, 10serviceops, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1319.eqiad.wmnet'] `  an...
[01:21:20] <wikibugs>	 10SRE, 10serviceops, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['mw1313.eqiad.wmnet'] `  an...
[01:21:34] <wikibugs>	 10SRE, 10serviceops, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10Dzahn)
[01:22:16] <wikibugs>	 10SRE, 10serviceops, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10Dzahn)
[01:23:24] <wikibugs>	 10SRE, 10serviceops, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10User-jijiki: Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10Dzahn) p:05Medium→03High
[01:25:35] <logmsgbot>	 !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw1313.eqiad.wmnet
[01:25:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[01:25:53] <logmsgbot>	 !log dzahn@cumin1001 conftool action : set/pooled=no; selector: name=mw1319.eqiad.wmnet
[01:25:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[01:29:10] <logmsgbot>	 !log dzahn@cumin1001 conftool action : set/pooled=yes; selector: name=mw1313.eqiad.wmnet
[01:29:12] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[01:30:05] <logmsgbot>	 !log dzahn@cumin1001 conftool action : set/pooled=yes; selector: name=mw1319.eqiad.wmnet
[01:30:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[01:44:29] <wikibugs>	 10Puppet, 10SRE, 10puppet-compiler, 10Patch-For-Review, 10User-jbond: replace all puppet crons with systemd timers - https://phabricator.wikimedia.org/T273673 (10Dzahn) a:05jbond→03None
[02:15:59] <icinga-wm>	 PROBLEM - Check systemd state on ms-be2055 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[02:18:49] <icinga-wm>	 PROBLEM - OSPF status on cr2-eqord is CRITICAL: OSPFv2: 2/3 UP : OSPFv3: 2/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[02:20:37] <icinga-wm>	 PROBLEM - Router interfaces on cr2-codfw is CRITICAL: CRITICAL: host 208.80.153.193, interfaces up: 133, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[02:20:47] <icinga-wm>	 RECOVERY - Check systemd state on ms-be2055 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[02:39:04] <wikibugs>	 (03PS1) 10Dzahn: site: add mwdebug servers on buster [puppet] - 10https://gerrit.wikimedia.org/r/662037 (https://phabricator.wikimedia.org/T274023)
[02:53:08] <wikibugs>	 (03PS1) 10Dzahn: trafficserver: add new debug servers to debug routing [puppet] - 10https://gerrit.wikimedia.org/r/662038 (https://phabricator.wikimedia.org/T274023)
[02:53:41] <wikibugs>	 (03CR) 10Dzahn: [C: 04-2] "stalled but upcoming" [puppet] - 10https://gerrit.wikimedia.org/r/662038 (https://phabricator.wikimedia.org/T274023) (owner: 10Dzahn)
[02:55:01] <wikibugs>	 (03CR) 10Dzahn: [C: 04-2] "3 and 4 will be buster, 1 and 2 are stretch, until we want to delete them" [puppet] - 10https://gerrit.wikimedia.org/r/662038 (https://phabricator.wikimedia.org/T274023) (owner: 10Dzahn)
[02:56:21] <wikibugs>	 (03PS2) 10Dzahn: site: add mwdebug servers on buster [puppet] - 10https://gerrit.wikimedia.org/r/662037 (https://phabricator.wikimedia.org/T274023)
[02:58:41] <wikibugs>	 (03PS3) 10Dzahn: site: add mwdebug servers on buster [puppet] - 10https://gerrit.wikimedia.org/r/662037 (https://phabricator.wikimedia.org/T274023)
[02:59:56] <wikibugs>	 (03PS4) 10Dzahn: site: add mwdebug servers on buster [puppet] - 10https://gerrit.wikimedia.org/r/662037 (https://phabricator.wikimedia.org/T274023)
[03:22:47] <icinga-wm>	 PROBLEM - WDQS SPARQL on wdqs1013 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook
[03:29:07] <wikibugs>	 (03PS5) 10Dzahn: site: add mwdebug servers on buster [puppet] - 10https://gerrit.wikimedia.org/r/662037 (https://phabricator.wikimedia.org/T274023)
[03:32:51] <icinga-wm>	 RECOVERY - WDQS SPARQL on wdqs1013 is OK: HTTP OK: HTTP/1.1 200 OK - 688 bytes in 1.065 second response time https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook
[03:34:43] <icinga-wm>	 PROBLEM - Disk space on wdqs1009 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=98%): /tmp 0 MB (0% inode=98%): /var/tmp 0 MB (0% inode=98%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=wdqs1009&var-datasource=eqiad+prometheus/ops
[03:40:40] <ryankemper>	 !log Deleted dump taking up diskspace on `wdqs1009`, disk space warning will resolve now
[03:40:43] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[03:43:53] <icinga-wm>	 PROBLEM - Check systemd state on wdqs1009 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[03:51:21] <icinga-wm>	 RECOVERY - Check systemd state on wdqs1009 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[03:56:11] <icinga-wm>	 RECOVERY - Disk space on wdqs1009 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=wdqs1009&var-datasource=eqiad+prometheus/ops
[03:58:15] <icinga-wm>	 PROBLEM - Host wdqs1013 is DOWN: PING CRITICAL - Packet loss = 100%
[03:58:37] <icinga-wm>	 RECOVERY - Host wdqs1013 is UP: PING OK - Packet loss = 0%, RTA = 0.28 ms
[05:23:09] <icinga-wm>	 RECOVERY - Router interfaces on cr2-codfw is OK: OK: host 208.80.153.193, interfaces up: 135, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[05:23:49] <icinga-wm>	 RECOVERY - OSPF status on cr2-eqord is OK: OSPFv2: 3/3 UP : OSPFv3: 3/3 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
[08:00:04] <jouncebot>	 Deploy window No deploys all day! See Deployments/Emergencies if things are broken. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210206T0800)
[08:47:02] <wikibugs>	 (03PS4) 10Elukey: Set Apache Bigtop 1.5 as default hadoop distro [puppet] - 10https://gerrit.wikimedia.org/r/661974 (https://phabricator.wikimedia.org/T273711)
[08:48:55] <wikibugs>	 (03PS1) 10Elukey: Set Bigtop 1.5 for Druid and Hue test hosts [puppet] - 10https://gerrit.wikimedia.org/r/662040
[08:49:37] <wikibugs>	 (03CR) 10Elukey: [C: 03+2] Set Bigtop 1.5 for Druid and Hue test hosts [puppet] - 10https://gerrit.wikimedia.org/r/662040 (owner: 10Elukey)
[08:52:14] <logmsgbot>	 !log elukey@cumin1001 START - Cookbook sre.hadoop.change-distro-from-cdh-clients for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001
[08:52:17] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:52:57] <logmsgbot>	 !log elukey@cumin1001 END (PASS) - Cookbook sre.hadoop.change-distro-from-cdh-clients (exit_code=0) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001
[08:52:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:58:57] <logmsgbot>	 !log elukey@cumin1001 START - Cookbook sre.hadoop.change-distro-from-cdh-clients for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001
[08:58:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:59:11] <logmsgbot>	 !log elukey@cumin1001 END (PASS) - Cookbook sre.hadoop.change-distro-from-cdh-clients (exit_code=0) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001
[08:59:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:03:28] <wikibugs>	 (03PS5) 10Elukey: Set Apache Bigtop 1.5 as default hadoop distro [puppet] - 10https://gerrit.wikimedia.org/r/661974 (https://phabricator.wikimedia.org/T273711)
[09:08:55] <wikibugs>	 (03CR) 10Elukey: [V: 03+1] "PCC SUCCESS (DIFF 17): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/27891/console" [puppet] - 10https://gerrit.wikimedia.org/r/661974 (https://phabricator.wikimedia.org/T273711) (owner: 10Elukey)
[09:17:43] <wikibugs>	 (03PS6) 10Elukey: Set Apache Bigtop 1.5 as default hadoop distro [puppet] - 10https://gerrit.wikimedia.org/r/661974 (https://phabricator.wikimedia.org/T273711)
[09:22:37] <wikibugs>	 (03CR) 10Elukey: [V: 03+1] "PCC SUCCESS (DIFF 17): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/27892/console" [puppet] - 10https://gerrit.wikimedia.org/r/661974 (https://phabricator.wikimedia.org/T273711) (owner: 10Elukey)
[12:46:24] <icinga-wm>	 PROBLEM - WDQS SPARQL on wdqs1005 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook
[13:21:58] <icinga-wm>	 PROBLEM - Varnish HTTP upload-frontend - port 3120 on cp5004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Varnish
[13:22:37] <icinga-wm>	 PROBLEM - ATS TLS has reduced HTTP availability #page on alert1001 is CRITICAL: cluster=cache_upload layer=tls https://wikitech.wikimedia.org/wiki/Cache_TLS_termination https://grafana.wikimedia.org/dashboard/db/frontend-traffic?panelId=13&fullscreen&refresh=1m&orgId=1
[13:24:20] <godog>	 here, checking
[13:24:46] <godog>	 moving to _security 
[13:25:10] <icinga-wm>	 PROBLEM - Maps edge eqsin on upload-lb.eqsin.wikimedia.org is CRITICAL: /osm-intl/info.json (tile service info for osm-intl) timed out before a response was received: /private-info/info.json (private tile service info for osm-intl) timed out before a response was received: /v4/marker/pin-m-fuel+ffffff.png (Untitled test) timed out before a response was received: /v4/marker/pin-m+ffffff.png (Untitled test) timed out before a respo
[13:25:10] <icinga-wm>	  /v4/marker/pin-m+ffffff@2x.png (Untitled test) timed out before a response was received https://wikitech.wikimedia.org/wiki/Maps/RunBook
[13:27:10] <icinga-wm>	 PROBLEM - Varnish HTTP upload-frontend - port 3120 on cp5006 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Varnish
[13:27:26] <icinga-wm>	 PROBLEM - Varnish HTTP upload-frontend - port 3120 on cp5002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Varnish
[13:28:46] <icinga-wm>	 PROBLEM - Varnish HTTP upload-frontend - port 3120 on cp5003 is CRITICAL: HTTP CRITICAL - No data received from host https://wikitech.wikimedia.org/wiki/Varnish
[13:29:54] <icinga-wm>	 RECOVERY - Maps edge eqsin on upload-lb.eqsin.wikimedia.org is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Maps/RunBook
[13:31:22] <icinga-wm>	 RECOVERY - Router interfaces on cr1-codfw is OK: OK: host 208.80.153.192, interfaces up: 134, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[13:31:26] <icinga-wm>	 RECOVERY - Varnish HTTP upload-frontend - port 3120 on cp5006 is OK: HTTP OK: HTTP/1.1 200 OK - 413 bytes in 0.450 second response time https://wikitech.wikimedia.org/wiki/Varnish
[13:33:04] <icinga-wm>	 RECOVERY - Router interfaces on cr1-eqiad is OK: OK: host 208.80.154.196, interfaces up: 241, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[13:39:22] <icinga-wm>	 PROBLEM - Varnish HTTP upload-frontend - port 3121 on cp5002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Varnish
[13:39:46] <icinga-wm>	 PROBLEM - Varnish HTTP upload-frontend - port 80 on cp5002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Varnish
[13:40:08] <icinga-wm>	 PROBLEM - Varnish HTTP upload-frontend - port 3126 on cp5002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Varnish
[13:40:17] <vgutierrez>	 cp5002 noise is expected :)
[13:41:12] <icinga-wm>	 PROBLEM - Varnish HTTP upload-frontend - port 3123 on cp5002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Varnish
[13:41:16] <icinga-wm>	 PROBLEM - Varnish HTTP upload-frontend - port 3122 on cp5002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Varnish
[13:41:42] <icinga-wm>	 PROBLEM - Varnish HTTP upload-frontend - port 3125 on cp5002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Varnish
[13:41:58] <icinga-wm>	 PROBLEM - Varnish HTTP upload-frontend - port 3124 on cp5002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Varnish
[13:43:12] <icinga-wm>	 PROBLEM - Varnish HTTP upload-frontend - port 3127 on cp5002 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Varnish
[13:45:04] <icinga-wm>	 RECOVERY - Varnish HTTP upload-frontend - port 3120 on cp5002 is OK: HTTP OK: HTTP/1.1 200 OK - 410 bytes in 0.451 second response time https://wikitech.wikimedia.org/wiki/Varnish
[13:45:28] <icinga-wm>	 RECOVERY - Varnish HTTP upload-frontend - port 3123 on cp5002 is OK: HTTP OK: HTTP/1.1 200 OK - 410 bytes in 0.450 second response time https://wikitech.wikimedia.org/wiki/Varnish
[13:45:34] <icinga-wm>	 RECOVERY - Varnish HTTP upload-frontend - port 3122 on cp5002 is OK: HTTP OK: HTTP/1.1 200 OK - 410 bytes in 0.451 second response time https://wikitech.wikimedia.org/wiki/Varnish
[13:45:58] <icinga-wm>	 RECOVERY - Varnish HTTP upload-frontend - port 3125 on cp5002 is OK: HTTP OK: HTTP/1.1 200 OK - 410 bytes in 0.451 second response time https://wikitech.wikimedia.org/wiki/Varnish
[13:46:18] <icinga-wm>	 RECOVERY - Varnish HTTP upload-frontend - port 3124 on cp5002 is OK: HTTP OK: HTTP/1.1 200 OK - 410 bytes in 0.601 second response time https://wikitech.wikimedia.org/wiki/Varnish
[13:47:32] <icinga-wm>	 RECOVERY - Varnish HTTP upload-frontend - port 3127 on cp5002 is OK: HTTP OK: HTTP/1.1 200 OK - 410 bytes in 0.450 second response time https://wikitech.wikimedia.org/wiki/Varnish
[13:48:12] <icinga-wm>	 RECOVERY - Varnish HTTP upload-frontend - port 3121 on cp5002 is OK: HTTP OK: HTTP/1.1 200 OK - 410 bytes in 0.589 second response time https://wikitech.wikimedia.org/wiki/Varnish
[13:48:34] <icinga-wm>	 RECOVERY - Varnish HTTP upload-frontend - port 80 on cp5002 is OK: HTTP OK: HTTP/1.1 200 OK - 410 bytes in 0.450 second response time https://wikitech.wikimedia.org/wiki/Varnish
[13:48:56] <icinga-wm>	 RECOVERY - Varnish HTTP upload-frontend - port 3126 on cp5002 is OK: HTTP OK: HTTP/1.1 200 OK - 409 bytes in 0.484 second response time https://wikitech.wikimedia.org/wiki/Varnish
[13:55:12] <icinga-wm>	 RECOVERY - Varnish HTTP upload-frontend - port 3120 on cp5003 is OK: HTTP OK: HTTP/1.1 200 OK - 411 bytes in 0.450 second response time https://wikitech.wikimedia.org/wiki/Varnish
[13:59:17] <icinga-wm>	 RECOVERY - ATS TLS has reduced HTTP availability #page on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Cache_TLS_termination https://grafana.wikimedia.org/dashboard/db/frontend-traffic?panelId=13&fullscreen&refresh=1m&orgId=1
[14:01:48] <icinga-wm>	 RECOVERY - Varnish HTTP upload-frontend - port 3120 on cp5004 is OK: HTTP OK: HTTP/1.1 200 OK - 411 bytes in 0.580 second response time https://wikitech.wikimedia.org/wiki/Varnish
[14:27:35] <wikibugs>	 10SRE, 10Traffic: Investigate unusual media traffic pattern for AsterNovi-belgii-flower-1mb.jpg on Commons - https://phabricator.wikimedia.org/T273741 (10jcrespo)
[14:41:18] <icinga-wm>	 PROBLEM - SSH on mw2217.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[15:42:34] <icinga-wm>	 RECOVERY - SSH on mw2217.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[15:47:50] <tabbycat>	 Amir1: normal renaming still okay?
[15:49:07] <Amir1>	 tabbycat: it should be slowed down but not a big deal
[15:49:34] <tabbycat>	 ack
[16:54:39] <wikibugs>	 (03CR) 10Ladsgroup: mailman3: Start apache2 for web (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/657950 (https://phabricator.wikimedia.org/T256542) (owner: 10Ladsgroup)
[19:21:12] <wikibugs>	 (03CR) 10Gergő Tisza: [C: 03+1] Set wgGEHelpPanelAskMentor to true for several wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/661448 (https://phabricator.wikimedia.org/T272753) (owner: 10Urbanecm)
[19:30:09] <legoktm>	 Daimona: very well done
[19:31:52] <legoktm>	 I'm debugging on mwdebug1003 right now
[19:39:06] <Daimona>	 legoktm: what exactly? :D
[19:42:35] <Reedy>	 Daimona: I think you being nerd sniped ;pP
[19:42:53] <Reedy>	 I'm presuming lego is benchmarking your patch
[19:42:54] <Daimona>	 Oh, I'm an easy target
[19:43:10] <Daimona>	 Ahhh that patch, right :D It was fun
[19:44:20] <legoktm>	 Daimona: I posted my results on your patch, it seems slower to me
[19:44:31] <Daimona>	 Hmmmm I see
[19:45:25] <Daimona>	 I need to investigate why
[19:47:12] <legoktm>	 Daimona: in general, function calls are slow, so I think just calling ord() has enough overhead that it might not be able to beat string index access
[19:47:21] <Reedy>	 heh
[19:47:42] <Daimona>	 That's correct, I was wondering if it would compensate the cast and empty check though
[19:50:09] <Daimona>	 Note, I'm using a poor man's script. But I'm still getting the same result locally.
[19:50:51] <legoktm>	 that ord is faster?
[19:51:02] <Daimona>	 Yes
[19:51:04] <legoktm>	 what PHP version are you using btw? I'm on 7.4
[19:51:08] <Daimona>	 7.4.1
[19:51:44] <legoktm>	 I doubt there's a significant difference between 7.4.1 and 7.4.15
[19:51:47] <Daimona>	 Using hrtime on 500000000 iterations, it's 19201ms vs 15283ms. I should try using Benchmarker
[19:53:08] <Daimona>	 It is indeed slower if you remove the string cast and empty check
[19:53:32] <Daimona>	 But consistently faster otherwise
[19:53:38] <legoktm>	 hmm
[19:53:51] <Daimona>	 Wait
[19:54:07] <Daimona>	 I was looking at your script, but it's not using the actual code, right?
[19:54:41] <legoktm>	 no, it just copies in the code
[19:55:14] <Daimona>	 The "new" version uses an if-else instead of a ternary, and the "no safety" obviously doesn't have the empty check
[19:56:17] <Daimona>	 (brb in a while, you've successfully nerd-sniped me \o/)
[19:56:19] <legoktm>	 let me switch "new" to use && - and I'm ignoring "no safety"
[19:56:22] <legoktm>	 haha :D
[19:57:51] <legoktm>	 huh, that's wild. && is slower than if/else for me
[20:00:08] <tabbycat>	 legoktm: huh, do we have mwdebug1003 now?
[20:00:35] <legoktm>	 Yep, it's a buster host
[20:01:52] <legoktm>	 I do worry that we're spending too much time on a microoptimization which very little gain though :p
[20:21:21] <Reedy>	 lol
[20:21:33] <Reedy>	 you started it ;)
[22:12:19] <Daimona>	 legoktm: so what is the final verdict? :P
[22:13:46] <Daimona>	 I've also been thinking about another possibility, i.e. using a null coalesce with the string offset access
[22:15:42] <Daimona>	 Which seems basically as performant as ord, at least my benchmarking shows no relevant difference.
[22:25:41] <Platonides>	 may I ask for the link to the patch ?
[22:26:15] <tabbycat>	 No, you may not Platonides 
[22:26:41] <Platonides>	 ok then :P
[22:26:46] <tabbycat>	 :)
[22:28:31] <Platonides>	 I guess it's just https://gerrit.wikimedia.org/r/c/mediawiki/core/+/662076
[22:28:57] <Platonides>	 lol https://gerrit.wikimedia.org/r/q/hashtag:"faster-mw-plz"
[22:30:37] <tabbycat>	 lolol
[22:31:01] <tabbycat>	 .o( "no le pidas peras al olmo")
[22:31:02] <Daimona>	 Yes, that one :)
[22:36:27] <Platonides>	 is "no le pidas peras al olmo" in one of those patches? O_o
[22:38:30] <tabbycat>	 Platonides: nope, that'd be for Wikimedia Labs :P
[22:40:18] <Platonides>	 that could suit a logo
[22:40:51] <Platonides>	 a pear over an Elm
[22:45:12] <tabbycat>	 BioHack LLP
[23:34:37] <wikibugs>	 (03CR) 10Dzahn: [C: 03+1] "compiles on mwdebug1003 - cherry-picking and then running all the httpbb tests seems best" [puppet] - 10https://gerrit.wikimedia.org/r/657139 (https://phabricator.wikimedia.org/T272305) (owner: 10Giuseppe Lavagetto)