[00:32:43] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 35.71% of data above the critical threshold [140.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [00:39:04] 10Operations, 10Puppet: Stop forcing RUNNER=php for foreachwiki/foreachwikiindblist - https://phabricator.wikimedia.org/T230110 (10Dzahn) Here is where i switched that to php7.2 to ensure all crons using foreachwikiindblist are actually using 7.2 and we don't just think it does. (cc: @jijiki ) https://gerrit.... [00:39:16] 10Operations, 10MediaWiki-Maintenance-scripts: Stop forcing RUNNER=php for foreachwiki/foreachwikiindblist - https://phabricator.wikimedia.org/T230110 (10Dzahn) [00:40:30] 10Operations, 10MediaWiki-Maintenance-scripts: Stop forcing RUNNER=php for foreachwiki/foreachwikiindblist - https://phabricator.wikimedia.org/T230110 (10Dzahn) also see https://gerrit.wikimedia.org/r/c/operations/puppet/+/425027 though which might or might not make this obsolete [00:40:41] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [00:43:25] 10Operations, 10MediaWiki-Maintenance-scripts, 10serviceops: Stop forcing RUNNER=php for foreachwiki/foreachwikiindblist - https://phabricator.wikimedia.org/T230110 (10Dzahn) [00:47:19] !log mwmaint - running cirrus sanitize jobs maintenance cron [00:47:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:52:51] !log mwmaint - running update_flaggedrevs_stats - updates the flagged revs statistics table on each wiki [00:52:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:07:55] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 42.86% of data above the critical threshold [140.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [01:14:19] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [01:22:11] (03CR) 10Dzahn: "https://meta.wikimedia.org/wiki/Meta_talk:Babylon/Translation_stats" [puppet] - 10https://gerrit.wikimedia.org/r/529437 (https://phabricator.wikimedia.org/T195392) (owner: 10Dzahn) [01:27:56] (03PS1) 10Dzahn: scap: set mwscriptwikiset to use PHP7.2 [puppet] - 10https://gerrit.wikimedia.org/r/529459 (https://phabricator.wikimedia.org/T195392) [01:35:40] (03CR) 10Huji: [C: 03+1] Rename globals and rights in AbuseFilter config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/480074 (owner: 10Daimona Eaytoy) [01:42:11] (03CR) 10Dzahn: [C: 03+2] scap: set mwscriptwikiset to use PHP7.2 [puppet] - 10https://gerrit.wikimedia.org/r/529459 (https://phabricator.wikimedia.org/T195392) (owner: 10Dzahn) [01:49:11] !log mwmaint - running (1 of 8, the one for en) refreshLinks maintenance cron manually to verify it works after switching mwscriptwikiset to PHP7.2 (T195392) [01:49:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:49:20] T195392: Switch cronjobs on maintenance hosts to PHP7 - https://phabricator.wikimedia.org/T195392 [02:01:25] RECOVERY - Router interfaces on cr2-knams is OK: OK: host 91.198.174.246, interfaces up: 59, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [02:05:42] (03CR) 10Subramanya Sastry: [C: 04-1] "I can get rid of the exception logging. We can figure out access to crashers some other way." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/529173 (https://phabricator.wikimedia.org/T228069) (owner: 10Subramanya Sastry) [02:11:59] 10Operations, 10MediaWiki-extensions-Mailgun, 10cloud-services-team, 10serviceops, and 5 others: Switch cronjobs on maintenance hosts to PHP7 - https://phabricator.wikimedia.org/T195392 (10Dzahn) [02:20:36] 10Operations, 10MediaWiki-extensions-Mailgun, 10cloud-services-team, 10serviceops, and 5 others: Switch cronjobs on maintenance hosts to PHP7 - https://phabricator.wikimedia.org/T195392 (10Peachey88) [02:22:18] 10Operations, 10MediaWiki-extensions-Mailgun, 10cloud-services-team, 10serviceops, and 5 others: Switch cronjobs on maintenance hosts to PHP7 - https://phabricator.wikimedia.org/T195392 (10Dzahn) @jijiki This is good to go now from my point of view. ---- 1:49 mutante: mwmaint - running (1 of 8, the one f... [02:25:21] (03CR) 10Dzahn: "it's fine with me to go ahead now. i checked basically everything manually https://phabricator.wikimedia.org/T195392#5406586 (at least one" [puppet] - 10https://gerrit.wikimedia.org/r/425027 (https://phabricator.wikimedia.org/T195392) (owner: 10Giuseppe Lavagetto) [02:29:44] (03CR) 10Dzahn: "keep in mind that we did not change that a bunch of jobs are sending everything, including errors, to /dev/null, so keeping an eye on it i" [puppet] - 10https://gerrit.wikimedia.org/r/425027 (https://phabricator.wikimedia.org/T195392) (owner: 10Giuseppe Lavagetto) [02:32:34] (03CR) 10Dzahn: "you can also find the list of changes to revert on my latest ticket comment .. for the cleanup step afterwards" [puppet] - 10https://gerrit.wikimedia.org/r/425027 (https://phabricator.wikimedia.org/T195392) (owner: 10Giuseppe Lavagetto) [02:39:31] (03PS2) 10Dzahn: parsoid::testing: remove parameter use_parsoid_php again [puppet] - 10https://gerrit.wikimedia.org/r/529400 (https://phabricator.wikimedia.org/T228069) [02:46:25] (03CR) 10Bartosz Dziewoński: "What needs to happen for this to be deployed?" [puppet] - 10https://gerrit.wikimedia.org/r/523952 (https://phabricator.wikimedia.org/T228225) (owner: 10Aklapper) [02:48:03] (03PS3) 10Dzahn: parsoid::testing: remove parameter use_parsoid_php again [puppet] - 10https://gerrit.wikimedia.org/r/529400 (https://phabricator.wikimedia.org/T228069) [02:53:51] (03CR) 10Dzahn: [C: 03+2] parsoid::testing: remove parameter use_parsoid_php again [puppet] - 10https://gerrit.wikimedia.org/r/529400 (https://phabricator.wikimedia.org/T228069) (owner: 10Dzahn) [02:54:05] (03PS4) 10Dzahn: parsoid::testing: remove parameter use_parsoid_php again [puppet] - 10https://gerrit.wikimedia.org/r/529400 (https://phabricator.wikimedia.org/T228069) [03:08:28] (03CR) 10Dzahn: [C: 03+2] Phab: Allow viewing ogg video files inline (instead of downloading) [puppet] - 10https://gerrit.wikimedia.org/r/523952 (https://phabricator.wikimedia.org/T228225) (owner: 10Aklapper) [03:09:16] (03PS2) 10Dzahn: Phab: Allow viewing ogg video files inline (instead of downloading) [puppet] - 10https://gerrit.wikimedia.org/r/523952 (https://phabricator.wikimedia.org/T228225) (owner: 10Aklapper) [03:14:13] (03CR) 10Dzahn: "> Patch Set 1:" [puppet] - 10https://gerrit.wikimedia.org/r/523952 (https://phabricator.wikimedia.org/T228225) (owner: 10Aklapper) [08:58:16] (03PS2) 10Fomafix: Add 'cmn' as alias for 'zh' [puppet] - 10https://gerrit.wikimedia.org/r/528835 (https://phabricator.wikimedia.org/T23915) [09:13:14] (03PS1) 10Jc86035: Add some HIDPI Wikivoyage logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/529464 (https://phabricator.wikimedia.org/T230114) [10:40:06] (03CR) 10Bartosz Dziewoński: "Thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/523952 (https://phabricator.wikimedia.org/T228225) (owner: 10Aklapper) [12:26:23] (03PS2) 10Fomafix: Add redirects for https://nan.wik{tionary,iquote,ibooks,isource}.org [puppet] - 10https://gerrit.wikimedia.org/r/529418 (https://phabricator.wikimedia.org/T86915) [13:00:26] 10Operations, 10Domains, 10Traffic, 10WMF-Legal, 10Patch-For-Review: Move wikimedia.ee under WM-EE - https://phabricator.wikimedia.org/T204056 (10Aklapper) p:05High→03Normal I assume that Legal is evaluating (and that some folks might be on summer vacation). There is contact information for Legal on... [14:03:04] (03PS8) 10Daimona Eaytoy: Rename globals and rights in AbuseFilter config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/480074 [14:03:25] (03CR) 10Daimona Eaytoy: "We need SWAT for this, right?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/480074 (owner: 10Daimona Eaytoy) [14:14:30] (03CR) 10Urbanecm: [C: 03+1] "LGTM, thank you Daimona." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/480074 (owner: 10Daimona Eaytoy) [14:15:13] (03CR) 10Urbanecm: [C: 03+1] "> Patch Set 8:" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/480074 (owner: 10Daimona Eaytoy) [14:16:59] (03CR) 10Daimona Eaytoy: "> > Patch Set 8:" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/480074 (owner: 10Daimona Eaytoy) [14:17:21] (03CR) 10Daimona Eaytoy: "> > > Patch Set 8:" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/480074 (owner: 10Daimona Eaytoy) [14:20:48] (03CR) 10Urbanecm: [C: 03+1] "> Patch Set 8:" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/480074 (owner: 10Daimona Eaytoy) [14:41:01] (03CR) 10Arlolra: Update parsoid-rt-client.config.js.erb to fetch test ids from a function (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/529391 (https://phabricator.wikimedia.org/T230166) (owner: 10Subramanya Sastry) [14:44:12] (03CR) 10Subramanya Sastry: Update parsoid-rt-client.config.js.erb to fetch test ids from a function (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/529391 (https://phabricator.wikimedia.org/T230166) (owner: 10Subramanya Sastry) [14:48:13] (03CR) 10Subramanya Sastry: Update parsoid-rt-client.config.js.erb to fetch test ids from a function (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/529391 (https://phabricator.wikimedia.org/T230166) (owner: 10Subramanya Sastry) [14:50:33] (03PS1) 10Arlolra: parsoid::testing: remove move parameter use_parsoid_php [puppet] - 10https://gerrit.wikimedia.org/r/529470 [14:51:33] (03PS2) 10Arlolra: parsoid::testing: remove more parameter use_parsoid_php [puppet] - 10https://gerrit.wikimedia.org/r/529470 [14:53:04] (03CR) 10Subramanya Sastry: [C: 03+1] parsoid::testing: remove more parameter use_parsoid_php [puppet] - 10https://gerrit.wikimedia.org/r/529470 (owner: 10Arlolra) [14:53:59] (03CR) 10Arlolra: "> Patch Set 1:" [puppet] - 10https://gerrit.wikimedia.org/r/529391 (https://phabricator.wikimedia.org/T230166) (owner: 10Subramanya Sastry) [14:58:59] (03CR) 10Subramanya Sastry: "> Patch Set 1:" [puppet] - 10https://gerrit.wikimedia.org/r/529391 (https://phabricator.wikimedia.org/T230166) (owner: 10Subramanya Sastry) [15:12:35] (03CR) 10Arlolra: "> Patch Set 1:" [puppet] - 10https://gerrit.wikimedia.org/r/529391 (https://phabricator.wikimedia.org/T230166) (owner: 10Subramanya Sastry) [17:22:19] PROBLEM - Disk space on ms-be2021 is CRITICAL: DISK CRITICAL - /srv/swift-storage/sdc1 is not accessible: Input/output error https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=ms-be2021&var-datasource=codfw+prometheus/ops [17:36:57] PROBLEM - puppet last run on ms-be2021 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 5 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[parted-/dev/sdc] https://wikitech.wikimedia.org/wiki/Monitoring/puppet_checkpuppetrun [17:39:53] (03PS2) 10Subramanya Sastry: Update parsoid-rt-client.config.js.erb to fetch test ids from a function [puppet] - 10https://gerrit.wikimedia.org/r/529391 (https://phabricator.wikimedia.org/T230166) [17:43:13] PROBLEM - HP RAID on ms-be2021 is CRITICAL: CRITICAL: Slot 3: Failed: 1I:1:1 - OK: 1I:1:2, 1I:1:3, 1I:1:4, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, 2I:4:1, 2I:4:2 - Controller: OK - Battery/Capacitor: OK https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Hardware_Raid_Information_Gathering [17:43:15] ACKNOWLEDGEMENT - HP RAID on ms-be2021 is CRITICAL: CRITICAL: Slot 3: Failed: 1I:1:1 - OK: 1I:1:2, 1I:1:3, 1I:1:4, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4, 2I:4:1, 2I:4:2 - Controller: OK - Battery/Capacitor: OK nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T230275 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Hardware_Raid_Information_Gathering [17:43:19] 10Operations, 10ops-codfw: Degraded RAID on ms-be2021 - https://phabricator.wikimedia.org/T230275 (10ops-monitoring-bot) [18:02:27] PROBLEM - Device not healthy -SMART- on ms-be2021 is CRITICAL: cluster=swift device=cciss,13 instance=ms-be2021:9100 job=node site=codfw https://wikitech.wikimedia.org/wiki/SMART%23Alerts https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=ms-be2021&var-datasource=codfw+prometheus/ops [18:25:59] (03CR) 10Arlolra: Update parsoid-rt-client.config.js.erb to fetch test ids from a function (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/529391 (https://phabricator.wikimedia.org/T230166) (owner: 10Subramanya Sastry) [18:28:35] (03CR) 10Subramanya Sastry: Update parsoid-rt-client.config.js.erb to fetch test ids from a function (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/529391 (https://phabricator.wikimedia.org/T230166) (owner: 10Subramanya Sastry) [19:37:55] PROBLEM - HHVM rendering on mw1288 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers [19:39:21] RECOVERY - HHVM rendering on mw1288 is OK: HTTP OK: HTTP/1.1 200 OK - 75569 bytes in 0.134 second response time https://wikitech.wikimedia.org/wiki/Application_servers [19:49:00] (03CR) 10Arlolra: Update parsoid-rt-client.config.js.erb to fetch test ids from a function (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/529391 (https://phabricator.wikimedia.org/T230166) (owner: 10Subramanya Sastry) [20:01:29] PROBLEM - Check systemd state on ms-be2032 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [20:09:29] RECOVERY - Check systemd state on ms-be2032 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [21:02:14] (03PS3) 10Subramanya Sastry: Update parsoid-rt-client.config.js.erb to fetch test ids from a function [puppet] - 10https://gerrit.wikimedia.org/r/529391 (https://phabricator.wikimedia.org/T230166) [22:57:31] (03CR) 10Urbanecm: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/529175 (owner: 10Viztor) [22:58:00] (03CR) 10Urbanecm: Update HD logo for wikisource using default (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/529175 (owner: 10Viztor) [23:06:39] (03CR) 10jerkins-bot: [V: 04-1] Update HD logo for wikisource using default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/529175 (owner: 10Viztor) [23:21:00] (03PS3) 10Viztor: Update HD logo for wikisource using default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/529175 [23:23:21] (03PS4) 10Viztor: Update HD logo for wikisource using default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/529175 [23:43:13] PROBLEM - HHVM rendering on mw1230 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers [23:44:39] RECOVERY - HHVM rendering on mw1230 is OK: HTTP OK: HTTP/1.1 200 OK - 75632 bytes in 0.134 second response time https://wikitech.wikimedia.org/wiki/Application_servers [23:49:29] (03PS5) 10Viztor: Update HD logo for wikisource using default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/529175