[00:03:02] 10Operations, 10Deployments, 10Release-Engineering-Team: Enable scap to roll back broken changes to MediaWiki - https://phabricator.wikimedia.org/T225207 (10Jdforrester-WMF) [00:22:25] 10Operations, 10Diffusion, 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Cannot connect to vcs@git-ssh.wikimedia.org (since move from phab1001 to phab1003) - https://phabricator.wikimedia.org/T224677 (10Paladox) p:05Triage→03Unbreak! Cloning over https is also broken git clone https://pha... [00:56:20] 10Operations, 10Diffusion, 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Cannot connect to vcs@git-ssh.wikimedia.org (since move from phab1001 to phab1003) - https://phabricator.wikimedia.org/T224677 (10Paladox) p:05Unbreak!→03High Turns out it was an issue on my side. I had ` [url "ssh:/... [02:45:49] (03PS1) 10CDanis: WIP: diff support. [software/conftool] - 10https://gerrit.wikimedia.org/r/515323 [04:06:16] PROBLEM - puppet last run on mw1232 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [04:33:28] RECOVERY - puppet last run on mw1232 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [06:31:20] PROBLEM - puppet last run on mw1283 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [06:32:34] PROBLEM - puppet last run on pollux is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 6 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/local/bin/puppet-enabled] [06:58:34] RECOVERY - puppet last run on mw1283 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:59:44] RECOVERY - puppet last run on pollux is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [07:29:02] PROBLEM - puppet last run on bast4002 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [07:56:18] RECOVERY - puppet last run on bast4002 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [08:04:46] (03PS1) 10Elukey: profile::analytics::refinery::job::refine: remove transform function [puppet] - 10https://gerrit.wikimedia.org/r/515439 (https://phabricator.wikimedia.org/T225342) [08:05:46] (03CR) 10Joal: [C: 03+1] "Thanks a lot Luca <3" [puppet] - 10https://gerrit.wikimedia.org/r/515439 (https://phabricator.wikimedia.org/T225342) (owner: 10Elukey) [08:10:49] (03CR) 10Elukey: [C: 03+2] profile::analytics::refinery::job::refine: remove transform function [puppet] - 10https://gerrit.wikimedia.org/r/515439 (https://phabricator.wikimedia.org/T225342) (owner: 10Elukey) [08:55:30] PROBLEM - BGP status on cr2-esams is CRITICAL: BGP CRITICAL - AS6939/IPv6: Connect https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [09:26:30] PROBLEM - mailman_queue_size on fermium is CRITICAL: CRITICAL: 1 mailman queue(s) above limits (thresholds: bounces: 25 in: 25 virgin: 25) https://wikitech.wikimedia.org/wiki/Mailman [09:30:52] RECOVERY - mailman_queue_size on fermium is OK: OK: mailman queues are below the limits. https://wikitech.wikimedia.org/wiki/Mailman [10:42:31] (03PS1) 10Reedy: Last minute throttle exemption addition [mediawiki-config] - 10https://gerrit.wikimedia.org/r/515533 (https://phabricator.wikimedia.org/T225344) [10:43:51] (03CR) 10Reedy: [C: 03+2] Last minute throttle exemption addition [mediawiki-config] - 10https://gerrit.wikimedia.org/r/515533 (https://phabricator.wikimedia.org/T225344) (owner: 10Reedy) [10:44:47] (03Merged) 10jenkins-bot: Last minute throttle exemption addition [mediawiki-config] - 10https://gerrit.wikimedia.org/r/515533 (https://phabricator.wikimedia.org/T225344) (owner: 10Reedy) [10:46:07] !log reedy@deploy1001 Synchronized wmf-config/throttle.php: T225344 (duration: 00m 51s) [10:46:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:46:12] T225344: Account creation throttling exception for Wikitech (2019-06-08) - https://phabricator.wikimedia.org/T225344 [10:46:17] (03CR) 10jenkins-bot: Last minute throttle exemption addition [mediawiki-config] - 10https://gerrit.wikimedia.org/r/515533 (https://phabricator.wikimedia.org/T225344) (owner: 10Reedy) [11:26:44] 10Operations, 10Gerrit, 10Traffic: When downloading from git using HTTPS: HTTP 500 / GnuTLS recv error (-110) - https://phabricator.wikimedia.org/T225347 (10Reedy) [11:28:44] PROBLEM - MariaDB Slave Lag: s6 on db2097 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 807.01 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_slave [11:58:20] !log stop swift processes on ms-be1033 - T223518 [11:58:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:58:26] T223518: ms-be1033 not powering up - https://phabricator.wikimedia.org/T223518 [12:32:00] 10Operations, 10Gerrit, 10Traffic: When downloading from git using HTTPS: HTTP 500 / GnuTLS recv error (-110) - https://phabricator.wikimedia.org/T225347 (10BBlack) The TLS-level error is just complaining that, at the end of the transaction, the connection was aborted abruptly instead of torn down cleanly.... [12:47:31] 10Operations, 10Gerrit, 10Traffic: When downloading from git using HTTPS: HTTP 500 / GnuTLS recv error (-110) - https://phabricator.wikimedia.org/T225347 (10Ciencia_Al_Poder) I'm just downloading a release branch as anonymous (hence https and not ssh), so it shouldn't be a problem with authentication. The on... [14:29:11] 10Operations, 10Gerrit, 10Traffic: When downloading from git using HTTPS: HTTP 500 / GnuTLS recv error (-110) - https://phabricator.wikimedia.org/T225347 (10Paladox) When i looked earlier at https://gerrit.wikimedia.org/r/monitoring earlier, i saw nothing that would have explained this. So i think it may ha... [14:33:16] PROBLEM - Host lvs4007 is DOWN: PING CRITICAL - Packet loss = 100% [15:08:28] PROBLEM - Disk space on ms-be2018 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=89%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space [15:47:44] 10Operations, 10Page-Previews, 10Readers-Web-Backlog, 10Wikimedia-production-error: [Bug] TypeError in PopupsContext - https://phabricator.wikimedia.org/T225018 (10Krinkle) > was never served to a user. This took me a minute to understand, but it's because the two requests that failed were from the health... [15:47:55] 10Operations, 10Page-Previews, 10Readers-Web-Backlog, 10Wikimedia-production-error: [Bug] TypeError in PopupsContext - https://phabricator.wikimedia.org/T225018 (10Krinkle) [15:47:58] 10Operations, 10serviceops, 10PHP 7.2 support, 10Patch-For-Review, and 3 others: PHP 7 corruption during deployment (was: PHP 7 fatals on mw1262) - https://phabricator.wikimedia.org/T224491 (10Krinkle) [17:11:06] RECOVERY - BGP status on cr2-esams is OK: BGP OK - up: 414, down: 0, shutdown: 1 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [18:51:14] PROBLEM - Disk space on stat1007 is CRITICAL: DISK CRITICAL - free space: /srv 283401 MB (3% inode=98%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space [19:40:20] PROBLEM - puppet last run on kafkamon1001 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [20:07:34] RECOVERY - puppet last run on kafkamon1001 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures