[00:00:31] RECOVERY - Check systemd state on an-launcher1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [03:02:31] !help [03:02:31] want docs? ask for "!wm-bot". all keywords? try "@regsearch .*" [03:02:47] !wm-bot [03:02:47] http://meta.wikimedia.org/wiki/WM-Bot [03:12:05] cmd [03:14:21] Permission denied [03:14:21] @add [03:14:26] Permission denied [03:14:26] @join [03:14:42] how do i make a bot join a channel [03:14:53] @infobot-detail [03:27:51] !/leave #wikimedia-operations [03:53:39] RECOVERY - PHP opcache health on mw1407 is OK: OK: opcache is healthy https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_opcache_health [03:59:17] PROBLEM - PHP opcache health on mw1407 is CRITICAL: CRITICAL: opcache full. https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_opcache_health [05:27:18] !log restart elasticsearch on logstash2022 [05:27:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:52:48] <_joe_> !log depooling mw1407 again, should not be serving traffic [05:52:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:29:13] PROBLEM - MD RAID on ores1002 is CRITICAL: connect to address 10.64.0.52 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Hardware_Raid_Information_Gathering [06:29:45] PROBLEM - Check systemd state on ores1002 is CRITICAL: connect to address 10.64.0.52 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [06:30:01] PROBLEM - Check size of conntrack table on ores1002 is CRITICAL: connect to address 10.64.0.52 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack [06:30:01] PROBLEM - ores uWSGI web app on ores1002 is CRITICAL: connect to address 10.64.0.52 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/Services/ores [06:30:31] PROBLEM - Check the NTP synchronisation status of timesyncd on ores1002 is CRITICAL: connect to address 10.64.0.52 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/NTP [06:33:59] PROBLEM - puppet last run on ores1002 is CRITICAL: connect to address 10.64.0.52 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/puppet_checkpuppetrun [06:36:25] PROBLEM - dhclient process on ores1002 is CRITICAL: connect to address 10.64.0.52 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/check_dhclient [06:39:03] RECOVERY - Check systemd state on ores1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [06:39:21] RECOVERY - Check size of conntrack table on ores1002 is OK: OK: nf_conntrack is 0 % full https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack [06:39:53] RECOVERY - puppet last run on ores1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures https://wikitech.wikimedia.org/wiki/Monitoring/puppet_checkpuppetrun [06:40:07] RECOVERY - MD RAID on ores1002 is OK: OK: Active: 4, Working: 4, Failed: 0, Spare: 0 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Hardware_Raid_Information_Gathering [06:52:33] PROBLEM - MegaRAID on es2004 is CRITICAL: CRITICAL: 1 failed LD(s) (Degraded) https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [06:52:34] ACKNOWLEDGEMENT - MegaRAID on es2004 is CRITICAL: CRITICAL: 1 failed LD(s) (Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T251017 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [06:52:38] 10Operations, 10ops-codfw: Degraded RAID on es2004 - https://phabricator.wikimedia.org/T251017 (10ops-monitoring-bot) [07:01:23] RECOVERY - Check the NTP synchronisation status of timesyncd on ores1002 is OK: OK: synced at Sat 2020-04-25 07:01:22 UTC. https://wikitech.wikimedia.org/wiki/NTP [07:07:21] RECOVERY - dhclient process on ores1002 is OK: PROCS OK: 0 processes with command name dhclient https://wikitech.wikimedia.org/wiki/Monitoring/check_dhclient [09:19:15] PROBLEM - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3056 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [09:26:45] RECOVERY - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3056 is OK: HTTP OK: HTTP/1.0 200 OK - 22748 bytes in 7.497 second response time https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [10:23:02] !log going to restart and probably depool for a short time wdqs1005 as it is in a deadlock T242453 [10:23:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:23:10] T242453: Deadlock in blazegraph blocking all queries and updates - https://phabricator.wikimedia.org/T242453 [10:25:17] urgfff [10:43:13] PROBLEM - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3056 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [10:46:49] RECOVERY - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3056 is OK: HTTP OK: HTTP/1.0 200 OK - 22748 bytes in 0.271 second response time https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [12:26:29] Any wikitech admins available? https://wikitech.wikimedia.org/wiki/Incident_documentation/20200425-Satoshi-Nakamoto_fortune looks like spam (and the username of the creator represents someone well known too) [13:10:12] blocke [13:10:15] d [14:50:19] PROBLEM - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3052 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [14:57:39] RECOVERY - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3052 is OK: HTTP OK: HTTP/1.0 200 OK - 22729 bytes in 0.256 second response time https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [15:04:15] PROBLEM - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3062 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [15:13:33] RECOVERY - Ensure traffic_exporter binds on port 9322 and responds to HTTP requests on cp3062 is OK: HTTP OK: HTTP/1.0 200 OK - 22748 bytes in 0.258 second response time https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server [15:49:01] (03PS1) 10Zoranzoki21: Enable visualeditor on srwiki by default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592455 (https://phabricator.wikimedia.org/T250878) [15:49:51] (03PS2) 10Zoranzoki21: Enable visualeditor on srwiki by default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592455 (https://phabricator.wikimedia.org/T250878) [19:25:35] (03CR) 10Urbanecm: [C: 03+1] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592455 (https://phabricator.wikimedia.org/T250878) (owner: 10Zoranzoki21) [19:31:02] (03CR) 10Urbanecm: [C: 03+1] "LGTM." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592427 (https://phabricator.wikimedia.org/T250419) (owner: 10QEDK) [19:36:32] (03CR) 10Urbanecm: [C: 04-1] "Reapplying my -1. If you think this should be deployed, go ahead – -1 doesn't block a patch from being merged." (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/591416 (owner: 10Esanders) [19:41:11] (03CR) 10Urbanecm: [C: 03+1] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/592306 (https://phabricator.wikimedia.org/T250903) (owner: 10Zoranzoki21) [19:55:40] 10Puppet, 10Wikimedia Meet: Puppetize the account manager - https://phabricator.wikimedia.org/T251034 (10Ladsgroup) [20:19:51] (03CR) 10Esanders: VisualEditor: Allow external link paste on officewiki (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/591416 (owner: 10Esanders) [20:56:49] (03CR) 10Busecolak: [C: 03+1] Refactor the exporter to support metrics specs via config file (038 comments) [software/druid_exporter] - 10https://gerrit.wikimedia.org/r/592261 (owner: 10Elukey) [21:24:02] (03CR) 10Busecolak: [C: 03+1] Refactor the exporter to support metrics specs via config file (033 comments) [software/druid_exporter] - 10https://gerrit.wikimedia.org/r/592261 (owner: 10Elukey) [22:07:27] (03CR) 10Busecolak: [C: 04-1] Refactor the exporter to support metrics specs via config file [software/druid_exporter] - 10https://gerrit.wikimedia.org/r/592261 (owner: 10Elukey)