[00:24:05] PROBLEM - mw1 Current Load on mw1 is WARNING: WARNING - load average: 7.86, 5.85, 4.28 [00:26:04] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 5.11, 5.49, 4.34 [00:26:37] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fj6Zf [00:26:39] [02miraheze/puppet] 07paladox 03bb1cc98 - varnish: Blacklist a bot temporarily [00:28:05] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.95, 5.82, 4.04 [00:30:06] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 6.22, 5.95, 4.31 [00:34:00] !log restarted jobrunner/jobchron on mw3 [00:34:05] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [00:34:48] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fj6ZY [00:34:49] [02miraheze/puppet] 07paladox 034e8393b - Update mediawiki.conf [00:44:27] Hi! Here is the list of currently open high priority tasks on Phabricator [00:44:34] No updates for 31 days - https://phabricator.miraheze.org/T4438 - Upgrade elasticsearch to version 6 - authored by Southparkfan, assigned to Paladox [00:44:41] No updates for 8 days - https://phabricator.miraheze.org/T4260 - Migrate all wikis to elasticsearch - authored by Southparkfan, assigned to None [00:47:06] !log apply Echo/db_patches/patch-drop-notification_bundle_display_hash.sql to all wikis [00:47:12] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [00:48:02] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 6.89, 5.56, 4.48 [00:48:05] PROBLEM - mw1 Current Load on mw1 is WARNING: WARNING - load average: 7.05, 5.33, 4.12 [00:48:44] !log aborted, appears update.php is doing it. [00:48:48] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [00:49:31] [02miraheze/puppet] 07paladox pushed 032 commits to 03master [+0/-0/±2] 13https://git.io/fj6ZR [00:49:32] [02miraheze/puppet] 07paladox 03b84cb57 - Revert "Update mediawiki.conf" This reverts commit 4e8393b4a76a5886906e56d4df4404864aa0f813. [00:49:34] [02miraheze/puppet] 07paladox 03175af5e - Revert "varnish: Blacklist a bot temporarily" This reverts commit bb1cc98b6cfa25598e69f438fef480e956387f64. [00:50:03] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 3.81, 4.84, 4.34 [00:50:05] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 4.62, 4.71, 4.03 [00:50:12] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fj6Zu [00:50:13] [02miraheze/services] 07MirahezeSSLBot 0312ef5ca - BOT: Updating services config for wikis [02:34:28] PROBLEM - misc2 Current Load on misc2 is WARNING: WARNING - load average: 3.44, 3.17, 2.76 [02:36:24] RECOVERY - misc2 Current Load on misc2 is OK: OK - load average: 3.16, 3.14, 2.80 [02:37:20] strange [02:42:15] PROBLEM - misc2 Current Load on misc2 is WARNING: WARNING - load average: 3.82, 3.40, 2.99 [02:51:55] PROBLEM - misc2 Current Load on misc2 is CRITICAL: CRITICAL - load average: 4.39, 3.89, 3.36 [02:53:51] PROBLEM - misc2 Current Load on misc2 is WARNING: WARNING - load average: 3.26, 3.67, 3.35 [02:59:50] RECOVERY - misc2 Current Load on misc2 is OK: OK - load average: 3.22, 3.40, 3.33 [03:03:50] PROBLEM - misc2 Current Load on misc2 is WARNING: WARNING - load average: 3.44, 3.80, 3.53 [03:11:50] PROBLEM - misc2 Current Load on misc2 is CRITICAL: CRITICAL - load average: 4.60, 3.88, 3.60 [03:15:50] PROBLEM - misc2 Current Load on misc2 is WARNING: WARNING - load average: 3.98, 4.00, 3.72 [03:21:50] PROBLEM - misc2 Current Load on misc2 is CRITICAL: CRITICAL - load average: 4.35, 4.10, 3.82 [03:27:50] PROBLEM - misc2 Current Load on misc2 is WARNING: WARNING - load average: 3.54, 3.99, 3.87 [03:33:50] RECOVERY - misc2 Current Load on misc2 is OK: OK - load average: 0.60, 2.20, 3.15 [05:15:53] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 13.88, 7.80, 4.57 [05:16:08] PROBLEM - mw1 Current Load on mw1 is CRITICAL: CRITICAL - load average: 13.83, 8.42, 5.10 [05:16:19] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 2 backends are down. mw2 mw3 [05:16:22] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [05:17:03] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw1 [05:17:04] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [05:17:25] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 128.199.139.216/cpweb [05:17:31] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 8.90, 5.85, 3.22 [05:17:52] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.90, 7.96, 5.04 [05:18:06] PROBLEM - mw1 Current Load on mw1 is WARNING: WARNING - load average: 6.31, 7.85, 5.31 [05:18:19] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [05:18:22] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [05:19:03] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [05:19:04] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [05:19:25] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [05:19:30] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 2.60, 4.48, 3.04 [05:19:52] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 3.61, 6.36, 4.81 [05:20:05] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 2.80, 5.98, 4.93 [06:21:51] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 14.49, 8.65, 5.30 [06:27:49] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 5.83, 7.53, 5.99 [06:29:48] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 3.67, 6.16, 5.67 [06:33:50] PROBLEM - misc2 Current Load on misc2 is CRITICAL: CRITICAL - load average: 4.22, 3.40, 2.85 [06:35:50] PROBLEM - misc2 Current Load on misc2 is WARNING: WARNING - load average: 3.45, 3.36, 2.90 [06:45:50] PROBLEM - misc2 Current Load on misc2 is CRITICAL: CRITICAL - load average: 4.24, 3.93, 3.39 [06:47:50] PROBLEM - misc2 Current Load on misc2 is WARNING: WARNING - load average: 2.94, 3.61, 3.33 [06:51:50] PROBLEM - misc2 Current Load on misc2 is CRITICAL: CRITICAL - load average: 4.10, 3.67, 3.40 [06:53:50] PROBLEM - misc2 Current Load on misc2 is WARNING: WARNING - load average: 3.22, 3.53, 3.38 [07:01:50] PROBLEM - misc2 Current Load on misc2 is CRITICAL: CRITICAL - load average: 4.16, 3.85, 3.54 [07:03:50] PROBLEM - misc2 Current Load on misc2 is WARNING: WARNING - load average: 3.20, 3.70, 3.53 [07:11:50] PROBLEM - misc2 Current Load on misc2 is CRITICAL: CRITICAL - load average: 4.16, 3.74, 3.59 [07:13:50] PROBLEM - misc2 Current Load on misc2 is WARNING: WARNING - load average: 3.23, 3.60, 3.56 [07:19:50] RECOVERY - misc2 Current Load on misc2 is OK: OK - load average: 1.77, 3.03, 3.36 [09:37:31] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 6.85, 5.51, 3.49 [09:39:30] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 3.33, 4.84, 3.50 [10:02:05] PROBLEM - mw1 Current Load on mw1 is CRITICAL: CRITICAL - load average: 8.27, 6.10, 4.48 [10:02:40] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 11.08, 7.95, 5.44 [10:03:32] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 10.55, 7.66, 4.60 [10:05:30] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 7.80, 7.49, 4.90 [10:07:31] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 5.01, 6.51, 4.86 [10:08:04] PROBLEM - mw1 Current Load on mw1 is WARNING: WARNING - load average: 7.08, 7.90, 5.87 [10:10:06] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 4.19, 6.27, 5.51 [10:11:13] PROBLEM - mw3 JobQueue on mw3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [10:13:33] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 11.96, 8.07, 5.81 [10:14:08] PROBLEM - mw1 Current Load on mw1 is CRITICAL: CRITICAL - load average: 12.67, 8.63, 6.52 [10:15:03] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 2 backends are down. mw1 mw3 [10:15:05] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [10:15:18] PROBLEM - mw3 JobQueue on mw3 is UNKNOWN: NRPE: Unable to read output [10:15:25] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [10:17:03] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [10:17:25] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [10:17:30] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 4.93, 7.32, 6.13 [10:19:04] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [10:19:30] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 3.69, 5.96, 5.76 [10:26:05] PROBLEM - mw1 Current Load on mw1 is WARNING: WARNING - load average: 5.95, 7.86, 7.85 [10:26:45] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 3.44, 5.99, 7.51 [10:30:05] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 2.21, 4.89, 6.64 [10:30:44] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 2.39, 4.15, 6.41 [11:23:14] PROBLEM - mw3 JobQueue on mw3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [11:24:34] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 16.49, 9.98, 5.72 [11:25:14] PROBLEM - mw3 JobQueue on mw3 is UNKNOWN: NRPE: Unable to read output [11:25:30] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 7.40, 5.37, 3.23 [11:26:05] PROBLEM - mw1 Current Load on mw1 is CRITICAL: CRITICAL - load average: 8.51, 7.21, 5.05 [11:27:30] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 4.51, 5.02, 3.35 [11:28:05] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 3.83, 6.00, 4.87 [11:28:34] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 3.51, 7.17, 5.63 [11:30:33] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 2.61, 5.66, 5.25 [11:36:05] PROBLEM - mw1 Current Load on mw1 is WARNING: WARNING - load average: 7.16, 7.21, 5.76 [11:38:05] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 3.66, 5.87, 5.44 [12:13:33] !log rm undefined.log* (mw1) [12:13:37] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [12:45:04] !log root@mw1:/var/log/mediawiki# rm *.log* [12:45:09] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [12:57:25] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [12:59:25] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [13:11:18] PROBLEM - mw3 JobQueue on mw3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:13:14] PROBLEM - mw3 JobQueue on mw3 is UNKNOWN: NRPE: Unable to read output [13:20:24] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fj6E2 [13:20:26] [02miraheze/puppet] 07paladox 03dc13457 - mediawiki: add /server-status to the nginx config [13:41:20] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [13:41:25] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [13:42:45] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 2 backends are down. mw2 mw3 [13:43:03] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [13:44:43] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [13:45:03] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [13:45:16] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [13:45:25] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [13:54:47] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fj6uJ [13:54:49] [02miraheze/puppet] 07paladox 033f7f2af - Update mediawiki.conf [14:21:31] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 9.96, 6.10, 3.69 [14:23:30] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 4.58, 5.52, 3.78 [14:40:45] PROBLEM - mw1 Current Load on mw1 is CRITICAL: CRITICAL - load average: 9.78, 6.53, 4.92 [14:42:43] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 6.55, 6.65, 5.17 [14:56:11] !log setting REL1_33 as default branch in mediawiki [14:56:15] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [16:42:41] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_33 [+0/-0/±1] 13https://git.io/fj62o [16:42:42] [02miraheze/mediawiki] 07paladox 037b171c9 - Update ArticleFeedbackv5 [16:45:25] !log running lc on mw* [16:45:31] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [16:49:36] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 12.91, 8.05, 4.63 [16:52:05] PROBLEM - mw1 Current Load on mw1 is WARNING: WARNING - load average: 7.28, 6.20, 4.70 [16:52:38] load being high again today [16:53:12] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 11.54, 7.97, 5.51 [16:54:05] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 6.00, 5.90, 4.75 [16:54:36] PROBLEM - mw2 Puppet on mw2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:59:10] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 5.43, 7.68, 6.42 [16:59:30] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 3.44, 6.94, 6.20 [17:01:10] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 3.27, 6.16, 6.02 [17:01:30] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 2.49, 5.32, 5.68 [17:04:26] RECOVERY - mw2 Puppet on mw2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:20:12] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fj6aU [17:20:13] [02miraheze/services] 07MirahezeSSLBot 038bcf093 - BOT: Updating services config for wikis [17:27:16] !log deleting tsbsaltmineswiki (for it to be recreated) [17:27:20] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [17:34:05] PROBLEM - mw1 Current Load on mw1 is WARNING: WARNING - load average: 7.69, 6.48, 4.84 [17:36:05] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 3.47, 5.24, 4.58 [17:42:38] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_33 [+0/-0/±2] 13https://git.io/fj6as [17:42:39] [02miraheze/mediawiki] 07paladox 03ab75920 - Update ArticleFeedbackv5 to clone from gerrit Also update to commit 463ea1dd9e5e0eabe358af9604ee6e7cd26bc0d6 to fix MediaWiki 1.33 support. [17:48:32] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_33 [+0/-0/±2] 13https://git.io/fj6an [17:48:33] [02miraheze/mediawiki] 07paladox 03fab11f1 - Update Tabs to clone from gerrit [18:05:32] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 8.48, 6.39, 4.50 [18:07:31] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 6.12, 6.15, 4.64 [19:00:03] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_33 [+0/-0/±2] 13https://git.io/fj6VI [19:00:05] [02miraheze/mediawiki] 07paladox 03d21e3c7 - Update News to clone from gerrit [19:03:13] PROBLEM - mw3 JobQueue on mw3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:04:01] !log restart nginx & php-fpm on mw1 [19:04:07] PROBLEM - mw1 Current Load on mw1 is CRITICAL: CRITICAL - load average: 12.15, 7.55, 4.83 [19:04:46] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 15.62, 9.43, 5.75 [19:05:03] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 2 backends are down. mw1 mw2 [19:05:04] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [19:05:25] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [19:05:34] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 12.75, 8.07, 5.05 [19:05:47] known (note caused by my mw changes, rather someone is hitting mw* hard) [19:05:55] PROBLEM - mw1 Puppet on mw1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:06:19] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [19:06:22] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [19:07:04] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [19:07:21] PROBLEM - mw3 JobQueue on mw3 is UNKNOWN: NRPE: Unable to read output [19:07:49] RECOVERY - mw1 Puppet on mw1 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [19:08:19] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [19:08:22] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [19:09:25] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [19:10:08] PROBLEM - mw1 Current Load on mw1 is WARNING: WARNING - load average: 3.94, 7.45, 6.05 [19:11:03] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [19:11:31] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 5.19, 7.34, 5.84 [19:12:07] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 4.68, 6.65, 5.92 [19:13:31] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 5.44, 6.75, 5.81 [19:18:06] PROBLEM - mw1 Current Load on mw1 is CRITICAL: CRITICAL - load average: 8.80, 7.82, 6.59 [19:20:06] PROBLEM - mw1 Current Load on mw1 is WARNING: WARNING - load average: 6.39, 7.33, 6.56 [19:20:45] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 2.76, 6.63, 7.72 [19:22:04] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 4.50, 6.19, 6.23 [19:24:44] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 2.40, 4.50, 6.61 [20:22:05] PROBLEM - mw1 Current Load on mw1 is WARNING: WARNING - load average: 6.88, 5.60, 4.32 [20:24:04] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 3.50, 4.81, 4.19 [20:38:47] !log MariaDB [nonciclopediawiki]> truncate objectcache; [20:38:52] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [20:42:22] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 10.60, 6.18, 4.08 [20:44:05] PROBLEM - mw1 Current Load on mw1 is CRITICAL: CRITICAL - load average: 10.13, 6.88, 5.00 [20:45:29] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_33 [+0/-0/±1] 13https://git.io/fj6wZ [20:45:31] [02miraheze/mediawiki] 07paladox 03441cebc - Downgrade RandomSelection to REL1_32 A change in RandomSelection has caused a performance hit (https://github.com/wikimedia/mediawiki-extensions-RandomSelection/commit/89a8c2d300bdcebc96b9d1887695d55f93aa6f80) for some of our wikis. [20:46:05] RECOVERY - mw1 Current Load on mw1 is OK: OK - load average: 5.40, 6.29, 5.02 [20:46:22] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 6.07, 7.32, 5.15 [20:48:21] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 3.58, 6.00, 4.92 [21:00:34] PROBLEM - test1 Puppet on test1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:02:07] PROBLEM - test1 Current Load on test1 is CRITICAL: CRITICAL - load average: 3.61, 2.27, 1.06 [21:02:36] RECOVERY - test1 Puppet on test1 is OK: OK: Puppet is currently enabled, last run 9 minutes ago with 0 failures [21:08:06] PROBLEM - test1 Current Load on test1 is WARNING: WARNING - load average: 0.33, 1.73, 1.30 [21:10:07] RECOVERY - test1 Current Load on test1 is OK: OK - load average: 1.14, 1.50, 1.26 [21:14:08] PROBLEM - test1 Current Load on test1 is CRITICAL: CRITICAL - load average: 3.12, 2.32, 1.63 [21:18:06] PROBLEM - test1 Current Load on test1 is WARNING: WARNING - load average: 0.83, 1.77, 1.58 [21:20:06] RECOVERY - test1 Current Load on test1 is OK: OK - load average: 0.74, 1.39, 1.46 [21:29:20] !log remove PII from a user [21:29:24] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [21:43:33] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 9.64, 6.45, 3.95 [21:44:09] no abnormal cpu, so the problem is different this time [21:47:32] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 6.11, 6.50, 4.57 [21:57:09] !log depool mw2 [21:57:14] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [21:57:38] !log killing a D state process (nginx) on mw2 [21:57:43] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [22:00:01] PROBLEM - mw2 Puppet on mw2 is WARNING: WARNING: Puppet is currently disabled, message: paladox, last run 6 minutes ago with 0 failures [22:00:15] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fj6re [22:00:16] [02miraheze/services] 07MirahezeSSLBot 038791a06 - BOT: Updating services config for wikis [22:00:19] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 1 backends are down. mw2 [22:00:22] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw2 [22:01:04] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw2 [22:01:15] PROBLEM - mw2 HTTPS on mw2 is CRITICAL: connect to address 185.52.2.113 and port 443: Connection refusedHTTP CRITICAL - Unable to open TCP socket [22:01:39] that's me [22:04:01] RECOVERY - mw2 Puppet on mw2 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [22:04:31] !log repooling mw2 [22:04:35] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [22:05:04] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [22:05:15] RECOVERY - mw2 HTTPS on mw2 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 442 bytes in 0.010 second response time [22:06:19] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [22:06:22] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [22:10:11] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fj6rT [22:10:12] [02miraheze/services] 07MirahezeSSLBot 03bdcb102 - BOT: Updating services config for wikis [22:17:11] !log depool mw2 (and restart as nginx is showing as a D state process) [22:17:15] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [22:19:03] !log upgraded puppet-agent and dbus on mw2 [22:19:07] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [22:21:39] PROBLEM - mw2 HTTPS on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:21:55] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:21:55] PROBLEM - mw2 php-fpm on mw2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:22:02] PROBLEM - mw2 SSH on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:22:20] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 1 backends are down. mw2 [22:22:22] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw2 [22:22:25] PROBLEM - mw2 Puppet on mw2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:22:34] PROBLEM - Host mw2 is DOWN: PING CRITICAL - Packet loss = 100% [22:23:03] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw2 [22:24:33] RECOVERY - Host mw2 is UP: PING OK - Packet loss = 0%, RTA = 0.34 ms [22:24:37] PROBLEM - mw2 Disk Space on mw2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:33:43] RECOVERY - mw2 SSH on mw2 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u6 (protocol 2.0) [22:37:30] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 0.15, 0.03, 0.01 [22:37:38] !log repool mw2 [22:37:44] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [22:38:01] RECOVERY - mw2 Puppet on mw2 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [22:38:40] RECOVERY - mw2 Disk Space on mw2 is OK: DISK OK - free space: / 39307 MB (51% inode=99%); [22:39:03] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [22:39:15] RECOVERY - mw2 HTTPS on mw2 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 442 bytes in 0.010 second response time [22:39:41] RECOVERY - mw2 php-fpm on mw2 is OK: PROCS OK: 5 processes with command name 'php-fpm7.2' [22:40:20] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [22:40:22] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [22:46:05] !log running lc on mw* [22:46:28] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [22:59:34] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fj6rX [22:59:35] [02miraheze/puppet] 07paladox 03f9c4a38 - Install lua sandbox under /usr/lib/php/20170718/luasandbox.so.so Fixes: [04-Jul-2019 22:37:51 UTC] PHP Warning: PHP Startup: Unable to load dynamic library 'luasandbox.so' (tried: /usr/lib/php/20170718/luasandbox.so (/usr/lib/php/20170718/luasandbox.so: cannot open shared object file: No such file or directory), /usr/lib/php/20170718/luasandbox.so.so [22:59:35] (/usr/lib/php/20170718/luasandbox.so.so: cannot open shared object file: No such file or directory)) in Unknown on line 0 [23:00:34] !log depooling mw2 && restarting php-fpm to pickup .so file [23:00:38] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [23:03:20] !log repool mw2 [23:03:23] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [23:04:08] PROBLEM - test1 Current Load on test1 is CRITICAL: CRITICAL - load average: 3.00, 1.54, 0.76 [23:05:47] !log depool mw1 and restart php-fpm to pick up .so file and repool [23:05:50] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [23:06:07] !log depool mw3 and restart php-fpm to pick up .so file and repool [23:06:11] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [23:08:06] RECOVERY - test1 Current Load on test1 is OK: OK - load average: 0.90, 1.48, 0.93 [23:44:07] PROBLEM - test1 Current Load on test1 is CRITICAL: CRITICAL - load average: 2.11, 1.31, 0.95 [23:46:06] RECOVERY - test1 Current Load on test1 is OK: OK - load average: 1.38, 1.40, 1.03