[01:45:38] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [01:46:41] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 4 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [01:48:49] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [01:54:14] PROBLEM - cp3 Stunnel Http for mw1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:57:09] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [01:57:29] RECOVERY - cp3 Stunnel Http for mw1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24586 bytes in 0.684 second response time [02:02:52] PROBLEM - bacula1 Bacula Databases db4 on bacula1 is WARNING: WARNING: Diff, 60047 files, 58.10GB, 2019-09-15 01:59:00 (2.1 weeks ago) [02:32:37] PROBLEM - bacula1 Bacula Databases db5 on bacula1 is WARNING: WARNING: Diff, 375 files, 24.02GB, 2019-09-15 02:28:00 (2.1 weeks ago) [06:26:16] RECOVERY - cp3 Disk Space on cp3 is OK: DISK OK - free space: / 3039 MB (12% inode=94%); [07:22:36] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 6.99, 7.11, 5.77 [07:25:04] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 4.97, 6.33, 5.68 [08:07:17] PROBLEM - mw3 Puppet on mw3 is WARNING: WARNING: Puppet last ran 1 hour ago [08:15:46] RECOVERY - mw3 Puppet on mw3 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [09:16:13] PROBLEM - db4 Puppet on db4 is CRITICAL: CRITICAL: Puppet has 12 failures. Last run 4 minutes ago with 12 failures. Failed resources (up to 3 shown) [09:16:25] PROBLEM - lizardfs5 Puppet on lizardfs5 is CRITICAL: CRITICAL: Puppet has 9 failures. Last run 4 minutes ago with 9 failures. Failed resources (up to 3 shown) [09:16:44] PROBLEM - bacula1 Puppet on bacula1 is CRITICAL: CRITICAL: Puppet has 10 failures. Last run 4 minutes ago with 10 failures. Failed resources (up to 3 shown) [09:16:51] PROBLEM - misc3 Puppet on misc3 is CRITICAL: CRITICAL: Puppet has 18 failures. Last run 4 minutes ago with 18 failures. Failed resources (up to 3 shown) [09:16:52] PROBLEM - cp4 Puppet on cp4 is CRITICAL: CRITICAL: Puppet has 373 failures. Last run 4 minutes ago with 373 failures. Failed resources (up to 3 shown) [09:16:58] PROBLEM - puppet1 Puppet on puppet1 is CRITICAL: CRITICAL: Puppet has 16 failures. Last run 4 minutes ago with 16 failures. Failed resources (up to 3 shown) [09:16:59] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 9 failures. Last run 5 minutes ago with 9 failures. Failed resources (up to 3 shown) [09:17:26] PROBLEM - mw1 Puppet on mw1 is CRITICAL: CRITICAL: Puppet has 391 failures. Last run 5 minutes ago with 391 failures. Failed resources (up to 3 shown) [09:17:42] PROBLEM - db5 Puppet on db5 is CRITICAL: CRITICAL: Puppet has 10 failures. Last run 5 minutes ago with 10 failures. Failed resources (up to 3 shown) [09:17:43] PROBLEM - misc2 Puppet on misc2 is CRITICAL: CRITICAL: Puppet has 25 failures. Last run 5 minutes ago with 25 failures. Failed resources (up to 3 shown) [09:17:55] PROBLEM - test1 Puppet on test1 is CRITICAL: CRITICAL: Puppet has 383 failures. Last run 5 minutes ago with 383 failures. Failed resources (up to 3 shown) [09:17:55] PROBLEM - misc4 Puppet on misc4 is CRITICAL: CRITICAL: Puppet has 28 failures. Last run 5 minutes ago with 28 failures. Failed resources (up to 3 shown) [09:18:00] PROBLEM - lizardfs4 Puppet on lizardfs4 is CRITICAL: CRITICAL: Puppet has 9 failures. Last run 6 minutes ago with 9 failures. Failed resources (up to 3 shown) [09:18:10] PROBLEM - cp2 Puppet on cp2 is CRITICAL: CRITICAL: Puppet has 333 failures. Last run 5 minutes ago with 333 failures. Failed resources (up to 3 shown): File[/usr/local/bin/gen_fingerprints],File[/etc/vim/vimrc.local],File[/etc/default/varnish],File[/etc/systemd/system/varnish.service] [09:18:15] PROBLEM - misc1 Puppet on misc1 is CRITICAL: CRITICAL: Puppet has 45 failures. Last run 6 minutes ago with 45 failures. Failed resources (up to 3 shown) [09:18:17] PROBLEM - cp3 Puppet on cp3 is CRITICAL: CRITICAL: Puppet has 198 failures. Last run 4 minutes ago with 198 failures. Failed resources (up to 3 shown): File[/root/ufw-fix],File[/usr/local/bin/gen_fingerprints],File[/etc/vim/vimrc.local],File[/etc/default/varnish] [09:18:45] PROBLEM - mw3 Puppet on mw3 is CRITICAL: CRITICAL: Puppet has 388 failures. Last run 6 minutes ago with 388 failures. Failed resources (up to 3 shown) [09:19:35] PROBLEM - mw2 Puppet on mw2 is CRITICAL: CRITICAL: Puppet has 212 failures. Last run 7 minutes ago with 212 failures. Failed resources (up to 3 shown) [09:23:28] RECOVERY - db5 Puppet on db5 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [09:23:43] RECOVERY - misc4 Puppet on misc4 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [09:23:43] RECOVERY - lizardfs4 Puppet on lizardfs4 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [09:24:02] RECOVERY - misc1 Puppet on misc1 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [09:24:48] RECOVERY - db4 Puppet on db4 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:25:16] RECOVERY - lizardfs5 Puppet on lizardfs5 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [09:25:27] RECOVERY - mw2 Puppet on mw2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:25:30] RECOVERY - misc3 Puppet on misc3 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:25:32] RECOVERY - bacula1 Puppet on bacula1 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [09:25:38] RECOVERY - cp4 Puppet on cp4 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:25:43] RECOVERY - puppet1 Puppet on puppet1 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [09:25:44] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [09:26:03] RECOVERY - mw1 Puppet on mw1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:26:14] RECOVERY - misc2 Puppet on misc2 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [09:26:22] RECOVERY - test1 Puppet on test1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:26:35] RECOVERY - cp2 Puppet on cp2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:26:40] RECOVERY - cp3 Puppet on cp3 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:34:54] RECOVERY - mw3 Puppet on mw3 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [12:04:36] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [12:04:56] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [12:05:16] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 2 backends are down. mw1 mw3 [12:05:17] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 2 backends are down. mw1 mw3 [12:05:18] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 2 backends are down. mw1 mw3 [12:08:03] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [12:08:24] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [12:08:24] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [12:08:25] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [12:13:30] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [12:16:03] [02mw-config] 07paladox deleted branch 03Southparkfan-patch-1 - 13https://git.io/vbvb3 [12:16:04] [02miraheze/mw-config] 07paladox deleted branch 03Southparkfan-patch-1 [12:22:39] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JenOK [12:22:40] [02miraheze/puppet] 07paladox 03322cf29 - php: Set request_slowlog_timeout to 0 We doin't need to have this set right at the moment. This may improve things (see https://serverfault.com/questions/406532/i-o-error-with-php5-fpm-ptracepeekdata-failed, https://stackoverflow.com/questions/29002165/php-fpm-too-many-notice-about-trace) [12:22:41] [ php - I/O Error with PHP5-FPM, ptrace(PEEKDATA) failed - Server Fault ] - serverfault.com [12:22:43] [ performance - php-fpm too many notice about trace - Stack Overflow ] - stackoverflow.com [12:23:33] !log depool and repool mw[123] [12:23:38] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [12:23:59] serverfault.com is a site full of talkative boys having opinion on every matter, albeit rarely a knowledge. [12:24:51] don’t consult them for anything exceeding a homework. [12:35:25] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [12:37:35] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [12:38:38] We seem to be down a lot lately [12:38:59] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:39:03] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [12:39:07] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 2 backends are down. mw1 mw3 [12:39:24] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 1 backends are down. mw3 [12:40:38] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [12:41:26] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [12:41:34] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24586 bytes in 0.775 second response time [12:41:36] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [12:41:38] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [12:41:55] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [12:42:04] k6ka yeh, i'm trying to figure out how to resolve that. [12:44:25] i've done a config change [12:44:33] so hopefully that'll improve things [13:06:05] [02miraheze/puppet] 07paladox pushed 032 commits to 03master [+0/-0/±2] 13https://git.io/Jen30 [13:06:06] [02miraheze/puppet] 07paladox 0362b609e - Revert "php: Set request_slowlog_timeout to 0" This reverts commit 322cf29428f14449e07daa1f0bc69d7ea85087fe. [13:06:08] [02miraheze/puppet] 07paladox 03f3911ea - Decrease pm.max_requests to 1000 [13:07:04] 503 [13:07:21] I think thats known [13:18:13] Might want to implement some sort of testing procedure to keep config changes from bringing the entire website down [13:23:26] yup [13:28:24] 503 [13:32:45] thanks, looking [13:32:47] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [13:33:27] PROBLEM - cp4 Stunnel Http for mw3 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:33:28] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 2 backends are down. mw1 mw3 [13:33:45] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 2 backends are down. mw1 mw3 [13:33:54] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [13:33:54] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [13:37:14] RECOVERY - cp4 Stunnel Http for mw3 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24570 bytes in 5.385 second response time [13:39:04] PROBLEM - cp2 Stunnel Http for mw3 on cp2 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 309 bytes in 0.294 second response time [13:40:00] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jensn [13:40:01] [02miraheze/puppet] 07paladox 03aea4bb3 - Revert "Decrease pm.max_requests to 1000" This reverts commit f3911eabf15e80c40ec535256130041d853d54b2. [13:42:03] RECOVERY - cp2 Stunnel Http for mw3 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24570 bytes in 0.391 second response time [13:42:33] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [13:42:55] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [13:46:00] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [13:46:04] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [13:46:32] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [13:51:28] i dunno why things deteriated when i changed slowlog & pm.max_requests. [13:56:16] PROBLEM - cp2 Stunnel Http for mw3 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:56:39] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [13:56:51] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [13:57:06] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 2 backends are down. mw1 mw3 [13:57:08] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 2 backends are down. mw1 mw3 [13:57:37] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 2 backends are down. mw1 mw2 [13:59:41] RECOVERY - cp2 Stunnel Http for mw3 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24592 bytes in 0.397 second response time [13:59:52] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [13:59:58] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [14:00:05] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [14:00:07] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [14:00:37] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [14:12:38] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [14:12:43] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [14:12:54] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 2 backends are down. mw1 mw2 [14:12:54] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 2 backends are down. mw1 mw2 [14:13:30] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 1 backends are down. mw1 [14:15:15] ffs [14:16:30] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [14:18:40] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [14:18:54] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [14:18:55] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [14:21:05] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [14:30:42] PROBLEM - cp3 Stunnel Http for mw1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:31:37] PROBLEM - cp2 Stunnel Http for mw1 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:32:55] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 5 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [14:33:46] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 3 datacenters are down: 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [14:34:18] RECOVERY - cp3 Stunnel Http for mw1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24592 bytes in 5.963 second response time [14:34:25] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 2 backends are down. mw1 mw3 [14:34:26] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 2 backends are down. mw1 mw3 [14:34:58] !log reboot mw2 [14:35:21] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 1 backends are down. mw2 [14:38:15] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [14:38:30] RECOVERY - cp2 Stunnel Http for mw1 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24592 bytes in 0.389 second response time [14:38:53] !log reboot mw1 [14:39:03] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [14:39:15] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [14:40:19] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [14:45:32] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [14:45:43] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [14:46:15] !log reboot mw3 [14:46:20] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [15:35:12] PROBLEM - bacula1 Bacula Phabricator Static on bacula1 is WARNING: WARNING: Diff, 11849 files, 16.14MB, 2019-09-15 15:30:00 (2.1 weeks ago) [15:48:52] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2651 MB (10% inode=94%); [17:41:05] PROBLEM - cp2 Stunnel Http for mw3 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:41:21] PROBLEM - cp4 Stunnel Http for mw3 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:41:47] PROBLEM - mw3 Disk Space on mw3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:42:00] PROBLEM - mw3 JobRunner Service on mw3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:42:27] PROBLEM - mw3 SSH on mw3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:42:28] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw3 [17:42:45] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:43:00] PROBLEM - mw3 HTTPS on mw3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:43:53] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [17:43:54] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [17:44:41] PROBLEM - Host mw3 is DOWN: PING CRITICAL - Packet loss = 100% [17:45:05] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 1 backends are down. mw3 [17:46:22] !log rebooting mw3 [17:47:26] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24570 bytes in 0.654 second response time [17:48:11] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [17:48:12] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [17:48:30] RECOVERY - Host mw3 is UP: PING OK - Packet loss = 0%, RTA = 0.32 ms [17:48:33] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [17:48:34] PROBLEM - mw3 JobChron Service on mw3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:48:34] PROBLEM - mw3 php-fpm on mw3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:48:37] RECOVERY - mw3 JobChron Service on mw3 is OK: PROCS OK: 1 process with args 'redisJobChronService' [17:48:45] RECOVERY - mw3 php-fpm on mw3 is OK: PROCS OK: 13 processes with command name 'php-fpm7.2' [17:49:04] RECOVERY - cp2 Stunnel Http for mw3 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24586 bytes in 0.389 second response time [17:49:10] RECOVERY - cp4 Stunnel Http for mw3 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24570 bytes in 0.012 second response time [17:49:25] RECOVERY - mw3 Disk Space on mw3 is OK: DISK OK - free space: / 33339 MB (43% inode=99%); [17:49:32] RECOVERY - mw3 JobRunner Service on mw3 is OK: PROCS OK: 1 process with args 'redisJobRunnerService' [17:49:47] RECOVERY - mw3 SSH on mw3 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u7 (protocol 2.0) [17:49:48] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [17:49:57] RECOVERY - mw3 HTTPS on mw3 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 442 bytes in 0.007 second response time [17:51:35] !log [18:46:22] <+paladox> !log rebooting mw3 [17:51:40] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [18:00:26] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 4 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [18:00:27] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 107.191.126.23/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [18:08:19] PROBLEM - cp4 Stunnel Http for mw1 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:08:53] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 2 backends are down. mw1 mw3 [18:10:04] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [18:10:12] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [18:11:14] RECOVERY - cp4 Stunnel Http for mw1 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24570 bytes in 0.005 second response time [18:11:39] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [18:35:30] hi [18:35:46] Hi [18:37:41] PROBLEM - cp4 Stunnel Http for mw1 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:38:34] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 2 backends are down. mw1 mw3 [18:40:42] PROBLEM - cp2 Stunnel Http for mw3 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:40:49] RECOVERY - cp4 Stunnel Http for mw1 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24592 bytes in 2.327 second response time [18:43:49] RECOVERY - cp2 Stunnel Http for mw3 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24570 bytes in 0.390 second response time [18:44:26] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [19:35:01] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 3 datacenters are down: 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [19:37:25] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [20:42:13] * RhinosF1 is currently ill and will be back in a few days [20:42:54] uh :( [20:53:18] hey guys [20:53:35] hey [20:53:48] hello apap [20:53:54] I do not think we have maet [20:54:33] i'm from wikipedia, i found Miraheze from testwiki.wiki through lurking around some pages [20:54:46] ah so am I [20:55:03] how are you? [20:55:12] Great. You? [20:55:18] i'm doing good. [20:55:36] I am a wiki creator here [20:55:41] which wiki? [20:55:48] globally [20:55:58] I just create wikis that people request [20:56:12] ohhh [20:56:14] okay [20:56:18] see https://meta.miraheze.org/wiki/Meta:Wiki_creators [20:56:19] [ Meta:Wiki creators - Miraheze Meta ] - meta.miraheze.org [20:57:10] apap: hello, feel free to ask any questions you may have or stick around to hang out [20:57:12] Examknow: hello [20:57:13] i don't plan to request a wiki unfortunately because I probably wouldn't use it for a while. [20:57:29] i shall Zppix [20:57:56] Apap: Do you plan to help out around miraheze meta? [20:58:09] eh, i don't know [20:58:14] Zppix: hello [21:00:04] is there anything that I can do? [21:00:09] just wondering [21:00:30] Apap: Depends do you know much about mediawiki? [21:01:18] I know it's the software that wikipedia runs on. I have an installation on my other drive. [21:02:06] apap: let me ask you this, what are you looking to do? [21:02:09] Apap: If you know how to use it you can help other out on community noticeboard [21:03:28] **others [21:04:17] Zppix: not sure. As Examknow said, I could probably help out others on the noticeboard. On testwiki.wiki I requested on their noticeboard to be a sysadmin (but i'm not sure if that really relates to Miraheze.) [21:04:43] nope it isnt [21:04:52] testwiki.wiki is not affliated with miraheze [21:05:09] yeah, i noticed some people on there are on here, so [21:08:12] apap: not everyone in here is affliated directly with miraheze some just are in here cause they have a wiki that miraheze hosts [21:08:40] apap: everyone that is voiced (or +v) is someone that has access to the servers miraheze is on or a steward, or both [21:09:50] alright, how do i know who's voiced? [21:10:02] Depends on what client you're using [21:10:18] i'm using HexChat [21:10:18] Most clients display voiced users with a + next to their username [21:10:29] HexChat displays voiced users with a blue circle by default [21:10:30] oh, it must be the blue circle [21:10:33] yup [21:11:29] you can see this stuff on the hex chat documentation [21:11:36] Apap: ^ [21:12:42] I don't recall seeing the colours mentioned in the docs, actually. From what I know, green circles are for ops, blue circles are for voiced users. for networks that support them, turquoise is for half-ops, yellow is for admins, and orange is for owners. [21:12:46] one day... i will check it out, i'm too lazy right now [21:14:24] k6ka: how are you [21:14:54] good [21:15:56] tbh, circles are annoying [21:16:15] * Voidwalker would use + and @ over them anyday [21:17:00] * Zppix draws a circle on Voidwalker [21:17:14] :O [21:17:40] Voidwalker: so does that mean you dont like yourself now? xD [21:21:43] i shall be back on another client [21:21:54] bye [21:32:46] i am back [21:33:21] hi [21:59:09] why is flow so broken? [21:59:11] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw2 [21:59:11] is it me or is meta slow? [21:59:57] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [22:00:30] PROBLEM - cp3 Stunnel Http for mw1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:01:07] paladox, ^ is it the usual culprit? [22:01:25] i doin't think this is lizard [22:03:38] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [22:03:38] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [22:03:39] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [22:04:37] RECOVERY - cp3 Stunnel Http for mw1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24586 bytes in 2.682 second response time [22:06:21] PROBLEM - cp2 Stunnel Http for mw3 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:07:20] hmm [22:08:18] !log restarted php-fpm on mw1 [22:08:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [22:08:53] RECOVERY - cp2 Stunnel Http for mw3 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24592 bytes in 0.391 second response time [22:09:47] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [22:09:50] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [22:09:54] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [22:09:55] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [22:10:18] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [22:14:23] Voidwalker oh! that time may be related to lizardfs looking at the logs [22:14:29] Sep 30 22:02:02 lizardfs4 mfsmount: master: tcp recv error: Connection timed out [22:14:36] i see highload at that time on misc3 [22:14:39] (the master) [22:27:55] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw2 [22:34:12] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [22:35:09] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JenEw [22:35:11] [02miraheze/services] 07MirahezeSSLBot 0363bb804 - BOT: Updating services config for wikis [22:42:53] are global userpages a thing on other miraheze wikis [22:46:52] I don't understand, what do you mean? apap [22:52:12] [ANNOUNCEMENT] Channel operators please see https://phabricator.wikimedia.org/T234275 ! [22:57:57] like userpages on meta showing on other wikis [22:59:10] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw3 [22:59:11] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 2 backends are down. mw1 mw3 [22:59:16] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 1 datacenter is down: 2400:6180:0:d0::403:f001/cpweb [22:59:17] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw3 [22:59:21] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 107.191.126.23/cpweb, 2400:6180:0:d0::403:f001/cpweb [22:59:31] hum, I think in login.miraheze.org you can create userpages that will be shown in other wikis like CommonsWiki. But only there you can. apap [23:01:21] okay, thanks! [23:02:14] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [23:03:01] 503 [23:03:16] it's fine now.. i think [23:03:23] nbm [23:03:24] nvm [23:03:28] why are file servers so terrible? [23:04:17] the backends seem to kill themselves every 10 minutes... FeelsBadMan [23:04:59] yeah it's always the file server too [23:05:29] we switch to a different file backend to fix it, and that one's terrible too [23:06:21] It's the bad thing. He can't see or edit.. [23:06:40] you guys still hunting down the right one? [23:08:54] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [23:09:01] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [23:09:02] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [23:15:18] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [23:17:44] question: can any new user apply for wiki creator? or do you have to have some sort of a reputation on other wiki-based websites like wikipedia? [23:20:31] Anyone can request a wiki apap [23:20:52] nah, to be a wiki creator [23:21:37] would recommend being at least a little involved on meta for a few days before anything else, but nothing would really stop you [23:22:43] hm okay [23:28:13] oh no [23:29:28] i'll be getting cross wiki notifs about my first edit every time i visit a wiki [23:30:17] or maybe it's just me [23:33:42] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [23:34:28] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [23:34:45] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 1 backends are down. mw2 [23:34:46] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [23:35:14] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw2 [23:37:43] PROBLEM - cp3 Stunnel Http for mw2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:41:20] RECOVERY - cp3 Stunnel Http for mw2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24586 bytes in 0.728 second response time [23:41:45] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [23:43:31] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [23:44:21] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [23:44:39] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [23:45:07] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy