[04:59:50] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 51.77.107.210/cpweb, 2001:41d0:800:1056::2/cpweb [05:03:20] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [06:26:48] RECOVERY - cp3 Disk Space on cp3 is OK: DISK OK - free space: / 3410 MB (14% inode=94%); [07:35:49] PROBLEM - cp8 Current Load on cp8 is CRITICAL: CRITICAL - load average: 1.15, 4.13, 2.69 [08:07:44] PROBLEM - cp8 Current Load on cp8 is WARNING: WARNING - load average: 0.11, 0.58, 1.80 [08:10:50] RECOVERY - cp8 Current Load on cp8 is OK: OK - load average: 0.25, 0.39, 1.50 [13:59:49] Yesterday, paladox fixed a "Unknown error, HTTP status 0" on ndg.nenawiki.org by changing timeouts and restarting parsoid. We have the same problem on nenawiki.org. [14:30:10] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JvXdV [14:30:12] [02miraheze/services] 07MirahezeSSLBot 03d9bab5c - BOT: Updating services config for wikis [14:31:56] PROBLEM - mw5 MediaWiki Rendering on mw5 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:32:39] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 2 backends are down. mw5 mw7 [14:32:46] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 51.77.107.210/cpweb, 2001:41d0:800:1056::2/cpweb, 2607:5300:205:200::17f6/cpweb [14:32:57] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 3 backends are down. mw4 mw5 mw7 [14:33:01] PROBLEM - cp6 Varnish Backends on cp6 is CRITICAL: 3 backends are down. mw5 mw6 mw7 [14:33:01] PROBLEM - cp7 Varnish Backends on cp7 is CRITICAL: 3 backends are down. mw5 mw6 mw7 [14:33:49] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JvXdD [14:33:50] [02miraheze/puppet] 07paladox 036c81145 - Update default.vcl [14:35:09] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JvXdS [14:35:10] [02miraheze/services] 07MirahezeSSLBot 036e5c918 - BOT: Updating services config for wikis [14:35:31] RECOVERY - mw5 MediaWiki Rendering on mw5 is OK: HTTP OK: HTTP/1.1 200 OK - 18711 bytes in 0.643 second response time [14:36:04] RECOVERY - cp8 Varnish Backends on cp8 is OK: All 11 backends are healthy [14:36:28] RECOVERY - cp6 Varnish Backends on cp6 is OK: All 11 backends are healthy [14:36:28] RECOVERY - cp7 Varnish Backends on cp7 is OK: All 11 backends are healthy [14:39:42] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [14:39:57] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 9 backends are healthy [14:57:35] PROBLEM - cp3 Stunnel Http for mw4 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:00:40] RECOVERY - cp3 Stunnel Http for mw4 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 3.842 second response time [15:05:05] PROBLEM - cp7 Varnish Backends on cp7 is CRITICAL: 1 backends are down. mw5 [15:08:33] RECOVERY - cp7 Varnish Backends on cp7 is OK: All 11 backends are healthy [15:09:06] PROBLEM - cp7 Stunnel Http for mw4 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:12:56] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 2400:6180:0:d0::403:f001/cpweb, 2001:41d0:800:1056::2/cpweb [15:12:58] RECOVERY - cp7 Stunnel Http for mw4 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.005 second response time [15:19:19] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 51.77.107.210/cpweb, 2001:41d0:800:1056::2/cpweb, 51.161.32.127/cpweb [15:23:55] paladox: is GlobalBlocking up to date? [15:24:42] PROBLEM - mw6 MediaWiki Rendering on mw6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:24:57] It's up to the version from september [15:25:05] but you can always check [15:26:16] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_34 [+0/-0/±1] 13https://git.io/JvXbO [15:26:17] [02miraheze/mediawiki] 07paladox 0345cd9b8 - Update GlobalBlocking [15:26:33] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [15:26:50] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [15:27:20] paladox: no then. Wanted it updating as there was a security fix last week. [15:27:35] CVE-2020-10534 [15:27:55] Thx for doing [15:28:00] RECOVERY - mw6 MediaWiki Rendering on mw6 is OK: HTTP OK: HTTP/1.1 200 OK - 18699 bytes in 2.048 second response time [15:35:08] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JvXbr [15:35:10] [02miraheze/services] 07MirahezeSSLBot 033952aa4 - BOT: Updating services config for wikis [15:39:27] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 2400:6180:0:d0::403:f001/cpweb, 51.77.107.210/cpweb, 51.161.32.127/cpweb [15:43:56] PROBLEM - cp8 Stunnel Http for mw5 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:44:11] PROBLEM - cp7 Stunnel Http for mw5 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:47:39] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 5 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 51.77.107.210/cpweb, 2001:41d0:800:1056::2/cpweb, 51.161.32.127/cpweb [15:47:43] RECOVERY - cp8 Stunnel Http for mw5 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15330 bytes in 4.856 second response time [15:47:57] RECOVERY - cp7 Stunnel Http for mw5 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.861 second response time [15:48:48] PROBLEM - cp7 Varnish Backends on cp7 is CRITICAL: 1 backends are down. mw1 [15:55:07] paladox: can you work your magical parsoid cure on nenawiki.org? [15:55:20] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [15:55:46] I mean, if what i did last night didn't work for that wiki, then what i did, didn't work :( [15:56:10] different wiki. The fix last night still works. [15:56:36] RECOVERY - cp7 Varnish Backends on cp7 is OK: All 11 backends are healthy [15:58:44] MikeV i mean, the fix i did last night should have worked for nenawiki.org, if it didn't then it didn't really fix the problem. [15:58:47] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [16:06:01] MikeV: did it work on your other private wiki? [16:07:05] Just re-tested: ndg.nenawiki.org - now, this error when selecting edit: "Error loading data from server: Could not connect to the server. Would you like to retry?" After several attempts to enter edit mode, an edit saved correctly. [16:07:22] PROBLEM - cp3 Stunnel Http for mw4 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:08:48] On nenawiki.org, I now get same "Error loading data..." problem. Trying to get by that to test edit saving... [16:12:42] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 128.199.139.216/cpweb, 2001:41d0:800:1056::2/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [16:13:24] PROBLEM - cp7 Varnish Backends on cp7 is CRITICAL: 1 backends are down. mw5 [16:13:52] PROBLEM - cp8 Stunnel Http for mw4 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:14:51] I still cannot get into Edit mode on nenawiki.org. "Error loading data from server: Could not connect to the server. Would you like to retry?" [16:16:16] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [16:17:00] RECOVERY - cp7 Varnish Backends on cp7 is OK: All 11 backends are healthy [16:17:16] RECOVERY - cp8 Stunnel Http for mw4 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15330 bytes in 1.122 second response time [16:17:36] PROBLEM - cp6 Stunnel Http for mw5 on cp6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:17:41] paladox: Reception123 any ideas for MikeV ? [16:18:10] Zppix: I think we might be looking out for some fun tonight. See -en and -stewards [16:18:34] RECOVERY - cp3 Stunnel Http for mw4 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.754 second response time [16:20:57] oh you meant onwiki [16:23:19] PROBLEM - cp3 Stunnel Http for mw5 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:24:47] RECOVERY - cp6 Stunnel Http for mw5 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.243 second response time [16:26:11] I guess the slowness is causing this instability. [16:27:01] RECOVERY - cp3 Stunnel Http for mw5 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15330 bytes in 7.526 second response time [16:27:09] paladox: It is certainly slow. [16:27:32] paladox: what is causing the slowness? [16:27:37] redis [16:33:11] Zppix: can you look at that user that just added to their userpage on metaafter wiki req? [16:33:14] * RhinosF1 bst [16:33:18] * RhinosF1 busy [16:35:24] RhinosF1: page created in error thats all [16:35:42] Maybe ensure they know the right place [16:35:51] You did [17:05:48] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 128.199.139.216/cpweb [17:12:26] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [17:32:47] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [17:35:20] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 2607:5300:205:200::17f6/cpweb [17:36:05] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [17:38:36] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [17:46:48] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 2001:41d0:800:1056::2/cpweb [17:47:08] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [17:50:35] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [17:53:08] PROBLEM - cp6 Varnish Backends on cp6 is CRITICAL: 1 backends are down. mw4 [17:53:42] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [17:56:25] RECOVERY - cp6 Varnish Backends on cp6 is OK: All 11 backends are healthy [18:22:13] PROBLEM - cp3 Stunnel Http for mw4 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:23:09] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 2400:6180:0:d0::403:f001/cpweb [18:25:57] RECOVERY - cp3 Stunnel Http for mw4 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 9.668 second response time [18:26:37] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [18:37:58] PROBLEM - mw4 MediaWiki Rendering on mw4 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:42:07] RECOVERY - mw4 MediaWiki Rendering on mw4 is OK: HTTP OK: HTTP/1.1 200 OK - 18668 bytes in 0.845 second response time [19:03:00] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 2400:6180:0:d0::403:f001/cpweb [19:13:08] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [19:21:00] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 128.199.139.216/cpweb [19:24:09] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [19:30:08] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jv1mu [19:30:10] [02miraheze/services] 07MirahezeSSLBot 031e9b78d - BOT: Updating services config for wikis [19:48:18] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 128.199.139.216/cpweb [19:52:00] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [19:54:40] HI [19:55:07] How's it going? [19:58:08] PROBLEM - test2 MediaWiki Rendering on test2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:58:52] PROBLEM - cp7 Stunnel Http for mw4 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:00:30] Hi hispano76 [20:01:50] RECOVERY - test2 MediaWiki Rendering on test2 is OK: HTTP OK: HTTP/1.1 200 OK - 18669 bytes in 0.680 second response time [20:02:37] RECOVERY - cp7 Stunnel Http for mw4 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.005 second response time [20:07:19] PROBLEM - mw6 MediaWiki Rendering on mw6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:10:47] RECOVERY - mw6 MediaWiki Rendering on mw6 is OK: HTTP OK: HTTP/1.1 200 OK - 18694 bytes in 0.714 second response time [20:18:17] hi RhinosF1 :) [20:19:37] PROBLEM - misc2 Current Load on misc2 is CRITICAL: connect to address 81.4.127.174 port 5666: Connection refusedconnect to host 81.4.127.174 port 5666: Connection refused [20:19:58] What’s up hispano76 [20:20:14] PROBLEM - misc2 Redis Process on misc2 is CRITICAL: connect to address 81.4.127.174 port 5666: Connection refusedconnect to host 81.4.127.174 port 5666: Connection refused [20:20:46] PROBLEM - misc2 Disk Space on misc2 is CRITICAL: connect to address 81.4.127.174 port 5666: Connection refusedconnect to host 81.4.127.174 port 5666: Connection refused [20:21:28] PROBLEM - misc2 Puppet on misc2 is CRITICAL: connect to address 81.4.127.174 port 5666: Connection refusedconnect to host 81.4.127.174 port 5666: Connection refused [20:45:04] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:48:57] RECOVERY - mw1 MediaWiki Rendering on mw1 is OK: HTTP OK: HTTP/1.1 200 OK - 18669 bytes in 1.110 second response time [20:53:30] PROBLEM - cp6 Stunnel Http for mw5 on cp6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:54:06] PROBLEM - cp3 Stunnel Http for mw5 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:54:31] PROBLEM - cp8 Stunnel Http for mw5 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:54:47] PROBLEM - test1 MediaWiki Rendering on test1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:57:01] RECOVERY - cp6 Stunnel Http for mw5 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 15330 bytes in 4.227 second response time [20:57:25] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [20:57:46] RECOVERY - cp3 Stunnel Http for mw5 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 7.613 second response time [20:58:04] RECOVERY - cp8 Stunnel Http for mw5 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15330 bytes in 2.806 second response time [21:01:27] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [21:03:36] PROBLEM - cp8 Stunnel Http for mw4 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:05:55] RECOVERY - test1 MediaWiki Rendering on test1 is OK: HTTP OK: HTTP/1.1 200 OK - 18655 bytes in 0.642 second response time [21:07:08] RECOVERY - cp8 Stunnel Http for mw4 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 3.895 second response time [21:19:43] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 2001:41d0:800:1056::2/cpweb [21:27:22] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [21:32:19] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw4 [21:36:18] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 9 backends are healthy [21:41:22] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 1 backends are down. mw5 [21:43:03] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 51.161.32.127/cpweb [21:44:40] RECOVERY - cp8 Varnish Backends on cp8 is OK: All 11 backends are healthy [21:47:24] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jv13H [21:47:26] [02miraheze/puppet] 07paladox 0381882c6 - varnish: Increase health checker timeout [21:50:28] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [21:53:18] PROBLEM - cp3 Stunnel Http for mw4 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:54:26] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 1 backends are down. mw4 [21:56:46] RECOVERY - cp3 Stunnel Http for mw4 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 7.011 second response time [21:56:56] PROBLEM - cp7 Stunnel Http for mw5 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:56:59] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw5 [21:58:02] RECOVERY - cp8 Varnish Backends on cp8 is OK: All 11 backends are healthy [21:58:16] RhinosF1: nothing :P [22:00:36] RECOVERY - cp7 Stunnel Http for mw5 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15330 bytes in 2.532 second response time [22:00:38] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 9 backends are healthy [22:03:12] hispano76: :) [22:04:04] PROBLEM - cp7 Stunnel Http for mw4 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:07:33] RECOVERY - cp7 Stunnel Http for mw4 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.005 second response time [22:09:03] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 2400:6180:0:d0::403:f001/cpweb, 51.77.107.210/cpweb, 2001:41d0:800:1056::2/cpweb, 2607:5300:205:200::17f6/cpweb [22:09:13] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 51.77.107.210/cpweb [22:12:49] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [22:15:53] PROBLEM - cp7 Varnish Backends on cp7 is CRITICAL: 2 backends are down. mw5 mw7 [22:16:29] PROBLEM - cp8 Stunnel Http for mw4 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:16:38] PROBLEM - cp3 Stunnel Http for mw4 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:16:38] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [22:19:14] paladox: [22:19:16] RECOVERY - cp7 Varnish Backends on cp7 is OK: All 11 backends are healthy [22:19:22] ? [22:19:50] RECOVERY - cp8 Stunnel Http for mw4 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.418 second response time [22:19:52] RECOVERY - cp3 Stunnel Http for mw4 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.749 second response time [22:20:11] paladox: users are reporting slowness and icinga is going crazy, is everything normal? [22:20:24] Yeh, slowness is due to redis [22:21:20] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 2001:41d0:800:1056::2/cpweb, 51.161.32.127/cpweb [22:22:00] paladox: what about the alerts? [22:22:12] the alerts are due to the slowness? [22:22:25] Ok [22:24:44] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [22:27:54] RECOVERY - misc2 Current Load on misc2 is OK: OK - load average: 1.23, 0.97, 0.48 [22:29:47] RECOVERY - misc2 Disk Space on misc2 is OK: DISK OK - free space: / 63505 MB (97% inode=98%); [22:30:13] PROBLEM - misc2 Puppet on misc2 is UNKNOWN: UNKNOWN: Failed to check. Reason is: no_summary_file [22:30:13] RECOVERY - misc2 Redis Process on misc2 is OK: PROCS OK: 1 process with args 'redis-server' [22:33:44] RECOVERY - misc2 Puppet on misc2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:38:22] [02miraheze/puppet] 07paladox pushed 031 commit to 03revert-1289-paladox-patch-5 [+0/-0/±1] 13https://git.io/Jv1sA [22:38:24] [02miraheze/puppet] 07paladox 037665b33 - Revert "nutcracker: Switch to rdb1 (#1289)" This reverts commit 6f262ccd883028ba4caf880148027f590d290f57. [22:38:25] [02puppet] 07paladox created branch 03revert-1289-paladox-patch-5 - 13https://git.io/vbiAS [22:38:27] [02puppet] 07paladox opened pull request 03#1293: Revert "nutcracker: Switch to rdb1" - 13https://git.io/Jv1sx [22:38:35] [02puppet] 07paladox closed pull request 03#1293: Revert "nutcracker: Switch to rdb1" - 13https://git.io/Jv1sx [22:38:37] [02miraheze/puppet] 07paladox deleted branch 03revert-1289-paladox-patch-5 [22:38:38] [02puppet] 07paladox deleted branch 03revert-1289-paladox-patch-5 - 13https://git.io/vbiAS [22:39:15] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jv1sp [22:39:16] [02miraheze/puppet] 07paladox 03a7ea3f2 - nutcracker: Switch to misc2 temporarily [22:45:22] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jv1GT [22:45:23] [02miraheze/puppet] 07paladox 03904f183 - Update nutcracker.yml.erb [22:46:09] PROBLEM - cp8 Stunnel Http for mw5 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:46:35] [02miraheze/puppet] 07paladox pushed 032 commits to 03master [+0/-0/±2] 13https://git.io/Jv1GI [22:46:37] [02miraheze/puppet] 07paladox 03c141436 - Revert "Update nutcracker.yml.erb" This reverts commit 904f183bd539e277d5c24785a4f6d040a2d14426. [22:46:38] [02miraheze/puppet] 07paladox 039f75a95 - Revert "nutcracker: Switch to misc2 temporarily" This reverts commit a7ea3f2ba24a78f0f0eacbfa36c077d589e8a1b8. [22:48:20] PROBLEM - mw5 MediaWiki Rendering on mw5 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:48:43] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jv1GL [22:48:44] [02miraheze/puppet] 07paladox 03639e3c8 - Update default.vcl [22:49:37] PROBLEM - cp7 Stunnel Http for mw4 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:49:58] PROBLEM - cp6 Stunnel Http for mw4 on cp6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:50:20] RECOVERY - cp8 Stunnel Http for mw5 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15330 bytes in 7.696 second response time [22:50:55] PROBLEM - cp8 Stunnel Http for mw4 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:52:21] RECOVERY - mw5 MediaWiki Rendering on mw5 is OK: HTTP OK: HTTP/1.1 200 OK - 18673 bytes in 0.658 second response time [22:53:18] RECOVERY - cp7 Stunnel Http for mw4 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 4.577 second response time [22:53:29] !log depool mw[123] [22:53:36] RECOVERY - cp6 Stunnel Http for mw4 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 5.602 second response time [22:54:16] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [22:54:29] RECOVERY - cp8 Stunnel Http for mw4 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.316 second response time [22:56:53] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [22:57:03] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 5 datacenters are down: 2400:6180:0:d0::403:f001/cpweb, 51.77.107.210/cpweb, 2001:41d0:800:1056::2/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [22:57:19] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 51.77.107.210/cpweb [23:01:48] PROBLEM - cp6 Varnish Backends on cp6 is CRITICAL: 4 backends are down. mw1 mw2 mw3 mw5 [23:02:17] PROBLEM - test1 MediaWiki Rendering on test1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:03:15] PROBLEM - cp7 Varnish Backends on cp7 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [23:08:59] PROBLEM - cp3 Stunnel Http for mw7 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:09:35] PROBLEM - cp6 Stunnel Http for mw7 on cp6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:10:07] PROBLEM - cp8 Stunnel Http for mw7 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:11:07] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:13:34] PROBLEM - mw4 MediaWiki Rendering on mw4 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:15:05] Hello Zap39! If you have any questions, feel free to ask and someone should answer soon. [23:15:58] heya [23:16:16] are the miraheze servers under ddos or something? I've been getting awfully slow response times the past few days [23:16:20] PROBLEM - cp3 Stunnel Http for mw5 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:17:01] PROBLEM - test2 MediaWiki Rendering on test2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:18:45] PROBLEM - jobrunner1 MediaWiki Rendering on jobrunner1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:18:57] PROBLEM - cp8 Stunnel Http for mw4 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:19:24] PROBLEM - mw3 MediaWiki Rendering on mw3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:19:36] RECOVERY - cp8 Stunnel Http for mw7 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.314 second response time [23:19:41] PROBLEM - mw5 MediaWiki Rendering on mw5 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:20:06] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw4 [23:21:16] PROBLEM - mw7 MediaWiki Rendering on mw7 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:22:38] It’s due to redis [23:23:18] RECOVERY - jobrunner1 MediaWiki Rendering on jobrunner1 is OK: HTTP OK: HTTP/1.1 200 OK - 18655 bytes in 4.385 second response time [23:24:28] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 9 backends are healthy [23:24:47] RECOVERY - mw2 MediaWiki Rendering on mw2 is OK: HTTP OK: HTTP/1.1 200 OK - 18656 bytes in 2.076 second response time [23:24:49] PROBLEM - lizardfs6 MediaWiki Rendering on lizardfs6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:26:01] they have a bad update? [23:26:10] No, latency. [23:28:18] RECOVERY - mw3 MediaWiki Rendering on mw3 is OK: HTTP OK: HTTP/1.1 200 OK - 18673 bytes in 8.138 second response time [23:29:37] RECOVERY - test1 MediaWiki Rendering on test1 is OK: HTTP OK: HTTP/1.1 200 OK - 18674 bytes in 6.691 second response time [23:34:15] RECOVERY - cp3 Stunnel Http for mw5 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.749 second response time [23:36:13] RECOVERY - cp3 Stunnel Http for mw7 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15330 bytes in 6.714 second response time [23:37:52] RECOVERY - lizardfs6 MediaWiki Rendering on lizardfs6 is OK: HTTP OK: HTTP/1.1 200 OK - 18655 bytes in 2.710 second response time [23:38:21] PROBLEM - cp8 Stunnel Http for mw5 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:38:40] Hi! Here is the list of currently open high priority tasks on Phabricator [23:38:45] PROBLEM - jobrunner1 MediaWiki Rendering on jobrunner1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:38:47] No updates for 9 days - https://phabricator.miraheze.org/T5302 - System generated password emails not being sent - authored by Stevethompson, assigned to None [23:38:54] No updates for 24 days - https://phabricator.miraheze.org/T5258 - Prevent creating Special: Namespace in ManageWiki - authored by RhinosF1, assigned to John [23:39:00] No updates for 19 days - https://phabricator.miraheze.org/T5244 - Perform/write postmortem for the RamNode->OVH migration - authored by RobLa, assigned to None [23:39:07] No updates for 37 days - https://phabricator.miraheze.org/T5222 - MediaWiki response time can fluctuate due to messages - authored by Southparkfan, assigned to None [23:39:14] No updates for 50 days - https://phabricator.miraheze.org/T5174 - enc.for.uz itermittently using both yandex and miraheze dns - authored by Paladox, assigned to Sf7_uz [23:39:20] and 1 more (see next pages...) [23:39:45] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:41:05] RECOVERY - cp8 Stunnel Http for mw4 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15324 bytes in 0.359 second response time [23:41:08] PROBLEM - test1 MediaWiki Rendering on test1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:41:50] RECOVERY - mw5 MediaWiki Rendering on mw5 is OK: HTTP OK: HTTP/1.1 200 OK - 18655 bytes in 2.622 second response time [23:42:35] PROBLEM - cp6 Stunnel Http for mw5 on cp6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:43:04] PROBLEM - cp7 Stunnel Http for mw5 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:44:51] RECOVERY - mw1 MediaWiki Rendering on mw1 is OK: HTTP OK: HTTP/1.1 200 OK - 18673 bytes in 5.375 second response time [23:45:29] PROBLEM - cp7 Stunnel Http for mw7 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:45:50] RECOVERY - test1 MediaWiki Rendering on test1 is OK: HTTP OK: HTTP/1.1 200 OK - 18673 bytes in 0.658 second response time [23:46:05] PROBLEM - cp3 Stunnel Http for mw5 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:46:31] RECOVERY - cp6 Varnish Backends on cp6 is OK: All 11 backends are healthy [23:46:35] !log repool mw[123] [23:47:27] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 2 backends are down. mw5 mw7 [23:47:57] RECOVERY - jobrunner1 MediaWiki Rendering on jobrunner1 is OK: HTTP OK: HTTP/1.1 200 OK - 18673 bytes in 0.688 second response time [23:47:59] RECOVERY - mw7 MediaWiki Rendering on mw7 is OK: HTTP OK: HTTP/1.1 200 OK - 18668 bytes in 0.547 second response time [23:48:03] RECOVERY - test2 MediaWiki Rendering on test2 is OK: HTTP OK: HTTP/1.1 200 OK - 18674 bytes in 4.123 second response time [23:48:46] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [23:48:53] RECOVERY - cp8 Varnish Backends on cp8 is OK: All 11 backends are healthy [23:48:53] RECOVERY - mw4 MediaWiki Rendering on mw4 is OK: HTTP OK: HTTP/1.1 200 OK - 18686 bytes in 0.627 second response time [23:49:02] RECOVERY - cp7 Stunnel Http for mw7 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.535 second response time [23:49:26] RECOVERY - cp6 Stunnel Http for mw7 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.170 second response time [23:49:28] RECOVERY - cp3 Stunnel Http for mw5 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.958 second response time [23:49:47] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [23:50:45] RECOVERY - cp6 Stunnel Http for mw5 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 15330 bytes in 5.740 second response time [23:50:49] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 9 backends are healthy [23:50:58] RECOVERY - cp8 Stunnel Http for mw5 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 2.236 second response time [23:50:59] RECOVERY - cp7 Stunnel Http for mw5 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15330 bytes in 2.365 second response time