[00:36:38] PROBLEM - db5 Current Load on db5 is CRITICAL: CRITICAL - load average: 8.60, 4.76, 2.13 [00:36:57] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 4 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [00:37:28] PROBLEM - cp4 Stunnel Http for misc2 on cp4 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 358 bytes in 0.016 second response time [00:37:31] PROBLEM - misc2 HTTPS on misc2 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 358 bytes in 0.006 second response time [00:37:39] have trouble accessing Miraheze? [00:38:03] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:38:04] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 1 backends are down. mw1 [00:38:05] PROBLEM - lizardfs6 MediaWiki Rendering on lizardfs6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:38:43] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw1 [00:39:02] It seems that already, although a little slow [00:39:33] RECOVERY - cp4 Stunnel Http for misc2 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 43457 bytes in 0.099 second response time [00:39:35] RECOVERY - misc2 HTTPS on misc2 is OK: HTTP OK: HTTP/1.1 200 OK - 43465 bytes in 0.059 second response time [00:39:59] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 6 backends are healthy [00:40:02] RECOVERY - mw1 MediaWiki Rendering on mw1 is OK: HTTP OK: HTTP/1.1 200 OK - 20541 bytes in 0.965 second response time [00:40:05] RECOVERY - lizardfs6 MediaWiki Rendering on lizardfs6 is OK: HTTP OK: HTTP/1.1 200 OK - 20540 bytes in 0.887 second response time [00:40:39] PROBLEM - db5 Current Load on db5 is WARNING: WARNING - load average: 1.69, 3.82, 2.45 [00:40:39] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [00:40:54] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [00:42:37] RECOVERY - db5 Current Load on db5 is OK: OK - load average: 0.84, 2.78, 2.23 [02:52:56] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 1 datacenter is down: 128.199.139.216/cpweb [02:52:58] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 2400:6180:0:d0::403:f001/cpweb [02:54:55] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [02:54:56] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [03:55:55] PROBLEM - cp8 Current Load on cp8 is CRITICAL: CRITICAL - load average: 1.55, 2.05, 1.24 [03:57:55] PROBLEM - cp8 Current Load on cp8 is WARNING: WARNING - load average: 1.03, 1.76, 1.23 [03:59:55] RECOVERY - cp8 Current Load on cp8 is OK: OK - load average: 0.50, 1.30, 1.12 [04:00:23] PROBLEM - cp8 Disk Space on cp8 is CRITICAL: DISK CRITICAL - free space: / 1152 MB (5% inode=93%); [04:19:02] PROBLEM - cp7 HTTPS on cp7 is CRITICAL: connect to address 51.89.160.142 and port 443: Connection refusedHTTP CRITICAL - Unable to open TCP socket [04:19:27] PROBLEM - mw6 HTTPS on mw6 is CRITICAL: connect to address 51.89.160.136 and port 443: Connection refusedHTTP CRITICAL - Unable to open TCP socket [04:19:48] PROBLEM - test2 HTTPS on test2 is CRITICAL: connect to address 51.77.107.211 and port 443: Connection refusedHTTP CRITICAL - Unable to open TCP socket [04:20:41] PROBLEM - mw7 HTTPS on mw7 is CRITICAL: connect to address 51.89.160.137 and port 443: Connection refusedHTTP CRITICAL - Unable to open TCP socket [04:21:22] well that's new [04:21:37] i guess expected due to ssl certificates [04:21:45] need to copy the ssl-key repo again [04:35:27] RECOVERY - cp7 HTTPS on cp7 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1532 bytes in 0.122 second response time [04:41:53] RECOVERY - mw6 HTTPS on mw6 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 541 bytes in 0.005 second response time [04:42:02] RECOVERY - cp6 HTTPS on cp6 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1532 bytes in 0.047 second response time [04:42:08] RECOVERY - test2 HTTPS on test2 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 545 bytes in 0.005 second response time [04:43:37] RECOVERY - mw7 HTTPS on mw7 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 541 bytes in 0.005 second response time [05:02:25] RECOVERY - cp8 Disk Space on cp8 is OK: DISK OK - free space: / 3164 MB (16% inode=93%); [06:27:31] RECOVERY - cp3 Disk Space on cp3 is OK: DISK OK - free space: / 3282 MB (13% inode=94%); [06:49:55] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [06:51:54] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [10:11:28] Reception123: could you create a herald rule to auto respond to blank requests like https://phabricator.miraheze.org/T5245 [10:11:31] [ ⚓ T5245 2020-02-15 data restore request - REPLACE_THIS_WITH_WIKI_URL ] - phabricator.miraheze.org [14:58:28] RhinosF1: {{Done}} [14:59:30] Zppix: thanks [15:04:07] !log MariaDB [metawiki]> update cw_requests set cw_status = 'approved' where cw_id = 10765; [15:04:14] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [15:04:39] [02ssl] 07Pix1234 opened pull request 03#269: T5238 - Add wiki.arcolinuxclub.org custom domain - 13https://git.io/Jv4KP [15:04:49] !log MariaDB [metawiki]> update cw_requests set cw_status = 'approved' where cw_id = 10775; [15:04:54] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [15:06:23] [02ssl] 07paladox closed pull request 03#269: T5238 - Add wiki.arcolinuxclub.org custom domain - 13https://git.io/Jv4KP [15:06:24] [02miraheze/ssl] 07paladox pushed 031 commit to 03master [+1/-0/±1] 13https://git.io/Jv4KM [15:06:26] [02miraheze/ssl] 07Pix1234 037cc7012 - T5238 - Add wiki.arcolinuxclub.org custom domain (#269) * Create wiki.arcolinuxclub.org.crt * + wiki.arcolinuxclub.org config [15:07:25] !log MariaDB [metawiki]> update cw_requests set cw_status = 'approved' where cw_id = 10776; [15:07:37] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [15:07:56] !log MariaDB [metawiki]> update cw_requests set cw_status = 'approved' where cw_id = 10777; [15:08:04] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [15:08:17] !log MariaDB [metawiki]> update cw_requests set cw_status = 'approved' where cw_id = 10778; [15:08:25] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [15:08:37] !log MariaDB [metawiki]> update cw_requests set cw_status = 'approved' where cw_id = 10780; [15:08:42] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [15:09:17] !log MariaDB [metawiki]> update cw_requests set cw_status = 'approved' where cw_id = 10781; [15:09:24] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [15:09:53] !log MariaDB [metawiki]> update cw_requests set cw_status = 'approved' where cw_id = 10782; [15:10:13] !log MariaDB [metawiki]> update cw_requests set cw_status = 'approved' where cw_id = 10783; [15:10:18] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [15:10:31] !log MariaDB [metawiki]> update cw_requests set cw_status = 'approved' where cw_id = 10784; [15:10:39] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [15:10:48] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [16:22:52] PROBLEM - test1 MediaWiki Rendering on test1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4208 bytes in 0.113 second response time [16:22:58] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [16:23:08] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 2 backends are down. mw2 mw3 [16:23:22] PROBLEM - cp8 HTTP 4xx/5xx ERROR Rate on cp8 is CRITICAL: CRITICAL - NGINX Error Rate is 97% [16:23:35] PROBLEM - cp8 HTTPS on cp8 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4142 bytes in 3.879 second response time [16:24:00] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [16:24:22] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 2 backends are down. mw1 lizardfs6 [16:24:25] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 2 backends are down. mw1 lizardfs6 [16:24:56] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+9/-8/±98] 13https://git.io/Jv4Pm [16:24:57] [02miraheze/puppet] 07paladox 033539db5 - Update apt to 7.3.0 [16:25:01] RECOVERY - test1 MediaWiki Rendering on test1 is OK: HTTP OK: HTTP/1.1 200 OK - 20534 bytes in 9.439 second response time [16:25:10] PROBLEM - test2 MediaWiki Rendering on test2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:25:32] RECOVERY - cp8 HTTPS on cp8 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1532 bytes in 0.463 second response time [16:27:22] RECOVERY - cp8 HTTP 4xx/5xx ERROR Rate on cp8 is OK: OK - NGINX Error Rate is 28% [16:27:51] paladox: why do we have both mirahezebots_ and icinga-miraheze [16:28:26] Because one of them is the new monitoring host [16:29:18] PROBLEM - test1 MediaWiki Rendering on test1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:29:42] PROBLEM - mw3 MediaWiki Rendering on mw3 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 1352 bytes in 0.019 second response time [16:30:04] paladox: cool [16:30:41] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:30:41] PROBLEM - lizardfs6 MediaWiki Rendering on lizardfs6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:30:45] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 1352 bytes in 0.028 second response time [16:31:46] RECOVERY - mw3 MediaWiki Rendering on mw3 is OK: HTTP OK: HTTP/1.1 200 OK - 20534 bytes in 8.767 second response time [16:32:02] PROBLEM - mw5 MediaWiki Rendering on mw5 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 1352 bytes in 0.080 second response time [16:32:12] PROBLEM - lizardfs6 Puppet on lizardfs6 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:32:25] PROBLEM - db4 Puppet on db4 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:32:28] PROBLEM - puppet1 Puppet on puppet1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:32:51] RECOVERY - mw2 MediaWiki Rendering on mw2 is OK: HTTP OK: HTTP/1.1 200 OK - 20534 bytes in 4.170 second response time [16:33:02] PROBLEM - misc2 Puppet on misc2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:33:02] PROBLEM - test1 Puppet on test1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:33:02] PROBLEM - db5 Puppet on db5 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:33:09] PROBLEM - misc4 Puppet on misc4 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:33:15] PROBLEM - mw4 MediaWiki Rendering on mw4 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:33:15] PROBLEM - cp3 Puppet on cp3 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:33:24] PROBLEM - mw1 Puppet on mw1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:33:31] PROBLEM - misc1 Puppet on misc1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:33:32] PROBLEM - mw3 Puppet on mw3 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:34:12] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:34:18] PROBLEM - mw2 Puppet on mw2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:34:18] PROBLEM - cp4 Puppet on cp4 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:34:20] PROBLEM - cp8 Puppet on cp8 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [16:35:21] PROBLEM - mw6 MediaWiki Rendering on mw6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 1352 bytes in 0.087 second response time [16:37:43] paladox: ^ [16:37:51] RECOVERY - test1 MediaWiki Rendering on test1 is OK: HTTP OK: HTTP/1.1 200 OK - 20534 bytes in 3.094 second response time [16:37:54] JohnLewis yup, puppet failure caused by me. [16:38:09] PROBLEM - mw3 MediaWiki Rendering on mw3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:38:26] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+8/-9/±98] 13https://git.io/Jv4PP [16:38:27] [02miraheze/puppet] 07paladox 0355b3191 - Revert "Update apt to 7.3.0" This reverts commit 3539db58e68165bc6fef2ee465442bd647e1ba28. [16:38:48] RECOVERY - mw6 MediaWiki Rendering on mw6 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 1.019 second response time [16:38:54] RECOVERY - mw1 MediaWiki Rendering on mw1 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.853 second response time [16:39:05] RECOVERY - lizardfs6 MediaWiki Rendering on lizardfs6 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.846 second response time [16:39:12] RECOVERY - mw5 MediaWiki Rendering on mw5 is OK: HTTP OK: HTTP/1.1 200 OK - 20532 bytes in 2.855 second response time [16:40:05] RECOVERY - mw3 MediaWiki Rendering on mw3 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.928 second response time [16:40:17] RECOVERY - mw4 MediaWiki Rendering on mw4 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.931 second response time [16:41:22] RECOVERY - misc4 Puppet on misc4 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [16:41:44] RECOVERY - misc1 Puppet on misc1 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [16:42:15] RECOVERY - lizardfs6 Puppet on lizardfs6 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [16:42:25] RECOVERY - cp4 Puppet on cp4 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [16:42:30] RECOVERY - db4 Puppet on db4 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:42:34] RECOVERY - puppet1 Puppet on puppet1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:42:35] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:42:39] !log upgrade grafana on misc1 [16:42:55] RECOVERY - test1 Puppet on test1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:42:55] misc1 is still serving dns traffic... [16:42:56] RECOVERY - test2 MediaWiki Rendering on test2 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.948 second response time [16:43:02] RECOVERY - db5 Puppet on db5 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:43:11] RECOVERY - misc2 Puppet on misc2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:43:20] RECOVERY - mw1 Puppet on mw1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:43:31] RECOVERY - mw3 Puppet on mw3 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:43:35] !log upgrade mariadb-client on misc1 [16:43:43] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [16:44:01] JohnLewis hmm? [16:44:12] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [16:44:13] misc1 is still serving live DNS requests [16:44:22] RECOVERY - cp8 Puppet on cp8 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:45:00] JohnLewis yes, but i doin't understand why your telling me that when i already know? :) [16:45:14] Because I depooled misc1 5 days ago? [16:45:18] RECOVERY - cp3 Puppet on cp3 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:45:39] oh [16:45:52] JohnLewis how did you depool misc1? [16:46:06] changed all DNS entries for ns2 [16:46:55] JohnLewis shall i go do the hack you did or should i just commit this to the dns repo? [16:47:05] huh? [16:47:40] wierd [16:47:41] ns2 A 51.89.247.234 [16:47:41] AAAA 2001:41d0:800:1056::11 [16:47:45] JohnLewis I see ^ [16:47:50] yes. I know [16:47:53] So how can misc1 be serving dns traffic [16:47:55] because I changed it 5 days ago [16:48:06] PROBLEM - jobrunner1 MediaWiki Rendering on jobrunner1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:49:04] ok, so it takes time for traffic to go JohnLewis ? [16:49:17] yes, 72 hours [16:49:48] ok [16:50:06] but it's been 5 days [16:50:44] yup [16:51:26] RECOVERY - jobrunner1 MediaWiki Rendering on jobrunner1 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.953 second response time [17:01:10] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2649 MB (10% inode=94%); [17:01:36] RECOVERY - cp8 Varnish Backends on cp8 is OK: All 6 backends are healthy [17:02:18] RECOVERY - mw2 Puppet on mw2 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [17:02:33] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 6 backends are healthy [17:02:50] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [17:02:52] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [17:02:54] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [17:31:09] [02puppet] 07paladox reopened pull request 03#1247: varnish: Use mw[4567] only on cp7 - 13https://git.io/Jv4L5 [17:41:17] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 51.161.32.127/cpweb [17:43:19] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [18:24:38] PROBLEM - mw2 Puppet on mw2 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[php7.3-redis] [18:42:50] RECOVERY - mw2 Puppet on mw2 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [18:51:29] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw2 [18:51:36] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 3 datacenters are down: 2400:6180:0:d0::403:f001/cpweb, 2a00:d880:5:8ea::ebc7/cpweb, 51.161.32.127/cpweb [18:51:37] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 1 backends are down. mw2 [18:51:49] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 1 backends are down. mw2 [18:52:01] paladox: ^ [18:52:33] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:53:03] nothing i can really do, it's old infra. I've looked and i doin't see why it's happening. [18:53:25] paladox: :( [18:53:25] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 2400:6180:0:d0::403:f001/cpweb, 51.161.32.127/cpweb [18:54:29] RECOVERY - mw2 MediaWiki Rendering on mw2 is OK: HTTP OK: HTTP/1.1 200 OK - 20532 bytes in 1.640 second response time [18:55:24] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [18:55:28] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [18:55:30] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [18:55:38] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 6 backends are healthy [18:55:47] RECOVERY - cp8 Varnish Backends on cp8 is OK: All 6 backends are healthy [19:01:14] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-8 [+0/-0/±1] 13https://git.io/Jv4yj [19:01:15] PROBLEM - cp8 Disk Space on cp8 is WARNING: DISK WARNING - free space: / 2115 MB (10% inode=93%); [19:01:16] [02miraheze/puppet] 07paladox 032ef9cd7 - Grafana: Set allow_sign_up to false Only allow admins to create accounts. [19:01:17] [02puppet] 07paladox created branch 03paladox-patch-8 - 13https://git.io/vbiAS [19:01:19] [02puppet] 07paladox opened pull request 03#1248: Grafana: Set allow_sign_up to false - 13https://git.io/Jv4Se [19:01:51] [02puppet] 07paladox closed pull request 03#1248: Grafana: Set allow_sign_up to false - 13https://git.io/Jv4Se [19:01:52] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jv4Sf [19:01:54] [02miraheze/puppet] 07paladox 0342a6f6c - Grafana: Set allow_sign_up to false (#1248) Only allow admins to create accounts. [19:01:55] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-8 [19:01:57] [02puppet] 07paladox deleted branch 03paladox-patch-8 - 13https://git.io/vbiAS [19:03:51] PROBLEM - mw3 MediaWiki Rendering on mw3 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4208 bytes in 0.021 second response time [19:04:17] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4212 bytes in 0.016 second response time [19:05:47] RECOVERY - mw3 MediaWiki Rendering on mw3 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.260 second response time [19:06:16] RECOVERY - mw2 MediaWiki Rendering on mw2 is OK: HTTP OK: HTTP/1.1 200 OK - 20534 bytes in 0.815 second response time [19:11:53] !log enable puppet on jobrunner1 [19:12:01] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [19:16:07] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4212 bytes in 0.022 second response time [19:16:44] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 1 backends are down. lizardfs6 [19:16:54] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 3 datacenters are down: 128.199.139.216/cpweb, 81.4.109.133/cpweb, 2607:5300:205:200::17f6/cpweb [19:17:25] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb [19:18:08] RECOVERY - mw2 MediaWiki Rendering on mw2 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.869 second response time [19:18:44] RECOVERY - cp8 Varnish Backends on cp8 is OK: All 6 backends are healthy [19:18:54] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [19:19:12] [02miraheze/mw-config] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jv4Su [19:19:14] [02miraheze/mw-config] 07paladox 03a96c084 - Unset wgHTTPTimeout and wgHTTPConnectTimeout [19:19:24] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [19:27:07] PROBLEM - lizardfs6 MediaWiki Rendering on lizardfs6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:08] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [19:27:28] PROBLEM - test1 MediaWiki Rendering on test1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:36] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [19:28:24] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4210 bytes in 0.024 second response time [19:28:47] PROBLEM - mw3 MediaWiki Rendering on mw3 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4210 bytes in 0.034 second response time [19:28:51] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 4 backends are down. mw1 mw2 mw3 lizardfs6 [19:28:54] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 4 backends are down. mw1 mw2 mw3 lizardfs6 [19:29:07] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 4 backends are down. mw1 mw2 mw3 lizardfs6 [19:29:53] PROBLEM - www.bushcraftpedia.org - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:29:58] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:30:05] PROBLEM - mw4 MediaWiki Rendering on mw4 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 1351 bytes in 0.115 second response time [19:30:25] !log restart php7.3-fpm on mw1 [19:31:03] RECOVERY - lizardfs6 MediaWiki Rendering on lizardfs6 is OK: HTTP OK: HTTP/1.1 200 OK - 20532 bytes in 0.816 second response time [19:31:48] !log restart php7.3-fpm on mw* and lizardfs6 [19:32:03] RECOVERY - mw1 MediaWiki Rendering on mw1 is OK: HTTP OK: HTTP/1.1 200 OK - 20535 bytes in 9.222 second response time [19:32:25] RECOVERY - mw2 MediaWiki Rendering on mw2 is OK: HTTP OK: HTTP/1.1 200 OK - 20532 bytes in 0.868 second response time [19:32:50] RECOVERY - mw3 MediaWiki Rendering on mw3 is OK: HTTP OK: HTTP/1.1 200 OK - 20532 bytes in 1.122 second response time [19:32:54] RECOVERY - cp8 Varnish Backends on cp8 is OK: All 6 backends are healthy [19:32:54] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [19:33:02] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [19:33:06] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 6 backends are healthy [19:33:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [19:33:32] RECOVERY - mw4 MediaWiki Rendering on mw4 is OK: HTTP OK: HTTP/1.1 200 OK - 20534 bytes in 1.970 second response time [19:33:41] RECOVERY - test1 MediaWiki Rendering on test1 is OK: HTTP OK: HTTP/1.1 200 OK - 20532 bytes in 1.443 second response time [19:33:42] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [19:34:58] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [19:38:23] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4210 bytes in 0.370 second response time [19:38:39] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4210 bytes in 0.111 second response time [19:38:51] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 4 backends are down. mw1 mw2 mw3 lizardfs6 [19:38:55] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [19:39:05] PROBLEM - mw3 MediaWiki Rendering on mw3 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4210 bytes in 0.023 second response time [19:39:05] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 4 backends are down. mw1 mw2 mw3 lizardfs6 [19:39:09] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 4 backends are down. mw1 mw2 mw3 lizardfs6 [19:39:22] PROBLEM - lizardfs6 MediaWiki Rendering on lizardfs6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4210 bytes in 0.033 second response time [19:39:38] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 90% [19:39:44] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [19:39:44] PROBLEM - test1 MediaWiki Rendering on test1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4210 bytes in 0.037 second response time [19:40:42] RECOVERY - mw2 MediaWiki Rendering on mw2 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.886 second response time [19:41:04] RECOVERY - mw3 MediaWiki Rendering on mw3 is OK: HTTP OK: HTTP/1.1 200 OK - 20532 bytes in 0.862 second response time [19:41:20] RECOVERY - lizardfs6 MediaWiki Rendering on lizardfs6 is OK: HTTP OK: HTTP/1.1 200 OK - 20531 bytes in 2.040 second response time [19:41:26] PROBLEM - cp8 HTTP 4xx/5xx ERROR Rate on cp8 is CRITICAL: CRITICAL - NGINX Error Rate is 93% [19:41:38] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 11% [19:42:36] PROBLEM - mw6 MediaWiki Rendering on mw6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4213 bytes in 0.085 second response time [19:42:48] PROBLEM - jobrunner1 MediaWiki Rendering on jobrunner1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4215 bytes in 0.164 second response time [19:43:23] RECOVERY - cp8 HTTP 4xx/5xx ERROR Rate on cp8 is OK: OK - NGINX Error Rate is 7% [19:43:53] RECOVERY - test1 MediaWiki Rendering on test1 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.308 second response time [19:44:34] RECOVERY - mw1 MediaWiki Rendering on mw1 is OK: HTTP OK: HTTP/1.1 200 OK - 20532 bytes in 4.085 second response time [19:45:00] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [19:45:05] RECOVERY - cp8 Varnish Backends on cp8 is OK: All 6 backends are healthy [19:45:14] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 6 backends are healthy [19:45:18] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [19:45:45] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [19:46:43] oh i know why [19:46:46] mw4 is failing [19:47:02] anyways, i've filed a task because ipv6 from db4 is not working [19:47:15] (pinging mw1 from db4 is failing when doing ipv6) [19:48:07] PROBLEM - test1 MediaWiki Rendering on test1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4212 bytes in 0.429 second response time [19:48:10] PROBLEM - mw4 MediaWiki Rendering on mw4 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4215 bytes in 0.076 second response time [19:48:12] PROBLEM - mw3 MediaWiki Rendering on mw3 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4212 bytes in 0.024 second response time [19:49:30] RECOVERY - mw6 MediaWiki Rendering on mw6 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.968 second response time [19:49:51] RECOVERY - jobrunner1 MediaWiki Rendering on jobrunner1 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.564 second response time [19:50:09] RECOVERY - mw3 MediaWiki Rendering on mw3 is OK: HTTP OK: HTTP/1.1 200 OK - 20534 bytes in 2.184 second response time [19:50:10] RECOVERY - test1 MediaWiki Rendering on test1 is OK: HTTP OK: HTTP/1.1 200 OK - 20534 bytes in 4.304 second response time [19:50:18] !log restart redis to see if it will fix the 503s [19:50:45] RECOVERY - www.bushcraftpedia.org - LetsEncrypt on sslhost is OK: OK - Certificate 'server.notrace.nl' will expire on Fri 08 May 2020 11:29:19 AM GMT +0000. [19:50:45] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [19:54:42] RECOVERY - mw4 MediaWiki Rendering on mw4 is OK: HTTP OK: HTTP/1.1 200 OK - 20534 bytes in 0.334 second response time [20:06:54] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [20:06:55] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 2 datacenters are down: 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [20:07:03] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 2 backends are down. mw1 lizardfs6 [20:07:31] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 2 backends are down. mw1 lizardfs6 [20:07:44] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 2 backends are down. mw1 lizardfs6 [20:10:30] PROBLEM - mw6 MediaWiki Rendering on mw6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4215 bytes in 0.077 second response time [20:10:32] PROBLEM - mw3 MediaWiki Rendering on mw3 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4210 bytes in 0.023 second response time [20:10:44] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4210 bytes in 0.047 second response time [20:10:50] PROBLEM - lizardfs6 MediaWiki Rendering on lizardfs6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4210 bytes in 0.038 second response time [20:11:35] RECOVERY - cp8 Varnish Backends on cp8 is OK: All 6 backends are healthy [20:12:06] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [20:12:33] RECOVERY - mw3 MediaWiki Rendering on mw3 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.262 second response time [20:12:44] RECOVERY - mw1 MediaWiki Rendering on mw1 is OK: HTTP OK: HTTP/1.1 200 OK - 20534 bytes in 4.868 second response time [20:12:47] RECOVERY - lizardfs6 MediaWiki Rendering on lizardfs6 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.866 second response time [20:12:56] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [20:12:57] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 6 backends are healthy [20:13:48] RECOVERY - mw6 MediaWiki Rendering on mw6 is OK: HTTP OK: HTTP/1.1 200 OK - 20532 bytes in 0.314 second response time [20:14:52] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [20:17:37] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 1 backends are down. lizardfs6 [20:18:05] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. lizardfs6 [20:18:51] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 1 backends are down. lizardfs6 [20:19:38] !log fixed perms on /etc/ssl/private on lizardfs6 [20:19:38] RECOVERY - cp8 Varnish Backends on cp8 is OK: All 6 backends are healthy [20:19:48] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [20:20:05] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [20:20:48] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 6 backends are healthy [20:25:30] !log added www-data to ssl-cert on mw[23] lizardfs6 [20:25:48] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [20:53:43] !log "ip -6 address add 2a00:d880:5:d6c::2/64 dev ens3" - db4 [20:54:13] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [20:54:31] !log "ip addr del fe80::216:3cff:fe25:9a18/64 dev ens3" - db4 [20:54:44] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [21:58:50] PROBLEM - mw4 SSH on mw4 is CRITICAL: connect to address 51.89.160.128 and port 22: Connection refused [21:59:56] PROBLEM - db7 SSH on db7 is CRITICAL: connect to address 51.89.160.143 and port 22: Connection refused [22:00:42] PROBLEM - mw4 HTTPS on mw4 is CRITICAL: connect to address 51.89.160.128 and port 443: Connection refusedHTTP CRITICAL - Unable to open TCP socket [22:02:53] RECOVERY - db7 SSH on db7 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [22:10:36] RECOVERY - mw4 SSH on mw4 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [22:30:02] PROBLEM - mw5 SSH on mw5 is CRITICAL: connect to address 51.89.160.133 and port 22: Connection refused [22:30:28] paladox: ^ expected? [22:30:33] yes [22:30:50] PROBLEM - mw5 HTTPS on mw5 is CRITICAL: connect to address 51.89.160.133 and port 443: Connection refusedHTTP CRITICAL - Unable to open TCP socket [22:31:48] PROBLEM - cp7 HTTPS on cp7 is CRITICAL: connect to address 51.89.160.142 and port 443: Connection refusedHTTP CRITICAL - Unable to open TCP socket [22:32:00] PROBLEM - mw6 HTTPS on mw6 is CRITICAL: connect to address 51.89.160.136 and port 443: Connection refusedHTTP CRITICAL - Unable to open TCP socket [22:32:05] PROBLEM - cp7 SSH on cp7 is CRITICAL: connect to address 51.89.160.142 and port 22: Connection refused [22:32:10] PROBLEM - mw7 SSH on mw7 is CRITICAL: connect to address 51.89.160.137 and port 22: Connection refused [22:32:22] PROBLEM - mw6 SSH on mw6 is CRITICAL: connect to address 51.89.160.136 and port 22: Connection refused [22:33:24] PROBLEM - mw7 HTTPS on mw7 is CRITICAL: connect to address 51.89.160.137 and port 443: Connection refusedHTTP CRITICAL - Unable to open TCP socket [22:34:38] RECOVERY - mw4 HTTPS on mw4 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 557 bytes in 0.006 second response time [22:35:23] RECOVERY - mw7 SSH on mw7 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [22:35:37] RECOVERY - mw6 SSH on mw6 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [22:36:49] RECOVERY - mw5 SSH on mw5 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [22:38:48] RECOVERY - cp7 SSH on cp7 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [22:50:25] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jv4Ft [22:50:27] [02miraheze/puppet] 07paladox 0399c594a - Make sure /var/lib/glusterd/ exists /var/lib/glusterd/ is not created by the client package so we have to make it our selfs. [23:01:54] [02puppet] 07Pix1234 opened pull request 03#1249: MOTD: Remove latest puppet commit as its unused - 13https://git.io/Jv4Fa [23:02:16] paladox: ^ minor change but if you want to review [23:02:33] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [23:02:39] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 2 datacenters are down: 2400:6180:0:d0::403:f001/cpweb, 51.161.32.127/cpweb [23:03:00] [02puppet] 07paladox closed pull request 03#1249: MOTD: Remove latest puppet commit as its unused - 13https://git.io/Jv4Fa [23:03:01] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jv4Fw [23:03:03] [02miraheze/puppet] 07Pix1234 03f98786f - Remove latest puppet commit as its unused (#1249) [23:04:12] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jv4Fo [23:04:13] SPF|Cloud ^ [23:04:13] [02miraheze/puppet] 07paladox 030023008 - Update 97-last-puppet-run [23:04:17] err Zppix ^ [23:04:22] Looking [23:04:33] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [23:04:36] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [23:05:22] paladox: thats what i get for trying to do stuff on mobile heh [23:06:52] heh [23:14:04] RECOVERY - mw5 HTTPS on mw5 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 557 bytes in 0.006 second response time [23:18:01] RECOVERY - mw6 HTTPS on mw6 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 541 bytes in 0.005 second response time [23:20:13] RECOVERY - mw7 HTTPS on mw7 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 541 bytes in 0.005 second response time [23:29:24] [02miraheze/mw-config] 07paladox pushed 031 commit to 03paladox-patch-6 [+0/-0/±1] 13https://git.io/Jv4bX [23:29:26] [02miraheze/mw-config] 07paladox 036f909f0 - Migrate testwiki to db7 [23:29:27] [02mw-config] 07paladox created branch 03paladox-patch-6 - 13https://git.io/vbvb3 [23:29:29] [02mw-config] 07paladox opened pull request 03#2889: Migrate testwiki to db7 - 13https://git.io/Jv4b1 [23:38:00] [02miraheze/dns] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/Jv4Nk [23:38:01] [02miraheze/dns] 07paladox 032048ec4 - Add cp7 [23:38:03] [02dns] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbQXl [23:38:04] [02dns] 07paladox opened pull request 03#129: Add cp7 - 13https://git.io/Jv4NI [23:43:03] PROBLEM - mw5 MediaWiki Rendering on mw5 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:43:08] PROBLEM - test1 MediaWiki Rendering on test1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:43:08] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 1 backends are down. mw1 [23:43:09] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:43:24] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [23:43:42] PROBLEM - mw6 MediaWiki Rendering on mw6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:43:50] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [23:44:04] PROBLEM - lizardfs6 MediaWiki Rendering on lizardfs6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:44:05] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 2 backends are down. mw2 mw3 [23:44:06] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 4 backends are down. mw1 mw2 mw3 lizardfs6 [23:44:36] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:44:37] PROBLEM - mw4 MediaWiki Rendering on mw4 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4213 bytes in 0.375 second response time [23:46:42] PROBLEM - mw7 MediaWiki Rendering on mw7 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:47:00] PROBLEM - jobrunner1 MediaWiki Rendering on jobrunner1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:47:25] RECOVERY - mw6 MediaWiki Rendering on mw6 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 1.054 second response time [23:47:38] RECOVERY - test1 MediaWiki Rendering on test1 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 0.668 second response time [23:47:45] PROBLEM - cp8 HTTP 4xx/5xx ERROR Rate on cp8 is CRITICAL: CRITICAL - NGINX Error Rate is 86% [23:48:08] RECOVERY - mw4 MediaWiki Rendering on mw4 is OK: HTTP OK: HTTP/1.1 200 OK - 20532 bytes in 1.223 second response time [23:50:37] RECOVERY - jobrunner1 MediaWiki Rendering on jobrunner1 is OK: HTTP OK: HTTP/1.1 200 OK - 20534 bytes in 1.241 second response time [23:51:08] RECOVERY - mw1 MediaWiki Rendering on mw1 is OK: HTTP OK: HTTP/1.1 200 OK - 20533 bytes in 6.105 second response time [23:51:59] RECOVERY - cp8 HTTP 4xx/5xx ERROR Rate on cp8 is OK: OK - NGINX Error Rate is 22% [23:54:16] [02miraheze/mw-config] 07paladox pushed 031 commit to 03paladox-patch-7 [+0/-0/±1] 13https://git.io/Jv4Nr [23:54:17] [02miraheze/mw-config] 07paladox 034061b8a - Migrate testwiki to new infra [23:54:19] [02mw-config] 07paladox created branch 03paladox-patch-7 - 13https://git.io/vbvb3 [23:54:20] [02mw-config] 07paladox opened pull request 03#2890: Migrate testwiki to new infra - 13https://git.io/Jv4No [23:54:28] [02mw-config] 07paladox edited pull request 03#2890: Set read only for testwiki - 13https://git.io/Jv4No [23:54:50] RECOVERY - mw5 MediaWiki Rendering on mw5 is OK: HTTP OK: HTTP/1.1 200 OK - 20534 bytes in 8.835 second response time [23:55:35] [02mw-config] 07paladox closed pull request 03#2890: Set read only for testwiki - 13https://git.io/Jv4No [23:55:36] [02miraheze/mw-config] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jv4NP [23:55:38] [02miraheze/mw-config] 07paladox 0359141ae - Migrate testwiki to new infra (#2890) [23:55:39]