[00:44:27] Hi! Here is the list of currently open high priority tasks on Phabricator [00:44:34] No updates for 8 days - https://phabricator.miraheze.org/T4647 - Migrate to the "Massive KVM VPS" plans - authored by Paladox, assigned to Paladox [00:44:41] No updates for 9 days - https://phabricator.miraheze.org/T4638 - Job Queue Fatal Exceptions - authored by RhinosF1, assigned to None [01:20:53] Hello Guest72706! If you have any questions feel free to ask and someone should answer soon. [01:21:20] Hello Guest35437! If you have any questions feel free to ask and someone should answer soon. [02:50:10] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjbpr [02:50:11] [02miraheze/services] 07MirahezeSSLBot 03fc3753d - BOT: Updating services config for wikis [06:24:20] Hello Guest68783! If you have any questions feel free to ask and someone should answer soon. [06:25:54] RECOVERY - cp3 Disk Space on cp3 is OK: DISK OK - free space: / 3114 MB (12% inode=94%); [06:26:34] RhinosF1: heh freenode is has been pretty annoying for the last couple of days [06:26:40] Has* [06:27:55] Reception123: normally mac tricks IRC Cloud to keeping the connection [06:43:34] Yeah but for me when it disconnects I lose my cloak so also access to any private channels [06:54:06] Reception123: I get logged out because I haven't updated this yet since I switched off a few channels [06:54:21] I might switch back to the working connection [07:14:15] PROBLEM - lizardfs3 Puppet on lizardfs3 is CRITICAL: CRITICAL: Puppet has 14 failures. Last run 2 minutes ago with 14 failures. Failed resources (up to 3 shown) [07:14:19] PROBLEM - db5 Puppet on db5 is CRITICAL: CRITICAL: Puppet has 14 failures. Last run 2 minutes ago with 14 failures. Failed resources (up to 3 shown) [07:14:19] PROBLEM - db4 Puppet on db4 is CRITICAL: CRITICAL: Puppet has 16 failures. Last run 2 minutes ago with 16 failures. Failed resources (up to 3 shown) [07:14:19] PROBLEM - lizardfs1 Puppet on lizardfs1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 2 minutes ago with 15 failures. Failed resources (up to 3 shown) [07:14:24] PROBLEM - puppet1 Puppet on puppet1 is CRITICAL: CRITICAL: Puppet has 20 failures. Last run 2 minutes ago with 20 failures. Failed resources (up to 3 shown) [07:14:29] PROBLEM - mw3 Puppet on mw3 is CRITICAL: CRITICAL: Puppet has 213 failures. Last run 2 minutes ago with 213 failures. Failed resources (up to 3 shown) [07:14:34] PROBLEM - mw1 Puppet on mw1 is CRITICAL: CRITICAL: Puppet has 216 failures. Last run 2 minutes ago with 216 failures. Failed resources (up to 3 shown) [07:15:09] PROBLEM - cp4 Puppet on cp4 is CRITICAL: CRITICAL: Puppet has 201 failures. Last run 3 minutes ago with 201 failures. Failed resources (up to 3 shown) [07:15:15] PROBLEM - mw2 Puppet on mw2 is CRITICAL: CRITICAL: Puppet has 209 failures. Last run 3 minutes ago with 209 failures. Failed resources (up to 3 shown) [07:15:17] PROBLEM - misc4 Puppet on misc4 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [07:15:19] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 12 failures. Last run 3 minutes ago with 12 failures. Failed resources (up to 3 shown) [07:15:20] PROBLEM - lizardfs2 Puppet on lizardfs2 is CRITICAL: CRITICAL: Puppet has 13 failures. Last run 3 minutes ago with 13 failures. Failed resources (up to 3 shown) [07:15:21] PROBLEM - misc2 Puppet on misc2 is CRITICAL: CRITICAL: Puppet has 30 failures. Last run 3 minutes ago with 30 failures. Failed resources (up to 3 shown) [07:15:32] PROBLEM - cp2 Puppet on cp2 is CRITICAL: CRITICAL: Puppet has 199 failures. Last run 3 minutes ago with 199 failures. Failed resources (up to 3 shown): File[authority certificates],File[/etc/apt/apt.conf.d/50unattended-upgrades],File[/etc/apt/apt.conf.d/20auto-upgrades],File[/root/ufw-fix] [07:15:35] PROBLEM - misc1 Puppet on misc1 is CRITICAL: CRITICAL: Puppet has 51 failures. Last run 3 minutes ago with 51 failures. Failed resources (up to 3 shown) [07:15:46] PROBLEM - bacula1 Puppet on bacula1 is CRITICAL: CRITICAL: Puppet has 14 failures. Last run 3 minutes ago with 14 failures. Failed resources (up to 3 shown) [07:15:47] PROBLEM - misc3 Puppet on misc3 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [07:15:53] PROBLEM - cp3 Puppet on cp3 is CRITICAL: CRITICAL: Puppet has 181 failures. Last run 3 minutes ago with 181 failures. Failed resources (up to 3 shown): File[/etc/rsyslog.conf],File[authority certificates],File[/etc/apt/apt.conf.d/50unattended-upgrades],File[/etc/apt/apt.conf.d/20auto-upgrades] [07:15:56] PROBLEM - test1 Puppet on test1 is CRITICAL: CRITICAL: Puppet has 210 failures. Last run 3 minutes ago with 210 failures. Failed resources (up to 3 shown) [07:16:47] Ah great, hopefully just Icinga on drugs [07:17:37] * RhinosF1 waits for the next run [07:23:19] RECOVERY - lizardfs2 Puppet on lizardfs2 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [07:23:19] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [07:23:46] RECOVERY - bacula1 Puppet on bacula1 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [07:23:47] RECOVERY - misc3 Puppet on misc3 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [07:23:56] Here it comes [07:24:15] RECOVERY - lizardfs3 Puppet on lizardfs3 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [07:24:18] RECOVERY - db5 Puppet on db5 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:24:19] RECOVERY - db4 Puppet on db4 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:24:19] RECOVERY - lizardfs1 Puppet on lizardfs1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:24:24] RECOVERY - puppet1 Puppet on puppet1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:24:31] RECOVERY - mw3 Puppet on mw3 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [07:24:47] RECOVERY - mw1 Puppet on mw1 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [07:25:08] RECOVERY - cp4 Puppet on cp4 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:25:16] RECOVERY - mw2 Puppet on mw2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:25:17] RECOVERY - misc4 Puppet on misc4 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:25:20] RECOVERY - misc2 Puppet on misc2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:25:31] RECOVERY - cp2 Puppet on cp2 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [07:25:35] RECOVERY - misc1 Puppet on misc1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:25:52] RECOVERY - cp3 Puppet on cp3 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [07:26:00] RECOVERY - test1 Puppet on test1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:26:15] All back [11:12:49] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjNUA [11:12:51] [02miraheze/services] 07MirahezeSSLBot 03350c00d - BOT: Updating services config for wikis [13:35:35] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 6.82, 5.58, 4.33 [13:37:35] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 5.55, 5.42, 4.41 [13:41:34] PROBLEM - misc1 webmail.miraheze.org HTTPS on misc1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:41:36] Looks like we're down [13:41:43] But I have no access to check for a long time [13:41:46] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [13:42:03] PROBLEM - cp2 Stunnel Http for mw1 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:42:07] PROBLEM - cp4 Stunnel Http for mw1 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:42:15] PROBLEM - cp4 Stunnel Http for mw3 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:42:20] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 66% [13:42:29] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [13:42:33] PROBLEM - cp3 Stunnel Http for mw1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:42:46] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:42:50] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [13:42:59] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is CRITICAL: CRITICAL - NGINX Error Rate is 74% [13:43:01] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [13:43:02] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [13:43:13] PROBLEM - cp2 Stunnel Http for mw2 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:43:13] PROBLEM - cp3 Stunnel Http for mw2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:44:02] RECOVERY - cp2 Stunnel Http for mw1 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.394 second response time [13:44:05] RECOVERY - cp4 Stunnel Http for mw1 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.005 second response time [13:44:10] RECOVERY - cp4 Stunnel Http for mw3 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24500 bytes in 0.006 second response time [13:44:32] RECOVERY - cp3 Stunnel Http for mw1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.687 second response time [13:44:44] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24522 bytes in 0.638 second response time [13:44:58] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 27% [13:45:08] RECOVERY - cp3 Stunnel Http for mw2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.682 second response time [13:45:09] RECOVERY - cp2 Stunnel Http for mw2 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.393 second response time [13:45:33] PROBLEM - misc4 phabricator.miraheze.org HTTPS on misc4 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 4221 bytes in 0.033 second response time [13:46:50] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [13:47:02] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [13:47:33] RECOVERY - misc4 phabricator.miraheze.org HTTPS on misc4 is OK: HTTP OK: HTTP/1.1 200 OK - 19073 bytes in 0.140 second response time [13:49:26] PROBLEM - cp3 Stunnel Http for mw2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:49:34] PROBLEM - cp2 Stunnel Http for mw2 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:49:46] PROBLEM - cp4 Stunnel Http for mw2 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:49:48] PROBLEM - cp2 Stunnel Http for mw3 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:50:24] PROBLEM - cp2 Stunnel Http for mw1 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:50:26] PROBLEM - cp4 Stunnel Http for mw3 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:50:28] PROBLEM - cp4 Stunnel Http for mw1 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:50:50] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [13:50:58] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is CRITICAL: CRITICAL - NGINX Error Rate is 72% [13:51:02] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [13:52:20] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 52% [13:52:58] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 57% [13:54:09] PROBLEM - misc1 icinga.miraheze.org HTTPS on misc1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:54:20] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 87% [13:55:15] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:55:18] PROBLEM - cp3 Stunnel Http for mw1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:56:59] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 37% [13:57:17] https://www.irccloud.com/pastebin/StlJPuFI [13:57:18] [ Snippet | IRCCloud ] - www.irccloud.com [13:57:23] Reception123: ^ [13:58:42] RECOVERY - cp4 Stunnel Http for mw3 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24522 bytes in 6.474 second response time [13:58:45] Icinga is dead as well [13:58:51] Well there's nothing I can do [13:58:57] Won't have access for a long while [13:59:22] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24522 bytes in 2.817 second response time [14:00:33] I emailed John, nothing more I can do :( [14:00:35] Paladox: ^ [14:00:45] Although you're on holiday as well [14:00:58] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 43% [14:01:00] SPF|Cloud: ^ [14:02:59] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is CRITICAL: CRITICAL - NGINX Error Rate is 77% [14:03:03] RECOVERY - cp2 Stunnel Http for mw1 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.393 second response time [14:03:04] PROBLEM - cp4 Stunnel Http for mw3 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:03:08] RECOVERY - cp4 Stunnel Http for mw1 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.010 second response time [14:03:09] JohnLewis: Were down and no reported RamNode issues [14:03:17] Phab + Icinga also down [14:03:44] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:04:00] Grafana is up [14:04:23] yeah, cleared binlogs [14:04:27] It’s the db [14:04:28] Ran out of space [14:04:38] RECOVERY - cp2 Stunnel Http for mw3 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.392 second response time [14:04:45] RECOVERY - misc1 icinga.miraheze.org HTTPS on misc1 is OK: HTTP OK: HTTP/1.1 302 Found - 341 bytes in 0.010 second response time [14:04:58] RECOVERY - cp4 Stunnel Http for mw3 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.012 second response time [14:04:59] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 51% [14:05:42] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.646 second response time [14:05:42] RECOVERY - cp3 Stunnel Http for mw2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.686 second response time [14:05:46] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [14:05:52] RECOVERY - cp3 Stunnel Http for mw1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.685 second response time [14:06:02] PROBLEM - db4 Disk Space on db4 is WARNING: DISK WARNING - free space: / 22840 MB (6% inode=96%); [14:06:07] RECOVERY - misc1 webmail.miraheze.org HTTPS on misc1 is OK: HTTP OK: Status line output matched "HTTP/1.1 401 Unauthorized" - 5799 bytes in 0.036 second response time [14:06:07] RECOVERY - cp4 Stunnel Http for mw2 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24522 bytes in 0.036 second response time [14:06:14] RECOVERY - cp2 Stunnel Http for mw2 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.392 second response time [14:06:20] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 7% [14:06:28] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [14:06:29] Thanks johnLewis! [14:06:50] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [14:06:58] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 8% [14:07:01] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [14:07:02] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [14:17:23] JohnLewis: thanks [14:17:33] No idea how it ran out of space it was at 21GB last I checked [14:17:40] Thanks [14:18:17] do we have any imports active? [14:18:43] I think so [14:19:08] imports decrease space quickly :D [14:19:39] Someone needs to check the screens on test1 :) [14:20:47] Definitely not me, I finished all mine when it was 21gb [14:22:12] paladox: https://phabricator.miraheze.org/project/view/36/ - that's you [14:22:13] [ Import · Workboard ] - phabricator.miraheze.org [14:22:36] An image import and one that failed is open - both yours [14:23:47] Yup [16:56:02] PROBLEM - db4 Disk Space on db4 is CRITICAL: DISK CRITICAL - free space: / 21970 MB (5% inode=96%); [17:36:02] PROBLEM - db4 Disk Space on db4 is WARNING: DISK WARNING - free space: / 22804 MB (6% inode=96%); [18:25:47] PROBLEM - misc3 Puppet on misc3 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): Service[lizardfs-cgiserv] [18:33:47] RECOVERY - misc3 Puppet on misc3 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [18:41:54] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2650 MB (10% inode=94%); [20:34:02] PROBLEM - db4 Disk Space on db4 is CRITICAL: DISK CRITICAL - free space: / 21969 MB (5% inode=96%);