[00:00:05] RECOVERY - Host bacula1 is UP: PING OK - Packet loss = 61%, RTA = 96.71 ms [00:01:10] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [00:01:48] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 4 datacenters are down: 2604:180:0:33b::2/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [00:03:07] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [00:03:20] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 7.26, 7.90, 7.73 [00:03:47] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [00:05:20] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 9.71, 8.55, 7.97 [00:07:33] !log reinstalling bacula1 with debian 10 [00:07:39] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [00:09:56] !log depool mw2 [00:10:05] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [00:13:09] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 10.02, 7.41, 5.29 [00:13:18] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 1.44, 5.57, 7.19 [00:15:16] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 0.56, 3.92, 6.40 [00:17:03] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.98, 7.72, 5.89 [00:18:04] !log repool mw2 [00:18:09] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [00:19:00] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 4.81, 6.71, 5.75 [00:20:52] !log depool mw2 [00:21:05] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [00:21:16] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 8.58, 5.89, 6.23 [00:23:14] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 2.47, 4.54, 5.69 [00:23:31] !log repool mw2 [00:23:38] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [00:24:47] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 5.52, 7.06, 6.21 [00:25:24] PROBLEM - mw2 Puppet on mw2 is WARNING: WARNING: Puppet is currently disabled, message: paladox, last run 2 minutes ago with 0 failures [00:26:44] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 4.65, 6.30, 6.03 [00:39:14] RECOVERY - mw2 Puppet on mw2 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [00:44:12] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 7.14, 5.82, 5.66 [00:46:12] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 8.95, 6.80, 6.02 [00:48:12] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 3.38, 5.67, 5.72 [00:54:35] PROBLEM - Host bacula1 is DOWN: PING CRITICAL - Packet loss = 100% [00:56:19] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 10.93, 8.44, 6.83 [01:11:29] RECOVERY - Host bacula1 is UP: PING OK - Packet loss = 0%, RTA = 94.79 ms [01:11:50] RECOVERY - bacula1 SSH on bacula1 is OK: SSH OK - OpenSSH_7.9p1 Debian-10 (protocol 2.0) [01:16:24] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 6.07, 7.50, 7.73 [01:22:21] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 8.18, 7.31, 7.51 [01:26:19] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 7.03, 7.35, 7.47 [01:32:01] PROBLEM - bacula1 SSH on bacula1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:32:19] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 8.32, 7.45, 7.39 [01:33:17] PROBLEM - Host bacula1 is DOWN: PING CRITICAL - Packet loss = 100% [01:35:19] RECOVERY - Host bacula1 is UP: PING OK - Packet loss = 0%, RTA = 94.75 ms [01:38:13] PROBLEM - bacula1 SSH on bacula1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:38:18] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 7.33, 7.55, 7.53 [01:40:11] RECOVERY - bacula1 SSH on bacula1 is OK: SSH OK - OpenSSH_7.9p1 Debian-10 (protocol 2.0) [01:44:15] PROBLEM - bacula1 SSH on bacula1 is CRITICAL: connect to address 172.245.38.205 and port 22: Connection refused [01:46:13] RECOVERY - bacula1 SSH on bacula1 is OK: SSH OK - OpenSSH_7.9p1 Debian-10 (protocol 2.0) [01:52:21] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 8.60, 7.42, 7.22 [01:52:35] PROBLEM - bacula1 SSH on bacula1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:56:22] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 6.54, 7.06, 7.10 [01:56:31] RECOVERY - bacula1 SSH on bacula1 is OK: SSH OK - OpenSSH_7.9p1 Debian-10 (protocol 2.0) [01:58:21] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 5.04, 6.19, 6.76 [02:10:17] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 9.16, 7.44, 6.84 [02:10:54] PROBLEM - bacula1 SSH on bacula1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:12:15] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 7.92, 7.28, 6.84 [02:12:20] PROBLEM - Host bacula1 is DOWN: PING CRITICAL - Packet loss = 100% [02:16:17] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 9.12, 7.57, 6.97 [02:18:16] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 6.57, 6.87, 6.77 [02:19:20] RECOVERY - Host bacula1 is UP: PING OK - Packet loss = 0%, RTA = 94.69 ms [02:20:15] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 8.63, 7.28, 6.91 [02:22:14] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 7.04, 7.34, 6.99 [02:23:19] RECOVERY - bacula1 SSH on bacula1 is OK: SSH OK - OpenSSH_7.9p1 Debian-10 (protocol 2.0) [02:24:13] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 10.51, 8.52, 7.46 [02:46:20] RECOVERY - bacula1 Current Load on bacula1 is OK: OK - load average: 1.55, 0.75, 0.32 [02:46:49] RECOVERY - bacula1 Disk Space on bacula1 is OK: DISK OK - free space: / 474826 MB (99% inode=99%); [02:46:54] PROBLEM - bacula1 Puppet on bacula1 is UNKNOWN: NRPE: Unable to read output [02:46:54] PROBLEM - bacula1 Bacula Static on bacula1 is UNKNOWN: NRPE: Unable to read output [02:47:11] PROBLEM - bacula1 Bacula Databases db4 on bacula1 is UNKNOWN: NRPE: Unable to read output [02:47:19] PROBLEM - bacula1 Bacula Databases db5 on bacula1 is UNKNOWN: NRPE: Unable to read output [02:47:21] PROBLEM - bacula1 Bacula Phabricator Static on bacula1 is UNKNOWN: NRPE: Unable to read output [02:47:31] PROBLEM - bacula1 Bacula Private Git on bacula1 is UNKNOWN: NRPE: Unable to read output [02:49:06] RECOVERY - bacula1 Bacula Daemon on bacula1 is OK: PROCS OK: 2 processes with UID = 116 (bacula) [02:51:04] PROBLEM - bacula1 Bacula Databases db4 on bacula1 is WARNING: WARNING: Full, 764611 files, 70.58GB, 2019-08-04 02:46:00 (4.0 weeks ago) [02:51:18] PROBLEM - bacula1 Bacula Static on bacula1 is CRITICAL: CRITICAL: Timeout or unknown client: lizardfs1-fd [02:51:20] PROBLEM - bacula1 Bacula Databases db5 on bacula1 is WARNING: WARNING: Full, 422 files, 19.13GB, 2019-08-10 13:18:00 (3.1 weeks ago) [02:51:21] PROBLEM - bacula1 Bacula Phabricator Static on bacula1 is WARNING: WARNING: Full, 80402 files, 2.603GB, 2019-08-10 12:53:00 (3.1 weeks ago) [02:51:32] PROBLEM - bacula1 Bacula Private Git on bacula1 is CRITICAL: CRITICAL: Full, 4137 files, 8.488MB, 2019-08-11 00:05:00 (3.0 weeks ago) [02:54:49] RECOVERY - bacula1 Puppet on bacula1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [02:59:32] RECOVERY - bacula1 Bacula Private Git on bacula1 is OK: OK: Full, 4176 files, 8.627MB, 2019-09-01 02:57:00 (1.0 hours ago) [03:07:26] RECOVERY - bacula1 Bacula Phabricator Static on bacula1 is OK: OK: Full, 83402 files, 2.649GB, 2019-09-01 03:05:00 (1.0 hours ago) [03:30:26] PROBLEM - sahitya.shaunak.in - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'sahitya.shaunak.in' expires in 15 day(s) (Tue 17 Sep 2019 03:26:30 AM GMT +0000). [03:30:39] [02miraheze/ssl] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjpOw [03:30:41] [02miraheze/ssl] 07MirahezeSSLBot 0323ac4b3 - Bot: Update SSL cert for sahitya.shaunak.in [03:32:26] RECOVERY - sahitya.shaunak.in - LetsEncrypt on sslhost is OK: OK - Certificate 'sahitya.shaunak.in' will expire on Sat 30 Nov 2019 02:30:33 AM GMT +0000. [03:53:04] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 107.191.126.23/cpweb [03:53:09] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 1 datacenter is down: 2400:6180:0:d0::403:f001/cpweb [03:53:19] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw2 [03:54:36] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 3.11, 5.54, 7.97 [03:55:19] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [03:55:59] PROBLEM - misc3 Puppet on misc3 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 1 minute ago with 1 failures. Failed resources (up to 3 shown): File[/opt/vpncloud_1.0.0_amd64.deb] [03:57:04] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [03:57:06] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [03:58:38] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CRITICAL - load average: 8.52, 6.90, 7.92 [04:00:37] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 6.40, 6.73, 7.73 [04:03:43] RECOVERY - misc3 Puppet on misc3 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:10:32] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 5.36, 5.52, 6.68 [05:50:27] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 9.60, 6.63, 4.85 [05:52:27] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 6.14, 6.45, 5.02 [06:04:27] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 9.54, 7.79, 6.18 [06:06:27] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 6.68, 7.43, 6.25 [06:06:46] RECOVERY - bacula1 Bacula Databases db4 on bacula1 is OK: OK: Full, 791788 files, 77.17GB, 2019-09-01 06:06:00 (1.0 hours ago) [06:14:27] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 8.08, 7.89, 6.88 [06:16:27] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 6.56, 7.31, 6.78 [06:20:27] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 5.86, 6.52, 6.58 [06:25:27] RECOVERY - cp3 Disk Space on cp3 is OK: DISK OK - free space: / 2689 MB (11% inode=94%); [06:59:20] RECOVERY - bacula1 Bacula Databases db5 on bacula1 is OK: OK: Full, 1024 files, 44.21GB, 2019-09-01 06:57:00 (1.0 hours ago) [07:51:17] Reception123: [07:51:35] Reception123: check incinga [07:52:36] No access atm [07:52:49] Reception123: k [07:53:15] Lfs1 is down but might just not have been removed [07:53:24] And puppet is dead on test1 [07:54:01] Although it is the happiest icinga-miraheze has ever been [09:46:01] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjpGJ [09:46:02] [02miraheze/puppet] 07paladox 03ab1ca0e - Update nrpe.cfg.erb [09:47:33] paladox: ^^^ see if you can correct icinga-miraheze [09:47:40] See a few messages up [09:53:01] PROBLEM - bacula1 Bacula Static on bacula1 is WARNING: WARNING: Full, 4580627 files, 388.1GB, 2019-08-10 08:02:00 (3.2 weeks ago) [09:53:12] I’m mobile only [10:24:15] PROBLEM - mw2 Current Load on mw2 is WARNING: WARNING - load average: 6.85, 6.41, 5.54 [10:26:14] RECOVERY - mw2 Current Load on mw2 is OK: OK - load average: 4.56, 6.00, 5.51 [10:33:36] RECOVERY - test1 Puppet on test1 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [10:45:24] Well puppet is okay so just lizardfs now [10:49:33] lizardfs is fine to me? [12:04:19] PROBLEM - www.reviwiki.info - PositiveSSLDV on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:04:39] PROBLEM - cp4 Stunnel Http for misc3 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:04:44] PROBLEM - commons.gyaanipedia.co.in - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:04:53] PROBLEM - cp4 Stunnel Http for mw2 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:04:56] PROBLEM - cp4 SSH on cp4 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:04:57] PROBLEM - cp4 Disk Space on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:05:04] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [12:05:09] PROBLEM - cp4 Stunnel Http for mw1 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:05:09] PROBLEM - netazar.org - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:05:15] PROBLEM - cp4 HTTPS on cp4 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:05:15] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 2 datacenters are down: 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [12:05:31] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:05:40] PROBLEM - cp4 Current Load on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:05:41] PROBLEM - guiasdobrasil.com.br - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:05:51] PROBLEM - cp4 Stunnel Http for mw3 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:05:58] PROBLEM - enc.for.uz - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:06:12] PROBLEM - cp4 Stunnel Http for misc2 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:06:14] RECOVERY - www.reviwiki.info - PositiveSSLDV on sslhost is OK: OK - Certificate 'reviwiki.info' will expire on Wed 03 Feb 2021 11:59:59 PM GMT +0000. [12:06:15] PROBLEM - cp4 Stunnel Http for test1 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:06:19] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:06:20] PROBLEM - cp4 Puppet on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:06:43] RECOVERY - commons.gyaanipedia.co.in - LetsEncrypt on sslhost is OK: OK - Certificate 'en.gyaanipedia.co.in' will expire on Thu 31 Oct 2019 03:14:12 PM GMT +0000. [12:06:53] PROBLEM - Host cp4 is DOWN: PING CRITICAL - Packet loss = 100% [12:08:15] paladox: recovered [12:08:29] Hmm [12:09:09] RECOVERY - netazar.org - LetsEncrypt on sslhost is OK: OK - Certificate 'www.netazar.org' will expire on Mon 07 Oct 2019 06:46:45 PM GMT +0000. [12:09:28] paladox: looked like Icinga hadn't removed 1-3 properly [12:09:52] Or they weren't connected [12:09:57] It was on https://icinga.miraheze.org/monitoring/service/show?host=bacula1&service=bacula1%20Bacula%20Static [12:09:58] [ Icinga Web 2 Login ] - icinga.miraheze.org [12:10:16] And cp4 is fkwn [12:10:18] Down [12:11:11] paladox, Reception123, SPF|Cloud: cp4 down [12:11:55] * SPF|Cloud looks [12:12:21] Thx [12:12:36] suspended :/ [12:13:58] :/ [12:15:59] paladox: thanks for opening the ticket [12:16:20] SpF|Cloud: I’ve opened a ticket [12:16:21] Not sure if they will give us additional bandwith [12:16:22] As this is the 2nd time in the last few weeks [12:17:36] Your welcome :) [12:18:21] RECOVERY - enc.for.uz - LetsEncrypt on sslhost is OK: OK - Certificate 'enc.for.uz' will expire on Wed 13 Nov 2019 01:50:42 PM GMT +0000. [12:22:18] paladox: any plans to increase bandwidth? [12:22:43] PROBLEM - enc.for.uz - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:26:33] Not that I’m aware off [12:26:34] *of [12:35:04] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [12:35:05] RECOVERY - enc.for.uz - LetsEncrypt on sslhost is OK: OK - Certificate 'enc.for.uz' will expire on Wed 13 Nov 2019 01:50:42 PM GMT +0000. [12:35:18] RECOVERY - guiasdobrasil.com.br - LetsEncrypt on sslhost is OK: OK - Certificate 'guiasdobrasil.com.br' will expire on Sat 16 Nov 2019 01:37:57 PM GMT +0000. [12:35:40] RECOVERY - Host cp4 is UP: PING OK - Packet loss = 0%, RTA = 0.35 ms [12:36:08] RECOVERY - cp4 Stunnel Http for mw2 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.009 second response time [12:36:10] RECOVERY - cp4 Puppet on cp4 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [12:36:10] RECOVERY - cp4 Stunnel Http for mw1 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.004 second response time [12:36:20] RECOVERY - cp4 HTTPS on cp4 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1502 bytes in 0.016 second response time [12:36:24] SpF|Cloud: ^ [12:36:27] RECOVERY - cp4 SSH on cp4 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u6 (protocol 2.0) [12:36:52] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [12:56:29] paladox: still around? [12:57:51] Yes, but mobile [12:57:52] Reception123: ^ [12:59:41] paladox: what do we do about cp4 then? [13:03:12] I’m not sure? Nothing really [13:03:13] Traffic grows [13:03:24] paladox: so an upgrade? [13:03:48] Well we are gonna have to upgrade, yeh. To the cloud. [13:31:22] paladox: so when is that planned? [13:36:18] I’m not sure [14:03:02] IRC RELAY BOT NOTICE The irc bot may go down near the end of the month every month this is due to the limitation of using a free hosting provider, the bot should automatically reconnect on the first of the next month. Thank you for understanding, Zppix IRC Relay bot maintainer [14:04:05] ack [14:04:51] Ive explained it before but people must of missed it before so i figure i’d just reiterate [14:05:11] @Zppix yeah, best to remind people [14:05:51] Reception123, are you able to pin messages to discord? If so maybe you could pin that message to #irc-relay [14:06:31] good idea, will do [14:06:48] Thanks [14:07:19] done [14:08:11] 👍 [14:38:31] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 10.53, 8.07, 5.58 [14:44:27] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 3.57, 7.40, 6.39 [14:46:27] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 3.94, 6.25, 6.08 [14:54:38] !log sudo service php7.2-fpm restart on mw* [14:54:43] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [15:11:51] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2649 MB (10% inode=94%); [15:12:11] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_33 [+0/-0/±1] 13https://git.io/fjpCI [15:12:12] [02miraheze/mediawiki] 07paladox 0320b12f1 - Update PortableInfobox to 6de48189ba3949bd446201f44cb241a705d02fc1 [15:12:38] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_33 [+0/-0/±1] 13https://git.io/fjpCL [15:12:39] [02miraheze/mediawiki] 07paladox 03182198d - Update PF [15:13:47] RECOVERY - cp3 Disk Space on cp3 is OK: DISK OK - free space: / 2652 MB (11% inode=94%); [15:14:56] !log add valkyrienskieswiki to discord webhooks [15:15:02] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [15:19:42] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2649 MB (10% inode=94%); [15:20:03] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjpCZ [15:20:05] [02miraheze/puppet] 07paladox 03967ca28 - Update mediawiki-includes.conf.erb [15:42:14] PROBLEM - test1 Puppet on test1 is WARNING: WARNING: Puppet is currently disabled, message: paladox, last run 8 minutes ago with 0 failures [15:45:08] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjpC1 [15:45:10] [02miraheze/services] 07MirahezeSSLBot 030efb05f - BOT: Updating services config for wikis [15:46:37] Hello Zabelaf! If you have any questions feel free to ask and someone should answer soon. [15:47:29] Anyone else having an issue with the visual editor? Every time I try to use it a message pops up saying: [15:47:32] Error loading data from server: apierror-visualeditor-docserver-http: HTTP 404. Would you like to retry? [15:49:12] Zabelaf: hi. If you just enabled it, please wait 10-15 mins [15:50:00] OK, thanks a lot! [16:16:49] paladox, Reception123: why's puppet off on test1? [16:20:07] i'm testing something. [16:22:10] Ok [16:39:06] [02miraheze/puppet] 07Southparkfan pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjpWa [16:39:08] [02miraheze/puppet] 07Southparkfan 035756aec - Add new SSH key [17:35:08] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjplB [17:35:09] [02miraheze/services] 07MirahezeSSLBot 03bbd8b0d - BOT: Updating services config for wikis [18:16:14] RECOVERY - test1 Puppet on test1 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [18:53:08] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjp8V [18:53:10] [02miraheze/puppet] 07paladox 0329dcc49 - matomo: Up cron to every 12hrs rather then 6hrs The script takes more then 6 hours, so there is little time for the server to rest before the script is ran again. [19:09:18] [02puppet] 07JohnFLewis commented on commit 0329dcc49a277ea26c471f9bd3663aed15e4724c38 - 13https://git.io/fjp8H [19:10:03] [02puppet] 07paladox commented on commit 0329dcc49a277ea26c471f9bd3663aed15e4724c38 - 13https://git.io/fjp87 [19:10:35] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjp8d [19:10:36] [02miraheze/puppet] 07paladox 031c24e1c - matomo: Only run the cron every 24h [19:15:10] paladox, JohnLewis: https://phabricator.miraheze.org/T4554 [19:15:11] [ ⚓ T4554 Extension:StopForumSpam ] - phabricator.miraheze.org [19:25:04] hmm [20:18:30] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 10.79, 7.45, 5.34 [20:22:27] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 5.46, 7.32, 5.83 [20:24:27] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 4.13, 6.24, 5.61 [22:02:53] PROBLEM - cp4 Stunnel Http for mw2 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:02:55] PROBLEM - cp2 Stunnel Http for mw3 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:03:07] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [22:03:11] PROBLEM - cp2 Stunnel Http for mw1 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:03:13] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [22:03:17] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [22:03:19] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 3 backends are down. mw1 mw2 mw3 [22:03:27] PROBLEM - mw2 HTTPS on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:03:37] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 81.4.109.133/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [22:03:52] PROBLEM - cp4 Stunnel Http for mw3 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:03:53] PROBLEM - cp4 Stunnel Http for mw1 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:03:59] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 54% [22:04:11] PROBLEM - cp3 Stunnel Http for mw1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:04:12] PROBLEM - cp2 Stunnel Http for mw2 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:04:18] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 63% [22:04:18] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:04:19] PROBLEM - cp3 Stunnel Http for mw2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:04:49] RECOVERY - cp4 Stunnel Http for mw2 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.109 second response time [22:04:52] RECOVERY - cp2 Stunnel Http for mw3 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.401 second response time [22:05:22] PROBLEM - misc3 Current Load on misc3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:05:24] RECOVERY - mw2 HTTPS on mw2 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 442 bytes in 0.007 second response time [22:05:58] RECOVERY - cp4 Stunnel Http for mw1 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24522 bytes in 9.866 second response time [22:06:00] PROBLEM - misc3 Puppet on misc3 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 5 seconds ago with 1 failures. Failed resources (up to 3 shown): File[wildcard.miraheze.org] [22:06:06] RECOVERY - cp3 Stunnel Http for mw1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24522 bytes in 0.677 second response time [22:06:07] RECOVERY - cp2 Stunnel Http for mw2 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24500 bytes in 0.942 second response time [22:06:13] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 46% [22:06:14] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.685 second response time [22:06:15] RECOVERY - cp3 Stunnel Http for mw2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.645 second response time [22:06:22] paladox, Reception123, SPF|Cloud: Error 503 Backend fetch failed, forwarded for 86.160.232.254, 127.0.0.1 [22:06:22] (Varnish XID 55673791) via cp4 at Sun, 01 Sep 2019 22:05:58 GMT. [22:06:59] Works for Metacity [22:07:01] *me [22:07:02] Also I’m mobile [22:07:03] RECOVERY - cp4 Varnish Backends on cp4 is OK: All 5 backends are healthy [22:07:13] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 5 backends are healthy [22:07:14] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [22:07:14] RECOVERY - cp2 Stunnel Http for mw1 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24516 bytes in 0.394 second response time [22:07:16] RECOVERY - misc3 Current Load on misc3 is OK: OK - load average: 1.33, 1.94, 1.22 [22:07:19] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 5 backends are healthy [22:07:23] paladox: ok and same now, looks like one of them short bursts [22:07:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [22:07:39] someone pls look into them [22:07:52] icinga-miraheze: we got u, everything is coming back [22:07:54] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 3% [22:07:56] RECOVERY - cp4 Stunnel Http for mw3 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24500 bytes in 0.374 second response time [22:08:09] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 3% [22:08:43] idk if you guys know, but this 503 backend fetch failed is not just happening today, I get it at arbitrary times pretty often [22:10:12] @Rubyjunk we know, luckily they only seem to last a few mins for us. [22:10:30] yes but that's why I wanted to put up a cloudflare [22:10:45] I think for the time being I'm going to enable cloudflare and disable IP editing [22:11:29] @Rubyjunk hmm, @NDKilla, paladox: why doesn't things work with cloudflare [22:12:03] well the DDoS and whatever else from custom domains page it says to disable [22:13:09] Yea it does, but it works completely fine from what I can tell besides screwing up IP editing [22:13:44] RECOVERY - misc3 Puppet on misc3 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:14:02] okay, hmm, That might be why we don't like it because of CC-BY-SA rules and making CU checks etc. harder [22:14:24] Well, there are instructions on mediawiki.net for fixing that [22:14:42] Regardless i think we block cloudflare ips from editing as imho its violation of meta.miraheze.org/wiki/NOP [22:14:45] idk if they are applicable to your systems or not though [22:15:08] yeah, NOP should then come into play [22:15:13] @Zppix there are systems where cloudflare forwards the source IP to the wiki [22:15:19] therefore removing this limitation [22:15:33] Regardless they are still open proxies technically [22:15:57] "open proxies may not be used for anonymous editing of Miraheze sites" [22:16:07] It's not anonymous if the source IP is forwarded imo [22:16:30] Anonymous editing is when you edit without an account [22:16:48] If the ip that is editing is a open proxy it is in violation of the policy and will be. Locmed [22:16:52] Blocked* [22:16:52] It feels like a technicality to be honest [22:17:06] See [22:17:07] https://meta.miraheze.org/wiki/Custom_domains/Cloudflare [22:17:08] [ Custom domains/Cloudflare - Miraheze Meta ] - meta.miraheze.org [22:17:08] The functionality is equivalent [22:17:13] I dont make the policy I just enforce it [22:17:34] paladox, I've read it, but it doesn't really explain why it doesn't work [22:17:57] you guys already setup letsencrypt, I don't even need a certificate request [22:18:45] LE was done per request in https://phabricator.miraheze.org/T4685 [22:18:46] [ ⚓ T4685 Custom Domain ] - phabricator.miraheze.org [22:18:56] indeed it was [22:19:03] I can enable it right now without requesting anything [22:19:15] LE, however doesnt spoof ips [22:19:15] like, you guys don't really need to do anything [22:19:21] We doint setup LE automatically. Needs a request to do it for a domain. [22:19:34] My point is you don't need to setup cloudflare manually [22:21:10] I’m not sure of the reason of why you carnt have cloudflare enabled. [22:21:38] I just disabled anonymous editing and enabled cloudflare, so hopefully that's in compliance [22:22:10] But then there’s no ssl cert our end, so how is cloudflare connecting securily to our end? [22:22:29] Cloudflare connects using letsencrypt [22:22:42] That’s the frontend [22:22:45] Not backend [22:22:47] User >> Cloudflare (using cloudflare cert) >> Miraheze (using letsencrypt) [22:22:57] Nope it dosen’t [22:23:07] Yea it does [22:23:29] "Your origin has a valid certificate (not expired and signed by a trusted CA or Cloudflare Origin CA) installed. Cloudflare will connect over HTTPS and verify the cert on each request." [22:24:28] well nothing has complained about it yet [22:24:44] cloudflare or 503 backend fetch failed? [22:25:12] https://support.cloudflare.com/hc/en-us/articles/200170416-End-to-end-HTTPS-with-Cloudflare-Part-3-SSL-options#h_845b3d60-9a03-4db0-8de6-20edc5b11057 [22:25:13] [ End-to-end HTTPS with Cloudflare - Part 3: SSL options – Cloudflare Support ] - support.cloudflare.com [22:25:43] cloudflare on your wiki, it's still up and I don't see any icinga alert [22:25:50] Yes, what you linked to explains how cloudflare securely connects to your origin [22:26:04] "Full(strict) ensures a secure connection between both the visitor and your Cloudflare domain and between Cloudflare and your origin web server" [22:26:21] @RhinosF1 Cloudflare is up and running, I just connected through it [22:26:45] I can see that, I just looked at your wiki [22:26:53] But that carn’t be possible [22:26:54] What ssl option did you select? [22:26:55] Flex, full, full (strict)? [22:27:01] It's on full right now [22:27:05]