[00:15:58] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [00:22:00] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 2400:6180:0:d0::403:f001/cpweb [00:25:18] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 6.98, 6.97, 6.19 [00:25:57] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [00:27:23] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:29:23] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24661 bytes in 1.971 second response time [00:31:13] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 5.32, 6.13, 6.28 [00:39:59] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 6.98, 6.96, 6.58 [00:41:59] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 8.11, 7.42, 6.85 [00:44:01] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.98, 7.77, 7.13 [00:46:58] !log depool mw3 [00:47:35] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [00:48:23] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 4.95, 4.92, 6.03 [00:49:21] PROBLEM - cp3 Stunnel Http for mw2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:51:27] RECOVERY - cp3 Stunnel Http for mw2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24662 bytes in 7.466 second response time [00:55:23] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-7 [+0/-0/±1] 13https://git.io/JeiSq [00:55:24] [02miraheze/puppet] 07paladox 03e3aa47c - mediawiki: Make nginx::worker_processes use default [00:55:26] [02puppet] 07paladox created branch 03paladox-patch-7 - 13https://git.io/vbiAS [00:55:27] [02puppet] 07paladox opened pull request 03#1154: mediawiki: Make nginx::worker_processes use default - 13https://git.io/JeiSm [00:55:42] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-7 [+0/-0/±1] 13https://git.io/JeiSY [00:55:43] [02miraheze/puppet] 07paladox 03b80dc06 - Update mw2.yaml [00:55:45] [02puppet] 07paladox synchronize pull request 03#1154: mediawiki: Make nginx::worker_processes use default - 13https://git.io/JeiSm [00:55:54] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-7 [+0/-0/±1] 13https://git.io/JeiSO [00:55:55] [02miraheze/puppet] 07paladox 03eee2125 - Update mw3.yaml [00:55:57] [02puppet] 07paladox synchronize pull request 03#1154: mediawiki: Make nginx::worker_processes use default - 13https://git.io/JeiSm [00:56:04] [02puppet] 07paladox closed pull request 03#1154: mediawiki: Make nginx::worker_processes use default - 13https://git.io/JeiSm [00:56:05] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±3] 13https://git.io/JeiS3 [00:56:07] [02miraheze/puppet] 07paladox 03ff08caa - mediawiki: Make nginx::worker_processes use default (#1154) * mediawiki: Make nginx::worker_processes use default * Update mw2.yaml * Update mw3.yaml [01:06:27] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [01:11:44] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [01:18:18] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [01:25:47] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 12 failures. Last run 3 minutes ago with 12 failures. Failed resources (up to 3 shown): Package[openssh-server],Service[ssh],Package[exim4],Service[postfix] [01:33:51] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [01:49:27] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [01:53:24] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [02:06:27] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 7 failures. Last run 3 minutes ago with 7 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [02:12:33] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [02:24:35] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[ops_ensure_members] [02:32:39] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [02:54:42] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 2 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [03:05:26] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 128.199.139.216/cpweb [03:11:22] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [03:43:29] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [03:45:28] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 128.199.139.216/cpweb [03:47:30] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [03:55:44] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 2 failures. Last run 3 minutes ago with 2 failures. Failed resources (up to 3 shown): Exec[ufw-logging-low],Exec[ufw-allow-tcp-from-any-to-any-port-22] [04:13:51] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:33:37] Hello Wolf2048215! If you have any questions, feel free to ask and someone should answer soon. [04:39:32] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 2 backends are down. lizardfs6 mw2 [04:39:33] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [04:39:36] PROBLEM - cp3 Stunnel Http for mw2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:39:46] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:40:34] PROBLEM - cp3 Stunnel Http for mw1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:40:44] PROBLEM - cp3 Stunnel Http for misc2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:40:49] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [04:45:21] RECOVERY - cp3 Stunnel Http for mw1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24639 bytes in 8.268 second response time [04:45:43] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 7 failures. Last run 3 minutes ago with 7 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [04:46:20] RECOVERY - cp3 Stunnel Http for mw2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24656 bytes in 0.914 second response time [04:48:22] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:48:49] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:50:52] PROBLEM - cp3 Stunnel Http for mw1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:51:08] PROBLEM - cp3 Stunnel Http for test1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:51:47] PROBLEM - cp3 Stunnel Http for mw2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:53:43] RECOVERY - cp3 Stunnel Http for test1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24639 bytes in 8.983 second response time [04:53:52] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1515 bytes in 3.046 second response time [04:54:34] PROBLEM - cp3 Current Load on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:55:55] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:56:36] RECOVERY - cp3 Stunnel Http for mw2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24656 bytes in 1.340 second response time [04:56:46] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24639 bytes in 0.693 second response time [04:56:48] RECOVERY - cp3 Current Load on cp3 is OK: OK - load average: 0.07, 0.11, 0.10 [04:57:03] RECOVERY - cp3 Stunnel Http for misc2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 43687 bytes in 3.960 second response time [04:57:14] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [04:57:36] PROBLEM - cp3 Puppet on cp3 is CRITICAL: CRITICAL: Puppet has 180 failures. Last run 15 seconds ago with 180 failures. Failed resources (up to 3 shown): File[/etc/stunnel/mediawiki.conf],File[/usr/lib/nagios/plugins/check_varnishbackends],File[/usr/lib/nagios/plugins/check_nginx_errorrate],File[/etc/logrotate.d/salt-common] [04:57:54] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 9% [05:00:17] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 1703 MB (7% inode=94%); [05:02:23] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:02:39] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [05:02:51] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:02:58] PROBLEM - cp3 Stunnel Http for test1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:03:03] PROBLEM - cp3 Stunnel Http for misc2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:06:45] PROBLEM - cp3 Stunnel Http for mw2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:09:25] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 1703 MB (7% inode=94%); [05:11:12] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24655 bytes in 6.847 second response time [05:11:49] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:15:16] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:15:26] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:16:13] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:18:00] PROBLEM - cp3 Current Load on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:22:36] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1515 bytes in 8.153 second response time [05:24:18] PROBLEM - cp3 SSH on cp3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:26:29] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 1703 MB (7% inode=94%); [05:28:03] RECOVERY - cp3 Current Load on cp3 is OK: OK - load average: 0.09, 0.03, 0.02 [05:28:17] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:31:39] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:31:46] RECOVERY - cp3 SSH on cp3 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u7 (protocol 2.0) [05:33:03] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [05:35:05] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 2% [05:37:37] PROBLEM - cp3 Current Load on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:39:13] RECOVERY - cp3 Stunnel Http for misc2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 43687 bytes in 3.618 second response time [05:39:31] RECOVERY - cp3 Stunnel Http for mw1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24655 bytes in 3.887 second response time [05:39:43] RECOVERY - cp3 Current Load on cp3 is OK: OK - load average: 0.00, 0.01, 0.00 [05:40:04] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1515 bytes in 6.239 second response time [05:43:32] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 1702 MB (7% inode=94%); [05:45:02] PROBLEM - cp3 Stunnel Http for misc2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:45:20] PROBLEM - cp3 Stunnel Http for mw1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:45:57] PROBLEM - cp3 SSH on cp3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:46:08] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:47:56] PROBLEM - cp3 Current Load on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:48:11] RECOVERY - cp3 SSH on cp3 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u7 (protocol 2.0) [05:50:06] RECOVERY - cp3 Current Load on cp3 is OK: OK - load average: 0.04, 0.03, 0.00 [05:55:25] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:57:49] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1515 bytes in 6.247 second response time [06:00:48] PROBLEM - cp3 SSH on cp3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:02:50] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 1703 MB (7% inode=94%); [06:06:11] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:06:53] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [06:07:07] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 5 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [06:08:26] RECOVERY - cp3 SSH on cp3 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u7 (protocol 2.0) [06:09:21] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 6% [06:10:28] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [06:12:48] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 1702 MB (7% inode=94%); [06:13:05] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.93, 7.02, 6.23 [06:15:02] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [06:15:37] RECOVERY - cp3 Stunnel Http for mw2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24640 bytes in 7.057 second response time [06:15:55] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1515 bytes in 2.115 second response time [06:16:17] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24661 bytes in 0.837 second response time [06:16:35] RECOVERY - cp3 Stunnel Http for misc2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 43687 bytes in 1.173 second response time [06:16:57] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 1702 MB (7% inode=94%); [06:17:05] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 5.70, 6.72, 6.52 [06:17:15] RECOVERY - cp3 Stunnel Http for mw1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24655 bytes in 0.781 second response time [06:17:52] RECOVERY - cp3 Stunnel Http for test1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24639 bytes in 1.121 second response time [06:18:01] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 6 backends are healthy [06:18:13] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [06:18:41] RECOVERY - cp3 Puppet on cp3 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:19:30] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [06:20:58] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 8.31, 7.65, 7.01 [06:22:55] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.45, 7.22, 6.97 [06:26:46] RECOVERY - cp3 Disk Space on cp3 is OK: DISK OK - free space: / 2682 MB (11% inode=94%); [06:28:51] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 6.54, 6.45, 6.75 [06:34:40] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 8.02, 7.33, 6.96 [06:36:37] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 6.66, 7.06, 6.93 [06:44:22] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:48:22] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 8.20, 7.59, 7.27 [06:50:24] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.02, 7.11, 7.14 [07:02:21] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 8.72, 7.65, 7.41 [07:04:21] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.62, 7.78, 7.52 [07:12:50] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2649 MB (10% inode=94%); [07:14:21] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 4.98, 6.05, 6.64 [07:14:34] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 2 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [07:17:57] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [07:18:33] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [07:19:52] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [07:22:37] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [07:37:13] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:38:50] PROBLEM - cp3 Stunnel Http for test1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:39:06] PROBLEM - cp3 Stunnel Http for mw2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:39:49] PROBLEM - cp3 Stunnel Http for mw1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:39:52] PROBLEM - cp3 Stunnel Http for misc2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:40:00] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [07:40:18] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:40:57] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 4 backends are down. lizardfs6 mw1 mw2 mw3 [07:40:58] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [07:41:46] PROBLEM - cp3 SSH on cp3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:42:07] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2637 MB (10% inode=94%); [07:44:23] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:45:41] RECOVERY - cp3 Stunnel Http for test1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24639 bytes in 0.877 second response time [07:45:51] RECOVERY - cp3 Stunnel Http for mw2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24662 bytes in 1.164 second response time [07:46:03] RECOVERY - cp3 SSH on cp3 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u7 (protocol 2.0) [07:46:22] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2637 MB (10% inode=94%); [07:50:42] PROBLEM - cp3 Current Load on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:50:51] PROBLEM - cp3 Stunnel Http for test1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:51:10] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:51:12] PROBLEM - cp3 Stunnel Http for mw2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:53:26] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2637 MB (10% inode=94%); [07:55:54] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:56:34] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24661 bytes in 0.685 second response time [07:57:22] RECOVERY - cp3 Current Load on cp3 is OK: OK - load average: 0.00, 0.00, 0.00 [08:00:06] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:01:38] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:02:18] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2636 MB (10% inode=94%); [08:02:20] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1516 bytes in 6.116 second response time [08:06:39] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:08:52] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2636 MB (10% inode=94%); [08:11:05] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:13:25] PROBLEM - cp3 Current Load on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:17:29] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24661 bytes in 9.266 second response time [08:17:41] RECOVERY - cp3 Stunnel Http for test1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24639 bytes in 0.924 second response time [08:17:51] RECOVERY - cp3 Current Load on cp3 is OK: OK - load average: 0.28, 0.17, 0.06 [08:17:52] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2636 MB (10% inode=94%); [08:22:43] PROBLEM - cp3 Stunnel Http for mw3 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:22:57] PROBLEM - cp3 Stunnel Http for test1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:24:53] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:26:58] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2635 MB (10% inode=94%); [08:29:14] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:31:12] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2635 MB (10% inode=94%); [08:31:37] paladox, PuppyKun, Reception123, SPF|Cloud: paging, can we get some assistance with these alerts? [08:32:03] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [08:34:59] PROBLEM - cp3 Puppet on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:35:55] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:37:17] PROBLEM - cp3 Puppet on cp3 is WARNING: WARNING: Puppet last ran 1 hour ago [08:38:02] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2635 MB (10% inode=94%); [08:38:46] RECOVERY - cp3 Stunnel Http for test1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24639 bytes in 8.339 second response time [08:38:54] RECOVERY - cp3 Stunnel Http for mw2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24656 bytes in 1.098 second response time [08:40:39] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [08:41:18] PROBLEM - cp3 Puppet on cp3 is CRITICAL: CRITICAL: Puppet has 20 failures. Last run 1 minute ago with 20 failures. Failed resources (up to 3 shown) [08:43:02] RhinosF1: not sure how I could be more useful than you with it... [08:43:22] oh, cp3 [08:43:24] I'll check it out [08:43:29] PROBLEM - cp3 Stunnel Http for mw2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:43:30] Reception123: you have access to the server and I don't [08:44:53] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [08:44:59] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 7 failures. Last run 3 minutes ago with 7 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [08:45:35] PROBLEM - cp3 Stunnel Http for test1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:45:36] ^ We are aware of all this [08:46:42] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:47:39] PROBLEM - cp3 Current Load on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:47:45] RECOVERY - cp3 Stunnel Http for test1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24639 bytes in 4.252 second response time [08:50:09] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:51:10] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2634 MB (10% inode=94%); [08:52:21] RECOVERY - cp3 Current Load on cp3 is OK: OK - load average: 0.11, 0.09, 0.05 [08:53:01] PROBLEM - cp3 Stunnel Http for test1 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:53:44] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:59:10] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1517 bytes in 2.440 second response time [09:00:14] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 3% [09:02:38] PROBLEM - cp3 Disk Space on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [09:03:17] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:06:15] PROBLEM - cp3 Disk Space on cp3 is WARNING: DISK WARNING - free space: / 2634 MB (10% inode=94%); [09:07:33] PROBLEM - cp2 SSH on cp2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:08:14] PROBLEM - Host cp2 is DOWN: PING CRITICAL - Packet loss = 100% [09:08:15] RECOVERY - cp3 Stunnel Http for mw3 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24655 bytes in 0.736 second response time [09:08:21] RECOVERY - cp3 Stunnel Http for mw2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24640 bytes in 0.705 second response time [09:08:30] PROBLEM - wiki.counterculturelabs.org - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:08:33] RECOVERY - cp3 Stunnel Http for test1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24639 bytes in 0.665 second response time [09:08:55] RECOVERY - cp3 Stunnel Http for mw1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 24655 bytes in 0.929 second response time [09:09:06] PROBLEM - thesimswiki.com - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:11:48] RECOVERY - cp3 Stunnel Http for misc2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 43687 bytes in 0.872 second response time [09:13:14] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 6 backends are healthy [09:15:57] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 4 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [09:18:54] RECOVERY - wiki.counterculturelabs.org - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.counterculturelabs.org' will expire on Sat 18 Jan 2020 05:39:44 AM GMT +0000. [09:19:00] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. lizardfs6 [09:19:07] RECOVERY - Host cp2 is UP: PING OK - Packet loss = 0%, RTA = 96.68 ms [09:19:09] PROBLEM - cp2 Current Load on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [09:19:11] RECOVERY - cp2 SSH on cp2 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u7 (protocol 2.0) [09:19:16] RECOVERY - cp2 Current Load on cp2 is OK: OK - load average: 0.20, 0.17, 0.07 [09:19:26] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [09:19:27] RECOVERY - thesimswiki.com - LetsEncrypt on sslhost is OK: OK - Certificate 'www.thesimswiki.com' will expire on Fri 14 Feb 2020 08:50:14 AM GMT +0000. [09:26:34] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [09:32:54] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 6 backends are healthy [09:35:53] RECOVERY - cp3 Puppet on cp3 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:42:06] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. lizardfs6 [09:43:55] Reception123: ^ +. Ns1 puppet still hasn't recovered [09:44:04] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 6 backends are healthy [09:45:06] :( [09:46:45] Reception123: all backends back so the new errors are just ns1 puppet not recovering after OOM'ing [09:47:08] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [09:47:46] And diwn again [09:48:46] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. lizardfs6 [09:49:07] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [09:50:17] PROBLEM - cp3 Puppet on cp3 is CRITICAL: CRITICAL: Puppet has 2 failures. Last run 2 minutes ago with 2 failures. Failed resources (up to 3 shown): File[wiki.opendominion.net_private],File[wiki.x1c7.com] [09:50:46] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 6 backends are healthy [09:53:28] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:54:01] Reception123: Testwiki Still slow [09:56:16] RECOVERY - cp3 Puppet on cp3 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [10:00:16] * RhinosF1 declares incident over [10:10:24] PROBLEM - cp2 Stunnel Http for mw2 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [10:10:50] PROBLEM - cp4 Stunnel Http for mw2 on cp4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [10:11:38] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 3 datacenters are down: 107.191.126.23/cpweb, 128.199.139.216/cpweb, 2a00:d880:5:8ea::ebc7/cpweb [10:11:38] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 2604:180:0:33b::2/cpweb, 81.4.109.133/cpweb [10:12:19] RECOVERY - cp2 Stunnel Http for mw2 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24640 bytes in 0.392 second response time [10:12:45] RECOVERY - cp4 Stunnel Http for mw2 on cp4 is OK: HTTP OK: HTTP/1.1 200 OK - 24640 bytes in 0.004 second response time [10:13:33] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [10:13:39] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [10:15:51] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 7 failures. Last run 3 minutes ago with 7 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [10:21:57] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [10:34:06] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 2 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [10:53:27] ** Warning: if there is any bot in #miraheze which should be exempted from Sigyn, contact staffers before it gets caught ** [10:59:52] ^ RhinosF1 see [10:59:59] grumble: around? [11:00:17] I’ve asked in #freenode-sigyn and amdj [11:00:19] RhinosF1: not sure if it keeps the old exemption [11:00:20] RhinosF1: ah ok [11:00:30] But amdj doesn’t remember if they have access [11:00:34] And no one else around [11:04:08] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:14:14] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 2 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [11:48:45] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. lizardfs6 [11:54:43] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 6 backends are healthy [12:02:56] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. lizardfs6 [12:03:15] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [12:04:52] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 6 backends are healthy [12:10:07] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JeiAQ [12:10:08] [02miraheze/services] 07MirahezeSSLBot 03d1b0136 - BOT: Updating services config for wikis [12:15:28] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter] [12:42:56] Is it possible to import pages exported from Wikipedia to Miraheze by XML file? If it asks for interwiki prefix, what should I type? [12:52:42] Wolf20482: Wikipedia [12:53:28] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:05:28] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 3 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [13:33:32] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:45:41] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 12 failures. Last run 3 minutes ago with 12 failures. Failed resources (up to 3 shown): Package[openssh-server],Service[ssh],Package[exim4],Service[postfix] [13:50:14] Reception123, RhinosF1: the old exemptions from last time round are still there, icinga-miraheze is correctly exempted [13:50:45] grumble: oh ok, thanks! [13:53:47] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [14:06:01] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 12 failures. Last run 3 minutes ago with 12 failures. Failed resources (up to 3 shown): Package[openssh-server],Service[ssh],Package[exim4],Service[postfix] [14:12:04] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [14:34:18] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 2 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [14:52:24] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [15:04:24] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 12 failures. Last run 2 minutes ago with 12 failures. Failed resources (up to 3 shown): Package[openssh-server],Service[ssh],Package[exim4],Service[postfix] [15:32:24] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [15:48:10] Reception123: "conflicto de edición en Phab" :) [15:54:34] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 2 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [16:51:47] [02mw-config] 07GustaveLondon776 opened pull request 03#2820: New protect level - 13https://git.io/JePvY [16:52:52] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:03:56] !log reset paladox password on phabricator by running `bin/auth recover` [17:04:05] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [17:04:58] !log upgrade phabricator on misc4 [17:05:12] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [17:23:30] [02mw-config] 07RhinosF1 commented on pull request 03#2820: New protect level - 13https://git.io/JePft [17:23:31] [02mw-config] 07RhinosF1 closed pull request 03#2820: New protect level - 13https://git.io/JePvY [17:28:24] Hello cmg! If you have any questions, feel free to ask and someone should answer soon. [17:28:32] Hi cmg [17:28:36] hello [17:28:47] How can we help cmg ? [17:29:19] I've created task in phab [17:29:57] https://phabricator.miraheze.org/T4929 [17:29:58] [ ⚓ T4929 CMGI TYPE of MIME Extensions ] - phabricator.miraheze.org [17:30:04] yes [17:30:27] can someone look there [17:31:02] cmg: I see, I’ve seen. At this stage, I see no reason to approve the executable files. Can you expand on your reason? [17:31:59] yes. executable file is not to be excutable but is temporary upload [17:32:16] it not affect in security issue [17:33:45] We don’t allow the uploading of executable files at this point in time [17:34:10] Why do you need that file type? What is the reason it needs to be on temporary? [17:34:11] oh [17:35:07] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 3 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [17:35:39] So, last year I've want to upload 2 executable file for myfriends. And myfriend download it. for exam purpose [17:37:37] done [17:37:59] lets looks again :-) [17:49:20] It’s a recent policy afaik [17:49:50] Add that to the task and we can have a think - link to previous instance [17:53:55] done [18:01:57] Will look later [18:03:50] okay :-) [18:23:34] [02miraheze/mediawiki] 07Pix1234 pushed 031 commit to 03REL1_33 [+0/-0/±5] 13https://git.io/JePJy [18:23:36] [02miraheze/mediawiki] 07Pix1234 0318db93f - Update various ext from upstream [18:28:46] !log rebuilding lc on lizardfs6 [18:28:57] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [18:31:43] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [18:34:58] !log rebuild LC on mw* [18:35:34] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [18:36:31] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 8.72, 8.15, 7.24 [18:41:19] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 1 datacenter is down: 81.4.109.133/cpweb [18:42:14] PROBLEM - cp2 Stunnel Http for mw3 on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:44:12] RECOVERY - cp2 Stunnel Http for mw3 on cp2 is OK: HTTP OK: HTTP/1.1 200 OK - 24639 bytes in 1.426 second response time [18:45:09] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [18:50:31] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.64, 7.88, 8.00 [18:54:29] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 12 failures. Last run 2 minutes ago with 12 failures. Failed resources (up to 3 shown): Package[openssh-server],Service[ssh],Package[exim4],Service[postfix] [19:06:01] [02miraheze/mediawiki] 07Pix1234 pushed 031 commit to 03REL1_34 [+0/-0/±5] 13https://git.io/JePU9 [19:06:02] [02miraheze/mediawiki] 07Pix1234 038f3e736 - Update various ext from upstream [19:06:46] * hispano76 greetings [19:07:32] !log restart php7.3-fpm and nginx on mw1 [19:07:34] hispano76 hi [19:07:37] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [19:07:49] Hi hispano76 [19:10:26] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 6.15, 6.26, 6.67 [19:21:57] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_33 [+0/-0/±1] 13https://git.io/JePT4 [19:21:58] [02miraheze/mediawiki] 07paladox 0394fbca5 - Update DiscordNotifications [19:22:21] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 8.07, 7.28, 6.91 [19:24:21] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.09, 7.41, 7.06 [19:26:27] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 9.68, 8.31, 7.48 [19:26:48] !log rebuild lc on mw* and lizardfs6 [19:27:15] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [19:29:04] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 2604:180:0:33b::2/cpweb, 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb [19:33:52] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_34 [+0/-0/±1] 13https://git.io/JePTM [19:33:53] [02miraheze/mediawiki] 07paladox 0341058c8 - Update DiscordNotifications [19:34:30] [02miraheze/mediawiki] 07paladox pushed 033 commits to 03REL1_34 [+0/-0/±6] 13https://git.io/JePTD [19:34:31] [02miraheze/mediawiki] 07brightbyte 03ccb2bb2 - WikiExporter: Remove unnecessary check for SCHEMA_COMPAT_WRITE_OLD flag WikiExporter used to require SCHEMA_COMPAT_WRITE_OLD to be enabled, until that requirement was fixed in I5ea972bb07ca1cfb3a2ad8ef120aef7. However, I failed to remove the explicit check for the flag at the time, causing all exports to fail in SCHEMA_COMPAT_NEW mode. This change removes the [19:34:31] obsolete check. Bug: T236735 Change-Id: I809ed4e2f1f30fdc4bd817f815d733d8a62f3d4f (cherry picked from commit d9209707cc62ea2eb0f0fe9d2c79e56a8cc87552) [19:34:33] [02miraheze/mediawiki] 07brightbyte 038c60cd5 - Set MCR migration stage to SCHEMA_COMPAT_NEW. This disables writing to the old schema in DefaultSettings.php. Bug: T231673 Change-Id: I799bfb76c10fd0c0dc791e7380fce0159d81c2d3 (cherry picked from commit 1a917bab4cfa3a957e4cda1959050a2c2058ee4c) [19:34:34] [02miraheze/mediawiki] 07paladox 03458ef98 - Merge branch 'REL1_34' of https://github.com/wikimedia/mediawiki into REL1_34 [19:34:36] [ GitHub - wikimedia/mediawiki: 🌻 The collaborative editing software that runs Wikipedia. This is a mirror from gerrit.wikimedia.org. See https://www.mediawiki.org/wiki/Developer_access for contributing ] - github.com [19:35:07] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [19:42:01] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [19:46:21] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 6.51, 6.86, 7.66 [19:49:21] !log rebuild LC on test1 for https://git.io/JePU9 [19:49:27] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [19:49:31] [ Comparing ce3245ce3796...8f3e736fa2d6 · miraheze/mediawiki · GitHub ] - git.io [19:54:14] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 2 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [19:56:25] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 6.22, 6.03, 6.64 [20:05:01] Great [20:05:06] paladox: another OOM [20:22:38] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [20:34:41] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 3 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [20:42:24] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 6.89, 6.75, 6.56 [20:42:42] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [20:46:22] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 6.37, 6.35, 6.47 [20:54:40] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 8 failures. Last run 3 minutes ago with 8 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Package[openssh-client],Package[openssh-server],Service[ssh] [21:22:50] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:35:04] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 15 failures. Last run 3 minutes ago with 15 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter],Service[nagios-nrpe-server],Package[openssh-client],Package[openssh-server] [22:00:08] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JePtv [22:00:09] [02miraheze/services] 07MirahezeSSLBot 037ed3890 - BOT: Updating services config for wikis [22:03:07] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:25:09] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): Service[prometheus-node-exporter] [22:43:17]