[00:00:48] RECOVERY - cp3 Disk Space on cp3 is OK: DISK OK - free space: / 3422 MB (14% inode=93%); [01:41:27] PROBLEM - cp8 Current Load on cp8 is WARNING: WARNING - load average: 1.51, 1.83, 1.39 [01:44:07] RECOVERY - cp8 Current Load on cp8 is OK: OK - load average: 0.54, 1.30, 1.26 [03:10:35] PROBLEM - cp8 Current Load on cp8 is WARNING: WARNING - load average: 1.72, 1.62, 1.23 [03:13:22] RECOVERY - cp8 Current Load on cp8 is OK: OK - load average: 0.77, 1.35, 1.19 [03:33:19] PROBLEM - cp8 Current Load on cp8 is CRITICAL: CRITICAL - load average: 2.04, 1.84, 1.42 [03:35:56] RECOVERY - cp8 Current Load on cp8 is OK: OK - load average: 1.01, 1.56, 1.38 [05:55:10] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfT6d [05:55:12] [02miraheze/services] 07MirahezeSSLBot 039a5abce - BOT: Updating services config for wikis [06:24:53] PROBLEM - rdb2 Puppet on rdb2 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[nagios-plugins] [06:33:04] RECOVERY - rdb2 Puppet on rdb2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:40:18] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfTPL [06:40:20] [02miraheze/services] 07MirahezeSSLBot 03d664d80 - BOT: Updating services config for wikis [06:49:49] hello! are we aware of the 503s? [06:52:39] musikanimal: no, monitoring just quit as well [06:52:41] Thx [06:53:21] paladox, Reception123, PuppyKun, SPF|Cloud: ^ [06:53:43] Everythung seems dead [06:55:31] Zppix is on "SRE Duty" [06:58:30] musikanimal: Zppix will be asleep [06:58:32] .t Zppix [06:58:33] 2020-04-21 - 01:58:33CDT [06:58:57] Phab is an error on db7 so could it be a database error [07:04:13] Could it be http://travaux.ovh.net/?do=details&id=44159 [07:04:13] [ OVH Tasks ] - travaux.ovh.net [07:04:36] Times exactly to our outage [07:11:02] PROBLEM - cp7 Stunnel Http for test2 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:11:15] Hmm [07:12:38] There must be something them [07:12:47] But still no service [07:23:23] PROBLEM - mw5 Puppet on mw5 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:23:39] PROBLEM - cp8 Stunnel Http for mw5 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:24:09] PROBLEM - mw7 HTTPS on mw7 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:24:19] PROBLEM - cp8 Stunnel Http for mw6 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:24:19] PROBLEM - cp6 Stunnel Http for mw5 on cp6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:24:20] PROBLEM - cp6 HTTP 4xx/5xx ERROR Rate on cp6 is CRITICAL: CRITICAL - NGINX Error Rate is 91% [07:24:23] yes well done icinga-miraheze [07:24:29] PROBLEM - db7 SSH on db7 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:24:30] PROBLEM - db6 SSH on db6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:24:39] PROBLEM - mw7 SSH on mw7 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:24:40] PROBLEM - db7 Puppet on db7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:24:49] PROBLEM - mw5 HTTPS on mw5 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:24:50] PROBLEM - bebaskanpengetahuan.id - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:24:59] PROBLEM - mw4 Current Load on mw4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:25:09] PROBLEM - db7 Disk Space on db7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:25:10] PROBLEM - cp6 HTTPS on cp6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4205 bytes in 0.007 second response time [07:25:12] PROBLEM - phab1 phabricator.miraheze.org HTTPS on phab1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 5483 bytes in 2.040 second response time [07:25:20] PROBLEM - Host cp7 is DOWN: PING CRITICAL - Packet loss = 100% [07:25:23] PROBLEM - cp3 Stunnel Http for mw6 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:25:29] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 51.77.107.210/cpweb, 2001:41d0:800:1056::2/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [07:25:47] PROBLEM - mw4 Puppet on mw4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:25:49] PROBLEM - ping4 on mw6 is CRITICAL: PING CRITICAL - Packet loss = 100% [07:25:59] PROBLEM - mw7 Disk Space on mw7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:26:00] PROBLEM - mw5 Disk Space on mw5 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:26:11] PROBLEM - mw7 php-fpm on mw7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:26:21] PROBLEM - mw4 php-fpm on mw4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:26:21] PROBLEM - cp6 Varnish Backends on cp6 is CRITICAL: 4 backends are down. mw4 mw5 mw6 mw7 [07:26:30] PROBLEM - mw6 SSH on mw6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:26:40] PROBLEM - mon1 grafana.miraheze.org HTTPS on mon1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 295 bytes in 0.004 second response time [07:26:50] PROBLEM - cp6 Stunnel Http for mw4 on cp6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:26:51] PROBLEM - Host db6 is DOWN: PING CRITICAL - Packet loss = 100% [07:26:51] PROBLEM - mon1 icinga.miraheze.org HTTPS on mon1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 295 bytes in 0.005 second response time [07:27:04] PROBLEM - ping4 on mw5 is CRITICAL: PING CRITICAL - Packet loss = 100% [07:27:10] PROBLEM - publictestwiki.com - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:27:10] PROBLEM - mw5 MediaWiki Rendering on mw5 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4276 bytes in 0.045 second response time [07:27:40] PROBLEM - Host db7 is DOWN: PING CRITICAL - Packet loss = 100% [07:28:02] PROBLEM - mw7 Current Load on mw7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:28:32] PROBLEM - Host mw5 is DOWN: PING CRITICAL - Packet loss = 100% [07:28:36] PROBLEM - cp8 Varnish Backends on cp8 is CRITICAL: 4 backends are down. mw4 mw5 mw6 mw7 [07:28:56] PROBLEM - cp8 Stunnel Http for mon1 on cp8 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 295 bytes in 0.232 second response time [07:28:58] PROBLEM - mw6 Disk Space on mw6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:29:08] PROBLEM - mw7 Puppet on mw7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:29:23] PROBLEM - mw6 php-fpm on mw6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:29:26] PROBLEM - Host mw6 is DOWN: PING CRITICAL - Packet loss = 100% [07:29:27] PROBLEM - test2 MediaWiki Rendering on test2 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4276 bytes in 0.044 second response time [07:29:36] PROBLEM - phab1 phab.miraheze.wiki HTTPS on phab1 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: HTTP/1.1 500 Internal Server Error [07:29:42] PROBLEM - mw7 MediaWiki Rendering on mw7 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4272 bytes in 0.044 second response time [07:29:50] PROBLEM - mw4 HTTPS on mw4 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:29:52] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 4 backends are down. mw4 mw5 mw6 mw7 [07:30:02] PROBLEM - nonciclopedia.org - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:30:15] PROBLEM - Host mw4 is DOWN: PING CRITICAL - Packet loss = 100% [07:30:26] PROBLEM - cp8 HTTP 4xx/5xx ERROR Rate on cp8 is WARNING: WARNING - NGINX Error Rate is 56% [07:30:27] PROBLEM - cp6 Stunnel Http for mon1 on cp6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 295 bytes in 0.002 second response time [07:30:36] PROBLEM - cp3 Stunnel Http for mw4 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:30:36] PROBLEM - jobrunner1 MediaWiki Rendering on jobrunner1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4274 bytes in 0.045 second response time [07:30:41] PROBLEM - cp8 Stunnel Http for mw7 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:30:41] PROBLEM - mon1 HTTPS on mon1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 295 bytes in 0.005 second response time [07:30:46] PROBLEM - cp6 Stunnel Http for mw7 on cp6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:30:47] PROBLEM - gluster1 GlusterFS port 49152 on gluster1 is CRITICAL: connect to address 51.77.107.209 and port 49152: Connection refused [07:30:57] PROBLEM - ping4 on mw7 is CRITICAL: PING CRITICAL - Packet loss = 100% [07:31:06] PROBLEM - cp3 Stunnel Http for mw7 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:31:17] PROBLEM - cp8 Stunnel Http for mw4 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:31:38] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 6 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 51.77.107.210/cpweb, 2001:41d0:800:1056::2/cpweb, 51.161.32.127/cpweb, 2607:5300:205:200::17f6/cpweb [07:31:48] PROBLEM - www.thesimswiki.com - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:32:11] PROBLEM - cp3 Stunnel Http for mw5 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:32:12] PROBLEM - cp3 Stunnel Http for mon1 on cp3 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 295 bytes in 0.708 second response time [07:32:49] PROBLEM - cp8 HTTPS on cp8 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4222 bytes in 0.325 second response time [07:33:01] PROBLEM - wiki.thesimswiki.com - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:33:11] PROBLEM - cp6 Stunnel Http for mw6 on cp6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:33:16] PROBLEM - wiki.conworlds.org - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:33:26] PROBLEM - thesimswiki.com - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:34:39] PROBLEM - Host mw7 is DOWN: PING CRITICAL - Packet loss = 100% [07:35:05] PROBLEM - bacula2 Bacula Databases db7 on bacula2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [07:38:37] Reception123: now would be a great time to wake up [07:41:21] PROBLEM - cp8 HTTP 4xx/5xx ERROR Rate on cp8 is CRITICAL: CRITICAL - NGINX Error Rate is 65% [07:42:04] PROBLEM - phab1 Puppet on phab1 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 44 seconds ago with 1 failures. Failed resources (up to 3 shown): Service[phd] [07:42:08] RhinosF1: looks like an OVH problem [07:42:19] Can't SSH to db7 [07:42:31] Reception123: worked that out 32 mins ago [07:43:02] RhinosF1: yeah, so I can't do anything about it [07:43:04] PROBLEM - phab1 phd on phab1 is CRITICAL: PROCS CRITICAL: 0 processes with args 'phd' [07:43:10] It's up to them to fix it [07:43:24] Reception123: be here until we recover, send alerts [07:43:35] Yeah that's all I can [07:43:42] Though I hope they do recover soon [07:43:52] * RhinosF1 is watching the status page [07:43:58] PM’d you a lot [07:44:19] Borschts-zhwiki: hi [07:44:44] Hi :) [07:45:04] Borschts-zhwiki: we are aware of the outage, host is down [07:45:12] Reception123: update topic pls [07:46:42] Reception123: I’m here virtually all day [07:47:09] Also just to note afaik SRE Duty is only for Phabricator and doesn't imply that the user is available all week to deal with every issue [07:47:18] they're just in charge of making sure Phab triage is done properly [07:47:24] When I do go for lunch break, hopefully we’ll be back [07:47:34] hopefully before [07:48:01] I’ll be scared if we’re not but paladox and john should be back by then [07:48:09] I would email but mail is likely dead [07:49:14] * RhinosF1 will try anyway [08:12:10] you can send it but it'll sit at some other mailserver until misc1 comes back up [08:12:23] wait no i lied mail looks like its working? [08:12:41] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 85% [08:13:44] PuppyKun: We think there might be just some stuff on the affected racks [08:13:52] Not sure how to confirm though [08:15:30] Linux misc1.miraheze.org Thu Jun 27 15:10:55 MSK 2019 x86_64 [08:15:33] welp, misc1 is def up [08:15:51] PuppyKun: Can you remember which cloud server hosts which? [08:16:31] erm [08:16:35] *visible confusion* [08:16:48] ssh ndkilla@mw1.miraheze.org successfully logs me into cp8 [08:16:49] wutface [08:16:53] PuppyKun: It’s stored on phab and wiki somewhere [08:17:22] Unless [08:17:25] I have idea [08:17:30] Let me get laptop up [08:19:27] RECOVERY - bacula2 Bacula Databases db7 on bacula2 is OK: OK: Diff, 3407 files, 84.88GB, 2020-04-19 04:09:00 (2.2 days ago) [08:19:50] RECOVERY - Host cp7 is UP: PING OK - Packet loss = 0%, RTA = 0.15 ms [08:19:53] PROBLEM - cp7 Stunnel Http for mon1 on cp7 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 295 bytes in 0.004 second response time [08:19:53] PROBLEM - cp7 Varnish Backends on cp7 is CRITICAL: 4 backends are down. mw4 mw5 mw6 mw7 [08:19:54] PROBLEM - cp7 HTTP 4xx/5xx ERROR Rate on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:19:54] PROBLEM - cp7 HTTPS on cp7 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:19:54] PROBLEM - cp7 Current Load on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:19:54] PROBLEM - ping4 on cp7 is CRITICAL: PING CRITICAL - Packet loss = 100% [08:19:54] PROBLEM - cp7 Disk Space on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:19:55] PROBLEM - cp7 Stunnel Http for mw4 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:19:55] PROBLEM - cp7 Puppet on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:19:56] PROBLEM - cp7 Stunnel Http for mw5 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:19:56] PROBLEM - cp7 SSH on cp7 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:19:57] PROBLEM - cp7 Stunnel Http for mw6 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:19:57] RECOVERY - mon1 icinga.miraheze.org HTTPS on mon1 is OK: HTTP OK: HTTP/1.1 302 Found - 335 bytes in 0.007 second response time [08:19:57] OwO [08:20:20] That's good [08:20:24] RECOVERY - publictestwiki.com - LetsEncrypt on sslhost is OK: OK - Certificate 'publictestwiki.com' will expire on Mon 01 Jun 2020 16:16:12 GMT +0000. [08:20:33] RECOVERY - Host db7 is UP: PING OK - Packet loss = 0%, RTA = 0.20 ms [08:20:34] PROBLEM - ping4 on db7 is CRITICAL: PING CRITICAL - Packet loss = 100% [08:20:34] PROBLEM - db7 Current Load on db7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:20:34] PROBLEM - db7 MySQL on db7 is CRITICAL: Can't connect to MySQL server on '51.89.160.143' (115) [08:20:44] !restarted db7/mw* [08:20:54] RhinosF1: for some reason it seems that many servers were marked as "stop" [08:20:55] RECOVERY - Host db6 is UP: PING WARNING - Packet loss = 81%, RTA = 0.46 ms [08:20:58] PROBLEM - db6 Current Load on db6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:20:58] PROBLEM - db7 MySQL on db7 is CRITICAL: Can't connect to MySQL server on '51.89.160.143' (115) [08:20:58] PROBLEM - db6 MySQL on db6 is CRITICAL: Can't connect to MySQL server on '51.89.160.130' (115) [08:20:59] PROBLEM - db6 Disk Space on db6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:20:59] PROBLEM - db6 Puppet on db6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:20:59] PROBLEM - ping4 on db6 is CRITICAL: PING CRITICAL - Packet loss = 100% [08:21:00] RECOVERY - Host mw5 is UP: PING OK - Packet loss = 0%, RTA = 0.61 ms [08:21:01] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 42% [08:21:03] RECOVERY - test2 MediaWiki Rendering on test2 is OK: HTTP OK: HTTP/1.1 200 OK - 19331 bytes in 0.468 second response time [08:21:04] RECOVERY - cp7 Current Load on cp7 is OK: OK - load average: 0.66, 0.48, 0.19 [08:21:04] RECOVERY - cp7 Disk Space on cp7 is OK: DISK OK - free space: / 11528 MB (43% inode=95%); [08:21:04] PROBLEM - mw5 php-fpm on mw5 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:21:04] PROBLEM - mw5 Current Load on mw5 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:21:05] !log restarted db7/mw* [08:21:11] I guess after whatever issue they had the servers likely shut down and never restarted? [08:21:13] RECOVERY - jobrunner1 MediaWiki Rendering on jobrunner1 is OK: HTTP OK: HTTP/1.1 200 OK - 19319 bytes in 0.122 second response time [08:21:13] RECOVERY - cp7 HTTP 4xx/5xx ERROR Rate on cp7 is OK: OK - NGINX Error Rate is 2% [08:21:15] RECOVERY - cp8 HTTP 4xx/5xx ERROR Rate on cp8 is OK: OK - NGINX Error Rate is 29% [08:21:15] RECOVERY - nonciclopedia.org - LetsEncrypt on sslhost is OK: OK - Certificate 'nonciclopedia.org' will expire on Tue 30 Jun 2020 02:32:05 GMT +0000. [08:21:17] RECOVERY - cp7 Stunnel Http for mw4 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.139 second response time [08:21:21] RECOVERY - Host mw6 is UP: PING OK - Packet loss = 0%, RTA = 0.19 ms [08:21:23] RECOVERY - mw6 Disk Space on mw6 is OK: DISK OK - free space: / 7528 MB (39% inode=72%); [08:21:23] RECOVERY - mw6 php-fpm on mw6 is OK: PROCS OK: 19 processes with command name 'php-fpm7.3' [08:21:24] RECOVERY - cp7 HTTPS on cp7 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1609 bytes in 0.011 second response time [08:21:24] PROBLEM - mw6 MediaWiki Rendering on mw6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4278 bytes in 0.043 second response time [08:21:24] PROBLEM - mw6 Current Load on mw6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:21:24] PROBLEM - mw6 Puppet on mw6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:21:25] PuppyKun: yeah [08:21:26] RECOVERY - cp3 Stunnel Http for mw4 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 1.015 second response time [08:21:27] RECOVERY - cp8 Stunnel Http for mw7 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.307 second response time [08:21:28] RECOVERY - cp6 Stunnel Http for mw7 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.003 second response time [08:21:28] that's my guess too [08:21:29] PuppyKun: no update from OVH [08:21:31] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [08:21:35] RhinosF1: well at least we're back online now [08:21:38] RECOVERY - cp7 SSH on cp7 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [08:21:40] RECOVERY - Host mw4 is UP: PING OK - Packet loss = 0%, RTA = 0.31 ms [08:21:42] Reception123: that's good [08:21:43] RECOVERY - mw4 HTTPS on mw4 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 541 bytes in 0.247 second response time [08:21:43] RECOVERY - cp8 Stunnel Http for mw4 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.306 second response time [08:21:44] PROBLEM - ping4 on mw4 is CRITICAL: PING CRITICAL - Packet loss = 100% [08:21:44] PROBLEM - mw4 Disk Space on mw4 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:21:44] PROBLEM - mw4 SSH on mw4 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:21:44] PROBLEM - mw4 MediaWiki Rendering on mw4 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 4276 bytes in 0.046 second response time [08:21:44] didn't think it'd be as easy as restarting the servers [08:21:45] RECOVERY - cp3 Stunnel Http for mw7 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 1.003 second response time [08:21:45] RECOVERY - db7 Current Load on db7 is OK: OK - load average: 1.28, 0.75, 0.31 [08:21:45] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [08:21:46] RECOVERY - mw5 php-fpm on mw5 is OK: PROCS OK: 19 processes with command name 'php-fpm7.3' [08:21:46] RECOVERY - cp7 Puppet on cp7 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [08:21:50] icinga-miraheze: seems really happy too [08:21:53] PROBLEM - ping4 on cp7 is CRITICAL: PING CRITICAL - Packet loss = 100% [08:21:54] RECOVERY - www.thesimswiki.com - LetsEncrypt on sslhost is OK: OK - Certificate 'www.thesimswiki.com' will expire on Mon 01 Jun 2020 16:23:50 GMT +0000. [08:21:55] RECOVERY - ping4 on cp7 is OK: PING OK - Packet loss = 0%, RTA = 0.23 ms [08:21:58] RECOVERY - mw4 MediaWiki Rendering on mw4 is OK: HTTP OK: HTTP/1.1 200 OK - 19321 bytes in 0.045 second response time [08:22:01] RECOVERY - cp7 Stunnel Http for mw6 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.003 second response time [08:22:02] RECOVERY - ping4 on mw4 is OK: PING OK - Packet loss = 0%, RTA = 0.34 ms [08:22:04] RECOVERY - cp3 Stunnel Http for mw5 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.953 second response time [08:22:04] RECOVERY - cp8 HTTPS on cp8 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1611 bytes in 0.391 second response time [08:22:05] RECOVERY - mw4 Disk Space on mw4 is OK: DISK OK - free space: / 7234 MB (38% inode=73%); [08:22:08] PROBLEM - db6 Puppet on db6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:22:12] Reception123: Which ones? [08:22:15] PROBLEM - db6 Puppet on db6 is WARNING: WARNING: Puppet last ran 1 hour ago [08:22:16] RECOVERY - db6 Disk Space on db6 is OK: DISK OK - free space: / 175091 MB (52% inode=99%); [08:22:16] RECOVERY - mw6 Puppet on mw6 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [08:22:17] RECOVERY - mw5 Current Load on mw5 is OK: OK - load average: 3.19, 1.17, 0.43 [08:22:18] RECOVERY - wiki.thesimswiki.com - LetsEncrypt on sslhost is OK: OK - Certificate 'www.thesimswiki.com' will expire on Mon 01 Jun 2020 16:23:50 GMT +0000. [08:22:19] RECOVERY - mw6 MediaWiki Rendering on mw6 is OK: HTTP OK: HTTP/1.1 200 OK - 19321 bytes in 0.046 second response time [08:22:23] RECOVERY - cp6 Stunnel Http for mw6 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.005 second response time [08:22:28] PROBLEM - ping4 on db7 is CRITICAL: PING CRITICAL - Packet loss = 100% [08:22:31] RECOVERY - ping4 on db7 is OK: PING OK - Packet loss = 0%, RTA = 0.18 ms [08:22:36] RECOVERY - wiki.conworlds.org - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.conworlds.org' will expire on Wed 13 May 2020 18:33:36 GMT +0000. [08:22:36] RECOVERY - thesimswiki.com - LetsEncrypt on sslhost is OK: OK - Certificate 'www.thesimswiki.com' will expire on Mon 01 Jun 2020 16:23:50 GMT +0000. [08:22:38] PROBLEM - ping4 on db6 is CRITICAL: PING CRITICAL - Packet loss = 100% [08:22:41] erm [08:22:41] RECOVERY - cp7 Stunnel Http for mw5 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.003 second response time [08:22:41] RECOVERY - mw4 SSH on mw4 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [08:22:42] RECOVERY - ping4 on db6 is OK: PING OK - Packet loss = 0%, RTA = 0.25 ms [08:22:46] PROBLEM - cp6 HTTP 4xx/5xx ERROR Rate on cp6 is WARNING: WARNING - NGINX Error Rate is 59% [08:22:46] PROBLEM - cp7 Stunnel Http for test2 on cp7 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 322 bytes in 0.011 second response time [08:22:46] RECOVERY - mw6 Current Load on mw6 is OK: OK - load average: 1.42, 1.33, 0.58 [08:22:52] PROBLEM - mw5 Puppet on mw5 is WARNING: WARNING: Puppet last ran 1 hour ago [08:22:53] im slightly confused by the fact that ping on several hosts is flapping a lot [08:22:53] RECOVERY - cp8 Stunnel Http for mw5 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.305 second response time [08:22:54] RECOVERY - cp8 Stunnel Http for mw6 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.313 second response time [08:22:54] RECOVERY - db7 SSH on db7 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [08:22:54] RECOVERY - db6 SSH on db6 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [08:22:55] RECOVERY - cp6 HTTPS on cp6 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 1610 bytes in 0.026 second response time [08:22:55] RECOVERY - cp6 Stunnel Http for mw5 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.005 second response time [08:22:56] RECOVERY - db7 Puppet on db7 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [08:22:56] RECOVERY - mw4 Current Load on mw4 is OK: OK - load average: 2.70, 1.35, 0.53 [08:22:57] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [08:22:57] RECOVERY - mw5 HTTPS on mw5 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 541 bytes in 0.004 second response time [08:22:58] RECOVERY - db7 Disk Space on db7 is OK: DISK OK - free space: / 337557 MB (54% inode=99%); [08:22:58] RECOVERY - bebaskanpengetahuan.id - LetsEncrypt on sslhost is OK: OK - Certificate 'bebaskanpengetahuan.id' will expire on Thu 11 Jun 2020 21:05:17 GMT +0000. [08:22:59] RECOVERY - Host mw7 is UP: PING OK - Packet loss = 0%, RTA = 0.24 ms [08:23:03] RECOVERY - mw7 HTTPS on mw7 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 541 bytes in 0.009 second response time [08:23:03] RECOVERY - mw7 SSH on mw7 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [08:23:05] RECOVERY - cp3 Stunnel Http for mw6 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.941 second response time [08:23:05] RECOVERY - cp7 Varnish Backends on cp7 is OK: All 7 backends are healthy [08:23:08] RECOVERY - ping4 on mw6 is OK: PING OK - Packet loss = 0%, RTA = 0.20 ms [08:23:09] PuppyKun: It looks live everything dead was cloud1 [08:23:11] PuppyKun: yeah I don't know many they're still having some issues [08:23:13] PROBLEM - mw4 Puppet on mw4 is WARNING: WARNING: Puppet last ran 1 hour ago [08:23:14] RECOVERY - mw5 Disk Space on mw5 is OK: DISK OK - free space: / 7236 MB (38% inode=72%); [08:23:14] RECOVERY - cp6 Varnish Backends on cp6 is OK: All 7 backends are healthy [08:23:18] RECOVERY - mw7 php-fpm on mw7 is OK: PROCS OK: 19 processes with command name 'php-fpm7.3' [08:23:18] RECOVERY - mw7 Disk Space on mw7 is OK: DISK OK - free space: / 7558 MB (39% inode=72%); [08:23:21] RhinosF1: not only, all mw servers seemed to be dead and db7 also [08:23:28] RECOVERY - mw6 SSH on mw6 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [08:23:28] RECOVERY - mw4 php-fpm on mw4 is OK: PROCS OK: 19 processes with command name 'php-fpm7.3' [08:23:29] RECOVERY - mw5 MediaWiki Rendering on mw5 is OK: HTTP OK: HTTP/1.1 200 OK - 19322 bytes in 0.107 second response time [08:23:29] RECOVERY - cp6 Stunnel Http for mw4 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 15316 bytes in 0.003 second response time [08:23:42] RECOVERY - ping4 on mw5 is OK: PING OK - Packet loss = 0%, RTA = 0.32 ms [08:23:42] RECOVERY - cp8 Varnish Backends on cp8 is OK: All 7 backends are healthy [08:23:43] RECOVERY - mw7 Current Load on mw7 is OK: OK - load average: 0.94, 1.24, 0.59 [08:23:43] RECOVERY - db6 Current Load on db6 is OK: OK - load average: 1.12, 0.80, 0.34 [08:23:44] Reception123: Which rack does it say they're on? If it does [08:23:45] RECOVERY - db6 MySQL on db6 is OK: Uptime: 171 Threads: 8 Questions: 212 Slow queries: 6 Opens: 97 Flush tables: 1 Open tables: 91 Queries per second avg: 1.239 [08:23:49] !log restarted cp7 (same time as the others) [08:23:52] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [08:23:54] RhinosF1: I can't see where that info would be [08:23:59] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 20% [08:24:00] RECOVERY - mw7 Puppet on mw7 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [08:24:00] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [08:24:03] RECOVERY - mw7 MediaWiki Rendering on mw7 is OK: HTTP OK: HTTP/1.1 200 OK - 19322 bytes in 0.046 second response time [08:24:17] im kind of assuming that each of the 2 cloud servers we have is some sort of rack. so likely everything on cloud1 is on the same rack but \o/ [08:24:23] RECOVERY - ping4 on mw7 is OK: PING OK - Packet loss = 0%, RTA = 0.14 ms [08:24:32] I only have access to something that allows me to manage servers that's it [08:25:11] PuppyKun: That's my thought but if both cloud1 and 2 had issues then we need to confirm that they both are affected [08:25:33] Reception123: E101E17 or E101E18 mentioned anywhere [08:26:13] nope, I only have access to the virtual cloud thing [08:26:17] I can't see other info [08:26:23] except directly relating to servers [08:27:03] hmm [08:27:34] Reception123: OVH are still showing as down so I wonder if it was them or that delayed us [08:28:01] RhinosF1: I don't know but it must be something from them our servers wouldn't just shut down like that [08:28:15] Reception123: I get that, we need to know why [08:32:58] PROBLEM - cp7 HTTP 4xx/5xx ERROR Rate on cp7 is CRITICAL: CRITICAL - NGINX Error Rate is 79% [08:33:00] RECOVERY - db6 Puppet on db6 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [08:33:08] PROBLEM - cp6 HTTP 4xx/5xx ERROR Rate on cp6 is CRITICAL: CRITICAL - NGINX Error Rate is 73% [08:33:31] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfTMV [08:33:32] [02miraheze/puppet] 07paladox 03343cb7b - db7: Lower mariadb::config::innodb_buffer_pool_instances to 1 [08:33:35] RECOVERY - mw5 Puppet on mw5 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [08:34:02] RECOVERY - mw4 Puppet on mw4 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [08:40:52] PROBLEM - cp6 HTTP 4xx/5xx ERROR Rate on cp6 is WARNING: WARNING - NGINX Error Rate is 59% [08:43:10] !log fallover db7 to db6 [08:43:13] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [08:44:35] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfTDv [08:44:37] [02miraheze/puppet] 07paladox 03d9c3285 - Switch db6 to being the db master [08:46:30] [02puppet] 07paladox closed pull request 03#1327: Switch over to db6 from db7 - 13https://git.io/JvhY5 [08:46:31] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±11] 13https://git.io/JfTDL [08:46:33] [02miraheze/puppet] 07paladox 038baf6cc - Switch over to db6 from db7 (#1327) [08:46:34] [02puppet] 07paladox deleted branch 03paladox-patch-2 - 13https://git.io/vbiAS [08:46:36] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-2 [08:46:50] [02mw-config] 07paladox closed pull request 03#2991: Switch to db6 - 13https://git.io/JvhOO [08:46:51] [02miraheze/mw-config] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfTDt [08:46:53] [02miraheze/mw-config] 07paladox 03842d27f - Switch to db6 (#2991) [08:46:54] [02mw-config] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbvb3 [08:46:56] [02miraheze/mw-config] 07paladox deleted branch 03paladox-patch-1 [08:49:41] RECOVERY - cp3 Stunnel Http for mon1 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 30936 bytes in 4.503 second response time [08:49:52] RECOVERY - cp7 Stunnel Http for mon1 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 30936 bytes in 0.011 second response time [08:49:53] RECOVERY - mon1 grafana.miraheze.org HTTPS on mon1 is OK: HTTP OK: HTTP/1.1 200 OK - 30936 bytes in 0.016 second response time [08:50:45] RECOVERY - phab1 phabricator.miraheze.org HTTPS on phab1 is OK: HTTP OK: HTTP/1.1 200 OK - 19068 bytes in 0.238 second response time [08:51:05] RECOVERY - cp8 Stunnel Http for mon1 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 30936 bytes in 0.336 second response time [08:51:08] PROBLEM - cp7 HTTP 4xx/5xx ERROR Rate on cp7 is WARNING: WARNING - NGINX Error Rate is 53% [08:51:23] RECOVERY - mon1 HTTPS on mon1 is OK: HTTP OK: HTTP/1.1 200 OK - 30936 bytes in 0.053 second response time [08:51:37] RECOVERY - gluster1 GlusterFS port 49152 on gluster1 is OK: TCP OK - 0.000 second response time on 51.77.107.209 port 49152 [08:51:37] RECOVERY - cp6 Stunnel Http for mon1 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 30936 bytes in 0.016 second response time [08:51:52] PROBLEM - bacula2 Bacula Private Git on bacula2 is UNKNOWN: NRPE: Unable to read output [08:51:53] RECOVERY - phab1 Puppet on phab1 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [08:52:05] RECOVERY - phab1 phd on phab1 is OK: PROCS OK: 1 process with args 'phd' [08:52:11] RECOVERY - phab1 phab.miraheze.wiki HTTPS on phab1 is OK: HTTP OK: Status line output matched "HTTP/1.1 200" - 17719 bytes in 0.160 second response time [08:53:11] PROBLEM - bacula2 Bacula Databases db6 on bacula2 is UNKNOWN: NRPE: Unable to read output [08:53:22] RECOVERY - cp6 HTTP 4xx/5xx ERROR Rate on cp6 is OK: OK - NGINX Error Rate is 9% [08:54:31] RECOVERY - cp7 HTTP 4xx/5xx ERROR Rate on cp7 is OK: OK - NGINX Error Rate is 5% [08:54:39] PROBLEM - bacula2 Bacula Databases dbt1 on bacula2 is UNKNOWN: NRPE: Unable to read output [08:55:27] PROBLEM - bacula2 Bacula Static on bacula2 is UNKNOWN: NRPE: Unable to read output [08:56:12] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 7.02, 7.55, 4.31 [08:56:35] PROBLEM - bacula2 Puppet on bacula2 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 4 minutes ago with 1 failures. Failed resources (up to 3 shown): Service[bacula-director] [08:56:43] PROBLEM - bacula2 Bacula Daemon on bacula2 is CRITICAL: PROCS CRITICAL: 1 process with UID = 112 (bacula) [08:57:32] PROBLEM - bacula2 Bacula Phabricator Static on bacula2 is UNKNOWN: NRPE: Unable to read output [08:58:07] paladox: what's happened? [08:58:38] mysql wouldn't start on db7 due to a mariadb bug that we worked around on db6, so we've fallen over to db6 [08:58:51] paladox: is that what caused earlier? [08:59:02] no [08:59:21] paladox: is that the outstanding ovh issue on E101E17 & E101E18 or? [08:59:37] i have no idea, i only had time to fall over to db6 [08:59:39] RECOVERY - db6 Current Load on db6 is OK: OK - load average: 3.46, 6.16, 4.45 [09:00:21] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfTDg [09:00:23] [02miraheze/puppet] 07paladox 032d94491 - Update bacula-dir.conf [09:00:30] paladox: do we know what racks they're on? We could do eventually with an idea why servers randomly shut down. [09:03:21] PROBLEM - bacula2 Bacula Databases db6 on bacula2 is CRITICAL: CRITICAL: no terminated jobs [09:03:28] cloud1 & 2 are in rack E101E17 [09:04:11] RECOVERY - bacula2 Bacula Phabricator Static on bacula2 is OK: OK: Diff, 5376 files, 9.432MB, 2020-04-19 18:36:00 (1.6 days ago) [09:04:31] paladox: so the random shutdown likely was http://travaux.ovh.net/?do=details&id=44159 [09:04:32] [ OVH Tasks ] - travaux.ovh.net [09:04:43] RECOVERY - bacula2 Bacula Databases dbt1 on bacula2 is OK: OK: Diff, 65403 files, 19.89GB, 2020-04-19 05:30:00 (2.1 days ago) [09:04:46] We'll have to hope we stay up as they're still having issues [09:04:49] ok [09:05:01] still having issues? [09:05:05] RECOVERY - bacula2 Bacula Static on bacula2 is OK: OK: Diff, 150413 files, 9.520GB, 2020-04-19 18:33:00 (1.6 days ago) [09:05:07] RECOVERY - bacula2 Bacula Private Git on bacula2 is OK: OK: Full, 4097 files, 11.90MB, 2020-04-19 18:37:00 (1.6 days ago) [09:05:35] paladox: read the task page, there's been no update since the incident was created [09:05:43] ok [09:05:49] RECOVERY - bacula2 Puppet on bacula2 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [09:05:52] RECOVERY - bacula2 Bacula Daemon on bacula2 is OK: PROCS OK: 2 processes with UID = 112 (bacula) [09:06:49] paladox: thx for sorting db7 [09:07:08] Reception123, PuppyKun: see what paladox just said. [09:29:35] MH-Discord: test [09:29:50] works [09:59:40] [02miraheze/IncidentReporting] 07paladox pushed 031 commit to 03paladox-patch-2 [+0/-0/±1] 13https://git.io/JfT9l [09:59:41] [02miraheze/IncidentReporting] 07paladox 03f5bf4e7 - Fix bug where ->i_responders could be empty [09:59:43] [02IncidentReporting] 07paladox created branch 03paladox-patch-2 - 13https://git.io/fh5YJ [09:59:44] [02IncidentReporting] 07paladox opened pull request 03#12: Fix bug where ->i_responders could be empty - 13https://git.io/JfT98 [10:00:45] * RhinosF1 sees no update from OVH [10:01:55] miraheze/IncidentReporting/paladox-patch-2/f5bf4e7 - paladox The build passed. https://travis-ci.com/miraheze/IncidentReporting/builds/161214308 [11:23:13] paladox, Reception123: 12:16 UK TIME / 11:16 UTC - OVH INCIDENT RESOLVED [11:24:32] Reception123: PM me your updated IR draft [11:54:00] PROBLEM - cp8 Current Load on cp8 is WARNING: WARNING - load average: 1.32, 1.79, 1.10 [11:56:47] PROBLEM - cp8 Current Load on cp8 is CRITICAL: CRITICAL - load average: 2.23, 2.09, 1.34 [11:59:31] RECOVERY - cp8 Current Load on cp8 is OK: OK - load average: 0.96, 1.55, 1.25 [12:59:48] Hmm... can someone take a look at this please https://meta.miraheze.org/m/4Gw [12:59:50] [ Difference between revisions of "Stewards' noticeboard" - Miraheze Meta ] - meta.miraheze.org [13:00:02] I'm not exactly sure what they are asking [13:04:25] AmandaCath: think they want crat rights so needs a Steward [13:05:25] Reception123: well, as I said on their misplaced Phab task, if they are not the creator of the wiki they should ask the creator/current crat(s) for a promotion [13:17:48] AmandaCath: https://meta.miraheze.org/w/index.php?title=User_talk:RhinosF1&diff=103605&oldid=103603 [13:17:49] [ Difference between revisions of "User talk:RhinosF1" - Miraheze Meta ] - meta.miraheze.org [13:22:25] AmandaCath: also re https://phabricator.miraheze.org/T5445#106213, Zppix was right to remove asignee as you didn’t work on it. Zppix is also on SRE Duty and in charge on phabricator triage for the week. [13:22:26] [ ⚓ T5445 Installing a Mandatory Skin ] - phabricator.miraheze.org [13:23:57] Well, I did work on it [13:24:10]