[00:01:28] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:03:32] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 8.058 second response time [00:12:54] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.68, 7.17, 6.60 [00:16:53] PROBLEM - mw3 Current Load on mw3 is CRITICAL: CRITICAL - load average: 9.60, 8.03, 7.02 [00:18:53] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.11, 7.60, 6.99 [00:26:51] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 5.37, 6.49, 6.76 [00:41:04] PROBLEM - test1 Current Load on test1 is WARNING: WARNING - load average: 1.75, 1.42, 1.01 [00:43:03] RECOVERY - test1 Current Load on test1 is OK: OK - load average: 0.95, 1.26, 1.00 [01:03:49] PROBLEM - test1 Current Load on test1 is CRITICAL: CRITICAL - load average: 2.28, 2.16, 1.52 [01:03:58] PROBLEM - cp4 Current Load on cp4 is CRITICAL: CRITICAL - load average: 2.12, 1.71, 0.96 [01:05:46] PROBLEM - test1 Current Load on test1 is WARNING: WARNING - load average: 1.05, 1.81, 1.47 [01:05:58] RECOVERY - cp4 Current Load on cp4 is OK: OK - load average: 0.53, 1.22, 0.87 [01:07:44] RECOVERY - test1 Current Load on test1 is OK: OK - load average: 1.01, 1.58, 1.43 [01:19:37] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:21:39] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 5.792 second response time [03:58:20] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:02:27] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 4.562 second response time [04:36:10] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:38:13] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 6.840 second response time [05:04:50] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:06:48] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 1.806 second response time [06:27:34] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:29:33] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 2.675 second response time [08:51:19] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:53:20] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 4.266 second response time [08:59:24] Reception123, Paladox: ^^ [08:59:56] will have to wait for paladox, don't know anything about ES other than it has issues [09:00:18] Reception123: :) [09:00:47] !log purge binary logs before '2019-07-14 09:00:00'; [09:00:52] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [09:04:21] Reception123: why isn't https://phabricator.miraheze.org/T4540 UBN ? MW3 and ES have been having issues for 24 hours now and I'm guessing that's related. [09:04:23] [ ⚓ T4540 Purchase db5 ] - phabricator.miraheze.org [09:11:55] RhinosF1: well it's more of a discussion for now because we need to decide between ES or db5 [09:12:01] and there is still some space left (currently 14GB) [09:14:27] Reception123: ah [09:15:25] though soon it will probably be UBN [09:15:51] Reception123: yeah [09:16:27] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:22:39] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 2.623 second response time [09:46:45] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:55:11] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 9.713 second response time [10:10:02] Hi [10:14:17] hi @Vabessa2006 [10:19:48] Is see you again <:3 [10:20:59] hm? [10:21:51] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:28:09] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 8.303 second response time [10:31:33] Oh sorry my bad sorry [10:36:34] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:38:38] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 6.833 second response time [10:47:05] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:49:06] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 4.890 second response time [10:55:30] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:57:34] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 8.154 second response time [11:01:06] You know Luigipro47 right? [11:50:22] Not sure who you're referring to, no [12:23:47] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:25:30] super minor but why do we have monitoring on a lb domain? [12:25:40] which sole purpose should be rotation not serving traffic directly? [12:25:45] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 1.582 second response time [12:56:43] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:02:54] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 1.413 second response time [14:15:10] paladox: ^ [14:15:54] JohnLewis elasticsearch-lb is being monitored so that it catches if nginx/elasticsearch went down. [14:16:17] paladox: lbs should not be used for monitoring [14:16:20] We serve traffic over it (since we use https) [14:16:44] but you're checking if a random server goes down [14:16:52] you're not checking if elasticsearch1 goes down by using an lb [14:17:23] ok [14:17:48] though I wouldn't invest the time to fix it since presumably we're binning the service [14:17:50] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 6.96, 6.76, 6.04 [14:18:09] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjX0W [14:18:10] [02miraheze/puppet] 07paladox 033620aae - Revert "Monitor elasticsearch-lb.miraheze.org" This reverts commit edf331d3e1cfbad7a98c57e68635d0d51d819a1a. [14:18:17] JohnLewis done ^^ [14:18:39] okay [14:20:49] PROBLEM - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:31:16] RECOVERY - elasticsearch1 elasticsearch-lb.miraheze.org HTTPS on elasticsearch1 is OK: HTTP OK: HTTP/1.1 200 OK - 790 bytes in 3.662 second response time [14:44:57] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 6.47, 6.67, 6.73 [16:38:49] PROBLEM - wiki.zymonic.com - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.zymonic.com' expires in 15 day(s) (Tue 30 Jul 2019 04:34:58 PM GMT +0000). [16:39:04] [02miraheze/ssl] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjXEy [16:39:06] [02miraheze/ssl] 07MirahezeSSLBot 033cd5d51 - Bot: Update SSL cert for wiki.zymonic.com [16:44:48] RECOVERY - wiki.zymonic.com - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.zymonic.com' will expire on Sat 12 Oct 2019 03:38:57 PM GMT +0000. [17:18:45] Hoi :3 [17:19:29] @vabessa2006 hi [17:19:48] What do you think? [17:19:48] https://cdn.discordapp.com/attachments/435711390544560128/600013582356316193/image0.jpg [17:21:02] @Vabessa2006 its a good profile pic [17:21:11] Can we help with Miraheze? [17:21:37] Yea [17:21:39] Thanks [17:24:00] @Vabessa2006 okay just ask if you have any questions [17:27:13] Yea [18:15:02] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.30, 6.86, 6.38 [18:15:08] Erm... https://phabricator.miraheze.org/T4543 [18:15:10] [ ⚓ T4543 CreateWiki JobQueueError ] - phabricator.miraheze.org [18:15:22] Reception123, Paladox: ^^ [18:15:41] ^ paladox what I told you about earlier... [18:16:58] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 6.01, 6.60, 6.34 [18:19:11] Reception123: anything we can do about it? [18:19:23] not sure [18:20:12] Reception123: that’ll be redis restarting.. [18:20:13] So nothing we can do about atm (unless we give misc2 more ram which we carnt due to the budget) [18:23:07] Paladox: So try again later [18:23:58] Can one of you update the link phab task pls as well [18:24:23] Hmm [18:24:24] https://grafana.miraheze.org/d/h8GaZJZWz/prometheus-redis?orgId=1&refresh=30s [18:24:25] [ Grafana ] - grafana.miraheze.org [18:24:46] JohnLewis: or reception123 can you check the logs please? [18:25:28] Timing does not fit with the time redis OOM [18:26:06] [02mw-config] 07Reception123 closed pull request 03#2695: Do not merge - 13https://git.io/fjXvX [18:26:08] [02mw-config] 07Reception123 commented on pull request 03#2695: Do not merge - 13https://git.io/fjXzs [18:27:07] Should be 3 logs as I tried it 3 times before releasing it definitely didn't work - only saved the id from the first though [18:28:23] It was when doing https://meta.miraheze.org/wiki/Special:RequestWikiQueue/8681#mw-section-request [18:28:25] [ Wiki requests queue - Miraheze Meta ] - meta.miraheze.org [18:30:54] Reception123 or johnLewis you can use salt to look at the php error log and exception log (grepping for that exception) [18:32:09] Otherwise i can do that later tonight [18:32:19] It’ll be in the exception log [18:32:37] * paladox currently mobile [18:32:39] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.65, 6.98, 6.54 [18:34:39] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 6.32, 6.72, 6.49 [18:34:52] RhinosF1: was the 2nd and third attempt show the same error? [18:38:59] Paladox: same error but didn't note the exact exception number . [18:39:28] Ok [18:41:34] Paladox: only difference to my usual requests is sitename has a space in. [18:41:53] the task isn't UBN tbh [18:42:41] JohnLewis: We don't yet know if it affects all wiki creations [18:42:46] RhinosF1: https://arg.miraheze.org/wiki/Main_Page [18:42:47] [ Knowledge Base ] - arg.miraheze.org [18:42:48] it doesn't [18:42:53] I know it doesn't :) [18:43:01] Paladox: nah that's me getting sitename and dbname mixed up [18:43:19] ^ [18:43:41] JohnLewis: the wiki requester would likley not be given the admin powers. [18:43:43] JohnLewis: how did that work and CreateWiki still show it as open [18:43:56] paladox: correct [18:44:04] RhinosF1: because the exception occurs during the process [18:44:04] Because anything that used the job queue failed :) [18:44:17] the status is set at the end of the process, to make sure no errors occured [18:44:38] Ah [18:45:10] Couldn't a steward give the rights manually [18:45:31] And then close the wiki request while we work out what happened [18:46:15] Yup [18:46:16] Which lucky for you johnLewis is :) [18:46:52] well stewards can't close it [18:48:28] JohnLewis: the wiki Request, couldn't you decline but leave a message saying it's actually created. [18:48:45] RhinosF1: well... yes but that's bad and wrong [18:48:57] because the user will be told their request is declined [18:49:34] JohnLewis: can sysop close? [18:49:50] paladox: if a Steward can't a sysop can't obvs? [18:50:17] Ok [18:50:28] JohnLewis: well yeah, but it would we could leave them a handwritten message explaining it in other terms [18:50:33] You can do it through the db though? [18:50:36] paladox: yes [18:50:42] RhinosF1: it's just super wrong [18:51:36] JohnLewis: k [18:52:05] !log UPDATE cw_requests SET cw_status = 'approved' WHERE cw_id = 8681; [18:52:43] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [18:56:04] Thanks johnLewis ! [18:57:08] I'll let someone who at least half knows what happened update the Phab task [20:57:19] [02miraheze/mw-config] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fjXg8 [20:57:21] [02miraheze/mw-config] 07paladox 0374b3ef4 - Revert "Sitenotice for ElasticSearch" This reverts commit 9bafa5652bd5335fdc5d1c8e0be84589cce9edf8. [21:57:36] PROBLEM - theliteratureproject.org - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Cannot make SSL connection. [21:58:57] that wiki again, didn't we have issues the other day with it [22:41:59] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.03, 6.82, 6.34 [22:43:56] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 6.40, 6.70, 6.36 [23:13:03] PROBLEM - mw3 Current Load on mw3 is WARNING: WARNING - load average: 7.14, 6.75, 6.36 [23:15:00] RECOVERY - mw3 Current Load on mw3 is OK: OK - load average: 6.05, 6.48, 6.31