[00:02:57] RECOVERY - cp8 Disk Space on cp8 is OK: DISK OK - free space: / 3578 MB (18% inode=93%); [00:03:21] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 23.87, 23.17, 17.61 [00:05:17] RECOVERY - cloud1 Current Load on cloud1 is OK: OK - load average: 12.32, 19.32, 16.87 [00:14:28] [02ManageWiki] 07The-Voidwalker opened pull request 03#170: apply blacklist to groups by passing ceMW by reference - 13https://git.io/JfpZf [00:14:54] [02ManageWiki] 07paladox closed pull request 03#170: apply blacklist to groups by passing ceMW by reference - 13https://git.io/JfpZf [00:14:56] [02miraheze/ManageWiki] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfpZU [00:14:57] [02miraheze/ManageWiki] 07The-Voidwalker 0330c8526 - apply blacklist to groups by passing ceMW by reference (#170) [00:15:42] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_34 [+0/-0/±1] 13https://git.io/JfpZt [00:15:43] [02miraheze/mediawiki] 07paladox 03b14ac68 - Update ManageWiki [00:31:45] PROBLEM - jobrunner1 Puppet on jobrunner1 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[git_pull_MediaWiki core] [00:33:15] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 22.96, 19.35, 16.84 [00:33:48] RECOVERY - jobrunner1 Puppet on jobrunner1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:35:20] RECOVERY - cloud1 Current Load on cloud1 is OK: OK - load average: 15.21, 18.30, 16.82 [01:46:14] !log upgrade phabricator on phab1 [01:46:17] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [02:22:23] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-3 [+0/-0/±1] 13https://git.io/Jfpll [02:22:25] [02miraheze/puppet] 07paladox 03dd28171 - openldap: Remove ldapcherry [02:22:26] [02puppet] 07paladox created branch 03paladox-patch-3 - 13https://git.io/vbiAS [02:22:28] [02puppet] 07paladox opened pull request 03#1410: openldap: Remove ldapcherry - 13https://git.io/Jfpl8 [02:22:43] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-3 [+0/-1/±0] 13https://git.io/JfplB [02:22:44] [02miraheze/puppet] 07paladox 0344a9ddf - Delete ldapcherry-nginx.conf [02:22:46] [02puppet] 07paladox synchronize pull request 03#1410: openldap: Remove ldapcherry - 13https://git.io/Jfpl8 [02:22:53] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-3 [+0/-1/±0] 13https://git.io/JfplR [02:22:55] [02miraheze/puppet] 07paladox 03ebe0350 - Delete ldapcherry.ini.erb [02:22:56] [02puppet] 07paladox synchronize pull request 03#1410: openldap: Remove ldapcherry - 13https://git.io/Jfpl8 [02:23:02] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-3 [+0/-1/±0] 13https://git.io/Jfplu [02:23:04] [02miraheze/puppet] 07paladox 036575e48 - Delete ldapcherry.systemd.erb [02:23:06] [02puppet] 07paladox synchronize pull request 03#1410: openldap: Remove ldapcherry - 13https://git.io/Jfpl8 [02:23:37] [02puppet] 07paladox closed pull request 03#1410: openldap: Remove ldapcherry - 13https://git.io/Jfpl8 [02:23:38] [02miraheze/puppet] 07paladox pushed 035 commits to 03master [+0/-6/±2] 13https://git.io/Jfpl2 [02:23:40] [02miraheze/puppet] 07paladox 03a8e1983 - Merge pull request #1410 from miraheze/paladox-patch-3 openldap: Remove ldapcherry [02:25:24] !log apt-get --purge remove python3-setuptools - ldap1 [02:25:28] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [02:25:31] !log apt-get --purge remove nginx* [02:25:34] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [02:25:40] ldap1 [02:26:16] PROBLEM - ldap1 HTTPS on ldap1 is CRITICAL: connect to address 51.68.222.55 and port 443: Connection refusedHTTP CRITICAL - Unable to open TCP socket [02:29:11] [02miraheze/dns] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfplD [02:29:13] [02miraheze/dns] 07paladox 03ac972c9 - Remove ldapcherry [02:34:04] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[git_pull_dns] [02:38:00] !log root@ns1:/etc/gdnsd# git reset --hard origin/master - modified files, reset. [02:38:03] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [02:38:19] !log restart gdnsd on ns1 [02:38:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [02:38:25] now i'm off [02:42:06] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [03:02:42] PROBLEM - cp8 Current Load on cp8 is CRITICAL: CRITICAL - load average: 1.98, 2.29, 1.51 [03:04:39] RECOVERY - cp8 Current Load on cp8 is OK: OK - load average: 0.60, 1.69, 1.38 [05:03:40] PROBLEM - cp8 Current Load on cp8 is CRITICAL: CRITICAL - load average: 3.16, 3.11, 1.76 [05:07:33] PROBLEM - cp8 Current Load on cp8 is WARNING: WARNING - load average: 0.67, 1.71, 1.47 [05:09:31] RECOVERY - cp8 Current Load on cp8 is OK: OK - load average: 0.51, 1.32, 1.36 [06:26:24] !log sudo -u www-data php /srv/mediawiki/w/maintenance/rebuildall.php --wiki testwiki [06:26:28] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [08:26:27] !log rhinos@jobrunner1:~$ sudo -u www-data php /srv/mediawiki/w/maintenance/deleteBatch.php --i=3 --r="Requested in [[phab:T5794]]" --wiki=thefinalrumblewiki /home/rhinos/t5794.txt [08:26:30] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [08:29:12] !log delete labster[at]miraheze.org from active status.miraheze.wiki invites [08:29:15] Reception123: ^ [08:29:15] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [08:29:45] Ack [08:30:55] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03RhinosF1-patch-1 [+0/-0/±1] 13https://git.io/JfpV9 [08:30:57] [02miraheze/mw-config] 07RhinosF1 03e0111e5 - + 60 and 90 to RC Link days Per request on IRC [08:30:58] [02mw-config] 07RhinosF1 created branch 03RhinosF1-patch-1 - 13https://git.io/vbvb3 [08:31:01] [02mw-config] 07RhinosF1 opened pull request 03#3122: + 60 and 90 to RC Link days - 13https://git.io/JfpVH [08:31:18] Reception123: does that look right do you before I deploy? [08:32:01] RhinosF1: yup, wait for travis and then go ahead [08:32:08] Reception123: will do [08:34:25] [02mw-config] 07RhinosF1 closed pull request 03#3122: + 60 and 90 to RC Link days - 13https://git.io/JfpVH [08:34:26] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfpVN [08:34:28] [02miraheze/mw-config] 07RhinosF1 03c09ab7f - + 60 and 90 to RC Link days (#3122) Per request on IRC [08:35:23] .ask JohnLewis do you still need https://github.com/miraheze/mw-config/tree/revert-3102-patch-2 as a branch? [08:35:24] RhinosF1: I'll pass that on when JohnLewis is around. [08:35:25] [ GitHub - miraheze/mw-config at revert-3102-patch-2 ] - github.com [08:35:31] now to wait for puppet [08:38:28] .in 1 hour check status of delete batch [08:38:29] RhinosF1: Okay, will remind at 2020-06-24 - 10:38:29BST [08:45:44] [02mw-config] 07RhinosF1 deleted branch 03RhinosF1-patch-1 - 13https://git.io/vbvb3 [08:45:45] [02miraheze/mw-config] 07RhinosF1 deleted branch 03RhinosF1-patch-1 [09:02:10] Hello CristalSys! If you have any questions, feel free to ask and someone should answer soon. [09:38:29] RhinosF1: check status of delete batch [09:39:01] ZppixBot: at least 30 mins left [09:39:11] .in 30 mins delete batch [09:39:12] RhinosF1: Okay, will remind at 2020-06-24 - 11:09:11BST [09:42:14] * RhinosF1 forgot how slow delete batch is [10:09:12] RhinosF1: delete batch [10:09:38] 181 pages left [10:09:56] ~16 mins [10:39:31] 48 [10:44:22] 28 [10:46:32] !log running delete batch for another 28 pages that didn't pick up [10:46:35] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [10:48:50] * RhinosF1 put in screen - lunch! [11:01:43] PROBLEM - cp6 HTTP 4xx/5xx ERROR Rate on cp6 is WARNING: WARNING - NGINX Error Rate is 40% [11:02:07] Huh? [11:03:12] paladox, reception123: we're online but ---^ [11:03:44] RECOVERY - cp6 HTTP 4xx/5xx ERROR Rate on cp6 is OK: OK - NGINX Error Rate is 3% [11:04:47] cp3 was same but only got to 1/3 so never fired [11:37:28] Hello Taichi! If you have any questions, feel free to ask and someone should answer soon. [11:37:46] RhinosF1 Ban to joaquinito01 [11:38:05] joaquinito01 is blocked globally for sockpuppet. [11:38:57] Taichi: they are banned [11:39:23] RhinosF1 Ok. Ban inmediately from IRC to joaquinito01 [11:39:38] Taichi: they are not in here [11:40:03] RhinosF1 OK. I'm in public test wiki, i need to test admin rights. [11:40:42] Taichi: You're not fooling anyone [11:58:00] !log rhinos@jobrunner1:~$ sudo -u www-data php /srv/mediawiki/w/maintenance/rebuildall.php --wiki=thefinalrumblewiki and initSiteStats after mass deletion [11:58:04] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [12:23:19] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 21.04, 18.02, 15.39 [12:25:19] RECOVERY - cloud1 Current Load on cloud1 is OK: OK - load average: 9.63, 14.93, 14.60 [13:17:39] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03RhinosF1-patch-1 [+0/-0/±1] 13https://git.io/JfpH2 [13:17:41] [02miraheze/mw-config] 07RhinosF1 03fcd08be - apply default blacklist to meta as well [13:17:42] [02mw-config] 07RhinosF1 created branch 03RhinosF1-patch-1 - 13https://git.io/vbvb3 [13:18:27] [02mw-config] 07RhinosF1 opened pull request 03#3123: apply default blacklist to meta as well - 13https://git.io/JfpHr [13:20:30] [02mw-config] 07RhinosF1 closed pull request 03#3123: apply default blacklist to meta as well - 13https://git.io/JfpHr [13:20:32] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfpHK [13:20:33] [02miraheze/mw-config] 07RhinosF1 03629ded6 - apply default blacklist to meta as well (#3123) [13:32:49] [02miraheze/mw-config] 07paladox pushed 031 commit to 03paladox-patch-2 [+0/-0/±1] 13https://git.io/JfpQs [13:32:51] [02miraheze/mw-config] 07paladox 039169dc6 - Fix wgTitleBlacklistSources [13:32:52] [02mw-config] 07paladox created branch 03paladox-patch-2 - 13https://git.io/vbvb3 [13:32:54] [02mw-config] 07paladox opened pull request 03#3124: Fix wgTitleBlacklistSources - 13https://git.io/JfpQG [13:41:04] [02mw-config] 07paladox synchronize pull request 03#3124: Fix wgTitleBlacklistSources - 13https://git.io/JfpQG [13:41:06] [02miraheze/mw-config] 07paladox pushed 031 commit to 03paladox-patch-2 [+0/-0/±1] 13https://git.io/JfpQX [13:41:07] [02miraheze/mw-config] 07paladox 033aa1a48 - Update LocalSettings.php [13:41:13] [02mw-config] 07paladox edited pull request 03#3124: Fix wgTitleBlacklistSources and wgTitleBlacklistUsernameSources - 13https://git.io/JfpQG [13:44:38] [02mw-config] 07paladox closed pull request 03#3124: Fix wgTitleBlacklistSources and wgTitleBlacklistUsernameSources - 13https://git.io/JfpQG [13:44:40] [02miraheze/mw-config] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfpQd [13:44:41] [02miraheze/mw-config] 07paladox 03040182e - Fix wgTitleBlacklistSources and wgTitleBlacklistUsernameSources (#3124) * Fix wgTitleBlacklistSources * Update LocalSettings.php [13:44:43] [02mw-config] 07paladox deleted branch 03paladox-patch-2 - 13https://git.io/vbvb3 [13:44:44] [02miraheze/mw-config] 07paladox deleted branch 03paladox-patch-2 [14:09:08] PROBLEM - cloud1 Current Load on cloud1 is CRITICAL: CRITICAL - load average: 28.71, 21.73, 17.81 [14:13:14] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 18.18, 22.40, 19.16 [14:19:05] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 9.74, 7.99, 5.96 [14:19:15] RECOVERY - cloud1 Current Load on cloud1 is OK: OK - load average: 13.15, 17.84, 18.32 [14:23:09] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 6.62, 7.51, 6.25 [14:25:14] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jfpd1 [14:25:16] [02miraheze/services] 07MirahezeSSLBot 037832abd - BOT: Updating services config for wikis [14:32:35] PROBLEM - cloud1 Current Load on cloud1 is CRITICAL: CRITICAL - load average: 31.18, 24.33, 20.82 [14:34:33] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 14.63, 20.66, 19.91 [14:35:18] Hi there! I have what may seem like a stupid question, but I was curious if there was a way to make {{indent}} work the same as {{Indent}}. At present, it's showing the actual curly brackets and the word "indent" if I use lowercase, but there seems to be no way to create a page/template with lowercase letters. [14:35:26] RECOVERY - db6 Current Load on db6 is OK: OK - load average: 6.17, 6.76, 6.57 [14:35:48] Onyxdubh: can you link me to the page? [14:36:35] RECOVERY - cloud1 Current Load on cloud1 is OK: OK - load average: 12.91, 17.81, 18.94 [14:36:36] https://ashesst.miraheze.org/wiki/Template:Indent [14:36:37] [ Permission error - Ashes of My Dead Desire Storyteller Wiki ] - ashesst.miraheze.org [14:37:15] The un-private version: https://ashesplayer.miraheze.org/wiki/Template:Indent >.> In case it matters. [14:37:16] [ Template:Indent - Ashes of My Dead Desire Player Wiki ] - ashesplayer.miraheze.org [14:39:21] Onyxdubh: do you have a link to a page where {{indent}} doesn't work [14:39:28] * RhinosF1 can see private wikis [14:39:30] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 7.60, 7.35, 6.84 [14:39:34] Ahah. Then yes, hold on [14:39:56] ... And now it's working [14:40:19] And I changed literally nothing but the capitalization of it. It failed before, and now it's working. Because I went to see the repair people, clearly [14:41:11] Onyxdubh: We didn't touch it. {{indent}} should work [14:41:42] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 8.79, 8.31, 7.29 [14:42:07] Nope, you didn't touch it. I think that because I came and asked, it decided to work. Like when you take your car into the shop and suddenly the "service engine" light is no longer on. [14:42:40] * Onyxdubh grumbles 'cause he always feels stupid when this happens, but thank you very much for the threat of your interference apparently terrifying it into working. [14:43:10] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 22.52, 22.99, 20.82 [14:43:29] Onyxdubh: MediaWiki can be strange. Likely just slow. You could just put a colon in front of text though to indent it. [14:44:02] I didn't know that either. Thank you for that too! I have different sized indents for different things, but I assume that the colon is the default paragraph indent? [14:44:38] Onyxdubh: Colon is built in. Each colon will do it further. As long as they're at the start of the line. [14:45:41] You're awesome! (Some of them are legit, two spaces bigger than the last one... It's for formatting character sheets, so the colons only help with text, but that's an amazing shortcut and I wish I'd learnt it earlier.) [14:45:52] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 5.62, 7.31, 7.18 [14:46:21] Onyxdubh: we're here to help. [14:46:30] Have a great day! [14:46:39] Thanks [14:46:59] :) [14:49:41] PROBLEM - cloud1 Current Load on cloud1 is CRITICAL: CRITICAL - load average: 41.13, 28.52, 23.31 [14:50:26] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 9.79, 8.02, 7.42 [14:50:34] Lovely [14:54:30] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 6.05, 7.16, 7.22 [14:55:46] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 13.25, 21.68, 22.38 [15:10:00] PROBLEM - cloud1 Current Load on cloud1 is CRITICAL: CRITICAL - load average: 28.43, 22.09, 21.49 [15:10:59] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 8.13, 7.19, 7.04 [15:11:14] paladox: high usage or ? [15:11:44] high i/o [15:13:01] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 5.75, 6.69, 6.88 [15:13:01] paladox: expected high I/O? [15:14:06] well i don't know what to expect since if it was a SSD i would say no, but a HDD? Yes. [15:14:18] expecially seeing as we're hosting all our services on it [15:14:19] Ok [15:17:35] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 10.01, 8.24, 7.44 [15:18:06] Hello loglake! If you have any questions, feel free to ask and someone should answer soon. [15:18:57] hi [15:19:04] PROBLEM - cloud2 Current Load on cloud2 is WARNING: WARNING - load average: 16.71, 21.20, 15.92 [15:19:24] HI [15:20:10] Hi loglake [15:20:41] problem [15:20:53] loglake: what problem? [15:21:02] RECOVERY - cloud2 Current Load on cloud2 is OK: OK - load average: 8.15, 16.93, 15.03 [15:21:11] big problem [15:21:57] can you Please tell us what the problem is ? [15:22:06] There isn't one [15:23:12] every time I see the IRC nickname loglake I think it's an Intel chip [15:23:39] Please stop commenting. For the record, they were kicked for ban evading [15:28:37] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 6.49, 7.59, 7.65 [15:30:42] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 9.23, 8.52, 8.01 [15:39:09] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 5.90, 7.64, 7.97 [15:49:31] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 8.27, 8.14, 8.06 [15:57:43] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 7.09, 7.66, 7.92 [15:59:44] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 7.73, 7.95, 8.01 [16:17:06] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 4.21, 6.57, 7.81 [16:31:24] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 11.38, 8.82, 8.10 [16:37:35] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 6.37, 7.51, 7.84 [16:40:28] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 20.96, 21.62, 23.63 [16:42:28] PROBLEM - cloud1 Current Load on cloud1 is CRITICAL: CRITICAL - load average: 32.99, 24.49, 24.36 [16:43:36] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 8.45, 7.28, 7.52 [16:45:40] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 7.95, 7.33, 7.50 [16:47:39] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 8.21, 7.59, 7.57 [16:49:46] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 7.45, 7.42, 7.50 [16:50:10] uh what are you doing db6? [16:50:11] ^ paladox [16:50:38] Reception123: high I/O [16:50:46] yeah seems so [16:51:20] https://grafana.miraheze.org/d/W9MIkA7iz/miraheze-cluster?orgId=1&var-job=node&var-node=db6.miraheze.org&var-port=9100 [16:51:20] [ Grafana ] - grafana.miraheze.org [16:51:26] Reception123: scroll up an hour [16:52:05] "since if it was a SSD i would say no, but a HDD? Yes." I see [16:52:13] That [16:53:03] well hardware isn't my department (many things aren't :P) so yeah I can't really say much more [16:53:08] maybe SPF|Cloud can take a look [16:53:42] * RhinosF1 is not a hardware expert either [16:58:36] PROBLEM - cp8 Disk Space on cp8 is WARNING: DISK WARNING - free space: / 2116 MB (10% inode=93%); [17:00:19] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 19.46, 11.14, 8.63 [17:02:11] paladox: report of 502 from Cocopuff [17:02:20] is he uploading files? [17:02:34] because i cannot reproduce [17:03:50] * RhinosF1 asks [17:08:28] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 5.45, 7.45, 7.91 [17:10:31] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 9.46, 8.11, 8.09 [17:25:16] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jfhv5 [17:25:18] [02miraheze/services] 07MirahezeSSLBot 03e85eccb - BOT: Updating services config for wikis [17:29:21] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 7.56, 7.45, 7.98 [17:31:35] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 9.85, 8.29, 8.21 [17:35:37] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 5.27, 7.02, 7.75 [17:41:49] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 8.04, 7.49, 7.62 [17:44:18] [02miraheze/mediawiki] 07paladox pushed 0324 commits to 03REL1_34 [+2/-6/±113] 13https://git.io/Jfhfb [17:44:20] [02miraheze/mediawiki] 07reedy 03d29905e - Start 1.34.2 Change-Id: Id94c154302f981b939f1d9789cb5a02a21b2024f [17:44:21] [02miraheze/mediawiki] 07it-spiderman 0364676b5 - Update git submodules * Update extensions/OATHAuth from branch 'REL1_34' to a71bd68ac4cd3258bd80560c344322ae63fa9c5d - Define fallback for request IP when persisting user Bug: T237554 Change-Id: I18f57a523a6515f593963a9c149374bd6f6c73b4 (cherry picked from commit 54fc8a0cbf6145ffa3dfc684465cbd3fe6dea064) [17:44:23] [02miraheze/mediawiki] 07samwilson 03abb5cd7 - Clean up unused $displayPassword return value This is a follow-up to f12a3edff708a1fb73a09d154693dba49b69d921 to remove the now unused $password return variable. Change-Id: I2b12bd7c9f84e915f1bda659a95bab3d63a611d2 [17:44:24] [02miraheze/mediawiki] ... and 21 more commits. [17:47:55] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 6.70, 7.47, 7.64 [17:52:05] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_34 [+0/-0/±1] 13https://git.io/JfhJC [17:52:07] [02miraheze/mediawiki] 07paladox 03de400e1 - Update OATHAuth [17:54:38] !log rebuild lc on mw* and jobrunner1 [17:54:46] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [17:58:29] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 8.44, 7.57, 7.50 [18:01:05] PROBLEM - mw6 Puppet on mw6 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[git_pull_MediaWiki core] [18:01:20] PROBLEM - mw4 Puppet on mw4 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[git_pull_MediaWiki core] [18:01:25] PROBLEM - mw7 Puppet on mw7 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[git_pull_MediaWiki core] [18:01:37] PROBLEM - jobrunner1 Puppet on jobrunner1 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[git_pull_MediaWiki core] [18:02:37] PROBLEM - mw5 Puppet on mw5 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 4 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[git_pull_MediaWiki core] [18:03:31] RECOVERY - mw7 Puppet on mw7 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [18:03:46] RECOVERY - jobrunner1 Puppet on jobrunner1 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [18:03:54] PROBLEM - mw4 Current Load on mw4 is CRITICAL: CRITICAL - load average: 8.59, 7.49, 5.52 [18:04:39] PROBLEM - mw5 Current Load on mw5 is CRITICAL: CRITICAL - load average: 8.17, 7.59, 5.56 [18:04:43] RECOVERY - mw5 Puppet on mw5 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [18:05:22] RECOVERY - mw6 Puppet on mw6 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [18:06:49] PROBLEM - mw5 Current Load on mw5 is WARNING: WARNING - load average: 7.23, 7.32, 5.72 [18:08:06] PROBLEM - mw4 Current Load on mw4 is WARNING: WARNING - load average: 6.20, 7.17, 5.90 [18:10:14] PROBLEM - mw4 Current Load on mw4 is CRITICAL: CRITICAL - load average: 9.33, 7.74, 6.26 [18:11:05] PROBLEM - mw5 Current Load on mw5 is CRITICAL: CRITICAL - load average: 8.32, 7.31, 6.07 [18:13:10] PROBLEM - mw5 Current Load on mw5 is WARNING: WARNING - load average: 7.40, 7.52, 6.32 [18:14:22] PROBLEM - mw4 Current Load on mw4 is WARNING: WARNING - load average: 5.94, 7.63, 6.64 [18:16:37] PROBLEM - mw4 Current Load on mw4 is CRITICAL: CRITICAL - load average: 7.81, 8.16, 7.00 [18:18:40] PROBLEM - mw4 Current Load on mw4 is WARNING: WARNING - load average: 5.78, 7.28, 6.82 [18:19:28] RECOVERY - mw5 Current Load on mw5 is OK: OK - load average: 4.65, 6.08, 6.14 [18:22:45] RECOVERY - mw4 Current Load on mw4 is OK: OK - load average: 6.08, 6.71, 6.68 [18:35:04] RECOVERY - mw4 Puppet on mw4 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [18:36:12] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 5.85, 6.97, 7.98 [18:36:23] PROBLEM - spiral.wiki - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:37:10] paladox: there's nothing wrong with that ^ [18:37:31] loads for me [18:38:19] paladox: I just said that. That's the point. [18:38:23] RECOVERY - spiral.wiki - LetsEncrypt on sslhost is OK: OK - Certificate 'spiral.wiki' will expire on Thu 20 Aug 2020 15:07:11 GMT +0000. [18:38:32] That check is sending false positives way too often [18:38:44] oh [18:39:17] paladox: can we work out why it goes off wrongly and revise it? [18:40:13] Well it looks like it's recovered [18:40:18] so nothing to really look at [18:40:24] (since we won't be able to reproduce) [18:40:41] paladox: I've seen numerous examples over the last few days [18:40:49] The check is broke [18:40:59] check is not broken tho [18:41:08] if it was broken, it would not say [18:41:12] "RECOVERY - spiral.wiki - LetsEncrypt on sslhost is OK: OK - Certificate 'spiral.wiki' will expire on Thu 20 Aug 2020 15:07:11 GMT +0000." [18:41:26] paladox: it's firing wrongly. [18:41:31] actually [18:41:42] "PROBLEM - spiral.wiki - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds" says it hit the check timeout [18:41:48] so no issue [18:41:53] we're using hdds [18:42:01] if we were using ssds it would work [18:42:22] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 10.62, 7.75, 7.79 [18:46:26] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 7.37, 7.88, 7.88 [18:48:40] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 17.40, 20.56, 23.96 [18:51:08] paladox: our checks not being compatible with our own infra isn't acceptable. I'll file a task and bring it to john and SPF|Cloud's attention [18:51:26] sure, if you want [18:52:10] paladox: I do want [18:52:11] there is nothing we can do tbh. [18:52:19] apart from increasing the timeout [18:52:43] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 8.99, 7.84, 7.73 [18:52:54] paladox: https://phabricator.miraheze.org/T5762#112398 [18:52:54] PROBLEM - cloud1 Current Load on cloud1 is CRITICAL: CRITICAL - load average: 36.65, 28.36, 26.07 [18:52:55] [ ⚓ T5762 cp7 load check flapping ] - phabricator.miraheze.org [18:52:57] Same applies [18:53:10] https://phabricator.miraheze.org/T5762#112494 [18:53:11] [ ⚓ T5762 cp7 load check flapping ] - phabricator.miraheze.org [18:53:42] RhinosF1 ^ [18:53:52] SPF|Cloud has got cp7 under control [18:54:43] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 7.12, 7.39, 7.57 [18:56:46] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 8.83, 7.75, 7.66 [18:56:59] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 16.57, 21.90, 23.94 [19:00:53] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 7.94, 7.83, 7.69 [19:01:05] PROBLEM - cloud1 Current Load on cloud1 is CRITICAL: CRITICAL - load average: 31.54, 24.51, 24.30 [19:02:50] hi all [19:02:54] today was _hot_ [19:02:58] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 7.70, 8.02, 7.80 [19:04:30] fyi RhinosF1 will be filling a task for the earlier discussion [19:09:09] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 19.41, 21.95, 23.42 [19:13:18] PROBLEM - cloud1 Current Load on cloud1 is CRITICAL: CRITICAL - load average: 30.01, 25.60, 24.42 [19:17:16] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 7.19, 7.63, 7.91 [19:21:22] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 8.16, 8.08, 8.03 [19:26:00] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 5.63, 7.07, 7.65 [19:26:00] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 15.04, 21.38, 23.53 [19:30:00] PROBLEM - db6 Current Load on db6 is CRITICAL: CRITICAL - load average: 8.03, 6.95, 7.43 [19:33:42] !log MariaDB [uncyclopediawiki]> truncate objectcache; - db6 [19:33:45] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [19:33:55] PROBLEM - cloud1 Current Load on cloud1 is CRITICAL: CRITICAL - load average: 40.46, 28.58, 25.20 [19:36:17] good, i'm on miraheze [19:36:30] i think the overflow channel is isolated from the main [19:37:11] sometimes I wonder. are you (staff) very busy usually or just a little busy? Is ok to ask you on a regular basis? paladox [19:37:21] You can ask :) [19:38:21] ok. i wanted to know if i could adopt a wiki. Request for adoption . [19:38:28] sure [19:38:37] but stewards deal with that [19:38:50] is something restrictive or is easy to adopt one? [19:39:10] https://meta.miraheze.org/wiki/Requests_for_adoption [19:39:11] [ Requests for adoption - Miraheze Meta ] - meta.miraheze.org [19:39:17] Stewards are a different type of staff that is not on the IRC? [19:39:52] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 15.48, 22.30, 23.69 [19:40:05] PROBLEM - db6 Current Load on db6 is WARNING: WARNING - load average: 6.14, 7.70, 7.83 [19:41:29] some are on irc [19:41:30] and yes [19:41:38] void not online atm. [19:41:52] PROBLEM - cloud1 Current Load on cloud1 is CRITICAL: CRITICAL - load average: 26.39, 24.74, 24.45 [19:41:58] ok, i will write on that page [19:43:51] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 14.74, 21.52, 23.36 [19:47:25] !log restarted phd [19:47:29] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [19:48:07] RECOVERY - db6 Current Load on db6 is OK: OK - load average: 2.99, 4.76, 6.48 [19:49:44] RECOVERY - cloud1 Current Load on cloud1 is OK: OK - load average: 11.23, 15.05, 19.90 [19:50:01] ok. i completed an adoption request [19:51:03] PROBLEM - mon1 grafana.miraheze.org HTTPS on mon1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 311 bytes in 0.004 second response time [19:51:29] PROBLEM - cp7 Stunnel Http for mon1 on cp7 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 311 bytes in 0.002 second response time [19:51:30] PROBLEM - cp6 Stunnel Http for mon1 on cp6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 311 bytes in 0.014 second response time [19:51:46] PROBLEM - cp8 Stunnel Http for mon1 on cp8 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 311 bytes in 0.243 second response time [19:53:05] RECOVERY - mon1 grafana.miraheze.org HTTPS on mon1 is OK: HTTP OK: HTTP/1.1 200 OK - 30279 bytes in 0.010 second response time [19:53:29] RECOVERY - cp7 Stunnel Http for mon1 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 30248 bytes in 0.026 second response time [19:53:30] RECOVERY - cp6 Stunnel Http for mon1 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 30279 bytes in 0.012 second response time [19:53:34] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jfhtm [19:53:36] [02miraheze/puppet] 07paladox 0367c33a8 - switch grafana to db9 [19:53:47] RECOVERY - cp8 Stunnel Http for mon1 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 30248 bytes in 1.309 second response time [19:56:36] !log drop grafana from db6 (migrated to db9) [19:56:40] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [20:00:56] PROBLEM - thelonsdalebattalion.co.uk - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:03:05] RECOVERY - thelonsdalebattalion.co.uk - LetsEncrypt on sslhost is OK: OK - Certificate 'thelonsdalebattalion.co.uk' will expire on Sun 16 Aug 2020 23:10:01 GMT +0000. [20:04:37] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 20.80, 20.23, 19.88 [20:06:39] RECOVERY - cloud1 Current Load on cloud1 is OK: OK - load average: 11.45, 17.27, 18.86 [20:23:22] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 21.11, 19.97, 18.47 [20:24:32] !log migrate jobrunner1 to cloud2 to reduce some preasure [20:24:37] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [20:25:30] RECOVERY - cloud1 Current Load on cloud1 is OK: OK - load average: 15.01, 18.53, 18.16 [20:31:25] PROBLEM - cloud2 Current Load on cloud2 is CRITICAL: CRITICAL - load average: 27.26, 21.27, 14.61 [20:33:39] PROBLEM - cloud2 Current Load on cloud2 is WARNING: WARNING - load average: 20.72, 21.63, 15.67 [20:34:04] !log upgrade grafana on mon1 [20:34:11] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [20:34:32] PROBLEM - cloud1 Current Load on cloud1 is WARNING: WARNING - load average: 20.38, 22.80, 20.32 [20:35:57] !log upgrade icinga2 [20:36:08] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [20:38:33] RECOVERY - cloud1 Current Load on cloud1 is OK: OK - load average: 17.81, 19.66, 19.58 [20:42:57] RECOVERY - cloud2 Current Load on cloud2 is OK: OK - load average: 11.84, 19.03, 17.52 [20:45:18] PROBLEM - ping6 on jobrunner1 is CRITICAL: PING CRITICAL - Packet loss = 100% [20:49:16] Trying to mark a page for translation TestWiki, I got this error: Error: [8f55e79acb9da685a73419ba] 2020-06-24 20:46:04: Fatal exception of type "JobQueueConnectionError" I've let @RhinosF1 (Samuel) know. [20:50:44] oh [20:51:57] @Doug looking [20:53:14] !log migrating test2 to cloud1 [20:53:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [20:54:32] [02miraheze/dns] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhYu [20:54:33] [02miraheze/dns] 07paladox 0343ff5d9 - Switch test2 and jobrunner1 ip around [20:59:46] PROBLEM - jobrunner1 SSH on jobrunner1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:59:46] PROBLEM - jobrunner1 MirahezeRenewSsl on jobrunner1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:00:04] PROBLEM - jobrunner1 JobRunner Service on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:00:17] PROBLEM - jobrunner1 Current Load on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:00:17] PROBLEM - jobrunner1 HTTPS on jobrunner1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:00:28] PROBLEM - jobrunner1 Puppet on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:00:40] PROBLEM - jobrunner1 JobChron Service on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:00:43] PROBLEM - ping4 on jobrunner1 is CRITICAL: PING CRITICAL - Packet loss = 100% [21:01:04] PROBLEM - jobrunner1 Redis Process on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:01:18] PROBLEM - jobrunner1 APT on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:01:29] PROBLEM - jobrunner1 php-fpm on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:02:02] Hello [21:02:19] PROBLEM - Host jobrunner1 is DOWN: PING CRITICAL - Packet loss = 100% [21:02:45] Can be good title/subpage "Meta:Wiki creators/List" for list of current Wiki Creators? [21:05:33] RECOVERY - Host jobrunner1 is UP: PING OK - Packet loss = 0%, RTA = 0.20 ms [21:05:35] PROBLEM - jobrunner1 Disk Space on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:05:36] RECOVERY - jobrunner1 Disk Space on jobrunner1 is OK: DISK OK - free space: / 15282 MB (53% inode=75%); [21:06:02] RECOVERY - jobrunner1 SSH on jobrunner1 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [21:06:30] RECOVERY - jobrunner1 Current Load on jobrunner1 is OK: OK - load average: 0.66, 0.30, 0.11 [21:06:51] what? @MrJaroslavik [21:08:18] PROBLEM - test2 SSH on test2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:08:36] PROBLEM - cp7 Stunnel Http for test2 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:08:46] PROBLEM - ping4 on test2 is CRITICAL: PING CRITICAL - Packet loss = 100% [21:11:51] RECOVERY - test2 SSH on test2 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [21:12:06] RECOVERY - cp7 Stunnel Http for test2 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15540 bytes in 0.004 second response time [21:12:26] RECOVERY - ping4 on test2 is OK: PING OK - Packet loss = 0%, RTA = 0.27 ms [21:13:11] [02miraheze/mw-config] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jfh3L [21:13:12] [02miraheze/mw-config] 07paladox 03b372034 - Switch jobrunner ip [21:14:35] PROBLEM - jobrunner1 Disk Space on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:16:07] PROBLEM - Host jobrunner1 is DOWN: PING CRITICAL - Packet loss = 100% [21:16:36] PROBLEM - bacula2 Bacula Databases db6 on bacula2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [21:22:10] PROBLEM - ping4 on mail1 is CRITICAL: PING CRITICAL - Packet loss = 100% [21:22:20] PROBLEM - ping4 on rdb1 is CRITICAL: PING CRITICAL - Packet loss = 100% [21:22:42] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 128.199.139.216/cpweb, 2400:6180:0:d0::403:f001/cpweb, 51.77.107.210/cpweb, 51.89.160.142/cpweb, 2001:41d0:800:105a::10/cpweb, 51.161.32.127/cpweb [21:22:50] PROBLEM - services1 citoid on services1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:22:50] PROBLEM - db6 SSH on db6 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:22:54] PROBLEM - services1 APT on services1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:23:45] PROBLEM - services1 Current Load on services1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:23:51] PROBLEM - mw6 Puppet on mw6 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:24:01] PROBLEM - cp7 Stunnel Http for mw7 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:24:04] PROBLEM - mw5 Puppet on mw5 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:24:06] PROBLEM - Host mail1 is DOWN: PING CRITICAL - Packet loss = 100% [21:24:19] PROBLEM - Host db6 is DOWN: PING CRITICAL - Packet loss = 100% [21:24:25] PROBLEM - cp6 Stunnel Http for mw7 on cp6 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.003 second response time [21:24:28] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw7 [21:24:34] PROBLEM - services1 Puppet on services1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:24:36] PROBLEM - mw4 Puppet on mw4 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:24:41] PROBLEM - puppet2 puppetserver on puppet2 is CRITICAL: connect to address 51.89.160.129 and port 8140: Connection refused [21:24:42] PROBLEM - cp6 Varnish Backends on cp6 is CRITICAL: 1 backends are down. mw7 [21:24:51] PROBLEM - services1 SSH on services1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:25:01] PROBLEM - services1 proton on services1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:25:06] [02miraheze/mw-config] 07paladox pushed 031 commit to 03paladox-patch-2 [+0/-0/±1] 13https://git.io/Jfh3h [21:25:07] [02miraheze/mw-config] 07paladox 038bf59ea - Switch to db7 Emergency fallover [21:25:09] [02mw-config] 07paladox created branch 03paladox-patch-2 - 13https://git.io/vbvb3 [21:25:10] [02mw-config] 07paladox opened pull request 03#3125: Switch to db7 - 13https://git.io/Jfh3j [21:25:21] [02mw-config] 07paladox closed pull request 03#3125: Switch to db7 - 13https://git.io/Jfh3j [21:25:22] [02miraheze/mw-config] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jfhsv [21:25:24] [02miraheze/mw-config] 07paladox 03838f94a - Switch to db7 (#3125) Emergency fallover [21:25:29] PROBLEM - services1 restbase on services1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:25:58] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-6 [+0/-0/±1] 13https://git.io/JfhsU [21:26:00] [02miraheze/puppet] 07paladox 03b723afe - Switch db7 to master [21:26:01] [02puppet] 07paladox created branch 03paladox-patch-6 - 13https://git.io/vbiAS [21:26:02] PROBLEM - rdb1 Puppet on rdb1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:26:03] [02puppet] 07paladox opened pull request 03#1411: Switch db7 to master - 13https://git.io/JfhsT [21:26:08] PROBLEM - Host services1 is DOWN: PING CRITICAL - Packet loss = 100% [21:26:12] [02puppet] 07paladox closed pull request 03#1411: Switch db7 to master - 13https://git.io/JfhsT [21:26:14] [02miraheze/puppet] 07paladox pushed 032 commits to 03master [+0/-0/±2] 13https://git.io/Jfhsk [21:26:15] [02miraheze/puppet] 07paladox 03fd4d83e - Merge pull request #1411 from miraheze/paladox-patch-6 Switch db7 to master [21:26:16] PROBLEM - mw7 Puppet on mw7 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:26:59] PROBLEM - cp6 Stunnel Http for test2 on cp6 is CRITICAL: HTTP CRITICAL - No data received from host [21:27:01] RECOVERY - ping4 on rdb1 is OK: PING OK - Packet loss = 0%, RTA = 0.21 ms [21:27:07] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [21:28:02] RECOVERY - cp7 Stunnel Http for mw7 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15533 bytes in 0.007 second response time [21:28:34] RECOVERY - cp6 Stunnel Http for mw7 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 15525 bytes in 0.004 second response time [21:28:56] RECOVERY - puppet2 puppetserver on puppet2 is OK: TCP OK - 0.001 second response time on 51.89.160.129 port 8140 [21:30:07] @paladox Just had an issue trying to connect to publictestwiki.com on db7: Sorry! This site is experiencing technical difficulties. (Cannot access the database: Cannot access the database: Unknown error (db7.miraheze.org)) [21:30:24] [02mw-config] 07paladox deleted branch 03paladox-patch-2 - 13https://git.io/vbvb3 [21:30:26] [02miraheze/mw-config] 07paladox deleted branch 03paladox-patch-2 [21:30:40] [02miraheze/mw-config] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhsC [21:30:42] [02miraheze/mw-config] 07paladox 03a673353 - fix [21:30:44] @Doug: Ongoing incident [21:30:59] PROBLEM - services2 Puppet on services2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:31:24] >  @Doug: Ongoing incident Yeah, just wasn't sure if that latest error had been seen. You guys are on it, so won't report any different errors now. [21:32:41] PROBLEM - cp3 Puppet on cp3 is CRITICAL: CRITICAL: Puppet has 86 failures. Last run 7 minutes ago with 86 failures. Failed resources (up to 3 shown): File[wiki.hrznstudio.com],File[wiki.hrznstudio.com_private],File[vedopedia.witches-empire.com],File[vedopedia.witches-empire.com_private] [21:32:42] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [21:32:52] @Doug: we know [21:33:04] RECOVERY - cp6 Varnish Backends on cp6 is OK: All 7 backends are healthy [21:33:24] Will update topic when users confirm [21:33:44] greetings [21:34:25] RECOVERY - rdb1 Puppet on rdb1 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [21:35:57] paladox: session issues reported, rdb restoring should recover that? [21:36:14] it should work [21:36:38] RECOVERY - cp3 Puppet on cp3 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [21:37:38] paladox: ack [21:37:40] PROBLEM - bacula2 Bacula Databases db6 on bacula2 is WARNING: WARNING: Full, 22675 files, 97.02GB, 2020-06-07 09:40:00 (2.5 weeks ago) [21:38:43] RECOVERY - Host services1 is UP: PING OK - Packet loss = 0%, RTA = 0.24 ms [21:38:44] RECOVERY - services1 restbase on services1 is OK: TCP OK - 0.000 second response time on 51.89.160.132 port 7231 [21:38:45] PROBLEM - services1 parsoid on services1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:38:45] PROBLEM - services1 zotero on services1 is CRITICAL: connect to address 51.89.160.132 and port 1969: Connection refused [21:38:48] RECOVERY - services1 parsoid on services1 is OK: TCP OK - 0.000 second response time on 51.89.160.132 port 8142 [21:39:06] RECOVERY - services1 zotero on services1 is OK: TCP OK - 0.000 second response time on 51.89.160.132 port 1969 [21:39:06] RECOVERY - services1 citoid on services1 is OK: TCP OK - 0.000 second response time on 51.89.160.132 port 6927 [21:39:09] RECOVERY - services1 APT on services1 is OK: APT OK: 1 packages available for upgrade (0 critical updates). [21:39:25] paladox: can you check puppet? Icinga has still got it as failed [21:39:44] RECOVERY - services1 Current Load on services1 is OK: OK - load average: 0.11, 0.42, 0.22 [21:40:09] RECOVERY - Host mail1 is UP: PING OK - Packet loss = 0%, RTA = 0.38 ms [21:40:10] PROBLEM - mail1 IMAP on mail1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:40:10] PROBLEM - mail1 Disk Space on mail1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:40:10] PROBLEM - mail1 SMTP on mail1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:40:10] PROBLEM - mail1 Current Load on mail1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:40:10] PROBLEM - mail1 SSH on mail1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:40:10] PROBLEM - mail1 Puppet on mail1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:40:10] PROBLEM - mail1 APT on mail1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:40:18] RECOVERY - Host db6 is UP: PING OK - Packet loss = 0%, RTA = 1.27 ms [21:40:19] RECOVERY - services1 SSH on services1 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [21:40:19] RECOVERY - db6 SSH on db6 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [21:40:19] RECOVERY - services1 proton on services1 is OK: TCP OK - 0.000 second response time on 51.89.160.132 port 3030 [21:40:20] PROBLEM - db6 Puppet on db6 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:40:28] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhsQ [21:40:29] [02miraheze/puppet] 07paladox 032ab269c - icinga2: switch to db7 [21:40:33] RECOVERY - mail1 Disk Space on mail1 is OK: DISK OK - free space: / 3773 MB (39% inode=80%); [21:40:34] PROBLEM - db6 Puppet on db6 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:40:54] RECOVERY - mail1 SSH on mail1 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) [21:40:55] RECOVERY - mail1 SMTP on mail1 is OK: SMTP OK - 0.169 sec. response time [21:41:01] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jfhs5 [21:41:03] RECOVERY - mail1 APT on mail1 is OK: APT OK: 1 packages available for upgrade (0 critical updates). [21:41:03] [02miraheze/puppet] 07paladox 039a202ad - matomo switch to db7 [21:41:08] RECOVERY - mail1 Current Load on mail1 is OK: OK - load average: 0.54, 0.26, 0.10 [21:41:09] PROBLEM - mail1 Puppet on mail1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:41:11] RECOVERY - mail1 IMAP on mail1 is OK: IMAP OK - 0.133 second response time on 51.89.160.134 port 143 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ STARTTLS LOGINDISABLED] Dovecot (Debian) ready.] [21:41:24] RECOVERY - ping4 on mail1 is OK: PING OK - Packet loss = 0%, RTA = 0.27 ms [21:41:28] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhsF [21:41:29] [02miraheze/puppet] 07paladox 03f6b47b9 - Update main.pp [21:41:54] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhsN [21:41:56] [02miraheze/puppet] 07paladox 032f2260f - phabricator: switch to db7 [21:42:24] Hey Voidwalker [21:42:43] hi [21:42:51] what nonsense is going on now? [21:43:26] RECOVERY - db6 Puppet on db6 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [21:43:35] umm system is down [21:43:49] @Void [21:44:05] there is no nonsense going on right now. [21:44:20] RECOVERY - mail1 Puppet on mail1 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [21:44:30] Voidwalker: see -staff [21:47:27] [02puppet] 07paladox deleted branch 03paladox-patch-6 - 13https://git.io/vbiAS [21:47:29] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-6 [21:53:42] PROBLEM - cp3 Puppet on cp3 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:53:55] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:53:59] PROBLEM - gluster2 Puppet on gluster2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:54:01] PROBLEM - rdb1 Puppet on rdb1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:54:06] PROBLEM - mail1 Puppet on mail1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:54:07] PROBLEM - bacula2 Puppet on bacula2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:54:23] RECOVERY - mw5 Puppet on mw5 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [21:54:24] PROBLEM - db6 Puppet on db6 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:54:46] PROBLEM - ldap1 Puppet on ldap1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:54:47] PROBLEM - cp8 Puppet on cp8 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:54:48] PROBLEM - cp6 Puppet on cp6 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:54:51] RECOVERY - test2 Puppet on test2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:54:54] PROBLEM - phab1 Puppet on phab1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:55:01] PROBLEM - cp7 Puppet on cp7 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:55:01] PROBLEM - rdb2 Puppet on rdb2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:55:01] PROBLEM - cloud2 Puppet on cloud2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:55:09] RECOVERY - mw4 Puppet on mw4 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:55:17] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhGl [21:55:18] [02miraheze/services] 07MirahezeSSLBot 036b40da7 - BOT: Updating services config for wikis [21:55:19] should recover [21:55:21] RECOVERY - mw7 Puppet on mw7 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [21:55:24] PROBLEM - mon1 Puppet on mon1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:55:26] PROBLEM - db9 Puppet on db9 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:55:28] PROBLEM - ns1 Puppet on ns1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:55:31] PROBLEM - cloud1 Puppet on cloud1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:55:32] PROBLEM - misc1 Puppet on misc1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:55:48] PROBLEM - db7 Puppet on db7 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:55:59] PROBLEM - gluster1 Puppet on gluster1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [21:56:30] RECOVERY - services1 Puppet on services1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:56:58] RECOVERY - mw6 Puppet on mw6 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [21:57:44] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhGE [21:57:46] [02miraheze/puppet] 07paladox 03fd2cbdc - jobrunner: switch ip [22:01:35] RECOVERY - rdb1 Puppet on rdb1 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [22:01:37] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [22:02:10] RECOVERY - mail1 Puppet on mail1 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [22:02:26] RECOVERY - rdb2 Puppet on rdb2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:02:36] RECOVERY - ldap1 Puppet on ldap1 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [22:02:38] RECOVERY - mon1 Puppet on mon1 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [22:04:09] RECOVERY - Host jobrunner1 is UP: PING OK - Packet loss = 0%, RTA = 0.94 ms [22:04:14] RECOVERY - ping6 on jobrunner1 is OK: PING OK - Packet loss = 0%, RTA = 0.73 ms [22:04:14] RECOVERY - jobrunner1 MirahezeRenewSsl on jobrunner1 is OK: TCP OK - 0.000 second response time on 51.77.107.211 port 5000 [22:04:24] RECOVERY - jobrunner1 Disk Space on jobrunner1 is OK: DISK OK - free space: / 15280 MB (53% inode=75%); [22:04:39] RECOVERY - jobrunner1 APT on jobrunner1 is OK: APT OK: 0 packages available for upgrade (0 critical updates). [22:04:44] RECOVERY - jobrunner1 HTTPS on jobrunner1 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 571 bytes in 0.005 second response time [22:04:44] RECOVERY - jobrunner1 JobRunner Service on jobrunner1 is OK: PROCS OK: 1 process with args 'redisJobRunnerService' [22:04:44] RECOVERY - jobrunner1 Redis Process on jobrunner1 is OK: PROCS OK: 1 process with args 'redis-server' [22:04:44] RECOVERY - jobrunner1 php-fpm on jobrunner1 is OK: PROCS OK: 7 processes with command name 'php-fpm7.3' [22:04:59] RECOVERY - ping4 on jobrunner1 is OK: PING OK - Packet loss = 0%, RTA = 0.24 ms [22:05:04] RECOVERY - jobrunner1 JobChron Service on jobrunner1 is OK: PROCS OK: 1 process with args 'redisJobChronService' [22:05:09] RECOVERY - jobrunner1 Puppet on jobrunner1 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [22:05:59] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-6 [+0/-0/±1] 13https://git.io/JfhGP [22:06:01] [02miraheze/puppet] 07paladox 03da3d044 - bacula: Switch db6 to db7 [22:06:02] [02puppet] 07paladox created branch 03paladox-patch-6 - 13https://git.io/vbiAS [22:06:04] [02puppet] 07paladox opened pull request 03#1412: bacula: Switch db6 to db7 - 13https://git.io/JfhG1 [22:06:53] [02puppet] 07paladox synchronize pull request 03#1412: bacula: Switch db6 to db7 - 13https://git.io/JfhG1 [22:06:54] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-6 [+0/-0/±1] 13https://git.io/JfhGy [22:06:56] [02miraheze/puppet] 07paladox 03097d9dc - Update bacula-dir.conf [22:07:25] [02puppet] 07paladox synchronize pull request 03#1412: bacula: Switch db6 to db7 - 13https://git.io/JfhG1 [22:07:26] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-6 [+0/-0/±1] 13https://git.io/JfhG9 [22:07:28] [02miraheze/puppet] 07paladox 0331e2144 - Update bacula-dir.conf [22:07:38] [02puppet] 07paladox synchronize pull request 03#1412: bacula: Switch db6 to db7 - 13https://git.io/JfhG1 [22:07:39] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-6 [+0/-0/±1] 13https://git.io/JfhGH [22:07:41] [02miraheze/puppet] 07paladox 0348ff49f - Update bacula-dir.conf [22:08:17] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhG7 [22:08:18] [02miraheze/puppet] 07paladox 032e0960f - icingaweb2: switch to db7 [22:08:40] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-6 [+0/-0/±1] 13https://git.io/JfhG5 [22:08:42] [02miraheze/puppet] 07paladox 0355c8195 - Update nrpe.cfg.erb [22:08:43] [02puppet] 07paladox synchronize pull request 03#1412: bacula: Switch db6 to db7 - 13https://git.io/JfhG1 [22:09:00] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhGd [22:09:01] [02miraheze/puppet] 07paladox 0378943df - roundcubemail: Switch to db7 [22:09:21] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-6 [+0/-0/±1] 13https://git.io/JfhGb [22:09:23] [02miraheze/puppet] 07paladox 032afb6a0 - Update director.pp [22:09:24] [02puppet] 07paladox synchronize pull request 03#1412: bacula: Switch db6 to db7 - 13https://git.io/JfhG1 [22:09:41] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhGN [22:09:43] [02miraheze/puppet] 07paladox 032507e75 - monitoring: Switch to db7 [22:09:58] [02puppet] 07paladox closed pull request 03#1412: bacula: Switch db6 to db7 - 13https://git.io/JfhG1 [22:10:00] [02miraheze/puppet] 07paladox pushed 037 commits to 03master [+0/-0/±9] 13https://git.io/JfhGA [22:10:01] [02miraheze/puppet] 07paladox 035961055 - Merge pull request #1412 from miraheze/paladox-patch-6 bacula: Switch db6 to db7 [22:10:57] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-6 [22:10:59] [02puppet] 07paladox deleted branch 03paladox-patch-6 - 13https://git.io/vbiAS [22:13:07] PROBLEM - cp6 Stunnel Http for test2 on cp6 is CRITICAL: HTTP CRITICAL - No data received from host [22:13:12] PROBLEM - gluster1 GlusterFS port 49152 on gluster1 is CRITICAL: connect to address 51.77.107.209 and port 49152: Connection refused [22:13:16] PROBLEM - bacula2 Bacula Phabricator Static on bacula2 is WARNING: WARNING: Full, 82779 files, 3.261GB, 2020-06-07 15:19:00 (2.5 weeks ago) [22:13:23] PROBLEM - gluster2 GlusterFS port 49152 on gluster2 is CRITICAL: connect to address 51.68.201.37 and port 49152: Connection refused [22:13:36] PROBLEM - bacula2 Bacula Private Git on bacula2 is CRITICAL: CRITICAL: Full, 4274 files, 12.40MB, 2020-06-07 15:12:00 (2.5 weeks ago) [22:13:53] paladox: the cp6 doesn't sound good [22:14:01] why? [22:14:04] PROBLEM - cp8 Disk Space on cp8 is WARNING: DISK WARNING - free space: / 1599 MB (8% inode=93%); [22:14:05] PROBLEM - bacula2 Bacula Databases db9 on bacula2 is CRITICAL: CRITICAL: no terminated jobs [22:14:08] PROBLEM - bacula2 Bacula Databases db6 on bacula2 is WARNING: WARNING: Full, 22675 files, 97.02GB, 2020-06-07 09:40:00 (2.5 weeks ago) [22:14:31] paladox: both test2 and mw7 according to web had stunnel issues [22:14:40] test2 is down, yes [22:14:44] that is known [22:14:54] also, please re-ack things and with descriptive messages once we're online [22:15:07] are you sure about mw7? [22:15:29] paladox: it did for a bit [22:15:41] https://icinga.miraheze.org/dashboard#!/monitoring/service/history?host=cp6&service=cp6%20Stunnel%20Http%20for%20mw7 [22:15:42] [ Icinga Web 2 Login ] - icinga.miraheze.org [22:22:57] PROBLEM - bacula2 Bacula Databases db7 on bacula2 is CRITICAL: CRITICAL: no terminated jobs [22:23:15] paladox: ack? ^ [22:23:28] PROBLEM - cloud2 Current Load on cloud2 is WARNING: WARNING - load average: 19.88, 20.59, 18.16 [22:25:07] RECOVERY - cp6 Stunnel Http for test2 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 15532 bytes in 0.005 second response time [22:25:24] RECOVERY - cloud2 Current Load on cloud2 is OK: OK - load average: 17.63, 19.40, 17.99 [22:32:44] !log migrating jobrunner1 to cloud1 and test2 to cloud2 (even out the load) [22:32:47] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [22:33:52] PROBLEM - jobrunner1 Puppet on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:34:07] known [22:34:33] * RhinosF1 will be here [22:35:07] PROBLEM - ping6 on test2 is CRITICAL: PING CRITICAL - Packet loss = 100% [22:35:10] PROBLEM - jobrunner1 JobChron Service on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:35:16] known [22:35:46] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhZP [22:35:47] [02miraheze/puppet] 07paladox 0346f324a - Revert "jobrunner: switch ip" This reverts commit fd2cbdc20e1dab14ec2a9fbf83eba8f63fcefecf. [22:35:47] PROBLEM - jobrunner1 APT on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:36:01] [02miraheze/mw-config] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhZD [22:36:02] [02miraheze/mw-config] 07paladox 03a6170fc - Revert "Switch jobrunner ip" This reverts commit b3720346199f7fd23fd9fed884bb0d761b41c16d. [22:36:31] PROBLEM - ping6 on jobrunner1 is CRITICAL: CRITICAL - Destination Unreachable (2001:41d0:800:105a::3) [22:36:45] All jobrunner and test2 alerts are known [22:37:03] PROBLEM - jobrunner1 MirahezeRenewSsl on jobrunner1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:37:05] PROBLEM - jobrunner1 Disk Space on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:37:14] * RhinosF1 wonders if we should have downtime'd the host [22:37:19] PROBLEM - jobrunner1 php-fpm on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:37:19] PROBLEM - jobrunner1 JobRunner Service on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:37:46] PROBLEM - ping4 on jobrunner1 is CRITICAL: CRITICAL - Plugin timed out after 10 seconds [22:38:01] down [22:38:03] *done [22:38:06] PROBLEM - jobrunner1 HTTPS on jobrunner1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:38:30] PROBLEM - cp8 Stunnel Http for test2 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:38:41] paladox: and for test2 [22:39:04] PROBLEM - jobrunner1 Redis Process on jobrunner1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:39:35] PROBLEM - cp7 Stunnel Http for test2 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:39:46] PROBLEM - Host jobrunner1 is DOWN: CRITICAL - Plugin timed out after 30 seconds [22:39:50] PROBLEM - cp3 Stunnel Http for test2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:40:04] * RhinosF1 gives icinga-miraheze a cookie [22:40:07] PROBLEM - cp6 Stunnel Http for test2 on cp6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:40:32] done [22:41:10] paladox: thanks [22:45:36] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 128.199.139.216/cpweb [22:49:33] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [22:53:37] PROBLEM - bacula2 Bacula Databases db9 on bacula2 is CRITICAL: CRITICAL: no terminated jobs [22:53:39] PROBLEM - bacula2 Bacula Phabricator Static on bacula2 is WARNING: WARNING: Full, 82779 files, 3.261GB, 2020-06-07 15:19:00 (2.5 weeks ago) [22:53:40] PROBLEM - gluster2 GlusterFS port 49152 on gluster2 is CRITICAL: connect to address 51.68.201.37 and port 49152: Connection refused [22:53:41] PROBLEM - cp8 Disk Space on cp8 is WARNING: DISK WARNING - free space: / 1529 MB (7% inode=93%); [22:53:42] PROBLEM - gluster1 GlusterFS port 49152 on gluster1 is CRITICAL: connect to address 51.77.107.209 and port 49152: Connection refused [22:53:51] PROBLEM - cp8 Stunnel Http for test2 on cp8 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:53:51] they sound fine [22:53:55] PROBLEM - cloud2 Current Load on cloud2 is CRITICAL: CRITICAL - load average: 16.49, 27.88, 30.76 [22:53:55] PROBLEM - bacula2 Bacula Private Git on bacula2 is CRITICAL: CRITICAL: Full, 4274 files, 12.40MB, 2020-06-07 15:12:00 (2.5 weeks ago) [22:54:20] PROBLEM - cp3 Stunnel Http for test2 on cp3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:54:26] PROBLEM - cp7 Stunnel Http for test2 on cp7 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:54:29] [02miraheze/dns] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JfhnB [22:54:30] [02miraheze/dns] 07paladox 03fe90f56 - Revert "Switch test2 and jobrunner1 ip around" This reverts commit 43ff5d98862e973414c0c36901739409bfde494f. [22:54:35] PROBLEM - bacula2 Bacula Databases db7 on bacula2 is CRITICAL: CRITICAL: no terminated jobs [22:54:41] PROBLEM - cp6 Stunnel Http for test2 on cp6 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:55:20] paladox: what's happened? [22:57:11] [02miraheze/dns] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/Jfhn0 [22:57:12] [02miraheze/dns] 07paladox 03e06b26b - Update miraheze.org [23:02:49] PROBLEM - cloud2 Current Load on cloud2 is WARNING: WARNING - load average: 10.00, 13.94, 22.39 [23:04:57] PROBLEM - cloud2 Current Load on cloud2 is CRITICAL: CRITICAL - load average: 24.96, 17.36, 22.50 [23:05:58] PROBLEM - cloud2 Current Load on cloud2 is WARNING: WARNING - load average: 20.97, 17.79, 22.31 [23:12:15] RECOVERY - cloud2 Current Load on cloud2 is OK: OK - load average: 12.47, 15.38, 19.85 [23:14:03] RECOVERY - cp3 Stunnel Http for test2 on cp3 is OK: HTTP OK: HTTP/1.1 200 OK - 15535 bytes in 1.015 second response time [23:14:03] RECOVERY - cp7 Stunnel Http for test2 on cp7 is OK: HTTP OK: HTTP/1.1 200 OK - 15535 bytes in 0.004 second response time [23:14:30] RECOVERY - cp6 Stunnel Http for test2 on cp6 is OK: HTTP OK: HTTP/1.1 200 OK - 15535 bytes in 0.005 second response time [23:15:10] RECOVERY - cp8 Stunnel Http for test2 on cp8 is OK: HTTP OK: HTTP/1.1 200 OK - 15541 bytes in 0.320 second response time [23:16:56] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_34 [+0/-0/±1] 13https://git.io/Jfhcf [23:16:58] [02miraheze/mediawiki] 07paladox 03ea77b5c - Update CommentStreams [23:19:35] !log rebuild lc on mw* and jobrunner1 [23:19:40] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [23:23:24] PROBLEM - test2 Puppet on test2 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 4 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[git_pull_MediaWiki core] [23:28:19] PROBLEM - db6 MySQL on db6 is CRITICAL: Can't connect to MySQL server on '51.89.160.130' (115) [23:41:22] PROBLEM - cloud2 Current Load on cloud2 is WARNING: WARNING - load average: 23.63, 21.16, 18.43 [23:47:42] RECOVERY - cloud2 Current Load on cloud2 is OK: OK - load average: 13.74, 18.46, 18.41 [23:51:52] PROBLEM - cloud2 Current Load on cloud2 is WARNING: WARNING - load average: 21.33, 20.98, 19.51 [23:54:06] RECOVERY - cloud2 Current Load on cloud2 is OK: OK - load average: 13.25, 18.11, 18.66