[00:07:51] RECOVERY - cp5 Disk Space on cp5 is OK: DISK OK - free space: / 5459 MB (23% inode=94%); [00:14:35] PROBLEM - cp2 Disk Space on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:14:48] PROBLEM - cp2 Puppet on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:14:59] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb [00:15:11] PROBLEM - misc1 GDNSD Datacenters on misc1 is CRITICAL: CRITICAL - 2 datacenters are down: 107.191.126.23/cpweb, 2604:180:0:33b::2/cpweb [00:15:31] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:15:44] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:16:05] PROBLEM - cp2 Current Load on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:16:13] PROBLEM - cp2 HTTPS on cp2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:16:18] PROBLEM - Host cp2 is DOWN: PING CRITICAL - Packet loss = 100% [00:36:00] Uh bandwith again [00:38:02] ....what [00:41:19] [02miraheze/dns] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fhakf [00:41:20] [02miraheze/dns] 07paladox 03963669e - Depool cp2 [01:28:59] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [01:29:10] RECOVERY - misc1 GDNSD Datacenters on misc1 is OK: OK - all datacenters are online [01:29:12] RECOVERY - Host cp2 is UP: PING OK - Packet loss = 0%, RTA = 95.14 ms [01:29:19] PROBLEM - cp2 Puppet on cp2 is WARNING: WARNING: Puppet last ran 1 hour ago [01:29:46] [02miraheze/dns] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fhaLD [01:29:48] [02miraheze/dns] 07paladox 03a4f4aa2 - Revert "Depool cp2" This reverts commit 963669ea35913b1a8e1ac8aeeb336a4eb2617976. [01:29:59] [02ManageWiki] 07The-Voidwalker opened pull request 03#75: updates to default private group handling - 13https://git.io/fhaLy [01:31:03] Ah so that worked in your testing Voidwalker ^^? [01:31:27] uhm, the first fix does :P [01:31:37] great! [01:31:39] why don't I test the second part before you merge :P [01:31:54] ah ok [01:31:56] * paladox waits [01:31:57] :) [01:32:14] that could cause more problems ;) [01:32:50] heh [01:33:25] PROBLEM - cp2 Puppet on cp2 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:35:24] RECOVERY - cp2 Puppet on cp2 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [01:37:07] paladox, looks to work just fine [01:37:14] Great [01:37:17] i'll merge [01:37:26] [02ManageWiki] 07paladox closed pull request 03#75: updates to default private group handling - 13https://git.io/fhaLy [01:37:28] [02miraheze/ManageWiki] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fhatv [01:37:29] [02miraheze/ManageWiki] 07The-Voidwalker 03a84163b - updates to default private group handling (#75) fix allowing sysops to add it, and make it so that it is completely removed [01:38:04] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_32 [+0/-0/±1] 13https://git.io/fhatf [01:38:05] [02miraheze/mediawiki] 07paladox 03c5286fb - Update ManageWiki [01:38:16] that was fun to test, because I didn't have managewiki properly setup, I had to eval the hook in order to test it :P [01:38:24] lol [02:10:15] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fhaq7 [02:10:17] [02miraheze/services] 07MirahezeSSLBot 03916bcb0 - BOT: Updating services config for wikis [02:20:14] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fhamC [02:20:15] [02miraheze/services] 07MirahezeSSLBot 03c1aa84f - BOT: Updating services config for wikis [13:53:38] PROBLEM - bacula1 Disk Space on bacula1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:55:43] PROBLEM - bacula1 Disk Space on bacula1 is WARNING: DISK WARNING - free space: / 98143 MB (20% inode=99%); [14:43:39] PROBLEM - bacula1 Disk Space on bacula1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:47:44] PROBLEM - bacula1 Disk Space on bacula1 is WARNING: DISK WARNING - free space: / 98143 MB (20% inode=99%); [15:01:18] Hi! Here is the list of currently open high priority tasks on Phabricator [15:01:25] No updates for 7 days - https://phabricator.miraheze.org/T4019 - Encrypt Redis traffic - authored by Southparkfan, assigned to None [15:01:31] No updates for 7 days - https://phabricator.miraheze.org/T4017 - Reconfigure TLS settings inside MariaDB - authored by Southparkfan, assigned to None [15:34:11] .priotasks [15:34:12] No updates for 7 days - https://phabricator.miraheze.org/T4019 - Encrypt Redis traffic - authored by Southparkfan, assigned to None [15:34:15] No updates for 7 days - https://phabricator.miraheze.org/T4017 - Reconfigure TLS settings inside MariaDB - authored by Southparkfan, assigned to None [16:40:16] [02miraheze/services] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fhVBP [16:40:18] [02miraheze/services] 07MirahezeSSLBot 0339d7915 - BOT: Updating services config for wikis [16:55:04] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fhV0C [16:55:06] [02miraheze/puppet] 07paladox 035b3a369 - Update matomo to 3.8.0 [17:25:53] [02miraheze/MirahezeMagic] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fhVzq [17:25:55] [02miraheze/MirahezeMagic] 07paladox 03509a209 - Fix script to support MW 1.32 Since https://github.com/wikimedia/mediawiki/commit/28fc31ccc3b6f7d3d02268a95fdca9a5c2b7c3ab#diff-cc640a6f2072d4f862e71ebb075ebb0a it now uses absolute url [17:25:55] Title: [ sitemaps: absolute URL for sitemaps · wikimedia/mediawiki@28fc31c · GitHub ] - github.com [17:26:36] [02miraheze/mediawiki] 07paladox pushed 031 commit to 03REL1_32 [+0/-0/±1] 13https://git.io/fhVzZ [17:26:37] [02miraheze/mediawiki] 07paladox 03cb28ba8 - Update MM [17:26:45] [02miraheze/puppet] 07Southparkfan pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fhVzc [17:26:47] [02miraheze/puppet] 07Southparkfan 0311ac57b - Disallow SemrushBot immediately This bot is potentially putting the servers at risk. [17:30:01] !log running generateMirahezeSitemap.php on mw1 [17:30:10] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [17:38:33] !log issues a varnish ban for req.url ~ ^/robots.txt on all cache proxies [17:38:42] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log, Master [17:39:05] Please note that there might be a delay up to two weeks before SEMrushBot discovers the changes you made to robots.txt. [17:39:32] let's ban it nginx side SPF|Cloud [17:40:47] Oh, we're still banning ArchiveTeam and ArchiveBot? [17:41:15] seems we are [17:41:49] [02miraheze/puppet] 07Southparkfan pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/fhVga [17:41:50] [02miraheze/puppet] 07Southparkfan 03daad389 - Ban SemrushBot for good inside nginx Up to 2 weeks to fetch an update for robots.txt? Let's do it the quick way [17:44:19] are we thinking that's the cause to the increased bandwidth? [17:44:40] or is that just another factor? [17:44:50] Yep, we think *it* may have been the cause [17:47:43] access.log.2.gz contains all requests from [21/Jan/2019:05:43:16 +0000] til [22/Jan/2019:05:43:16 +0000] [17:48:14] it contains 680628 entries, 168066 of them have 'SemrushBot' in their UA [17:49:05] too bad that our grafana data has been deleted due to the prometheus migration, but I'm confident that if I dig deeper through the old access logs, I'm sure SemrushBot will show up there as well [22:22:42] PROBLEM - mw3 JobQueue on mw3 is CRITICAL: JOBQUEUE CRITICAL - job queue greater than 300 jobs. Current queue: 349 [22:26:42] RECOVERY - mw3 JobQueue on mw3 is OK: JOBQUEUE OK - job queue below 300 jobs