[03:21:32] crud, NO_PAYMENT_PRODUCTS_AVAILABLE via Connect in Chile?
[03:24:11] Fundraising Sprint They Live, Fundraising Sprint USB stands for underhanded socket bureaucracy, Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, Patch-For-Review: Extend deletion to multiple silverpop databases - https://phabricator.wikimedia.org/T205332 (Eileenmcnaughton) @CCogdill_WMF I...
[03:27:00] ejegg: is tht the failmail cause?
[03:27:27] that's what it looks like
[03:27:45] guessing the recurring-ness might have something to do with it
[03:28:13] sending an email to PPena to ask if she can confirm what we should have available
[03:28:42] ejegg: ok - do we need to take something down for tonight?
[03:29:53] let's see if this is just one donor
[03:35:52] hmph, that description should be translated
[08:31:28] PROBLEM - Host americium is DOWN: PING CRITICAL - Packet loss = 100%
[08:37:48] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=11 [critical = 10]
[08:42:48] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=15 [critical = 10]
[08:47:48] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=20 [critical = 10]
[08:52:48] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=27 [critical = 10]
[08:57:58] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=35 [critical = 10]
[09:02:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=41 [critical = 10]
[09:07:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=45 [critical = 10]
[09:11:50] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=10 [critical = 10]
[09:12:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=47 [critical = 10]
[09:16:50] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=11 [critical = 10]
[09:17:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=52 [critical = 10]
[09:21:50] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=13 [critical = 10]
[09:21:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=10 [critical = 10]
[09:22:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=63 [critical = 10]
[09:26:50] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=14 [critical = 10]
[09:26:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=11 [critical = 10]
[09:27:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=68 [critical = 10]
[09:31:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=15 [critical = 10]
[09:31:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=12 [critical = 10]
[09:32:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=73 [critical = 10]
[09:35:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=10 [critical = 10]
[09:36:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=16 [critical = 10]
[09:36:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=14 [critical = 10]
[09:37:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=82 [critical = 10]
[09:40:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=10 [critical = 10]
[09:41:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=17 [critical = 10]
[09:41:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=15 [critical = 10]
[09:42:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=93 [critical = 10]
[09:45:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=11 [critical = 10]
[09:46:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=19 [critical = 10]
[09:46:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=16 [critical = 10]
[09:47:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=97 [critical = 10]
[09:50:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=10 [critical = 10]
[09:50:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=12 [critical = 10]
[09:51:50] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=20 [critical = 10]
[09:51:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=18 [critical = 10]
[09:52:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=102 [critical = 10]
[09:55:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=10 [critical = 10]
[09:55:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=13 [critical = 10]
[09:56:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=21 [critical = 10]
[09:57:00] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=20 [critical = 10]
[09:57:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=108 [critical = 10]
[10:00:10] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=11 [critical = 10]
[10:00:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=13 [critical = 10]
[10:01:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=22 [critical = 10]
[10:01:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=21 [critical = 10]
[10:02:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=128 [critical = 10]
[10:05:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=12 [critical = 10]
[10:05:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=14 [critical = 10]
[10:07:07] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=23 [critical = 10]
[10:07:07] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=22 [critical = 10]
[10:07:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=159 [critical = 10]
[10:10:10] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=12 [critical = 10]
[10:10:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=14 [critical = 10]
[10:11:50] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=24 [critical = 10]
[10:11:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=24 [critical = 10]
[10:12:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=164 [critical = 10]
[10:15:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=13 [critical = 10]
[10:15:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=15 [critical = 10]
[10:16:50] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=25 [critical = 10]
[10:16:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=25 [critical = 10]
[10:17:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=169 [critical = 10]
[10:20:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=13 [critical = 10]
[10:20:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=16 [critical = 10]
[10:21:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=26 [critical = 10]
[10:21:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=26 [critical = 10]
[10:22:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=178 [critical = 10]
[10:25:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=13 [critical = 10]
[10:25:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=16 [critical = 10]
[10:26:50] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=27 [critical = 10]
[10:26:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=27 [critical = 10]
[10:27:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=187 [critical = 10]
[10:30:10] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=14 [critical = 10]
[10:30:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=17 [critical = 10]
[10:31:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=28 [critical = 10]
[10:31:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=29 [critical = 10]
[10:32:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=192 [critical = 10]
[10:35:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=14 [critical = 10]
[10:35:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=18 [critical = 10]
[10:36:50] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=29 [critical = 10]
[10:36:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=30 [critical = 10]
[10:37:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=195 [critical = 10]
[10:40:10] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=15 [critical = 10]
[10:40:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=19 [critical = 10]
[10:41:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=30 [critical = 10]
[10:41:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=31 [critical = 10]
[10:42:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=204 [critical = 10]
[10:45:10] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=16 [critical = 10]
[10:45:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=20 [critical = 10]
[10:46:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=31 [critical = 10]
[10:46:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=32 [critical = 10]
[10:47:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=209 [critical = 10]
[10:50:10] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=16 [critical = 10]
[10:50:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=20 [critical = 10]
[10:51:50] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=32 [critical = 10]
[10:51:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=34 [critical = 10]
[10:52:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=220 [critical = 10]
[10:55:10] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=17 [critical = 10]
[10:55:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=21 [critical = 10]
[10:56:50] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=33 [critical = 10]
[10:56:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=35 [critical = 10]
[10:57:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=224 [critical = 10]
[11:00:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=17 [critical = 10]
[11:00:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=21 [critical = 10]
[11:01:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=35 [critical = 10]
[11:01:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=37 [critical = 10]
[11:02:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=236 [critical = 10]
[11:05:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=18 [critical = 10]
[11:05:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=22 [critical = 10]
[11:06:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=36 [critical = 10]
[11:06:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=38 [critical = 10]
[11:07:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=242 [critical = 10]
[11:10:10] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=18 [critical = 10]
[11:10:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=23 [critical = 10]
[11:11:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=37 [critical = 10]
[11:11:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=39 [critical = 10]
[11:12:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=247 [critical = 10]
[11:15:10] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=19 [critical = 10]
[11:15:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=23 [critical = 10]
[11:16:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=38 [critical = 10]
[11:16:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=41 [critical = 10]
[11:17:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=252 [critical = 10]
[11:20:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=19 [critical = 10]
[11:20:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=24 [critical = 10]
[11:21:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=39 [critical = 10]
[11:21:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=42 [critical = 10]
[11:22:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=256 [critical = 10]
[11:25:10] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=20 [critical = 10]
[11:25:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=25 [critical = 10]
[11:26:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=41 [critical = 10]
[11:26:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=43 [critical = 10]
[11:27:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=264 [critical = 10]
[11:30:10] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=20 [critical = 10]
[11:30:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=26 [critical = 10]
[11:31:50] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=42 [critical = 10]
[11:31:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=44 [critical = 10]
[11:32:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=271 [critical = 10]
[11:35:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=21 [critical = 10]
[11:35:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=27 [critical = 10]
[11:36:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=44 [critical = 10]
[11:36:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=46 [critical = 10]
[11:37:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=275 [critical = 10]
[11:40:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=22 [critical = 10]
[11:40:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=28 [critical = 10]
[11:41:01] Jeff_Green, having some issues
[11:41:11] I've mailed a couple of notes
[11:41:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=45 [critical = 10]
[11:41:40] currently look at switching off the job related to this:
[11:41:41] Fail Mail (civi1001) run-job: Banner impressions loader timed out after 10 minutes
[11:41:48] as civi1001 is showing 600 in top
[11:41:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=48 [critical = 10]
[11:42:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=279 [critical = 10]
[11:45:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=22 [critical = 10]
[11:45:50] PROBLEM - check_rsyslog_backlog on payments1001 is CRITICAL: CRITICAL frlog1001=29 [critical = 10]
[11:46:40] PROBLEM - check_rsyslog_backlog on payments1003 is CRITICAL: CRITICAL frlog1001=47 [critical = 10]
[11:46:50] PROBLEM - check_rsyslog_backlog on payments1002 is CRITICAL: CRITICAL frlog1001=49 [critical = 10]
[11:47:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=284 [critical = 10]
[11:48:34] jgleeson: hey, yup I just replied
[11:49:17] thanks Jeff_Green, read your reply but struggling to restart rsynclog. I don't have sudo perms
[11:49:25] yup
[11:49:28] I'm trying sudo service rsyslog restart
[11:49:30] i'm looking at it
[11:49:35] cool, thanks
[11:50:00] PROBLEM - check_rsyslog_backlog on frpig1001 is CRITICAL: CRITICAL frlog1001=23 [critical = 10]
[11:50:50] RECOVERY - check_rsyslog_backlog on payments1001 is OK: OK
[11:51:06] rsyslog*
[11:51:37] there are rsync problems too, it looks like one of the banner loggers fell over
[11:51:40] RECOVERY - check_rsyslog_backlog on payments1003 is OK: OK
[11:51:46] what the heck happened last night?!
[11:51:50] RECOVERY - check_rsyslog_backlog on payments1002 is OK: OK
[11:52:08] yeah I'm getting them mixed up lol
[11:52:37] I was looking at the log files for that and noticed the extremely high load in tp
[11:52:38] top
[11:52:49] although I can't see an offending process or CPU load to explain it
[11:52:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=283 [critical = 10]
[11:52:53] on which host?
[11:53:22] civi1001
[11:53:45] load average: 666.33, 653.45, 619.75
[11:53:48] Jeff_Green, ^
[11:53:51] oh really
[11:54:05] is that still happening?
[11:55:00] RECOVERY - check_rsyslog_backlog on frpig1001 is OK: OK
[11:55:05] yes
[11:55:11] viewing top now
[11:55:15] looking
[11:55:25] this doesn't behave like a machine with load >600
[11:56:07] I've never seen load that high!
[11:57:39] if it were actually working that hard it should be impossible to do anything b/c there wouldn't be resources for ssh
[11:57:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=283 [critical = 10]
[11:58:25] it is really slow for me
[11:58:39] but yes, I would imagine 600 would mean ground to a halt
[11:58:50] RECOVERY - Host americium is UP: PING OK - Packet loss = 0%, RTA = 0.85 ms
[11:59:44] well americium had some kind of kernel panic
[12:00:34] killing prometheus_node_exporter seems to have coincided with load recovery
[12:01:22] I can see it dropping
[12:01:24] woah
[12:01:27] that was crazy
[12:01:44] I wonder if that is the root cause to the banner job timeouts
[12:02:12] OH!
[12:02:18] ok you just explained it right there
[12:02:26] the root cause was americium falling over
[12:02:47] americium is the banner logger, it exports its banner log archive by nfs
[12:02:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=283 [critical = 10]
[12:03:02] ahh
[12:03:07] so when it fell over, civi1001 freaked out trying to access that nfs export
[12:03:21] nfs is notorious for not handling outages gracefully
[12:04:55] hmmm
[12:05:09] Jeff_Green, the load on civi1001 is increasing again
[12:05:12] 68+
[12:05:47] yup, watching too
[12:06:21] something is hammering rsyslog
[12:07:50] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=283 [critical = 10]
[12:08:40] PROBLEM - check_ipsec on americium is CRITICAL: Strongswan CRITICAL - ok: 0 not-conn: civi1001_v4
[12:10:10] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:10] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:11] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:11] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:12] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:12] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:13] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:13] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:14] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:14] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:15] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:15] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:16] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:16] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:17] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:17] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:18] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:18] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:21] hahahah
[12:10:30] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:30] :)
[12:10:30] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:31] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:31] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:32] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:32] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:33] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:33] PROBLEM - check_load on civi1001 is CRITICAL: CRITICAL - load average: 79.54, 142.74, 354.62
[12:10:51] i think we should just reboot civi1001
[12:12:52] Jeff_Green, do we need to disable paymentswiki first
[12:12:54] PROBLEM - check_rsyslog_backlog on frdb1001 is CRITICAL: CRITICAL frlog1001=283 [critical = 10]
[12:12:59] no
[12:13:30]