[00:05:24] !log asher synchronized wmf-config/db.php 'returning db32 to normal weight' [00:05:26] Logged the message, Master [00:12:30] New patchset: Asher; "adding db52/53 to s1" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1971 [00:12:45] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1971 [00:12:51] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1971 [00:12:52] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1971 [00:16:25] PROBLEM - Squid on brewster is CRITICAL: Connection refused [00:16:26] PROBLEM - Squid on brewster is CRITICAL: Connection refused [00:18:42] !log cleared lighttpd logs on brewster and restarted squid and lighttpd [00:18:44] Logged the message, Master [00:21:05] !log rebuilding virt1 as a nova compute node [00:21:06] Logged the message, Master [00:22:50] !log removing virt1 cname [00:22:52] Logged the message, Master [00:26:15] RECOVERY - Squid on brewster is OK: TCP OK - 0.000 second response time on port 8080 [00:26:16] RECOVERY - Squid on brewster is OK: TCP OK - 0.000 second response time on port 8080 [00:34:37] New patchset: Ryan Lane; "Removing virt1.wikimedia.org and adding virt1.pmtpa.wmnet" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1972 [00:34:52] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1972 [00:40:40] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1972 [00:40:41] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1972 [00:41:23] People are making themselves on admins on English Wikipedia, as we speak, right now. Is this a hole in the system? http://en.wikipedia.org/wiki/Special:Log/rights [00:43:01] Okay, perhaps not, as one of them did submit themselves for review. [00:43:12] (Administrator review) [00:50:35] RECOVERY - Puppet freshness on virt1 is OK: puppet ran at Thu Jan 19 00:50:23 UTC 2012 [00:50:36] RECOVERY - Puppet freshness on virt1 is OK: puppet ran at Thu Jan 19 00:50:23 UTC 2012 [00:51:59] zanimum: those users were adding or removing other rights, they were already administrators [00:52:28] weird, but phew [01:01:06] !log installed python-argparse on stat1 [01:01:07] Logged the message, Master [01:12:53] New patchset: Asher; "removing extra frontend cache capacity from mobile" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1973 [01:13:08] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1973 [01:13:09] !log updated zip code/representative data on enwiki to r109465 [01:13:38] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1973 [01:13:39] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1973 [01:15:21] Logged the message, Master [01:15:35] ... lag ;) [01:17:19] !log Leaving cleanupUploadStash.php running against commonswiki in a screen session as me on hume [01:17:21] Logged the message, Master [01:17:52] will the normal editing mode change be automatic or does someone need to manually change the code back? [01:18:15] Manually [01:18:21] But Ryan has planned to be around to do it [01:18:32] okay :) [01:20:11] New patchset: Ryan Lane; "Adding virt1 public cert" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1974 [01:20:26] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1974 [01:21:42] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1974 [01:21:43] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1974 [01:26:39] RECOVERY - ps1-d2-pmtpa-infeed-load-tower-A-phase-Y on ps1-d2-pmtpa is OK: ps1-d2-pmtpa-infeed-load-tower-A-phase-Y OK - 1200 [01:26:39] RECOVERY - ps1-d2-pmtpa-infeed-load-tower-A-phase-Y on ps1-d2-pmtpa is OK: ps1-d2-pmtpa-infeed-load-tower-A-phase-Y OK - 1200 [01:42:46] PROBLEM - Varnish HTTP mobile-frontend on cp1040 is CRITICAL: Connection refused [01:42:47] PROBLEM - Varnish HTTP mobile-frontend on cp1040 is CRITICAL: Connection refused [01:46:35] PROBLEM - Varnish HTTP mobile-frontend on cp1039 is CRITICAL: Connection refused [01:46:36] PROBLEM - Varnish HTTP mobile-frontend on cp1039 is CRITICAL: Connection refused [01:46:55] PROBLEM - mobile traffic loggers on cp1040 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishncsa [01:46:56] PROBLEM - mobile traffic loggers on cp1040 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishncsa [01:49:15] PROBLEM - Varnish HTTP mobile-backend on cp1039 is CRITICAL: Connection refused [01:49:16] PROBLEM - Varnish HTTP mobile-backend on cp1039 is CRITICAL: Connection refused [01:50:55] PROBLEM - mobile traffic loggers on cp1039 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishncsa [01:50:55] PROBLEM - mobile traffic loggers on cp1039 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishncsa [01:50:55] ACKNOWLEDGEMENT - Varnish HTTP mobile-backend on cp1039 is CRITICAL: Connection refused asher waiting for puppet [01:50:56] ACKNOWLEDGEMENT - Varnish HTTP mobile-backend on cp1039 is CRITICAL: Connection refused asher waiting for puppet [01:52:45] PROBLEM - Varnish HTTP mobile-backend on cp1040 is CRITICAL: Connection refused [01:52:46] PROBLEM - Varnish HTTP mobile-backend on cp1040 is CRITICAL: Connection refused [01:53:45] ACKNOWLEDGEMENT - mobile traffic loggers on cp1040 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishncsa asher waiting for puppet [01:53:46] ACKNOWLEDGEMENT - mobile traffic loggers on cp1040 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishncsa asher waiting for puppet [01:55:38] !log added virt1 and virt4 to instance volume for gluster [01:55:40] Logged the message, Master [02:00:25] RECOVERY - mobile traffic loggers on cp1044 is OK: PROCS OK: 2 processes with command name varnishncsa [02:00:25] RECOVERY - mobile traffic loggers on cp1044 is OK: PROCS OK: 2 processes with command name varnishncsa [02:05:13] !log rebalancing instance gluster volume. network may get saturated for a while. [02:05:16] Logged the message, Master [02:05:51] http://ganglia.wikimedia.org/2.2.0/graph_all_periods.php?c=Virtualization%20cluster%20pmtpa&m=load_one&r=hour&s=by%20name&hc=4&mc=2&st=1326938719&g=network_report&z=large&c=Virtualization%20cluster%20pmtpa [02:05:55] !log LocalisationUpdate completed (1.18) at Thu Jan 19 02:05:55 UTC 2012 [02:05:57] ^^ spike in network stats for rebalance :D [02:05:57] Logged the message, Master [02:06:18] !log rebalance of gluster volume completed [02:06:20] Logged the message, Master [02:17:26] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1321s [02:17:26] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1321s [02:23:35] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1691s [02:23:35] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1691s [02:29:49] !log awjrichards synchronized php/extensions/CongressLookup/CongressLookup.i18n.php 'r109477' [02:29:52] Logged the message, Master [02:30:05] PROBLEM - ps1-d2-sdtpa-infeed-load-tower-A-phase-Z on ps1-d2-sdtpa is CRITICAL: ps1-d2-sdtpa-infeed-load-tower-A-phase-Z CRITICAL - *2488* [02:30:05] PROBLEM - ps1-d2-sdtpa-infeed-load-tower-A-phase-Z on ps1-d2-sdtpa is CRITICAL: ps1-d2-sdtpa-infeed-load-tower-A-phase-Z CRITICAL - *2488* [02:30:31] !log awjrichards synchronized php/extensions/CongressLookup/SpecialCongressLookup.php 'r109477' [02:30:32] Logged the message, Master [02:33:35] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 14s [02:33:35] RECOVERY - Misc_Db_Lag on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 14s [02:37:55] RECOVERY - MySQL replication status on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [02:37:55] RECOVERY - MySQL replication status on storage3 is OK: CHECK MySQL REPLICATION - lag - OK - Seconds_Behind_Master : 0s [03:01:06] !log rebooting virt1 to ensure hardware virtualization is enabled in the bios [03:01:08] Logged the message, Master [03:11:34] !log bringing virt1 back up [03:11:36] Logged the message, Master [03:21:35] RECOVERY - Puppet freshness on db1045 is OK: puppet ran at Thu Jan 19 03:21:14 UTC 2012 [03:21:35] RECOVERY - Puppet freshness on db1045 is OK: puppet ran at Thu Jan 19 03:21:14 UTC 2012 [04:13:33] Are there still blog slowdowns? [04:15:45] RECOVERY - Disk space on es1004 is OK: DISK OK [04:15:46] RECOVERY - Disk space on es1004 is OK: DISK OK [04:16:45] RECOVERY - MySQL disk space on es1004 is OK: DISK OK [04:16:46] RECOVERY - MySQL disk space on es1004 is OK: DISK OK [04:32:06] !log awjrichards synchronizing Wikimedia installation... : Deploying CongressLookup changes for the lifting of the blackout [04:32:08] Logged the message, Master [04:34:09] sync done. [04:41:56] PROBLEM - MySQL slave status on es1004 is CRITICAL: CRITICAL: Slave running: expected Yes, got No [04:41:57] PROBLEM - MySQL slave status on es1004 is CRITICAL: CRITICAL: Slave running: expected Yes, got No [04:42:06] awjr: is that the last scap? [04:42:45] it should be - preilly has a sync-file left to do and i'll have a configchange to do right at nine [04:42:55] Ryan_Lane ^ [04:43:07] which configchange? [04:43:17] I'm doing a bunch of config changes at 9 [04:43:26] heh ok maybe you can do mine too, then [04:43:34] is it going to be in InitialiseSettings? [04:43:55] yeah [04:44:07] wmgCongressLookupBlackOnWhite needs to be set to true for enwiki [04:44:13] ok [04:44:27] last stanza in the file. [04:45:31] ok. I have it set and commented out [04:45:36] what does this do? [04:45:46] cool looks good [04:46:03] awjr: ^^ [04:46:16] Ryan_Lane: you aren't going to SCAP right? [04:46:20] no [04:46:32] it essentially inverts the color scheme on Special:CongressLookup [04:46:33] someone just did scap, though [04:46:37] it was me [04:46:39] oh wait [04:46:46] i mean, i scapped like 1- mins ago [04:46:49] er 10 [04:47:51] !log Preparing InitialiseSettings for renabling Wikipedia. DO NOT SCAP, DO NOT PUSH InitializeSettings [04:47:54] Logged the message, Master [04:47:58] or I will kill you [04:53:36] !log preilly synchronized php-1.18/extensions/MobileFrontend/ 'new sopa banner' [04:53:38] Logged the message, Master [04:57:02] !log flushing mobile varnish caches [04:57:03] Logged the message, Master [05:00:09] !log laner synchronized wmf-config/InitialiseSettings.php 'Removing all SOPA changes, excluding editing for anons, and page creation' [05:00:11] Logged the message, Master [05:00:15] !!!!! [05:00:17] Yay :D [05:03:27] binasher: read-only mode on database! [05:03:41] grr [05:03:44] :D [05:04:41] Ryan_Lane: it shouldn't be.. the line is commented out in db.php #>------'s1'>--- => 'Maintenance in progress, please try again in 5 minutes', [05:04:49] what about the master itself? [05:04:53] but its possible that some apaches didn't get it [05:05:04] no one can edit [05:05:09] so it's not just a few [05:05:16] | read_only | OFF | [05:05:19] hmm [05:05:27] ah. it's working now apparently [05:05:35] awjr has been updating congressional info since the master switch too [05:05:37] \o/ [05:05:48] did anything change? [05:05:50] not sure what happened [05:05:51] it was a auto lock, the slaves were behind [05:06:03] It's fixed [05:06:14] !log preilly synchronized php-1.18/extensions/MobileFrontend/ 'new sopa banner' [05:06:16] Logged the message, Master [05:07:38] binasher: heh. sorry for the scare ;) [05:07:47] flood of writes [05:07:55] ahhhh [05:08:05] preilly: still getting black banners on mobile [05:08:12] is that why it was having read-only issues? [05:08:48] IP editing is still disabled, is that intended? [05:08:57] !log preilly synchronized php-1.18/extensions/MobileFrontend/ 'new sopa banner' [05:08:59] Logged the message, Master [05:09:36] Shirik: Phillippe said apparently on #wikimedia-sopa "one minute" [05:09:41] ok thanks [05:10:12] er, "any minute" to be precise [05:12:13] !log laner synchronized wmf-config/InitialiseSettings.php 'Enabling page creation for users' [05:12:15] Logged the message, Master [05:14:46] !log laner synchronized wmf-config/InitialiseSettings.php 'Enabling anon editing for enwiki' [05:14:48] Logged the message, Master [05:16:20] !log preilly synchronized php-1.18/extensions/MobileFrontend/ 'new sopa banner' [05:16:22] Logged the message, Master [05:16:45] it looks like there are a few straggler apaches that aren't getting syncs but are getting traffic.. there are some entries in dberror.log about hosts trying to talk to ms2 which has been out of prod for a month or more [06:31:19] !log asher synchronized php-1.18/extensions/CongressLookup/SpecialCongressLookup.php 'new formatting for congresslookup background graphic' [07:28:07] Isn't the blackout supposed to be finished? I am getting a 'whiteout' now [07:29:42] it is supposed to be over, yes [07:29:49] I was able to load a page and read it [07:31:18] When I go to English Wikipedia, I still get the see-the-page-for-a-fragment-of-a-second-then-it-is-replaced-by-the-blackout effect - but with an empty screen instead of the SOPA protest screen as the blackout screen [07:33:51] Andre_Engels: You might need to bypass your browsers cache http://en.wikipedia.org/wiki/Wikipedia:Bypass_your_cache&banner=no [07:35:47] https://en.wikipedia.org/wiki/Wikipedia:Bypass_your_cache?banner=no [07:39:14] Thanks, but still didn't help :-( [07:39:40] Emptying the cache that is; the banner=no did work [07:40:11] One moment, will try stopping and restarting my browser [07:40:15] ok [07:42:30] any better? [07:42:45] Nope, still doesn't work... Now don't even get to see the page for a few milliseconds any more... [07:43:19] if you use another browser, (chrome if you're using ff, for example) what does that do foryou? [07:44:39] Andre - try CTRL + F5 [07:44:54] Thats the force reload combination for most browsers. [07:45:37] @Excirial: I already tried that before coming here [07:46:29] aspergos: it's indeed in the browser, I am getting the page correctly when using IE instead of FF [07:46:38] ok [07:47:00] Ctrl shift del doesnt work either? [07:47:05] clear your browser cache, cookies, everything... for good measure [07:47:08] well you might... hmm.. close the tab in the misbehaving browser, that has the page loaded, toss all cookies and all cache [07:47:08] (in firefox) [07:47:10] exit [07:47:17] then restart and see if that gets it [07:48:16] (um, it's *apergos* btw. no connection to asperger's :-P) [07:51:51] * Prodego pets apergos, he didn't mean it [07:52:00] heh [07:52:11] Asparagus, i mean aspergers, i mean... Apergos... :p [07:52:12] mister "SOPAOnWheels"... :-P [07:52:45] apergos: yea I can't remember the password for that one... might ask one of you guys to set the email for me so I can recover it if I think of a good use :) [07:52:46] I hope someone has a screenshot of rc for en wp during the blackout [07:52:50] it should all fit on one page [07:53:03] hahahah [07:53:31] 's what you get for crossing the picket lines during a blackout :-P [07:53:33] Andre_Engels: how'd that work out for you? [07:53:55] apergos: :) [07:54:04] Ok guys, found it. [07:54:07] oh? [07:54:15] what was it?? [07:54:28] It was a bad setting I created in an add-on during that day [07:54:34] :-) [07:54:39] ok, glad problem solved! [07:54:42] I had attempted to remove the blackout screen using "Remove it permanently" [07:54:56] ah, you wanted to do this - http://meta.wikimedia.org/w/index.php?action=raw&ctype=text/css&title=User:Prodego/enw.css [07:55:02] ... but instead ended up removing all
s on English Wikipedia pages [07:55:26] I personally thought my idea of including a meta .css file in my enwiki .css file was quite clever [07:55:42] some peoplpe (me) did not try to do some goofy workaround [07:55:46] just sayin.... [07:56:02] Just disabling javascript would have worked fine as well. [07:56:06] ?banner=anything worked nicely too [07:56:24] apergos: but.. but what if I wanted to look something up? [07:56:49] guess I'll just have to use conservapedia [07:56:57] www.happyplace.com/13509/alternative-information-sources-while-wikipedia-is-down [07:57:19] in the news at conservapedia - Why isn't Wikipedia protesting Hollywood's insistence on SOPA/PIPA? [07:58:25] ""Wikipedia editors question site's blackout." [4] Why doesn't Wikipedia instead expose the big liberal money being poured into the Democrat Party to pass the bad bill? [07:58:26] " [07:58:39] ah well, read too much of that and your head explodes [07:58:50] good night apergos, Excirial, Andre_Engels [07:58:55] other people [08:03:05] Excirial: I tried the disabling Javascript, but if I have javascript disabled on Wikipedia, I get to see only the left-side vertical strip thingie [08:03:53] Wait, no, does work now. [08:04:02] Ah well, I managede [09:06:35] PROBLEM - Puppet freshness on mw1096 is CRITICAL: Puppet has not run in the last 10 hours [09:06:36] PROBLEM - Puppet freshness on mw1096 is CRITICAL: Puppet has not run in the last 10 hours [09:26:57] Another correction of "zip code" to "ZIP code" is needed: http://en.wikipedia.org/wiki/Special:CongressLookup [09:35:09] New patchset: ArielGlenn; "add snapshot1001-4 to site.pp and to download exports list" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1975 [09:41:30] New review: ArielGlenn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/1975 [09:41:31] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1975 [09:50:05] PROBLEM - Disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 424992 MB (3% inode=99%): [09:50:05] PROBLEM - Disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 424992 MB (3% inode=99%): [09:50:15] PROBLEM - MySQL disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 424699 MB (3% inode=99%): [09:50:16] PROBLEM - MySQL disk space on es1004 is CRITICAL: DISK CRITICAL - free space: /a 424699 MB (3% inode=99%): [10:16:35] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [10:16:35] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [10:36:15] RECOVERY - MySQL slave status on es1004 is OK: OK: [10:36:15] RECOVERY - MySQL slave status on es1004 is OK: OK: [10:36:35] PROBLEM - Puppet freshness on bast1001 is CRITICAL: Puppet has not run in the last 10 hours [10:36:35] PROBLEM - Puppet freshness on bast1001 is CRITICAL: Puppet has not run in the last 10 hours [10:41:26] PROBLEM - Puppet freshness on fenari is CRITICAL: Puppet has not run in the last 10 hours [10:41:26] PROBLEM - Puppet freshness on fenari is CRITICAL: Puppet has not run in the last 10 hours [11:06:20] New review: Dzahn; "looks like this is related to a new puppet problem on fenari:" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1972 [11:32:52] New patchset: ArielGlenn; "em.. the new snaps are at equid, add to regexp in site.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1976 [11:37:26] New patchset: ArielGlenn; "em.. the new snaps are at equid, add to regexp in site.pp" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1976 [11:44:39] New review: ArielGlenn; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/1976 [11:44:40] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1976 [12:08:58] I'm getting a 500 error on the English Wikinews RSS feed... http://en.wikinews.org/w/index.php?title=Special:NewsFeed&feed=rss&categories=Published&notcategories=No%20publish%7CArchived%7CAutoArchived%7Cdisputed&namespace=0&count=128&hourcount=240&ordermethod=categoryadd&stablepages=only [12:12:04] brianmc: you surely do [12:12:12] Because your URL is incorrect [12:12:26] It has XML entities in it [12:12:54] That worked until 17/01/12 22:22 [12:13:54] Well, you are supposed to strip them [12:17:24] The RSS link on the enWN main page, https://en.wikinews.org/w/index.php?title=Special:NewsFeed&feed=atom&categories=Published¬categories=No%20publish|Archived|AutoArchived|disputed&namespace=0&count=30&hourcount=124&ordermethod=categoryadd&stablepages=only, is similarly failing [12:18:24] PHP fatal error in /usr/local/apache/common-local/php-1.18/extensions/GoogleNewsSitemap/FeedSMItem.php line 111 [12:18:30] Access level to FeedSMItem::$title must be public (as in class FeedItem) [12:18:38] Looks like we got a real problem [12:22:20] Thanks, was a bit of head-scratching for me there. [12:40:57] Hello there Masti! After a long time! ;) [12:41:56] hi Tanvir ;) [12:42:28] Okay, whatever broke Wikinews' NewsFeed happened between 22:22 and 22:30UTC on the 17th... [12:44:26] note that shortly the wikitech web site (and therefor the server admin log and the bot that logs to it) will be unavailable, since the hosting site is moving our content off its currently broken instance to a new one [13:01:34] erm, morebots died about 7 1/2 hours ago anyway [13:04:39] good I'm not logging anything then :-/ [13:06:37] !log cleanupUploadStash finished for Commons [13:06:50] probably not [13:07:06] the wikitech instance is being moved or will be soon, by our hosters [13:07:15] ah [13:07:21] besides morebots being dead [13:07:29] Not a big deal [13:07:31] heh [14:24:15] PROBLEM - Apache HTTP on srv263 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:24:15] PROBLEM - Apache HTTP on srv263 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:30:25] PROBLEM - Disk space on srv263 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:30:25] PROBLEM - Disk space on srv263 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:35:35] PROBLEM - DPKG on srv263 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:35:35] PROBLEM - RAID on srv263 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:35:36] PROBLEM - DPKG on srv263 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:35:36] PROBLEM - RAID on srv263 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:36:13] !log reedy synchronized php-1.18/extensions/GoogleNewsSitemap/FeedSMItem.php 'r109532' [14:36:26] PROBLEM - SSH on srv263 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:36:26] PROBLEM - SSH on srv263 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:37:58] Hi, I'm having an issue on en.wiki with renames. I'm getting an error message: There was a problem with receiving the request. Please go back and try again. [14:39:02] I am trying to fulfill http://en.wikipedia.org/wiki/Wikipedia:Changing_username/Simple#Pfc432_.E2.86.92_Brenda_Fernandez [14:42:10] was the new username created during the rename? [14:44:29] apergos: no [14:44:46] so it didn't even get that far [14:45:15] PROBLEM - MySQL slave status on es2 is CRITICAL: CRITICAL: Connected threads = 1199 (1000) [14:45:16] PROBLEM - MySQL slave status on es2 is CRITICAL: CRITICAL: Connected threads = 1199 (1000) [14:45:26] apergos: correct [14:45:36] PROBLEM - DPKG on srv259 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:45:36] PROBLEM - Disk space on srv275 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:45:36] PROBLEM - DPKG on srv259 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:45:37] PROBLEM - Disk space on srv275 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:45:47] PROBLEM - Disk space on srv259 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:45:47] PROBLEM - Disk space on srv259 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:46:15] PROBLEM - MySQL slave status on es4 is CRITICAL: CRITICAL: Connected threads = 1159 (1000) [14:46:16] PROBLEM - MySQL slave status on es4 is CRITICAL: CRITICAL: Connected threads = 1159 (1000) [14:46:20] uh, things are lagging pretty hard on frwiki [14:47:05] PROBLEM - SSH on srv259 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:47:05] PROBLEM - SSH on srv259 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:47:53] brianmc, about? [14:48:16] Request: POST http://www.mediawiki.org/wiki/Special:Code/MediaWiki/109532, from 208.80.152.72 via sq59.wikimedia.org (squid/2.7.STABLE9) to () [14:48:16] Error: ERR_CANNOT_FORWARD, errno [No Error] at Thu, 19 Jan 2012 14:47:23 GMT [14:48:29] ugh [14:48:30] Had multiple of those in the past 2 mins. [14:48:30] I'm getting Request: GET http://zh.wikipedia.org/wiki/Template:Country, from [ip] via sq65.wikimedia.org (squid/2.7.STABLE9) to () [14:48:31] Error: ERR_CANNOT_FORWARD, errno (11) Resource temporarily unavailable at Thu, 19 Jan 2012 14:47:24 GMT [14:49:55] PROBLEM - Apache HTTP on srv259 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:49:55] PROBLEM - Apache HTTP on srv259 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:50:25] PROBLEM - RAID on srv275 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:50:26] PROBLEM - RAID on srv275 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:50:34] hello [14:50:35] http://commons.wikimedia.org/w/index.php?title=Commons:Deletion_requests/All_files_copyrighted_in_the_US_under_the_URAA&curid=18088827&diff=65642425&oldid=65642381 [14:50:45] PROBLEM - RAID on srv259 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:50:48] Request: GET http://commons.wikimedia.org/w/index.php?title=Commons:Deletion_requests/All_files_copyrighted_in_the_US_under_the_URAA&curid=18088827&diff=65642425&oldid=65642381, from 208.80.152.87 via sq66.wikimedia.org (squid/2.7.STABLE9) to () [14:50:48] PROBLEM - RAID on srv259 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:50:48] Error: ERR_CANNOT_FORWARD, errno (11) Resource temporarily unavailable at Thu, 19 Jan 2012 14:49:27 GMT [14:50:54] are you aware of that? [14:50:56] PROBLEM - SSH on srv275 is CRITICAL: Server answer: [14:50:57] PROBLEM - SSH on srv275 is CRITICAL: Server answer: [14:51:22] yannf: yes [14:51:35] PROBLEM - SSH on srv286 is CRITICAL: Server answer: [14:51:36] PROBLEM - SSH on srv286 is CRITICAL: Server answer: [14:51:37] ok [14:51:43] read appears to work, change/edit is failing. [14:52:05] Still here, Reedy - but might be worth getting out everyone's hair ;) [14:52:15] PROBLEM - DPKG on srv275 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:52:15] PROBLEM - DPKG on srv275 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:52:25] PROBLEM - Disk space on srv286 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:52:25] PROBLEM - Disk space on srv286 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:53:15] PROBLEM - Apache HTTP on srv275 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:53:15] PROBLEM - Apache HTTP on srv275 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:53:23] brianmc, in theory, the rss issue should be fixed... but i'm not familiar with any caching that takes place... Do you know where vvv got the actualy fatal listed? [14:53:44] Oh [14:53:46] I se eanother [14:53:56] Eh, nope. I just posted a bust link, and he dug that out [14:54:18] i just found another fatal [14:54:20] let me fix that [14:55:22] Anyone from ops able to provide some brief info? [14:55:46] I'm looking at it and don't know what's wrong [14:55:57] PROBLEM - DPKG on srv286 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:55:57] PROBLEM - DPKG on srv286 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:56:02] apergos: need more info, or are you able to reproduce? [14:56:19] I'm looking at these servers that are out to lunch, basically [14:56:26] hah [14:57:09] Thanks. [14:57:55] PROBLEM - RAID on srv286 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:57:55] PROBLEM - RAID on srv286 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [14:58:53] Hm. Taking squid ERR_CANNOT_FORWARD, errno (11) on http://wikimediafoundation.org/wiki/SOPA/Blackoutpage [14:58:55] PROBLEM - Apache HTTP on srv286 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:58:56] PROBLEM - Apache HTTP on srv286 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:59:06] I see 15 apaches not working during sync-file.. [14:59:27] !log reedy synchronized php-1.18/extensions/GoogleNewsSitemap/FeedSMItem.php 'r109538' [15:00:06] hi [15:00:10] srv275--had a big jump in cpu wait i/o time [15:00:24] Request: GET http://it.wikipedia.org/wiki/Pagina_principale, from 208.80.152.86 via sq66.wikimedia.org (squid/2.7.STABLE9) to () [15:00:25] Error: ERR_CANNOT_FORWARD, errno (115) Operation now in progress at Thu, 19 Jan 2012 14:59:24 GMT [15:01:01] that sorta feels like nfsfail [15:01:45] RECOVERY - SSH on srv286 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:01:46] RECOVERY - SSH on srv286 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:02:25] RECOVERY - Disk space on srv286 is OK: DISK OK [15:02:25] RECOVERY - Disk space on srv286 is OK: DISK OK [15:02:57] Jeff_Green, a few of the busy apaches have had quite an increase of swap usage [15:03:35] Reedy: yeah, I see that on srv286 for example [15:03:35] PROBLEM - DPKG on srv261 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:03:35] PROBLEM - Apache HTTP on srv230 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:03:36] PROBLEM - DPKG on srv261 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:03:36] PROBLEM - Apache HTTP on srv230 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:03:45] PROBLEM - Apache HTTP on srv211 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:03:46] PROBLEM - Apache HTTP on srv211 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:03:57] PROBLEM - Apache HTTP on mw54 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:03:58] PROBLEM - Apache HTTP on mw54 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:04:12] srv275 too [15:04:28] what just died? [15:04:29] 263, 261 [15:04:42] 279 [15:04:45] PROBLEM - Apache HTTP on mw30 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:04:45] PROBLEM - Apache HTTP on srv270 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:04:45] PROBLEM - Apache HTTP on mw30 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:04:46] PROBLEM - Apache HTTP on srv270 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:04:53] 268 [15:04:54] ? [15:04:59] has this been happening a lot? [15:05:05] PROBLEM - Apache HTTP on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:05:05] PROBLEM - Apache HTTP on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:05:13] You get the odd apache do it from time to time [15:05:15] PROBLEM - Apache HTTP on mw28 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:05:15] PROBLEM - Apache HTTP on mw28 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:05:25] PROBLEM - Disk space on srv261 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:05:25] PROBLEM - Disk space on srv261 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:05:38] might be time for a cron to log processes so we can see which bloats [15:05:55] PROBLEM - Apache HTTP on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:05:55] RECOVERY - DPKG on srv286 is OK: All packages OK [15:05:56] PROBLEM - Apache HTTP on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:05:56] RECOVERY - DPKG on srv286 is OK: All packages OK [15:05:58] as it stands I can't get a session to investigate [15:06:05] PROBLEM - Apache HTTP on srv190 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:06:05] PROBLEM - Apache HTTP on srv190 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:06:13] 38 srv boxes and 25 mw boxes on nagios complaining [15:06:20] I think the usual process is just to bounce the box [15:06:25] PROBLEM - Apache HTTP on srv205 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:06:25] PROBLEM - Apache HTTP on mw35 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:06:25] PROBLEM - Apache HTTP on mw42 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:06:26] PROBLEM - Apache HTTP on srv205 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:06:26] PROBLEM - Apache HTTP on mw35 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:06:26] PROBLEM - Apache HTTP on mw42 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:06:27] well [15:06:29] those machines are idle [15:06:32] apache threads are locked up [15:06:38] well, doing something [15:06:45] PROBLEM - Apache HTTP on srv271 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:06:46] PROBLEM - Apache HTTP on srv271 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:07:04] i'd happily slay procs if I could actually get a session :-( [15:07:06] #2 php_sock_stream_wait_for_data (stream=0x7fb63da3f0a0, buf=0x7fb63cc8c3b0 "\230\225\f:\266\177", count=8192) at /tmp/buildd/php5-5.3.2/main/streams/xp_socket.c:131 [15:07:06] #3 php_sockop_read (stream=0x7fb63da3f0a0, buf=0x7fb63cc8c3b0 "\230\225\f:\266\177", count=8192) at /tmp/buildd/php5-5.3.2/main/streams/xp_socket.c:154 [15:07:06] #4 0x00007fb63836db9a in php_openssl_sockop_read (stream=0x7fff764f7220, buf=0x7fb63cc8c3b0 "\230\225\f:\266\177", count=500) at /tmp/buildd/php5-5.3.2/ext/openssl/xp_ssl.c:234 [15:07:06] #5 0x00007fb63858a764 in php_stream_fill_read_buffer (stream=0x7fb63da3f0a0, size=) at /tmp/buildd/php5-5.3.2/main/streams/streams.c:562 [15:07:08] #6 0x00007fb63858a910 in _php_stream_get_line (stream=0x7fb63da3f0a0, buf=0x0, maxlen=500, returned_len=0xffffffffffffffff) at /tmp/buildd/php5-5.3.2/main/streams/streams.c:841 [15:07:11] #7 0x00007fb638500c01 in zif_fgets (ht=1, return_value=0x7fb63e75cd60, return_value_ptr=, this_ptr=, return_value_used=) at /tmp/buildd/php5-5.3.2/ext/standard/file.c:1074 [15:07:15] #8 0x00007fb63861e3fa in zend_do_fcall_common_helper_SPEC (execute_data=0x7fb63ea2ce48) at /tmp/buildd/php5-5.3.2/Zend/zend_vm_execute.h:313 [15:07:18] where do we do SSL reads? [15:07:25] PROBLEM - Apache HTTP on srv261 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:07:26] PROBLEM - Apache HTTP on srv261 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:07:30] * domas resolves stack trace manually [15:07:45] RECOVERY - RAID on srv286 is OK: OK: no RAID installed [15:07:45] PROBLEM - Apache HTTP on mw55 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:07:45] PROBLEM - RAID on srv279 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:07:46] RECOVERY - RAID on srv286 is OK: OK: no RAID installed [15:07:46] PROBLEM - Apache HTTP on mw55 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:07:46] PROBLEM - RAID on srv279 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:07:50] wikipedia is very slow atm. [15:07:54] yes, we know [15:07:55] PROBLEM - Apache HTTP on mw8 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:07:55] PROBLEM - Apache HTTP on srv244 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:07:56] PROBLEM - Apache HTTP on mw8 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:07:56] PROBLEM - Apache HTTP on srv244 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:08:05] PROBLEM - Apache HTTP on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:08:06] PROBLEM - Apache HTTP on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:08:07] domas: what's /tmp/buildd/php5-5.3.2/ext/standard/file.c ? [15:08:09] people will think there's a second blakout, haha [15:08:26] PROBLEM - Apache HTTP on mw27 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:08:26] PROBLEM - Apache HTTP on mw27 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:08:29] DarkoNeko: a blackout against blockouts! [15:08:39] a blackout against web scale technologies [15:08:45] PROBLEM - Apache HTTP on srv279 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:08:45] PROBLEM - Apache HTTP on srv279 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:08:54] mmm [15:08:56] PROBLEM - SSH on srv279 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:08:57] PROBLEM - SSH on srv279 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:09:02] something is doing ssl fgets [15:09:07] i guess stracing would work too :) [15:09:19] maybe people will think congress decided to censor wikipedia in retaliation [15:09:25] RECOVERY - Apache HTTP on srv286 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.063 second response time [15:09:26] RECOVERY - Apache HTTP on srv286 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.063 second response time [15:09:33] oh wait [15:09:39] or stupid me [15:09:55] PROBLEM - DPKG on srv279 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:09:56] PROBLEM - DPKG on srv279 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:10:07] can be just memcached issue [15:10:15] PROBLEM - Apache HTTP on srv232 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:10:16] PROBLEM - Apache HTTP on srv232 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:10:35] FFFUUUU [15:10:35] php mctest.php enwiki [15:10:35] No MWMultiVersion instance initialized! MWScript.php wrapper not used? [15:10:51] how can people do things like this? [15:10:54] mwscript mctest.php enwiki [15:11:11] heh, interesting [15:11:17] it starts timing out on very first one? [15:12:01] I guess it blew up in memory [15:12:21] there're like 10 bad memcached servers at the moment [15:12:23] or more [15:12:25] PROBLEM - Disk space on srv279 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:12:26] PROBLEM - Disk space on srv279 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:13:13] beautiful [15:13:13] http://ganglia.wikimedia.org/2.2.0/?c=Application%20servers%20pmtpa&h=srv268.pmtpa.wmnet&m=load_one&r=hour&s=by%20name&hc=4&mc=2 [15:13:35] RECOVERY - DPKG on srv261 is OK: All packages OK [15:13:36] RECOVERY - DPKG on srv261 is OK: All packages OK [15:13:47] RECOVERY - Apache HTTP on srv211 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.027 second response time [15:13:47] RECOVERY - Apache HTTP on srv230 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 4.040 second response time [15:13:48] RECOVERY - Apache HTTP on srv211 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.027 second response time [15:13:48] RECOVERY - Apache HTTP on srv230 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 4.040 second response time [15:14:17] what did you do? [15:14:55] RECOVERY - Apache HTTP on srv270 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 5.698 second response time [15:14:56] RECOVERY - Apache HTTP on srv270 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 5.698 second response time [15:15:15] RECOVERY - Apache HTTP on mw28 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.020 second response time [15:15:16] RECOVERY - Apache HTTP on mw28 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.020 second response time [15:15:17] it looks like it bounced toward swapdeath, recovered briefly, and did it again [15:15:18] hmm? [15:15:25] RECOVERY - Disk space on srv261 is OK: DISK OK [15:15:26] RECOVERY - Disk space on srv261 is OK: DISK OK [15:15:45] PROBLEM - MySQL slave status on es2 is CRITICAL: CRITICAL: Connected threads = 1016 (1000) [15:15:46] PROBLEM - MySQL slave status on es2 is CRITICAL: CRITICAL: Connected threads = 1016 (1000) [15:16:25] RECOVERY - Apache HTTP on srv205 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.023 second response time [15:16:26] RECOVERY - Apache HTTP on srv205 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.023 second response time [15:16:35] PROBLEM - SSH on srv268 is CRITICAL: Server answer: [15:16:36] PROBLEM - SSH on srv268 is CRITICAL: Server answer: [15:17:25] RECOVERY - Apache HTTP on srv261 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 3.037 second response time [15:17:26] RECOVERY - Apache HTTP on srv261 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 3.037 second response time [15:17:45] RECOVERY - Apache HTTP on mw55 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.035 second response time [15:17:46] RECOVERY - Apache HTTP on mw55 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.035 second response time [15:17:55] RECOVERY - Apache HTTP on mw8 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.032 second response time [15:17:56] RECOVERY - Apache HTTP on mw8 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.032 second response time [15:18:05] RECOVERY - Apache HTTP on mw1 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.401 second response time [15:18:06] RECOVERY - Apache HTTP on mw1 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.401 second response time [15:18:25] RECOVERY - Apache HTTP on mw27 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.021 second response time [15:18:26] RECOVERY - Apache HTTP on mw27 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.021 second response time [15:18:45] PROBLEM - Disk space on srv268 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:18:45] PROBLEM - DPKG on srv268 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:18:46] PROBLEM - Disk space on srv268 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:18:46] PROBLEM - DPKG on srv268 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:19:32] lol [15:19:32] PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND [15:19:32] 24122 apache 39 19 1391m 1.2g 3616 R 38 15.4 63:04.28 php [15:20:15] RECOVERY - Apache HTTP on srv232 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.029 second response time [15:20:16] RECOVERY - Apache HTTP on srv232 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.029 second response time [15:21:45] PROBLEM - RAID on srv268 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:21:46] PROBLEM - RAID on srv268 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:22:03] I suggest stopping all job queue runners [15:22:16] PROBLEM - Apache HTTP on srv268 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:22:16] PROBLEM - Apache HTTP on srv268 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:22:43] or whole cluster will go down [15:22:44] :) [15:23:21] dominoes effect [15:23:56] not dominoes [15:24:01] i was going to say i just got a 502, but it seems i'm not alone [15:24:35] RECOVERY - Apache HTTP on mw54 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.055 second response time [15:24:36] RECOVERY - Apache HTTP on mw54 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.055 second response time [15:24:55] RECOVERY - Apache HTTP on mw30 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.030 second response time [15:24:56] RECOVERY - Apache HTTP on mw30 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.030 second response time [15:25:37] RECOVERY - Apache HTTP on mw11 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.029 second response time [15:25:38] RECOVERY - Apache HTTP on mw11 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.029 second response time [15:25:55] RECOVERY - Apache HTTP on mw12 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.028 second response time [15:25:56] RECOVERY - Apache HTTP on mw12 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.028 second response time [15:26:06] RECOVERY - Apache HTTP on srv190 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 4.972 second response time [15:26:06] RECOVERY - Apache HTTP on srv190 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 4.972 second response time [15:26:25] RECOVERY - Apache HTTP on mw35 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.038 second response time [15:26:26] RECOVERY - Apache HTTP on mw35 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.038 second response time [15:26:37] RECOVERY - Apache HTTP on mw42 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.616 second response time [15:26:37] RECOVERY - Apache HTTP on mw42 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.616 second response time [15:26:45] RECOVERY - Apache HTTP on srv271 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.024 second response time [15:26:46] RECOVERY - Apache HTTP on srv271 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.024 second response time [15:26:55] PROBLEM - MySQL slave status on es4 is CRITICAL: CRITICAL: Connected threads = 1038 (1000) [15:26:56] PROBLEM - MySQL slave status on es4 is CRITICAL: CRITICAL: Connected threads = 1038 (1000) [15:27:03] Reedy now seems fast. [15:27:18] crap [15:27:26] ? [15:27:36] kill all job queues please [15:27:55] RECOVERY - Apache HTTP on srv244 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.033 second response time [15:27:56] RECOVERY - Apache HTTP on srv244 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.033 second response time [15:29:59] what fricking user do I have to run things as now? grrrrr [15:30:08] * domas whistle [15:30:08] dsh -g job-runners pkill -9 -f obs [15:30:31] well if /etc/init.d stop won't do it [15:30:36] I don't know why pkill would [15:30:41] start-stop-daemon: warning: failed to kill 27943: Operation not permitted [15:30:42] etc [15:30:47] what do you mean? [15:30:53] upido it as root? [15:30:54] err [15:30:57] you're doing that as root? [15:31:04] of course I'm doing that as root [15:31:07] how else would I do that? [15:31:08] tried that and it prompted me for the root pwd [15:31:16] domas: i meant apergos [15:31:19] did not prompt for me [15:31:25] * domas is super-master [15:31:29] guess you had better do it then [15:31:45] * domas sighs, bunch of machines in swapdeath [15:31:52] you can reboot them! [15:32:03] four or so! [15:32:26] PROBLEM - Memcached on srv279 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:32:27] PROBLEM - Memcached on srv279 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:32:33] when this is over can we document this here: https://wikitech.wikimedia.org/view/Job_queue b/c I've been sitting here on my thumbs for lack of clue how to kill job queues [15:32:36] see the memory drop? thats me! [15:32:47] jeff_green: pkill -f obs [15:32:48] \o/ [15:32:58] obs? [15:33:03] yeah [15:33:04] it's not better kill -9 ? [15:33:18] obs matches jobs-loop and RunJobs [15:33:20] and few others! [15:33:21] domas: yes, but I don't even know where to do it from the documentation I've been able to find [15:33:21] I see [15:33:31] on the job-runner group [15:34:00] /home/config/others/usr/local/dsh/node_groups [15:34:04] they are in here now [15:34:25] a pretty obscure path but linked to from /etc/dsh/group if you remember that's the original location [15:34:31] lesson for today - overprovisioning hardware doesn't mean that you don't have to do operations :))) [15:35:16] lesson for today: if I need to do something in a hurry on a bunch of machines as root, it will prompt me for root on all of them, so I had better find someone else >_< [15:35:18] (and openssl calls are used to access memcached ;-) [15:35:32] apergos: I have no idea what root password is, by the way [15:35:37] probably thats because I never enter it [15:35:44] I didn't enter it, I backed out [15:35:49] ok [15:36:03] I generally don't have an issue with key forwarding [15:36:09] only when it's an emergency >_< [15:36:55] RECOVERY - DPKG on srv263 is OK: All packages OK [15:36:56] RECOVERY - DPKG on srv263 is OK: All packages OK [15:37:05] RECOVERY - Apache HTTP on srv263 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.601 second response time [15:37:06] RECOVERY - Apache HTTP on srv263 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.601 second response time [15:37:33] * domas restarted srv268 [15:37:49] I'll look at 275 then [15:37:52] so 259, 253, 268, 275, 279 [15:38:37] I guess that you know the root mgmt password though (domas) [15:38:47] for the record: [15:38:51] apache 24119 0.0 0.3 186036 25472 ? SN 14:01 0:00 php MWScript.php runJobs.php --wiki=ocwiki --procs=5 --maxtime=300 [15:38:51] apache 24122 80.4 15.4 1425252 1260932 ? RN 14:01 63:10 php MWScript.php runJobs.php --wiki=ocwiki --procs=5 --maxtime=300 [15:38:51] apache 30245 0.0 0.0 10740 572 ? SN Jan03 4:12 /bin/bash /usr/local/apache/common/php/maintenance/jobs-loop.sh [15:39:27] did we end up needing the -9 in this case, or did a generic pkill do it? [15:39:27] I'll add an ulimit there [15:39:35] jeff_green: doesn't matter much, does it? :) [15:39:36] RECOVERY - SSH on srv263 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:39:36] RECOVERY - SSH on srv263 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:40:39] domas: ? [15:40:51] is your point that it's now ancient history? [15:41:00] !log midom synchronized php/maintenance/jobs-loop.sh 'adding 400M ulimit' [15:41:15] RECOVERY - Disk space on srv263 is OK: DISK OK [15:41:16] RECOVERY - Disk space on srv263 is OK: DISK OK [15:41:19] I mean, you can kill a script as hard as you want [15:41:28] it is not like we trap signals from within our PHP scripts [15:41:38] it will not be graceful whatever you do, so kill -9 is the most direct way [15:41:55] ok good. i will hereby document that. [15:42:14] so, what was the problem, someone doesn't know management password? :( [15:42:17] domas: and how does facebook do it? you trap signals? [15:42:34] we don't write leaking code! [15:42:40] so no need to kill it [15:42:45] we test and monitor [15:42:47] too [15:43:41] no, that wasn't my quesstion [15:43:49] you said you didn't know the (cluster) root password [15:44:05] Reedy, topic status is right ? [15:44:11] but I expect you know the management one (since you were able to powercycle the one host :-P) [15:44:40] 275, 279, 259 done [15:44:44] tegra, usually it's easier to leave it until it's all been dealt with [15:45:37] back in the day I used to install SSH keys into host mgmt interfaces too [15:45:39] very handy [15:45:43] no need to enter passwords [15:45:51] that would be nice [15:45:56] RECOVERY - MySQL slave status on es2 is OK: OK: [15:45:56] RECOVERY - MySQL slave status on es2 is OK: OK: [15:45:59] all Suns used to be set up that way [15:46:05] RECOVERY - Disk space on srv275 is OK: DISK OK [15:46:05] RECOVERY - DPKG on srv259 is OK: All packages OK [15:46:06] RECOVERY - Disk space on srv275 is OK: DISK OK [15:46:06] RECOVERY - DPKG on srv259 is OK: All packages OK [15:46:12] 268 still looks unhappy, you said you got that one? because 263 looks better now [15:46:15] RECOVERY - Disk space on srv259 is OK: DISK OK [15:46:15] RECOVERY - RAID on srv263 is OK: OK: no RAID installed [15:46:16] RECOVERY - Disk space on srv259 is OK: DISK OK [15:46:16] RECOVERY - RAID on srv263 is OK: OK: no RAID installed [15:46:35] apergos: if you want an alternative to key forwarding see [[access]] @ labsconsole [15:46:51] all I want is something that works in an emergency [15:46:56] btw, need to restart job runners on those machines that just came up! [15:47:00] and resync the file [15:47:05] because they came up with old copy [15:47:07] RECOVERY - MySQL slave status on es4 is OK: OK: [15:47:08] RECOVERY - MySQL slave status on es4 is OK: OK: [15:47:09] which doesn't have memory constraints [15:47:14] and will blow up as soon as it gets to ocwiki [15:47:32] might be worth scapping [15:47:36] something wrong with that wiki? [15:47:45] domas, did you see my q about 263 vs 268? [15:47:46] something wrong with mediawiki code [15:47:55] RECOVERY - RAID on srv279 is OK: OK: no RAID installed [15:47:56] RECOVERY - RAID on srv279 is OK: OK: no RAID installed [15:47:59] https://wikitech.wikimedia.org/view/Job_queue <-- comments welcome [15:48:05] RECOVERY - SSH on srv259 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:48:06] RECOVERY - SSH on srv259 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:48:08] domas: but seen only with that wiki? [15:48:16] that exact job, probably [15:48:28] oh [15:48:46] doing 268 [15:48:56] done [15:49:13] wikimedia hire people from home ? [15:49:15] RECOVERY - SSH on srv279 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:49:16] RECOVERY - SSH on srv279 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:49:36] from home? [15:49:49] remote work :P [15:49:52] Yes [15:49:57] sure ? [15:50:01] Yup [15:50:04] yeah I'm full time telecommuter for example [15:50:09] There's 3 of us here now [15:50:21] up and running again? [15:50:25] should be [15:50:38] ok then I'll update the #wikipedia status [15:50:48] !log reedy synchronizing Wikimedia installation... : Updates post outage [15:50:50] !log midom synchronized php/maintenance/jobs-loop.sh 'adding 400M ulimit' [15:50:56] RECOVERY - DPKG on srv279 is OK: All packages OK [15:50:56] RECOVERY - DPKG on srv279 is OK: All packages OK [15:50:58] eh [15:51:00] good reedy [15:51:05] RECOVERY - RAID on srv275 is OK: OK: no RAID installed [15:51:06] RECOVERY - RAID on srv275 is OK: OK: no RAID installed [15:51:25] RECOVERY - SSH on srv275 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:51:26] RECOVERY - SSH on srv275 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:51:45] RECOVERY - RAID on srv259 is OK: OK: no RAID installed [15:51:46] RECOVERY - RAID on srv259 is OK: OK: no RAID installed [15:51:50] still quite a few apaches more than usual timing out [15:52:05] RECOVERY - RAID on srv268 is OK: OK: no RAID installed [15:52:06] RECOVERY - RAID on srv268 is OK: OK: no RAID installed [15:52:18] one more probably stupid question--did the dsh approach work in this case or did you end up going to individual hosts? [15:52:25] RECOVERY - Memcached on srv279 is OK: TCP OK - 0.002 second response time on port 11000 [15:52:26] RECOVERY - Memcached on srv279 is OK: TCP OK - 0.002 second response time on port 11000 [15:52:34] jeff_green: Restarting the hosts had to be done manually [15:52:37] Neeed to look at tidying up the crap that scap spews with stupid errors [15:52:40] those which were completely screwed up [15:52:42] I had to go to mgmt [15:52:46] RECOVERY - DPKG on srv275 is OK: All packages OK [15:52:47] RECOVERY - DPKG on srv275 is OK: All packages OK [15:52:51] because of course one couldn't ssh in to them [15:52:54] jeff_green: now, killing job queue on all alive ones was easy [15:53:03] I trid that when I first saw there was a problem but they were already unresponsive [15:53:05] RECOVERY - Disk space on srv279 is OK: DISK OK [15:53:06] sync done. [15:53:06] RECOVERY - Disk space on srv279 is OK: DISK OK [15:53:12] so you power cycled and then logged in to slay the runner? [15:53:21] which sync was that? [15:53:31] no, I could rerun the dsh [15:53:36] ah ok [15:53:54] ok, let's start modified jobsloop [15:53:59] Reedy: or anyone, did a second sync of the jobloop file go around? [15:54:43] i'm presuming domas' sync-file did.. [15:55:02] domas: did you sync a second tie? if not I will [15:55:04] *tie [15:55:06] *time [15:55:25] * jeremyb puts some glucose in apergos's m [15:55:31] thanks [15:55:37] any time [15:55:49] I did [15:55:53] ok [15:56:05] !log reedy synchronized php-1.18/extensions/GoogleNewsSitemap/FeedSMItem.php 'Debugging for fatal' [15:56:06] * apergos control-c's their command line [15:57:50] there should be increase in job fatals [15:57:54] as now they're limited in memory [15:57:55] heeehe [15:58:03] yayyyyyy! [15:58:09] !log reedy synchronized php-1.18/extensions/GoogleNewsSitemap/FeedSMItem.php 'Debugging for fatal' [15:58:11] I wonder why they were able to grow, memory limit is set to 100M [15:58:24] um you restarted them everywhere? [15:58:45] RECOVERY - DPKG on srv268 is OK: All packages OK [15:58:45] RECOVERY - Disk space on srv268 is OK: DISK OK [15:58:46] RECOVERY - DPKG on srv268 is OK: All packages OK [15:58:46] RECOVERY - Disk space on srv268 is OK: DISK OK [15:58:55] RECOVERY - Apache HTTP on srv279 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.019 second response time [15:58:56] RECOVERY - Apache HTTP on srv279 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.019 second response time [15:59:01] Jeff_Green are you a freelancer ? what's your job ? [15:59:15] RECOVERY - SSH on srv268 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:59:16] RECOVERY - SSH on srv268 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [15:59:36] I'm a full time employee, ops engineer "special projects" [15:59:39] * domas eyes [15:59:40] public function memoryLimit() { [15:59:40] // Don't eat all memory on the machine if we get a bad job. [15:59:40] return "150M"; [15:59:40] } [15:59:50] sort of a noob though, I started in July [15:59:50] does it work? :) [16:00:01] jeff green needs a pic [16:00:03] tegra: I'm so horrible they never employed me in any capacity [16:00:04] http://wikimediafoundation.org/wiki/Job_openings/Operations_Engineer_-_Special_Projects [16:00:09] domas: we could use ulimit :-P [16:00:16] jeff_green: thats what I did [16:00:18] ulimit -v 400000 [16:00:26] i like it [16:00:29] oh nice :) it's my dream to work for wikimedia from home :P [16:00:40] :-) [16:00:48] meh, working for wikimedia from home is easy [16:00:48] it's a little isolating, but so far so good [16:00:51] now try working from home for... [16:01:03] domas: It's not horribleness, ... WMF just couldn't afford the stabbing-incident insurance required. [16:01:06] !log reedy synchronized php-1.18/extensions/GoogleNewsSitemap/FeedSMItem.php 'Debugging for fatal' [16:01:10] Jeff_Green: get to BOS or NY much? [16:01:15] gmaxwell: I don't stab people [16:01:19] or you mean everyone would be stabbing me? [16:01:23] gmaxwell: hah! [16:01:25] all I do is make them cry [16:01:40] jeremyb: we moved out here in August, haven't been to Boston yet but we were in NYC this weekend! [16:01:46] Stabbing, finger breaking .. kill -9ing .. same insurance plan. [16:02:01] let's alias "stab" to kill -9 [16:02:12] let's not [16:02:17] I want to actually stab dataset1 [16:02:24] and by that I do not mean kill -9 it [16:02:25] Jeff_Green: ohhh! well maybe see you in NYC sometime. unless last weekend was an anomaly! [16:02:26] ok, I would break fingers [16:02:36] !log reedy synchronized php-1.18/extensions/GoogleNewsSitemap/FeedSMItem.php 'Debugging for fatal' [16:02:37] gmaxwell: my better half is already moving to IAD tomorrow [16:02:49] gmaxwell: my flight leaves an hour earlier, but I'm west coast bound this time [16:02:59] apergos: kerosene? [16:03:01] FIRST CLASS TRAVEL BABY [16:03:18] hmm. no, not unless that's the last thing [16:03:25] it's too fast, see [16:03:30] At the moment I'm in Kansas city of all places. [16:03:32] jeremyb: I'd imagine we'll make it there a couple times a year, we'll have to plan an east-coast gathering or something [16:03:33] Operations Engineer - Special Projects, only this in remote ? [16:03:48] no [16:04:10] tegra: ~ half of all ppl are remote? (wild guess, don't quote) [16:04:20] nah, less than that [16:04:29] most of hte ops team used to be [16:04:36] but slowly people have been sucked in to moving to sf [16:04:49] Ashame. Made round the clock coverage a bit simpler. [16:05:00] European people still resists though :b [16:05:12] yes we do [16:05:18] I guess it would need a WMF headquarter in Europe to make us move in an office [16:05:23] and I went the other way, left SF for the east coast [16:05:29] i'm from switzerland i think it's very difficult to work for wikimedia. [16:05:30] Jeff_Green: maybe too soon to come back but there's a conference we're doing at NYU in 9 days (wikipedia birthday party) [16:05:58] drdee: ^^^^ are you in toronto? wikipedia bday party + conference in NY on 28th if you're interested [16:06:46] gmaxwell: people being employed made round the clock coverage way more complicated [16:06:52] gmaxwell: we all were working crazy hours as volunteers [16:06:57] jeremyb: might be possible, I'm not sure [16:07:01] once you're employed, you start honoring 5x8 [16:07:13] who does that?? [16:08:03] who does honor 5x8 ? [16:09:01] me probably on average, although I did more during the fundraiser for sure [16:09:29] well, 5*8 is a bit an exaggeration [16:09:37] but again, in early days we were doing crazy hours and were not being paid [16:09:55] I think I did 60+/week on wikipedia at times [16:10:01] yeah, that's not sustainable long term though [16:10:03] maybe even 80 [16:10:11] but how sustainable is that? can you do that for 5 years without burning out? for 10? [16:10:20] cause we've all been through that [16:10:25] you can do that for years and years [16:10:28] imo tech people are most productive if allowed to crank when they're motivated and rest when they're not [16:10:35] hehe, true [16:10:37] my dad been working 60+ hours per week for the last 34 years [16:10:42] ok well you can but it turns out that I can't (and a lot of people I know can't) [16:11:37] i can sit at a desk for 12 hours a day but that doesn't mean my brain will produce useful output that whole time [16:11:41] if you count regular work + wikipedia volunteering + freelance consulting, I have been working roughly 50hours a week for the last 3 years at least [16:11:48] apergos: Meh. What else are you going to do with your time? Play golf? [16:11:52] no it doesn't count [16:12:01] wiki* volunteering is separate [16:12:07] the context shift keeps you motivated [16:12:14] oh, if you count regular work + wikipedia volunteering + gaming, I've been working 160! [16:12:19] hahah [16:12:21] no gaming [16:12:27] that *definitely* doesn't count [16:12:29] :))) [16:12:31] but what if he gets paid for gaming??? [16:12:34] Jeff_Green: sure, and you can have context shifts within a job too. [16:12:35] RECOVERY - Apache HTTP on srv268 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.019 second response time [16:12:36] RECOVERY - Apache HTTP on srv268 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.019 second response time [16:12:48] what if there's a manager scolding you for failing to meet your gaming hours quota? [16:12:51] I found that I can be extremely annoying, sit in a small room, put a portable radar and meet everyone from random position with SMG gunfire! [16:12:55] Yea. "Domas clicked a cow. 324273827 times." "Tests passed!" [16:13:00] hehe [16:13:07] gmaxwell: didn't they shut down the cow clicker? [16:13:11] there was a great wired article about it [16:13:16] domas: yes. Indeed. [16:13:26] I missed out (not at all) [16:13:35] imagine all those mems I will never miss... [16:13:36] oh, cow clicker was awesome [16:14:27] apergos: http://www.wired.com/magazine/2011/12/ff_cowclicker/all/1 [16:14:28] I just remember worrying that it might not be polite to ask Cary if he knew it was supposted to be ironic. [16:14:45] RECOVERY - Apache HTTP on srv275 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.019 second response time [16:14:46] RECOVERY - Apache HTTP on srv275 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.019 second response time [16:14:51] gmaxwell: lol [16:14:59] it is amazing that even that formed a community [16:15:02] do I read it? or do I remain in blissfull ignorance? [16:15:12] * apergos compromises with a quick scan for the funny bits [16:15:14] well, I just suggested reading it [16:15:22] it is a great insight into all [a]social gaming [16:16:03] I am still wondering how many years of life are wasted playing Angry Birds [16:16:13] otoh, apergos doesn't seem to be social [16:16:21] I share a great article, and all I get is skepticism [16:16:24] :) [16:16:25] antisocial, that's me [16:16:31] such people end up in wikipedia [16:16:39] rather than social platforms like facebook!!11 [16:16:40] :) [16:16:44] apergos: no way. You are on IRC!!! [16:16:46] * domas eyes at gmaxwell [16:16:47] which is odd cause I don't actually edit wikipedia [16:16:48] ever [16:16:54] hashar: you're allowed to be antisocial on IRC [16:16:56] that is the first social network thing being widely used!! [16:17:28] hehe [16:17:31] BBS! [16:17:38] fidonet! [16:17:39] cowthulhu bwahahahaha [16:17:39] Fido was social network too [16:17:44] ok that made the article worth reading [16:17:45] hehe [16:17:51] BBS were awesomes [16:17:52] Jeff_Green: 2:471/23.213 was me! [16:17:59] awesome [16:18:40] I used fido too. ... and I hope no one ever finds any of the idiotic things I must have been posting when I was 15. [16:18:53] okay,... hands up who was online pre-DNS? :P [16:19:13] used Minitel :b [16:19:28] I remember Minitel - was pretty neat [16:19:35] http://en.wikipedia.org/wiki/Minitel [16:19:36] I started in the days of 1200 baud [16:19:52] here comes woosters [16:20:00] run, everyone [16:20:02] hi Domas! [16:20:08] hmm I was online when you had to dial into the tacacs to get to the stanford node to get to a given arpanet node, does that count? [16:20:21] woosters: are you going to show up on sunday? at the hackathon? [16:20:22] 1200 baud? [16:20:23] sheesh [16:20:25] that'd be about what was current when I got net access Jeff_Green [16:20:31] 110 baud and it was screamingly fast [16:20:34] will be there for a short while [16:20:37] New patchset: Mark Bergsma; "Revert "Google and possibly others are rate limiting our new ip, so use the old server(s) for delayed messages (for now)"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1977 [16:20:39] brianmc, found it [16:20:44] it is Chinese New Yea eve [16:20:51] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1977 [16:20:52] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1977 [16:20:52] dial into an edinburgh uni pad, then use bangpaths to send emails [16:20:54] Got me beat.. I would have firsted used a BBS in 1985/1986 and didn't get internet access until 1992 or so (via tymnet). [16:20:54] woosters: I may go there as early as possible, jetlagged kind of [16:20:56] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1977 [16:20:57] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1977 [16:21:02] woosters: if you're driving there for the start, pick me up! [16:21:05] RECOVERY - Apache HTTP on srv259 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.024 second response time [16:21:06] RECOVERY - Apache HTTP on srv259 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.024 second response time [16:21:22] otherwise I'll take the caltrain [16:21:24] toot-toot [16:21:42] will do if I am [16:21:47] ok:) [16:21:51] u will be here on Sat? [16:21:55] or Fri? [16:22:12] I'm landing Fri-noon [16:22:46] I was going through old logs and found copies of the EFF gopher site from like 1993/4 that were pretty amusing. [16:22:55] !log reedy synchronized php-1.18/extensions/GoogleNewsSitemap/FeedSMItem.php 'r109543' [16:22:55] ohmygod [16:22:55] http://technolog.msnbc.msn.com/_news/2012/01/19/10185978-wikipedia-traffic-surged-during-sopa-blackout [16:22:58] brianmc, ^ [16:22:59] * domas got mentioned by MSNBC [16:23:19] Reedy, whilst you're looking at the newsfeed code - how frequently will it be updated? [16:23:21] What did you do? [16:23:34] well, not sure if TV mentioned that [16:25:58] brianmc, it has a 30 minute squid cache [16:27:22] ewww. That's not good if it delays a published page by 29 minutes [16:27:46] Well, that's a completely different issue :p [16:28:49] brianmc: use facebook!!! it is real time!!!1 [16:29:13] FB is only realtime because I post articles to the wikinews page when I publish them! [16:29:36] =) [16:34:52] That's the feed functioning again Reedy, thanks for the quick turnaround (despite other things going a bit wonky) [16:42:21] New patchset: Mark Bergsma; "Update MTA hostnames" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1978 [16:42:37] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1978 [16:43:08] New review: Mark Bergsma; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1978 [16:43:09] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1978 [16:45:35] PROBLEM - Puppet freshness on db22 is CRITICAL: Puppet has not run in the last 10 hours [16:45:35] PROBLEM - Puppet freshness on db22 is CRITICAL: Puppet has not run in the last 10 hours [16:58:06] I have a question about the default editing toolbar on WP. [17:00:05] Does the default editing toolbar include Cite-Template? [17:26:25] <[Haekchen]> Hello! Is there an IRC log for this channel somewhere? [17:26:51] no, it's not publiccally logged [18:07:21] Does the default editing toolbar include Cite-Template? I'm asking because I think new editors would benefit from having access to that instead of constructing refs from scratch. [18:19:35] Does anyone know the answer to my question? [18:32:17] Thanks anyway, someone in -en answered. [19:07:12] New patchset: Bhartshorne; "taking out the SOPA filter now that the blackout is over." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1979 [19:07:28] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1979 [19:07:54] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/1979 [19:07:55] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1979 [19:16:36] PROBLEM - Puppet freshness on mw1096 is CRITICAL: Puppet has not run in the last 10 hours [19:16:36] PROBLEM - Puppet freshness on mw1096 is CRITICAL: Puppet has not run in the last 10 hours [19:35:32] New patchset: Bhartshorne; "adding rules for the new ms-fe hosts" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1980 [19:35:48] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1980 [19:36:42] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/1980 [19:36:42] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1980 [19:46:39] !log testing [19:46:45] !log come on bot! [19:48:20] LeslieCarr, doesn't it need morebots? [19:48:34] It always gets confusing [19:49:16] gah [19:49:19] i'm not sure [19:49:19] :) [19:49:23] know where morebots lives ? [19:49:46] I think that lives on the server that runs wikitech [19:49:56] Which apparently was being migrated earlier today [19:51:21] ahha on the linode server :) [19:51:22] http://wikitech.wikimedia.org/view/Morebots [19:51:24] yus [19:53:16] not that much earlier [19:53:24] I got email at um [19:54:05] oh. now it's somewhat earlier. anyways about 2 hours ago saying "woops sorry, *now* we're almost ready to migrate the data") [19:54:10] lol [19:54:12] useful [19:57:25] so, do we need to get andrew ? [19:57:27] and which andrew ? [19:57:31] Werdna [19:57:35] And no, you shouldn't need him [19:58:20] i think it's just a case of restarting it when the migration has happened [19:59:58] let's see if it joins :) [20:00:04] ~log testing [20:00:07] !log testing [20:00:09] Logged the message, Mistress of the network gear. [20:00:13] yay! [20:05:16] Anyone here who is somehow involved with developing the WikiMiniAtlas? [20:26:35] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:26:36] PROBLEM - Puppet freshness on spence is CRITICAL: Puppet has not run in the last 10 hours [20:46:35] PROBLEM - Puppet freshness on bast1001 is CRITICAL: Puppet has not run in the last 10 hours [20:46:35] PROBLEM - Puppet freshness on bast1001 is CRITICAL: Puppet has not run in the last 10 hours [20:48:14] zzz =_= [20:50:35] PROBLEM - Puppet freshness on fenari is CRITICAL: Puppet has not run in the last 10 hours [20:50:35] PROBLEM - Puppet freshness on fenari is CRITICAL: Puppet has not run in the last 10 hours [21:26:47] New patchset: Ryan Lane; "Making ipv6 enabled or disabled per domain" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1981 [21:27:10] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1981 [21:27:11] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1981 [22:05:25] PROBLEM - Disk space on srv223 is CRITICAL: DISK CRITICAL - free space: / 168 MB (2% inode=60%): /var/lib/ureadahead/debugfs 168 MB (2% inode=60%): [22:05:25] PROBLEM - Disk space on srv223 is CRITICAL: DISK CRITICAL - free space: / 168 MB (2% inode=60%): /var/lib/ureadahead/debugfs 168 MB (2% inode=60%): [22:33:58] TimStarling, are you busy? Got a hopefully quick-ish question [22:34:12] fire away [22:34:27] So, bug 33808 suggests the interwikimap is empty on all wikisources [22:35:00] quick bit of digging suggests that the api before 1.19 read from the interwiki table, not the interwiki cache (this is now fixed as of r92528) [22:35:32] Is it worth populating the interwiki tables on the wikisource projects, or just leave it till it's fixed properly with the 1.19 release? [22:35:35] RECOVERY - Disk space on srv223 is OK: DISK OK [22:35:35] RECOVERY - Disk space on srv223 is OK: DISK OK [22:36:22] I don't know if the script we use to generate interwiki links even updates the table anymore [22:36:26] probably it doesn't [22:37:52] yeah, see maintenance/dumpInterwiki.php in 1.18wmf1 [22:38:55] basically when Domas introduced the interwiki cache, he just copied rebuildInterwiki.php to dumpInterwiki.php and changed the SQL bits to write to a CDB file instead [22:39:06] and since then, only dumpInterwiki.php was updated [22:39:23] ah [22:39:30] yay code duplication [22:39:40] rebuildInterwiki.php has got wikisource in it, it's not quite that old [22:39:50] but I doubt anyone has run it in the last 3 years [22:41:02] domas is kind of results-focused, he probably figured someone else would clean it up later [22:41:37] Yeaah [22:41:56] is it worth dragging the script back upto date? [22:42:08] I can see it having it's uses for 3 party users [22:42:17] *3rd [22:43:04] maybe merging them would be useful [22:43:47] rebuildInterwiki.php actually output SQL text instead of doing the SQL queries directly [22:43:59] I'm sure I had a good reason for that at the time, but it's a bit out of date now [22:44:18] Yeah, doing a truncate and reinsert isn't going to be that bad for the size of the table [22:47:03] Somewhat annoyingly, it's still WMF specific, referencing files in /h/w/c [22:47:46] Probably worth spending a bit of time to tidy up/merge these scripts, and then update all the interwiki tables [22:48:25] I really hate the interwiki map page on meta [22:48:39] Hah [22:48:49] I just found one pointing to wg.en.wikipedia.org [22:48:59] whenever anyone adds a prefix there, it breaks pages [22:49:06] But there is a lot of random wikis I can't believe we care about [22:49:32] so maybe if you want to run dumpInterwiki for real again, you could use a text file based on a snapshot from meta, rather than the actual meta page [22:49:37] a snapshot from 2008 or so [22:50:38] enwiki has like 660 rows in the interwiki table [22:50:48] New patchset: Asher; "make sure npre.d/* is included" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1982 [22:50:56] Could almost dump that and reimport it.. Obviously is going to be somehwat out of date [22:51:03] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1982 [22:51:25] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1982 [22:51:26] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1982 [22:52:37] !log recompiled wikidiff2 and put the new version up on apt.wikimedia.org [22:52:38] Logged the message, Master [22:57:04] New patchset: Ryan Lane; "Fix ipv6 check logic" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1983 [22:57:19] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1983 [22:57:28] New review: Ryan Lane; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1983 [22:57:29] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1983 [23:15:39] !log rebuilt wikidiff2 with package name php-wikidiff2, removed lucid package php5-wikidiff2 from apt using "reprepro remove" [23:15:40] Logged the message, Master [23:20:15] PROBLEM - DPKG on srv261 is CRITICAL: Connection refused by host [23:20:16] PROBLEM - DPKG on srv261 is CRITICAL: Connection refused by host [23:24:46] PROBLEM - RAID on srv261 is CRITICAL: Connection refused by host [23:24:46] PROBLEM - RAID on srv261 is CRITICAL: Connection refused by host [23:24:56] PROBLEM - DPKG on srv188 is CRITICAL: Connection refused by host [23:24:56] PROBLEM - DPKG on srv188 is CRITICAL: Connection refused by host [23:25:12] New patchset: Asher; "root .my.cnf on all cluster dbs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1984 [23:25:27] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/1984 [23:26:29] New review: Asher; "(no comment)" [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1984 [23:26:30] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/1984 [23:27:35] PROBLEM - RAID on srv267 is CRITICAL: Connection refused by host [23:27:35] PROBLEM - Disk space on srv261 is CRITICAL: Connection refused by host [23:27:36] PROBLEM - RAID on srv267 is CRITICAL: Connection refused by host [23:27:36] PROBLEM - Disk space on srv261 is CRITICAL: Connection refused by host [23:28:16] PROBLEM - Disk space on srv267 is CRITICAL: Connection refused by host [23:28:16] PROBLEM - Disk space on srv267 is CRITICAL: Connection refused by host [23:28:16] PROBLEM - RAID on srv188 is CRITICAL: Connection refused by host [23:28:16] PROBLEM - RAID on srv188 is CRITICAL: Connection refused by host [23:28:25] PROBLEM - Disk space on srv188 is CRITICAL: Connection refused by host [23:28:26] PROBLEM - Disk space on srv188 is CRITICAL: Connection refused by host [23:30:15] RECOVERY - DPKG on srv261 is OK: All packages OK [23:30:16] RECOVERY - DPKG on srv261 is OK: All packages OK [23:33:46] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1471s [23:33:46] PROBLEM - Misc_Db_Lag on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1471s [23:34:05] PROBLEM - DPKG on srv267 is CRITICAL: Connection refused by host [23:34:06] PROBLEM - DPKG on srv267 is CRITICAL: Connection refused by host [23:34:35] RECOVERY - RAID on srv261 is OK: OK: no RAID installed [23:34:36] RECOVERY - RAID on srv261 is OK: OK: no RAID installed [23:34:55] RECOVERY - DPKG on srv188 is OK: All packages OK [23:34:56] RECOVERY - DPKG on srv188 is OK: All packages OK [23:37:25] RECOVERY - RAID on srv267 is OK: OK: no RAID installed [23:37:26] RECOVERY - RAID on srv267 is OK: OK: no RAID installed [23:37:45] RECOVERY - Disk space on srv261 is OK: DISK OK [23:37:46] RECOVERY - Disk space on srv261 is OK: DISK OK [23:38:05] RECOVERY - Disk space on srv267 is OK: DISK OK [23:38:06] RECOVERY - Disk space on srv267 is OK: DISK OK [23:38:15] RECOVERY - RAID on srv188 is OK: OK: no RAID installed [23:38:15] RECOVERY - Disk space on srv188 is OK: DISK OK [23:38:16] RECOVERY - RAID on srv188 is OK: OK: no RAID installed [23:38:16] RECOVERY - Disk space on srv188 is OK: DISK OK [23:38:35] PROBLEM - MySQL replication status on db1025 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1761s [23:38:36] PROBLEM - MySQL replication status on db1025 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1761s [23:39:25] PROBLEM - RAID on mw1122 is CRITICAL: Connection refused by host [23:39:25] PROBLEM - Disk space on tarin is CRITICAL: Connection refused by host [23:39:26] PROBLEM - RAID on mw1122 is CRITICAL: Connection refused by host [23:39:26] PROBLEM - Disk space on tarin is CRITICAL: Connection refused by host [23:39:35] PROBLEM - DPKG on mw1016 is CRITICAL: Connection refused by host [23:39:36] PROBLEM - DPKG on mw1016 is CRITICAL: Connection refused by host [23:39:45] PROBLEM - DPKG on ms5 is CRITICAL: Connection refused by host [23:39:45] PROBLEM - DPKG on mw1044 is CRITICAL: Connection refused by host [23:39:45] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1831s [23:39:46] PROBLEM - DPKG on ms5 is CRITICAL: Connection refused by host [23:39:46] PROBLEM - DPKG on mw1044 is CRITICAL: Connection refused by host [23:39:46] PROBLEM - MySQL replication status on storage3 is CRITICAL: CHECK MySQL REPLICATION - lag - CRITICAL - Seconds_Behind_Master : 1831s [23:39:55] PROBLEM - Disk space on mw1050 is CRITICAL: Connection refused by host [23:39:55] PROBLEM - RAID on mw44 is CRITICAL: Connection refused by host [23:39:55] PROBLEM - Disk space on cp1043 is CRITICAL: Connection refused by host [23:39:55] PROBLEM - Disk space on mw1084 is CRITICAL: Connection refused by host [23:39:55] PROBLEM - RAID on mw1044 is CRITICAL: Connection refused by host [23:39:56] PROBLEM - Disk space on mw1050 is CRITICAL: Connection refused by host [23:39:56] PROBLEM - RAID on mw44 is CRITICAL: Connection refused by host [23:39:56] PROBLEM - Disk space on cp1043 is CRITICAL: Connection refused by host [23:39:56] PROBLEM - Disk space on mw1084 is CRITICAL: Connection refused by host [23:39:56] PROBLEM - RAID on mw1044 is CRITICAL: Connection refused by host [23:40:05] PROBLEM - RAID on srv192 is CRITICAL: Connection refused by host [23:40:05] PROBLEM - Disk space on mw1044 is CRITICAL: Connection refused by host [23:40:05] PROBLEM - Disk space on db1018 is CRITICAL: Connection refused by host [23:40:05] PROBLEM - DPKG on srv200 is CRITICAL: Connection refused by host [23:40:05] PROBLEM - Disk space on srv243 is CRITICAL: Connection refused by host [23:40:05] PROBLEM - DPKG on mw72 is CRITICAL: Connection refused by host [23:40:06] PROBLEM - DPKG on mw1050 is CRITICAL: Connection refused by host [23:40:06] PROBLEM - RAID on srv192 is CRITICAL: Connection refused by host [23:40:06] PROBLEM - Disk space on mw1044 is CRITICAL: Connection refused by host [23:40:06] PROBLEM - Disk space on db1018 is CRITICAL: Connection refused by host [23:40:06] PROBLEM - DPKG on srv200 is CRITICAL: Connection refused by host [23:40:06] PROBLEM - Disk space on srv243 is CRITICAL: Connection refused by host [23:40:06] PROBLEM - Disk space on srv232 is CRITICAL: Connection refused by host [23:40:07] PROBLEM - RAID on srv190 is CRITICAL: Connection refused by host [23:40:07] PROBLEM - DPKG on mw72 is CRITICAL: Connection refused by host [23:40:07] PROBLEM - DPKG on mw1050 is CRITICAL: Connection refused by host [23:40:07] PROBLEM - Disk space on srv232 is CRITICAL: Connection refused by host [23:40:07] PROBLEM - RAID on srv190 is CRITICAL: Connection refused by host [23:40:15] PROBLEM - RAID on mw1095 is CRITICAL: Connection refused by host [23:40:15] PROBLEM - DPKG on mw1058 is CRITICAL: Connection refused by host [23:40:16] PROBLEM - RAID on mw1095 is CRITICAL: Connection refused by host [23:40:16] PROBLEM - DPKG on mw1058 is CRITICAL: Connection refused by host [23:40:25] PROBLEM - Disk space on db1033 is CRITICAL: Connection refused by host [23:40:25] PROBLEM - RAID on mw1050 is CRITICAL: Connection refused by host [23:40:25] PROBLEM - RAID on srv200 is CRITICAL: Connection refused by host [23:40:25] PROBLEM - RAID on db1033 is CRITICAL: Connection refused by host [23:40:25] PROBLEM - RAID on db1017 is CRITICAL: Connection refused by host [23:40:25] PROBLEM - RAID on mw1041 is CRITICAL: Connection refused by host [23:40:26] PROBLEM - Disk space on db1033 is CRITICAL: Connection refused by host [23:40:26] PROBLEM - RAID on mw1050 is CRITICAL: Connection refused by host [23:40:26] PROBLEM - RAID on srv200 is CRITICAL: Connection refused by host [23:40:26] PROBLEM - RAID on db1033 is CRITICAL: Connection refused by host [23:40:26] PROBLEM - RAID on db1017 is CRITICAL: Connection refused by host [23:40:26] PROBLEM - Disk space on srv220 is CRITICAL: Connection refused by host [23:40:27] PROBLEM - DPKG on aluminium is CRITICAL: Connection refused by host [23:40:27] PROBLEM - RAID on mw1041 is CRITICAL: Connection refused by host [23:40:27] PROBLEM - Disk space on srv220 is CRITICAL: Connection refused by host [23:40:27] PROBLEM - DPKG on aluminium is CRITICAL: Connection refused by host [23:40:35] PROBLEM - DPKG on db1006 is CRITICAL: Connection refused by host [23:40:35] PROBLEM - RAID on ms5 is CRITICAL: Connection refused by host [23:40:35] PROBLEM - DPKG on db1005 is CRITICAL: Connection refused by host [23:40:35] PROBLEM - MySQL disk space on db1001 is CRITICAL: Connection refused by host [23:40:35] PROBLEM - MySQL disk space on es4 is CRITICAL: Connection refused by host [23:40:35] PROBLEM - DPKG on srv192 is CRITICAL: Connection refused by host [23:40:35] PROBLEM - RAID on mw1131 is CRITICAL: Connection refused by host [23:40:36] PROBLEM - DPKG on db1006 is CRITICAL: Connection refused by host [23:40:36] PROBLEM - RAID on ms5 is CRITICAL: Connection refused by host [23:40:36] PROBLEM - DPKG on db1005 is CRITICAL: Connection refused by host [23:40:36] PROBLEM - MySQL disk space on db1001 is CRITICAL: Connection refused by host [23:40:36] PROBLEM - MySQL disk space on es4 is CRITICAL: Connection refused by host [23:40:37] PROBLEM - DPKG on db1010 is CRITICAL: Connection refused by host [23:40:37] PROBLEM - MySQL disk space on db1031 is CRITICAL: Connection refused by host [23:40:37] PROBLEM - DPKG on srv192 is CRITICAL: Connection refused by host [23:40:37] PROBLEM - RAID on mw1131 is CRITICAL: Connection refused by host [23:40:37] PROBLEM - DPKG on db1048 is CRITICAL: Connection refused by host [23:40:37] PROBLEM - DPKG on db1010 is CRITICAL: Connection refused by host [23:40:37] PROBLEM - MySQL disk space on db1031 is CRITICAL: Connection refused by host [23:40:38] PROBLEM - DPKG on db1048 is CRITICAL: Connection refused by host [23:40:45] PROBLEM - Disk space on mw1156 is CRITICAL: Connection refused by host [23:40:45] PROBLEM - Disk space on srv227 is CRITICAL: Connection refused by host [23:40:45] PROBLEM - DPKG on mw44 is CRITICAL: Connection refused by host [23:40:45] PROBLEM - DPKG on mw1083 is CRITICAL: Connection refused by host [23:40:45] PROBLEM - RAID on db1008 is CRITICAL: Connection refused by host [23:40:45] PROBLEM - DPKG on db1015 is CRITICAL: Connection refused by host [23:40:46] PROBLEM - RAID on db1010 is CRITICAL: Connection refused by host [23:40:46] PROBLEM - Disk space on mw1156 is CRITICAL: Connection refused by host [23:40:46] PROBLEM - Disk space on srv227 is CRITICAL: Connection refused by host [23:40:46] PROBLEM - DPKG on mw44 is CRITICAL: Connection refused by host [23:40:46] PROBLEM - DPKG on mw1083 is CRITICAL: Connection refused by host [23:40:46] PROBLEM - RAID on db1008 is CRITICAL: Connection refused by host [23:40:46] PROBLEM - MySQL disk space on db1006 is CRITICAL: Connection refused by host [23:40:46] PROBLEM - DPKG on es2 is CRITICAL: Connection refused by host [23:40:46] PROBLEM - DPKG on db1015 is CRITICAL: Connection refused by host [23:40:46] PROBLEM - RAID on db1010 is CRITICAL: Connection refused by host [23:40:47] PROBLEM - Disk space on mw1016 is CRITICAL: Connection refused by host [23:40:47] PROBLEM - Disk space on db1010 is CRITICAL: Connection refused by host [23:40:47] PROBLEM - MySQL disk space on db1006 is CRITICAL: Connection refused by host [23:40:47] PROBLEM - DPKG on es2 is CRITICAL: Connection refused by host [23:40:48] PROBLEM - Disk space on mw1016 is CRITICAL: Connection refused by host [23:40:48] PROBLEM - Disk space on db1010 is CRITICAL: Connection refused by host [23:40:55] PROBLEM - DPKG on emery is CRITICAL: Connection refused by host [23:40:55] PROBLEM - RAID on cp1043 is CRITICAL: Connection refused by host [23:40:55] PROBLEM - RAID on srv229 is CRITICAL: Connection refused by host [23:40:55] PROBLEM - DPKG on db1002 is CRITICAL: Connection refused by host [23:40:55] PROBLEM - Disk space on srv200 is CRITICAL: Connection refused by host [23:40:56] PROBLEM - DPKG on emery is CRITICAL: Connection refused by host [23:40:56] PROBLEM - RAID on cp1043 is CRITICAL: Connection refused by host [23:40:56] PROBLEM - RAID on srv229 is CRITICAL: Connection refused by host [23:40:56] PROBLEM - DPKG on db1002 is CRITICAL: Connection refused by host [23:40:56] PROBLEM - Disk space on srv200 is CRITICAL: Connection refused by host [23:41:02] o_0 [23:41:05] PROBLEM - RAID on mw1142 is CRITICAL: Connection refused by host [23:41:05] PROBLEM - DPKG on mw1100 is CRITICAL: Connection refused by host [23:41:05] PROBLEM - RAID on mw1156 is CRITICAL: Connection refused by host [23:41:05] PROBLEM - DPKG on virt3 is CRITICAL: Connection refused by host [23:41:05] PROBLEM - RAID on mw1141 is CRITICAL: Connection refused by host [23:41:06] PROBLEM - RAID on mw1142 is CRITICAL: Connection refused by host [23:41:06] PROBLEM - DPKG on mw1100 is CRITICAL: Connection refused by host [23:41:06] PROBLEM - RAID on mw1156 is CRITICAL: Connection refused by host [23:41:06] PROBLEM - DPKG on virt3 is CRITICAL: Connection refused by host [23:41:06] PROBLEM - RAID on mw1141 is CRITICAL: Connection refused by host [23:41:15] PROBLEM - DPKG on srv225 is CRITICAL: Connection refused by host [23:41:15] PROBLEM - Disk space on db1001 is CRITICAL: Connection refused by host [23:41:15] PROBLEM - RAID on srv286 is CRITICAL: Connection refused by host [23:41:16] PROBLEM - DPKG on srv225 is CRITICAL: Connection refused by host [23:41:16] PROBLEM - Disk space on db1001 is CRITICAL: Connection refused by host [23:41:16] PROBLEM - RAID on srv286 is CRITICAL: Connection refused by host [23:41:25] PROBLEM - DPKG on mw1031 is CRITICAL: Connection refused by host [23:41:25] PROBLEM - RAID on db1031 is CRITICAL: Connection refused by host [23:41:25] PROBLEM - Disk space on mw1095 is CRITICAL: Connection refused by host [23:41:25] PROBLEM - DPKG on mw1028 is CRITICAL: Connection refused by host [23:41:25] PROBLEM - Disk space on mw58 is CRITICAL: Connection refused by host [23:41:25] PROBLEM - Disk space on srv195 is CRITICAL: Connection refused by host [23:41:25] PROBLEM - DPKG on mw1069 is CRITICAL: Connection refused by host [23:41:26] PROBLEM - DPKG on mw1031 is CRITICAL: Connection refused by host [23:41:26] PROBLEM - RAID on db1031 is CRITICAL: Connection refused by host [23:41:26] PROBLEM - Disk space on mw1095 is CRITICAL: Connection refused by host [23:41:26] PROBLEM - DPKG on mw1028 is CRITICAL: Connection refused by host [23:41:26] PROBLEM - Disk space on mw58 is CRITICAL: Connection refused by host [23:41:27] PROBLEM - Disk space on cp1041 is CRITICAL: Connection refused by host [23:41:27] PROBLEM - Disk space on srv195 is CRITICAL: Connection refused by host [23:41:27] PROBLEM - DPKG on mw1069 is CRITICAL: Connection refused by host [23:41:27] PROBLEM - Disk space on cp1041 is CRITICAL: Connection refused by host [23:41:35] PROBLEM - RAID on srv218 is CRITICAL: Connection refused by host [23:41:35] PROBLEM - Disk space on db1043 is CRITICAL: Connection refused by host [23:41:36] PROBLEM - RAID on srv218 is CRITICAL: Connection refused by host [23:41:36] PROBLEM - Disk space on db1043 is CRITICAL: Connection refused by host [23:41:46] PROBLEM - Disk space on srv209 is CRITICAL: Connection refused by host [23:41:46] PROBLEM - Disk space on mw72 is CRITICAL: Connection refused by host [23:41:46] PROBLEM - Disk space on searchidx2 is CRITICAL: Connection refused by host [23:41:46] PROBLEM - Disk space on srv209 is CRITICAL: Connection refused by host [23:41:46] PROBLEM - Disk space on mw72 is CRITICAL: Connection refused by host [23:41:46] PROBLEM - Disk space on searchidx2 is CRITICAL: Connection refused by host [23:41:55] PROBLEM - Disk space on srv210 is CRITICAL: Connection refused by host [23:41:55] PROBLEM - MySQL disk space on db1038 is CRITICAL: Connection refused by host [23:41:55] PROBLEM - Disk space on es2 is CRITICAL: Connection refused by host [23:41:55] PROBLEM - DPKG on srv195 is CRITICAL: Connection refused by host [23:41:55] PROBLEM - Disk space on snapshot2 is CRITICAL: Connection refused by host [23:41:56] PROBLEM - Disk space on srv210 is CRITICAL: Connection refused by host [23:41:56] PROBLEM - MySQL disk space on db1038 is CRITICAL: Connection refused by host [23:41:56] PROBLEM - Disk space on es2 is CRITICAL: Connection refused by host [23:41:56] PROBLEM - DPKG on srv195 is CRITICAL: Connection refused by host [23:41:56] PROBLEM - Disk space on snapshot2 is CRITICAL: Connection refused by host [23:42:05] PROBLEM - Disk space on virt4 is CRITICAL: Connection refused by host [23:42:05] PROBLEM - Disk space on es1002 is CRITICAL: Connection refused by host [23:42:05] PROBLEM - Disk space on mw1089 is CRITICAL: Connection refused by host [23:42:05] PROBLEM - RAID on mw72 is CRITICAL: Connection refused by host [23:42:05] PROBLEM - DPKG on srv218 is CRITICAL: Connection refused by host [23:42:06] PROBLEM - Disk space on virt4 is CRITICAL: Connection refused by host [23:42:06] PROBLEM - Disk space on es1002 is CRITICAL: Connection refused by host [23:42:06] PROBLEM - Disk space on mw1089 is CRITICAL: Connection refused by host [23:42:06] PROBLEM - RAID on mw72 is CRITICAL: Connection refused by host [23:42:06] PROBLEM - DPKG on srv218 is CRITICAL: Connection refused by host [23:42:15] PROBLEM - DPKG on virt4 is CRITICAL: Connection refused by host [23:42:15] PROBLEM - MySQL disk space on db1029 is CRITICAL: Connection refused by host [23:42:15] PROBLEM - Disk space on mw1001 is CRITICAL: Connection refused by host [23:42:15] PROBLEM - DPKG on db45 is CRITICAL: Connection refused by host [23:42:15] PROBLEM - Disk space on mw1010 is CRITICAL: Connection refused by host [23:42:16] PROBLEM - DPKG on virt4 is CRITICAL: Connection refused by host [23:42:16] PROBLEM - MySQL disk space on db1029 is CRITICAL: Connection refused by host [23:42:16] PROBLEM - Disk space on mw1001 is CRITICAL: Connection refused by host [23:42:16] PROBLEM - DPKG on db45 is CRITICAL: Connection refused by host [23:42:16] PROBLEM - Disk space on mw1010 is CRITICAL: Connection refused by host [23:42:25] PROBLEM - MySQL disk space on db1035 is CRITICAL: Connection refused by host [23:42:25] PROBLEM - DPKG on srv286 is CRITICAL: Connection refused by host [23:42:25] PROBLEM - RAID on es4 is CRITICAL: Connection refused by host [23:42:25] PROBLEM - Disk space on mw44 is CRITICAL: Connection refused by host [23:42:25] PROBLEM - Disk space on mw65 is CRITICAL: Connection refused by host [23:42:25] PROBLEM - DPKG on srv226 is CRITICAL: Connection refused by host [23:42:26] PROBLEM - MySQL disk space on db1035 is CRITICAL: Connection refused by host [23:42:26] PROBLEM - DPKG on srv286 is CRITICAL: Connection refused by host [23:42:26] PROBLEM - RAID on es4 is CRITICAL: Connection refused by host [23:42:26] PROBLEM - Disk space on mw44 is CRITICAL: Connection refused by host [23:42:26] PROBLEM - Disk space on mw65 is CRITICAL: Connection refused by host [23:42:26] PROBLEM - DPKG on srv226 is CRITICAL: Connection refused by host [23:42:35] PROBLEM - DPKG on srv215 is CRITICAL: Connection refused by host [23:42:35] PROBLEM - Disk space on virt2 is CRITICAL: Connection refused by host [23:42:35] PROBLEM - RAID on srv210 is CRITICAL: Connection refused by host [23:42:36] PROBLEM - DPKG on srv215 is CRITICAL: Connection refused by host [23:42:36] PROBLEM - Disk space on virt2 is CRITICAL: Connection refused by host [23:42:36] PROBLEM - RAID on srv210 is CRITICAL: Connection refused by host [23:42:45] PROBLEM - Disk space on mw1037 is CRITICAL: Connection refused by host [23:42:45] PROBLEM - RAID on db1041 is CRITICAL: Connection refused by host [23:42:45] PROBLEM - RAID on srv223 is CRITICAL: Connection refused by host [23:42:45] PROBLEM - Disk space on mw1078 is CRITICAL: Connection refused by host [23:42:45] PROBLEM - Disk space on srv225 is CRITICAL: Connection refused by host [23:42:45] PROBLEM - DPKG on db1043 is CRITICAL: Connection refused by host [23:42:46] PROBLEM - Disk space on mw1037 is CRITICAL: Connection refused by host [23:42:46] PROBLEM - RAID on db1041 is CRITICAL: Connection refused by host [23:42:46] PROBLEM - RAID on srv223 is CRITICAL: Connection refused by host [23:42:46] PROBLEM - Disk space on mw1078 is CRITICAL: Connection refused by host [23:42:46] PROBLEM - Disk space on srv225 is CRITICAL: Connection refused by host [23:42:46] PROBLEM - DPKG on db1043 is CRITICAL: Connection refused by host [23:42:55] PROBLEM - Disk space on db1048 is CRITICAL: Connection refused by host [23:42:55] PROBLEM - MySQL disk space on db1046 is CRITICAL: Connection refused by host [23:42:55] PROBLEM - DPKG on mw67 is CRITICAL: Connection refused by host [23:42:56] PROBLEM - Disk space on db1048 is CRITICAL: Connection refused by host [23:42:56] PROBLEM - MySQL disk space on db1046 is CRITICAL: Connection refused by host [23:42:56] PROBLEM - DPKG on mw67 is CRITICAL: Connection refused by host [23:43:05] PROBLEM - DPKG on mw1149 is CRITICAL: Connection refused by host [23:43:05] PROBLEM - MySQL disk space on db1028 is CRITICAL: Connection refused by host [23:43:05] PROBLEM - Disk space on srv196 is CRITICAL: Connection refused by host [23:43:05] PROBLEM - RAID on srv215 is CRITICAL: Connection refused by host [23:43:06] PROBLEM - DPKG on mw1149 is CRITICAL: Connection refused by host [23:43:06] PROBLEM - MySQL disk space on db1028 is CRITICAL: Connection refused by host [23:43:06] PROBLEM - Disk space on srv196 is CRITICAL: Connection refused by host [23:43:06] PROBLEM - RAID on srv215 is CRITICAL: Connection refused by host [23:43:15] PROBLEM - RAID on db1028 is CRITICAL: Connection refused by host [23:43:15] PROBLEM - DPKG on mw1 is CRITICAL: Connection refused by host [23:43:15] PROBLEM - DPKG on srv190 is CRITICAL: Connection refused by host [23:43:16] PROBLEM - RAID on db1028 is CRITICAL: Connection refused by host [23:43:16] PROBLEM - DPKG on mw1 is CRITICAL: Connection refused by host [23:43:16] PROBLEM - DPKG on srv190 is CRITICAL: Connection refused by host [23:43:25] PROBLEM - RAID on storage3 is CRITICAL: Connection refused by host [23:43:25] PROBLEM - Disk space on db1029 is CRITICAL: Connection refused by host [23:43:25] PROBLEM - Disk space on mw1012 is CRITICAL: Connection refused by host [23:43:25] PROBLEM - Disk space on db43 is CRITICAL: Connection refused by host [23:43:25] PROBLEM - RAID on es2 is CRITICAL: Connection refused by host [23:43:25] PROBLEM - Disk space on srv189 is CRITICAL: Connection refused by host [23:43:26] PROBLEM - RAID on storage3 is CRITICAL: Connection refused by host [23:43:26] PROBLEM - Disk space on db1029 is CRITICAL: Connection refused by host [23:43:26] PROBLEM - Disk space on mw1012 is CRITICAL: Connection refused by host [23:43:26] PROBLEM - Disk space on db43 is CRITICAL: Connection refused by host [23:43:26] PROBLEM - RAID on es2 is CRITICAL: Connection refused by host [23:43:26] PROBLEM - Disk space on srv189 is CRITICAL: Connection refused by host [23:43:35] PROBLEM - RAID on srv227 is CRITICAL: Connection refused by host [23:43:35] PROBLEM - RAID on db1048 is CRITICAL: Connection refused by host [23:43:35] PROBLEM - DPKG on searchidx2 is CRITICAL: Connection refused by host [23:43:35] PROBLEM - Disk space on db1038 is CRITICAL: Connection refused by host [23:43:35] PROBLEM - Disk space on db45 is CRITICAL: Connection refused by host [23:43:36] PROBLEM - RAID on srv227 is CRITICAL: Connection refused by host [23:43:36] PROBLEM - RAID on db1048 is CRITICAL: Connection refused by host [23:43:36] PROBLEM - DPKG on searchidx2 is CRITICAL: Connection refused by host [23:43:36] PROBLEM - Disk space on db1038 is CRITICAL: Connection refused by host [23:43:36] PROBLEM - Disk space on db45 is CRITICAL: Connection refused by host [23:43:45] PROBLEM - Disk space on aluminium is CRITICAL: Connection refused by host [23:43:45] PROBLEM - Disk space on mw1031 is CRITICAL: Connection refused by host [23:43:45] PROBLEM - RAID on aluminium is CRITICAL: Connection refused by host [23:43:45] PROBLEM - Disk space on db1002 is CRITICAL: Connection refused by host [23:43:45] PROBLEM - Disk space on srv215 is CRITICAL: Connection refused by host [23:43:45] PROBLEM - MySQL disk space on db44 is CRITICAL: Connection refused by host [23:43:45] PROBLEM - Disk space on srv226 is CRITICAL: Connection refused by host [23:43:46] PROBLEM - Disk space on aluminium is CRITICAL: Connection refused by host [23:43:46] PROBLEM - Disk space on mw1031 is CRITICAL: Connection refused by host [23:43:46] PROBLEM - RAID on aluminium is CRITICAL: Connection refused by host [23:43:46] PROBLEM - Disk space on db1002 is CRITICAL: Connection refused by host [23:43:46] PROBLEM - Disk space on srv215 is CRITICAL: Connection refused by host [23:43:46] PROBLEM - Disk space on mw1013 is CRITICAL: Connection refused by host [23:43:46] PROBLEM - Disk space on mw1 is CRITICAL: Connection refused by host [23:43:46] PROBLEM - MySQL disk space on db44 is CRITICAL: Connection refused by host [23:43:46] PROBLEM - Disk space on srv226 is CRITICAL: Connection refused by host [23:43:47] PROBLEM - Disk space on mw1013 is CRITICAL: Connection refused by host [23:43:47] PROBLEM - Disk space on mw1 is CRITICAL: Connection refused by host [23:43:55] PROBLEM - Disk space on mw1058 is CRITICAL: Connection refused by host [23:43:55] PROBLEM - Disk space on mw1069 is CRITICAL: Connection refused by host [23:43:55] PROBLEM - RAID on mw1106 is CRITICAL: Connection refused by host [23:43:56] PROBLEM - Disk space on mw1058 is CRITICAL: Connection refused by host [23:43:56] PROBLEM - Disk space on mw1069 is CRITICAL: Connection refused by host [23:43:56] PROBLEM - RAID on mw1106 is CRITICAL: Connection refused by host [23:44:05] PROBLEM - Disk space on mw1076 is CRITICAL: Connection refused by host [23:44:05] RECOVERY - DPKG on srv267 is OK: All packages OK [23:44:05] PROBLEM - DPKG on mw52 is CRITICAL: Connection refused by host [23:44:05] PROBLEM - RAID on mw69 is CRITICAL: Connection refused by host [23:44:05] PROBLEM - DPKG on srv289 is CRITICAL: Connection refused by host [23:44:05] PROBLEM - RAID on sodium is CRITICAL: Connection refused by host [23:44:06] PROBLEM - Disk space on mw1076 is CRITICAL: Connection refused by host [23:44:06] RECOVERY - DPKG on srv267 is OK: All packages OK [23:44:06] PROBLEM - DPKG on mw52 is CRITICAL: Connection refused by host [23:44:06] PROBLEM - RAID on mw69 is CRITICAL: Connection refused by host [23:44:06] PROBLEM - DPKG on srv289 is CRITICAL: Connection refused by host [23:44:06] PROBLEM - RAID on sodium is CRITICAL: Connection refused by host [23:44:15] PROBLEM - DPKG on mw1131 is CRITICAL: Connection refused by host [23:44:15] PROBLEM - mailman on sodium is CRITICAL: Connection refused by host [23:44:15] PROBLEM - DPKG on mw1041 is CRITICAL: Connection refused by host [23:44:15] PROBLEM - RAID on srv237 is CRITICAL: Connection refused by host [23:44:15] PROBLEM - Disk space on srv229 is CRITICAL: Connection refused by host [23:44:15] PROBLEM - Disk space on mw1149 is CRITICAL: Connection refused by host [23:44:16] PROBLEM - DPKG on mw1131 is CRITICAL: Connection refused by host [23:44:16] PROBLEM - mailman on sodium is CRITICAL: Connection refused by host [23:44:16] PROBLEM - DPKG on mw1041 is CRITICAL: Connection refused by host [23:44:16] PROBLEM - RAID on srv237 is CRITICAL: Connection refused by host [23:44:16] PROBLEM - Disk space on srv229 is CRITICAL: Connection refused by host [23:44:16] PROBLEM - Disk space on mw1149 is CRITICAL: Connection refused by host [23:44:25] PROBLEM - DPKG on virt2 is CRITICAL: Connection refused by host [23:44:25] PROBLEM - MySQL disk space on db1010 is CRITICAL: Connection refused by host [23:44:26] PROBLEM - DPKG on mw1125 is CRITICAL: Connection refused by host [23:44:26] PROBLEM - Disk space on db1039 is CRITICAL: Connection refused by host [23:44:26] PROBLEM - Disk space on db1031 is CRITICAL: Connection refused by host [23:44:26] PROBLEM - DPKG on mw1122 is CRITICAL: Connection refused by host [23:44:26] PROBLEM - DPKG on virt2 is CRITICAL: Connection refused by host [23:44:26] PROBLEM - MySQL disk space on db1010 is CRITICAL: Connection refused by host [23:44:26] PROBLEM - DPKG on mw1125 is CRITICAL: Connection refused by host [23:44:26] PROBLEM - Disk space on db1039 is CRITICAL: Connection refused by host [23:44:26] PROBLEM - Disk space on db1031 is CRITICAL: Connection refused by host [23:44:26] PROBLEM - DPKG on mw1122 is CRITICAL: Connection refused by host [23:44:35] PROBLEM - DPKG on db44 is CRITICAL: Connection refused by host [23:44:35] PROBLEM - RAID on mw1058 is CRITICAL: Connection refused by host [23:44:35] PROBLEM - Disk space on mw1060 is CRITICAL: Connection refused by host [23:44:35] PROBLEM - Disk space on srv208 is CRITICAL: Connection refused by host [23:44:35] PROBLEM - RAID on srv220 is CRITICAL: Connection refused by host [23:44:36] PROBLEM - DPKG on srv241 is CRITICAL: Connection refused by host [23:44:36] PROBLEM - DPKG on db44 is CRITICAL: Connection refused by host [23:44:36] PROBLEM - RAID on mw1058 is CRITICAL: Connection refused by host [23:44:36] PROBLEM - Disk space on mw1060 is CRITICAL: Connection refused by host [23:44:36] PROBLEM - Disk space on srv208 is CRITICAL: Connection refused by host [23:44:36] PROBLEM - RAID on srv220 is CRITICAL: Connection refused by host [23:44:36] PROBLEM - DPKG on srv241 is CRITICAL: Connection refused by host [23:44:45] PROBLEM - Disk space on virt3 is CRITICAL: Connection refused by host [23:44:45] PROBLEM - RAID on mw1078 is CRITICAL: Connection refused by host [23:44:45] PROBLEM - RAID on virt4 is CRITICAL: Connection refused by host [23:44:45] PROBLEM - Disk space on mw52 is CRITICAL: Connection refused by host [23:44:45] PROBLEM - MySQL disk space on storage3 is CRITICAL: Connection refused by host [23:44:45] PROBLEM - poolcounter on tarin is CRITICAL: Connection refused by host [23:44:45] PROBLEM - Disk space on mw1083 is CRITICAL: Connection refused by host [23:44:46] PROBLEM - Disk space on virt3 is CRITICAL: Connection refused by host [23:44:46] PROBLEM - RAID on mw1078 is CRITICAL: Connection refused by host [23:44:46] PROBLEM - RAID on virt4 is CRITICAL: Connection refused by host [23:44:46] PROBLEM - Disk space on mw52 is CRITICAL: Connection refused by host [23:44:46] PROBLEM - MySQL disk space on storage3 is CRITICAL: Connection refused by host [23:44:47] PROBLEM - Disk space on srv192 is CRITICAL: Connection refused by host [23:44:47] PROBLEM - poolcounter on tarin is CRITICAL: Connection refused by host [23:44:47] PROBLEM - Disk space on mw1083 is CRITICAL: Connection refused by host [23:44:47] PROBLEM - Disk space on srv192 is CRITICAL: Connection refused by host [23:44:56] PROBLEM - RAID on es1002 is CRITICAL: Connection refused by host [23:44:56] PROBLEM - jenkins_service_running on aluminium is CRITICAL: Connection refused by host [23:44:56] PROBLEM - Disk space on db1003 is CRITICAL: Connection refused by host [23:44:57] PROBLEM - RAID on es1002 is CRITICAL: Connection refused by host [23:44:57] PROBLEM - jenkins_service_running on aluminium is CRITICAL: Connection refused by host [23:44:57] PROBLEM - Disk space on db1003 is CRITICAL: Connection refused by host [23:45:05] PROBLEM - DPKG on db1003 is CRITICAL: Connection refused by host [23:45:05] PROBLEM - DPKG on mw1105 is CRITICAL: Connection refused by host [23:45:05] PROBLEM - RAID on mw1110 is CRITICAL: Connection refused by host [23:45:05] PROBLEM - Disk space on mw67 is CRITICAL: Connection refused by host [23:45:05] PROBLEM - Disk space on mw1105 is CRITICAL: Connection refused by host [23:45:06] PROBLEM - DPKG on db1003 is CRITICAL: Connection refused by host [23:45:06] PROBLEM - DPKG on mw1105 is CRITICAL: Connection refused by host [23:45:06] PROBLEM - RAID on mw1110 is CRITICAL: Connection refused by host [23:45:06] PROBLEM - Disk space on mw67 is CRITICAL: Connection refused by host [23:45:06] PROBLEM - Disk space on mw1105 is CRITICAL: Connection refused by host [23:45:15] PROBLEM - DPKG on db1001 is CRITICAL: Connection refused by host [23:45:15] PROBLEM - RAID on db1002 is CRITICAL: Connection refused by host [23:45:15] PROBLEM - RAID on db1006 is CRITICAL: Connection refused by host [23:45:15] PROBLEM - DPKG on es1002 is CRITICAL: Connection refused by host [23:45:15] PROBLEM - DPKG on db1004 is CRITICAL: Connection refused by host [23:45:15] PROBLEM - DPKG on db1038 is CRITICAL: Connection refused by host [23:45:15] PROBLEM - DPKG on mw1110 is CRITICAL: Connection refused by host [23:45:16] PROBLEM - DPKG on db1001 is CRITICAL: Connection refused by host [23:45:16] PROBLEM - RAID on db1002 is CRITICAL: Connection refused by host [23:45:16] PROBLEM - RAID on db1006 is CRITICAL: Connection refused by host [23:45:16] PROBLEM - DPKG on es1002 is CRITICAL: Connection refused by host [23:45:16] PROBLEM - DPKG on db1004 is CRITICAL: Connection refused by host [23:45:16] PROBLEM - DPKG on db1017 is CRITICAL: Connection refused by host [23:45:16] PROBLEM - DPKG on db1038 is CRITICAL: Connection refused by host [23:45:16] PROBLEM - DPKG on mw1110 is CRITICAL: Connection refused by host [23:45:17] PROBLEM - DPKG on db1017 is CRITICAL: Connection refused by host [23:45:25] PROBLEM - RAID on srv232 is CRITICAL: Connection refused by host [23:45:25] PROBLEM - Disk space on db44 is CRITICAL: Connection refused by host [23:45:25] PROBLEM - Disk space on mw1127 is CRITICAL: Connection refused by host [23:45:25] PROBLEM - MySQL disk space on db43 is CRITICAL: Connection refused by host [23:45:25] PROBLEM - MySQL disk space on db1043 is CRITICAL: Connection refused by host [23:45:26] PROBLEM - DPKG on srv223 is CRITICAL: Connection refused by host [23:45:26] PROBLEM - RAID on srv232 is CRITICAL: Connection refused by host [23:45:26] PROBLEM - Disk space on db44 is CRITICAL: Connection refused by host [23:45:26] PROBLEM - Disk space on mw1127 is CRITICAL: Connection refused by host [23:45:26] PROBLEM - MySQL disk space on db43 is CRITICAL: Connection refused by host [23:45:26] PROBLEM - MySQL disk space on db1043 is CRITICAL: Connection refused by host [23:45:26] PROBLEM - RAID on srv225 is CRITICAL: Connection refused by host [23:45:26] PROBLEM - DPKG on srv223 is CRITICAL: Connection refused by host [23:45:26] PROBLEM - RAID on srv225 is CRITICAL: Connection refused by host [23:45:35] PROBLEM - RAID on mw58 is CRITICAL: Connection refused by host [23:45:35] PROBLEM - RAID on mw1060 is CRITICAL: Connection refused by host [23:45:35] PROBLEM - DPKG on db1039 is CRITICAL: Connection refused by host [23:45:35] PROBLEM - DPKG on srv237 is CRITICAL: Connection refused by host [23:45:35] PROBLEM - mobile traffic loggers on cp1043 is CRITICAL: Connection refused by host [23:45:35] PROBLEM - MySQL disk space on db1004 is CRITICAL: Connection refused by host [23:45:36] PROBLEM - RAID on db1029 is CRITICAL: Connection refused by host [23:45:36] PROBLEM - RAID on mw58 is CRITICAL: Connection refused by host [23:45:36] PROBLEM - RAID on mw1060 is CRITICAL: Connection refused by host [23:45:36] PROBLEM - DPKG on db1039 is CRITICAL: Connection refused by host [23:45:36] PROBLEM - DPKG on srv237 is CRITICAL: Connection refused by host [23:45:36] PROBLEM - mobile traffic loggers on cp1043 is CRITICAL: Connection refused by host [23:45:36] PROBLEM - MySQL disk space on db1004 is CRITICAL: Connection refused by host [23:45:36] PROBLEM - RAID on db1029 is CRITICAL: Connection refused by host [23:45:45] PROBLEM - DPKG on db1031 is CRITICAL: Connection refused by host [23:45:45] PROBLEM - RAID on srv289 is CRITICAL: Connection refused by host [23:45:45] PROBLEM - DPKG on mw1141 is CRITICAL: Connection refused by host [23:45:45] PROBLEM - MySQL disk space on db1048 is CRITICAL: Connection refused by host [23:45:45] PROBLEM - MySQL disk space on db1041 is CRITICAL: Connection refused by host [23:45:46] PROBLEM - DPKG on db1031 is CRITICAL: Connection refused by host [23:45:46] PROBLEM - RAID on srv289 is CRITICAL: Connection refused by host [23:45:46] PROBLEM - DPKG on mw1141 is CRITICAL: Connection refused by host [23:45:46] PROBLEM - MySQL disk space on db1048 is CRITICAL: Connection refused by host [23:45:46] PROBLEM - MySQL disk space on db1041 is CRITICAL: Connection refused by host [23:45:55] PROBLEM - RAID on virt2 is CRITICAL: Connection refused by host [23:45:55] PROBLEM - RAID on db45 is CRITICAL: Connection refused by host [23:45:55] PROBLEM - DPKG on mw1142 is CRITICAL: Connection refused by host [23:45:55] PROBLEM - Disk space on srv223 is CRITICAL: Connection refused by host [23:45:56] PROBLEM - RAID on virt2 is CRITICAL: Connection refused by host [23:45:56] PROBLEM - RAID on db45 is CRITICAL: Connection refused by host [23:45:56] PROBLEM - DPKG on mw1142 is CRITICAL: Connection refused by host [23:45:56] PROBLEM - Disk space on srv223 is CRITICAL: Connection refused by host [23:46:05] PROBLEM - DPKG on db1033 is CRITICAL: Connection refused by host [23:46:05] PROBLEM - RAID on mw65 is CRITICAL: Connection refused by host [23:46:05] PROBLEM - Disk space on mw1028 is CRITICAL: Connection refused by host [23:46:05] PROBLEM - RAID on mw1069 is CRITICAL: Connection refused by host [23:46:05] PROBLEM - DPKG on tarin is CRITICAL: Connection refused by host [23:46:06] PROBLEM - DPKG on mw69 is CRITICAL: Connection refused by host [23:46:06] PROBLEM - RAID on db1003 is CRITICAL: Connection refused by host [23:46:06] PROBLEM - DPKG on db1033 is CRITICAL: Connection refused by host [23:46:06] PROBLEM - RAID on mw65 is CRITICAL: Connection refused by host [23:46:06] PROBLEM - Disk space on mw1028 is CRITICAL: Connection refused by host [23:46:06] PROBLEM - RAID on mw1069 is CRITICAL: Connection refused by host [23:46:06] PROBLEM - DPKG on tarin is CRITICAL: Connection refused by host [23:46:06] PROBLEM - Disk space on mw1100 is CRITICAL: Connection refused by host [23:46:06] PROBLEM - Disk space on mw1125 is CRITICAL: Connection refused by host [23:46:06] PROBLEM - DPKG on mw69 is CRITICAL: Connection refused by host [23:46:06] PROBLEM - RAID on db1003 is CRITICAL: Connection refused by host [23:46:07] PROBLEM - Disk space on mw1100 is CRITICAL: Connection refused by host [23:46:07] PROBLEM - Disk space on mw1125 is CRITICAL: Connection refused by host [23:46:15] PROBLEM - DPKG on mw1106 is CRITICAL: Connection refused by host [23:46:15] PROBLEM - RAID on db44 is CRITICAL: Connection refused by host [23:46:15] PROBLEM - RAID on mw1012 is CRITICAL: Connection refused by host [23:46:15] PROBLEM - Disk space on mw1041 is CRITICAL: Connection refused by host [23:46:15] PROBLEM - Disk space on srv286 is CRITICAL: Connection refused by host [23:46:15] PROBLEM - RAID on mw1001 is CRITICAL: Connection refused by host [23:46:15] PROBLEM - Disk space on db1017 is CRITICAL: Connection refused by host [23:46:16] PROBLEM - DPKG on mw1106 is CRITICAL: Connection refused by host [23:46:16] PROBLEM - RAID on db44 is CRITICAL: Connection refused by host [23:46:16] PROBLEM - RAID on mw1012 is CRITICAL: Connection refused by host [23:46:16] PROBLEM - Disk space on mw1041 is CRITICAL: Connection refused by host [23:46:16] PROBLEM - Disk space on srv286 is CRITICAL: Connection refused by host [23:46:16] PROBLEM - RAID on mw1001 is CRITICAL: Connection refused by host [23:46:17] PROBLEM - Disk space on db1017 is CRITICAL: Connection refused by host [23:46:25] PROBLEM - RAID on srv209 is CRITICAL: Connection refused by host [23:46:25] PROBLEM - RAID on emery is CRITICAL: Connection refused by host [23:46:26] PROBLEM - RAID on srv209 is CRITICAL: Connection refused by host [23:46:26] PROBLEM - RAID on emery is CRITICAL: Connection refused by host [23:46:35] PROBLEM - DPKG on mw1156 is CRITICAL: Connection refused by host [23:46:35] PROBLEM - MySQL disk space on db1008 is CRITICAL: Connection refused by host [23:46:35] PROBLEM - mobile traffic loggers on cp1041 is CRITICAL: Connection refused by host [23:46:35] PROBLEM - RAID on mw1100 is CRITICAL: Connection refused by host [23:46:35] PROBLEM - Disk space on db1041 is CRITICAL: Connection refused by host [23:46:35] PROBLEM - DPKG on db1008 is CRITICAL: Connection refused by host [23:46:36] PROBLEM - DPKG on mw1156 is CRITICAL: Connection refused by host [23:46:36] PROBLEM - MySQL disk space on db1008 is CRITICAL: Connection refused by host [23:46:36] PROBLEM - mobile traffic loggers on cp1041 is CRITICAL: Connection refused by host [23:46:36] PROBLEM - RAID on mw1100 is CRITICAL: Connection refused by host [23:46:36] PROBLEM - Disk space on db1041 is CRITICAL: Connection refused by host [23:46:36] PROBLEM - DPKG on db1008 is CRITICAL: Connection refused by host [23:46:45] PROBLEM - Disk space on mw1142 is CRITICAL: Connection refused by host [23:46:45] PROBLEM - MySQL disk space on db1033 is CRITICAL: Connection refused by host [23:46:45] PROBLEM - RAID on srv226 is CRITICAL: Connection refused by host [23:46:45] PROBLEM - RAID on virt3 is CRITICAL: Connection refused by host [23:46:45] PROBLEM - RAID on srv195 is CRITICAL: Connection refused by host [23:46:45] PROBLEM - DPKG on mw1012 is CRITICAL: Connection refused by host [23:46:46] PROBLEM - Disk space on mw1142 is CRITICAL: Connection refused by host [23:46:46] PROBLEM - MySQL disk space on db1033 is CRITICAL: Connection refused by host [23:46:46] PROBLEM - RAID on srv226 is CRITICAL: Connection refused by host [23:46:46] PROBLEM - RAID on virt3 is CRITICAL: Connection refused by host [23:46:46] PROBLEM - RAID on srv195 is CRITICAL: Connection refused by host [23:46:46] PROBLEM - DPKG on mw1012 is CRITICAL: Connection refused by host [23:46:55] PROBLEM - spamassassin on sodium is CRITICAL: Connection refused by host [23:46:55] PROBLEM - DPKG on db1029 is CRITICAL: Connection refused by host [23:46:55] PROBLEM - RAID on mw52 is CRITICAL: Connection refused by host [23:46:55] PROBLEM - Disk space on srv289 is CRITICAL: Connection refused by host [23:46:55] PROBLEM - RAID on srv208 is CRITICAL: Connection refused by host [23:46:55] PROBLEM - DPKG on srv243 is CRITICAL: Connection refused by host [23:46:55] PROBLEM - RAID on db1038 is CRITICAL: Connection refused by host [23:46:56] PROBLEM - spamassassin on sodium is CRITICAL: Connection refused by host [23:46:56] PROBLEM - DPKG on db1029 is CRITICAL: Connection refused by host [23:46:56] PROBLEM - RAID on mw52 is CRITICAL: Connection refused by host [23:46:56] PROBLEM - Disk space on srv289 is CRITICAL: Connection refused by host [23:46:56] PROBLEM - RAID on srv208 is CRITICAL: Connection refused by host [23:46:56] PROBLEM - RAID on mw1031 is CRITICAL: Connection refused by host [23:46:56] PROBLEM - RAID on searchidx2 is CRITICAL: Connection refused by host [23:46:56] PROBLEM - DPKG on srv243 is CRITICAL: Connection refused by host [23:46:56] PROBLEM - RAID on db1038 is CRITICAL: Connection refused by host [23:46:57] PROBLEM - Disk space on db1035 is CRITICAL: Connection refused by host [23:46:57] PROBLEM - DPKG on srv229 is CRITICAL: Connection refused by host [23:46:57] PROBLEM - RAID on mw1031 is CRITICAL: Connection refused by host [23:46:57] PROBLEM - RAID on searchidx2 is CRITICAL: Connection refused by host [23:46:59] PROBLEM - DPKG on snapshot2 is CRITICAL: Connection refused by host [23:46:59] PROBLEM - MySQL disk space on db1015 is CRITICAL: Connection refused by host [23:46:59] PROBLEM - Disk space on db1035 is CRITICAL: Connection refused by host [23:46:59] PROBLEM - DPKG on srv229 is CRITICAL: Connection refused by host [23:46:59] PROBLEM - DPKG on snapshot2 is CRITICAL: Connection refused by host [23:47:00] PROBLEM - MySQL disk space on db1015 is CRITICAL: Connection refused by host [23:47:05] PROBLEM - DPKG on mw1127 is CRITICAL: Connection refused by host [23:47:05] PROBLEM - DPKG on mw1010 is CRITICAL: Connection refused by host [23:47:05] PROBLEM - DPKG on db1046 is CRITICAL: Connection refused by host [23:47:05] PROBLEM - RAID on tarin is CRITICAL: Connection refused by host [23:47:05] PROBLEM - RAID on mw1016 is CRITICAL: Connection refused by host [23:47:06] PROBLEM - DPKG on mw58 is CRITICAL: Connection refused by host [23:47:06] PROBLEM - Disk space on db1005 is CRITICAL: Connection refused by host [23:47:06] PROBLEM - DPKG on mw1127 is CRITICAL: Connection refused by host [23:47:06] PROBLEM - DPKG on mw1010 is CRITICAL: Connection refused by host [23:47:06] PROBLEM - DPKG on db1046 is CRITICAL: Connection refused by host [23:47:06] PROBLEM - RAID on tarin is CRITICAL: Connection refused by host [23:47:06] PROBLEM - RAID on mw1016 is CRITICAL: Connection refused by host [23:47:06] PROBLEM - MySQL disk space on db1003 is CRITICAL: Connection refused by host [23:47:06] PROBLEM - DPKG on mw58 is CRITICAL: Connection refused by host [23:47:06] PROBLEM - Disk space on db1005 is CRITICAL: Connection refused by host [23:47:07] PROBLEM - MySQL disk space on db1003 is CRITICAL: Connection refused by host [23:47:15] PROBLEM - RAID on db1046 is CRITICAL: Connection refused by host [23:47:15] PROBLEM - DPKG on cp1043 is CRITICAL: Connection refused by host [23:47:15] PROBLEM - RAID on mw1013 is CRITICAL: Connection refused by host [23:47:15] PROBLEM - Disk space on mw1106 is CRITICAL: Connection refused by host [23:47:15] PROBLEM - RAID on srv189 is CRITICAL: Connection refused by host [23:47:15] PROBLEM - DPKG on srv220 is CRITICAL: Connection refused by host [23:47:16] PROBLEM - RAID on db1046 is CRITICAL: Connection refused by host [23:47:16] PROBLEM - DPKG on cp1043 is CRITICAL: Connection refused by host [23:47:16] PROBLEM - RAID on mw1013 is CRITICAL: Connection refused by host [23:47:16] PROBLEM - Disk space on mw1106 is CRITICAL: Connection refused by host [23:47:16] PROBLEM - RAID on srv189 is CRITICAL: Connection refused by host [23:47:16] PROBLEM - DPKG on srv220 is CRITICAL: Connection refused by host [23:47:25] PROBLEM - Disk space on db1028 is CRITICAL: Connection refused by host [23:47:25] PROBLEM - Disk space on mw1131 is CRITICAL: Connection refused by host [23:47:25] PROBLEM - DPKG on mw1095 is CRITICAL: Connection refused by host [23:47:25] PROBLEM - Disk space on mw1122 is CRITICAL: Connection refused by host [23:47:25] PROBLEM - DPKG on db1018 is CRITICAL: Connection refused by host [23:47:25] PROBLEM - RAID on db1039 is CRITICAL: Connection refused by host [23:47:26] PROBLEM - Disk space on db1028 is CRITICAL: Connection refused by host [23:47:26] PROBLEM - Disk space on mw1131 is CRITICAL: Connection refused by host [23:47:26] PROBLEM - DPKG on mw1095 is CRITICAL: Connection refused by host [23:47:26] PROBLEM - Disk space on mw1122 is CRITICAL: Connection refused by host [23:47:26] PROBLEM - DPKG on db1018 is CRITICAL: Connection refused by host [23:47:26] PROBLEM - RAID on db1039 is CRITICAL: Connection refused by host [23:47:35] PROBLEM - DPKG on srv227 is CRITICAL: Connection refused by host [23:47:35] PROBLEM - DPKG on cp1041 is CRITICAL: Connection refused by host [23:47:35] PROBLEM - Disk space on db1004 is CRITICAL: Connection refused by host [23:47:35] PROBLEM - RAID on db1035 is CRITICAL: Connection refused by host [23:47:35] PROBLEM - DPKG on db1028 is CRITICAL: Connection refused by host [23:47:36] PROBLEM - DPKG on srv227 is CRITICAL: Connection refused by host [23:47:36] PROBLEM - DPKG on cp1041 is CRITICAL: Connection refused by host [23:47:36] PROBLEM - Disk space on db1004 is CRITICAL: Connection refused by host [23:47:36] PROBLEM - RAID on db1035 is CRITICAL: Connection refused by host [23:47:36] PROBLEM - DPKG on db1028 is CRITICAL: Connection refused by host [23:47:46] PROBLEM - Disk space on mw1141 is CRITICAL: Connection refused by host [23:47:46] PROBLEM - MySQL disk space on es2 is CRITICAL: Connection refused by host [23:47:46] PROBLEM - RAID on db1005 is CRITICAL: Connection refused by host [23:47:46] PROBLEM - DPKG on storage3 is CRITICAL: Connection refused by host [23:47:46] PROBLEM - Disk space on mw1141 is CRITICAL: Connection refused by host [23:47:46] PROBLEM - MySQL disk space on es2 is CRITICAL: Connection refused by host [23:47:46] PROBLEM - RAID on db1005 is CRITICAL: Connection refused by host [23:47:46] PROBLEM - DPKG on storage3 is CRITICAL: Connection refused by host [23:47:55] PROBLEM - Disk space on srv241 is CRITICAL: Connection refused by host [23:47:56] PROBLEM - RAID on mw1010 is CRITICAL: Connection refused by host [23:47:56] PROBLEM - DPKG on srv189 is CRITICAL: Connection refused by host [23:47:56] PROBLEM - DPKG on srv208 is CRITICAL: Connection refused by host [23:47:56] PROBLEM - Disk space on srv241 is CRITICAL: Connection refused by host [23:47:56] PROBLEM - RAID on mw1010 is CRITICAL: Connection refused by host [23:47:56] PROBLEM - DPKG on srv189 is CRITICAL: Connection refused by host [23:47:56] PROBLEM - DPKG on srv208 is CRITICAL: Connection refused by host [23:48:05] PROBLEM - DPKG on mw1084 is CRITICAL: Connection refused by host [23:48:05] PROBLEM - Disk space on db1008 is CRITICAL: Connection refused by host [23:48:05] PROBLEM - MySQL disk space on db1002 is CRITICAL: Connection refused by host [23:48:06] PROBLEM - DPKG on mw1084 is CRITICAL: Connection refused by host [23:48:06] PROBLEM - Disk space on db1008 is CRITICAL: Connection refused by host [23:48:06] PROBLEM - MySQL disk space on db1002 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - DPKG on mw1078 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - RAID on db1018 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - RAID on db1043 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - MySQL disk space on db1017 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - Disk space on mw69 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - DPKG on mw1078 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - RAID on db1018 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - RAID on db1043 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - MySQL disk space on db1017 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - Disk space on mw69 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - DPKG on es4 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - Disk space on mw1110 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - DPKG on es4 is CRITICAL: Connection refused by host [23:48:16] PROBLEM - Disk space on mw1110 is CRITICAL: Connection refused by host [23:48:17] PROBLEM - DPKG on mw1013 is CRITICAL: Connection refused by host [23:48:18] PROBLEM - DPKG on mw1013 is CRITICAL: Connection refused by host [23:48:25] PROBLEM - Disk space on storage3 is CRITICAL: Connection refused by host [23:48:25] PROBLEM - RAID on mw1076 is CRITICAL: Connection refused by host [23:48:25] PROBLEM - DPKG on mw1001 is CRITICAL: Connection refused by host [23:48:25] PROBLEM - DPKG on srv210 is CRITICAL: Connection refused by host [23:48:25] PROBLEM - MySQL disk space on db1039 is CRITICAL: Connection refused by host [23:48:25] PROBLEM - RAID on mw1125 is CRITICAL: Connection refused by host [23:48:26] PROBLEM - Disk space on storage3 is CRITICAL: Connection refused by host [23:48:26] PROBLEM - RAID on mw1076 is CRITICAL: Connection refused by host [23:48:26] PROBLEM - DPKG on mw1001 is CRITICAL: Connection refused by host [23:48:26] PROBLEM - DPKG on srv210 is CRITICAL: Connection refused by host [23:48:26] PROBLEM - MySQL disk space on db1039 is CRITICAL: Connection refused by host [23:48:26] PROBLEM - RAID on mw1125 is CRITICAL: Connection refused by host [23:48:35] PROBLEM - DPKG on srv232 is CRITICAL: Connection refused by host [23:48:35] PROBLEM - RAID on db1001 is CRITICAL: Connection refused by host [23:48:35] PROBLEM - MySQL disk space on db45 is CRITICAL: Connection refused by host [23:48:35] PROBLEM - Disk space on db1046 is CRITICAL: Connection refused by host [23:48:35] PROBLEM - Disk space on emery is CRITICAL: Connection refused by host [23:48:35] PROBLEM - RAID on mw1 is CRITICAL: Connection refused by host [23:48:36] PROBLEM - DPKG on srv232 is CRITICAL: Connection refused by host [23:48:36] PROBLEM - RAID on db1001 is CRITICAL: Connection refused by host [23:48:36] PROBLEM - MySQL disk space on db45 is CRITICAL: Connection refused by host [23:48:36] PROBLEM - Disk space on db1046 is CRITICAL: Connection refused by host [23:48:36] PROBLEM - Disk space on emery is CRITICAL: Connection refused by host [23:48:36] PROBLEM - RAID on mw1 is CRITICAL: Connection refused by host [23:48:45] PROBLEM - RAID on db1015 is CRITICAL: Connection refused by host [23:48:45] PROBLEM - RAID on mw1089 is CRITICAL: Connection refused by host [23:48:45] PROBLEM - RAID on cp1041 is CRITICAL: Connection refused by host [23:48:45] PROBLEM - Disk space on srv237 is CRITICAL: Connection refused by host [23:48:46] PROBLEM - RAID on db1015 is CRITICAL: Connection refused by host [23:48:46] PROBLEM - RAID on mw1089 is CRITICAL: Connection refused by host [23:48:46] PROBLEM - RAID on cp1041 is CRITICAL: Connection refused by host [23:48:46] PROBLEM - Disk space on srv237 is CRITICAL: Connection refused by host [23:48:55] PROBLEM - Disk space on srv190 is CRITICAL: Connection refused by host [23:48:55] PROBLEM - RAID on mw1084 is CRITICAL: Connection refused by host [23:48:55] PROBLEM - RAID on srv243 is CRITICAL: Connection refused by host [23:48:55] PROBLEM - DPKG on mw1089 is CRITICAL: Connection refused by host [23:48:55] PROBLEM - Disk space on db1006 is CRITICAL: Connection refused by host [23:48:56] PROBLEM - Disk space on srv190 is CRITICAL: Connection refused by host [23:48:56] PROBLEM - RAID on mw1084 is CRITICAL: Connection refused by host [23:48:56] PROBLEM - RAID on srv243 is CRITICAL: Connection refused by host [23:48:56] PROBLEM - DPKG on mw1089 is CRITICAL: Connection refused by host [23:48:56] PROBLEM - Disk space on db1006 is CRITICAL: Connection refused by host [23:49:05] PROBLEM - Disk space on srv218 is CRITICAL: Connection refused by host [23:49:06] PROBLEM - Disk space on srv218 is CRITICAL: Connection refused by host [23:49:15] PROBLEM - DPKG on mw1076 is CRITICAL: Connection refused by host [23:49:15] PROBLEM - RAID on mw1083 is CRITICAL: Connection refused by host [23:49:15] PROBLEM - Disk space on db1015 is CRITICAL: Connection refused by host [23:49:15] PROBLEM - RAID on mw1127 is CRITICAL: Connection refused by host [23:49:15] PROBLEM - DPKG on srv196 is CRITICAL: Connection refused by host [23:49:15] PROBLEM - RAID on mw1028 is CRITICAL: Connection refused by host [23:49:15] PROBLEM - RAID on snapshot2 is CRITICAL: Connection refused by host [23:49:16] PROBLEM - DPKG on mw1076 is CRITICAL: Connection refused by host [23:49:16] PROBLEM - RAID on mw1083 is CRITICAL: Connection refused by host [23:49:16] PROBLEM - Disk space on db1015 is CRITICAL: Connection refused by host [23:49:16] PROBLEM - RAID on mw1127 is CRITICAL: Connection refused by host [23:49:16] PROBLEM - DPKG on srv196 is CRITICAL: Connection refused by host [23:49:16] PROBLEM - MySQL disk space on db1018 is CRITICAL: Connection refused by host [23:49:16] PROBLEM - RAID on mw1028 is CRITICAL: Connection refused by host [23:49:16] PROBLEM - RAID on snapshot2 is CRITICAL: Connection refused by host [23:49:17] PROBLEM - MySQL disk space on db1018 is CRITICAL: Connection refused by host [23:49:25] PROBLEM - MySQL disk space on es1002 is CRITICAL: Connection refused by host [23:49:25] PROBLEM - RAID on mw67 is CRITICAL: Connection refused by host [23:49:25] PROBLEM - MySQL disk space on db1005 is CRITICAL: Connection refused by host [23:49:25] PROBLEM - RAID on mw1149 is CRITICAL: Connection refused by host [23:49:25] PROBLEM - RAID on srv241 is CRITICAL: Connection refused by host [23:49:25] PROBLEM - RAID on db1004 is CRITICAL: Connection refused by host [23:49:26] PROBLEM - MySQL disk space on es1002 is CRITICAL: Connection refused by host [23:49:26] PROBLEM - RAID on mw67 is CRITICAL: Connection refused by host [23:49:26] PROBLEM - MySQL disk space on db1005 is CRITICAL: Connection refused by host [23:49:26] PROBLEM - RAID on mw1149 is CRITICAL: Connection refused by host [23:49:26] PROBLEM - RAID on srv241 is CRITICAL: Connection refused by host [23:49:26] PROBLEM - RAID on db1004 is CRITICAL: Connection refused by host [23:49:35] PROBLEM - Disk space on ms5 is CRITICAL: Connection refused by host [23:49:35] PROBLEM - Disk space on es4 is CRITICAL: Connection refused by host [23:49:35] PROBLEM - DPKG on sodium is CRITICAL: Connection refused by host [23:49:36] PROBLEM - Disk space on ms5 is CRITICAL: Connection refused by host [23:49:36] PROBLEM - Disk space on es4 is CRITICAL: Connection refused by host [23:49:36] PROBLEM - DPKG on sodium is CRITICAL: Connection refused by host [23:49:45] PROBLEM - DPKG on mw1037 is CRITICAL: Connection refused by host [23:49:45] PROBLEM - DPKG on mw65 is CRITICAL: Connection refused by host [23:49:46] PROBLEM - DPKG on mw1037 is CRITICAL: Connection refused by host [23:49:46] PROBLEM - DPKG on mw65 is CRITICAL: Connection refused by host [23:49:55] PROBLEM - RAID on db43 is CRITICAL: Connection refused by host [23:49:55] PROBLEM - DPKG on db1035 is CRITICAL: Connection refused by host [23:49:55] RECOVERY - RAID on srv192 is OK: OK: no RAID installed [23:49:56] PROBLEM - RAID on db43 is CRITICAL: Connection refused by host [23:49:56] PROBLEM - DPKG on db1035 is CRITICAL: Connection refused by host [23:49:56] RECOVERY - RAID on srv192 is OK: OK: no RAID installed [23:50:05] PROBLEM - RAID on srv196 is CRITICAL: Connection refused by host [23:50:06] PROBLEM - RAID on srv196 is CRITICAL: Connection refused by host [23:50:15] PROBLEM - DPKG on db1041 is CRITICAL: Connection refused by host [23:50:16] PROBLEM - DPKG on db1041 is CRITICAL: Connection refused by host [23:50:35] RECOVERY - DPKG on srv192 is OK: All packages OK [23:50:36] RECOVERY - DPKG on srv192 is OK: All packages OK [23:51:35] RECOVERY - Disk space on srv209 is OK: DISK OK [23:51:36] RECOVERY - Disk space on srv209 is OK: DISK OK [23:51:55] PROBLEM - RAID on mw1105 is CRITICAL: Connection refused by host [23:51:56] PROBLEM - RAID on mw1105 is CRITICAL: Connection refused by host [23:52:05] PROBLEM - DPKG on mw1060 is CRITICAL: Connection refused by host [23:52:06] PROBLEM - DPKG on mw1060 is CRITICAL: Connection refused by host [23:52:15] PROBLEM - RAID on mw1037 is CRITICAL: Connection refused by host [23:52:16] PROBLEM - RAID on mw1037 is CRITICAL: Connection refused by host [23:52:25] RECOVERY - Disk space on virt2 is OK: DISK OK [23:52:26] RECOVERY - Disk space on virt2 is OK: DISK OK [23:56:14] !log changed global roles netadmins and sysadmins to be virtual static groups in ldap that autopopulate with any user that has objectclass=novauser [23:56:15] Logged the message, Master [23:56:29] no more needing to manually add people to those worthless fucking groups! :) [23:58:12] Evening guys, can you check whether centralauth is down? I'm trying to make an account for the en ACC team and the tools which connect to centralauth db aren't doing so