[09:39:21] !ping [09:39:21] !pong [09:39:26] o/ [12:24:01] petan: Thanks for fixing "!tr". Is wm-bot the only thing left on Bots? Do you need help migrating it to Tools? It would be nice to be able to dissolve the Bots project, i. e. not migrate it to eqiad :-). [12:24:01] Hey scfc_de, you are welcome! [12:30:10] scfc_de: so far it's not possible to migrate it because tools projects doesn't meet environment requirements for wm-bot as well as wm-bot doesn't fit to tools project terms of use [12:30:31] this is one long time and very boring discussion... [12:31:28] What requirements aren't fulfilled, and where's the clash with Tools' rules? [12:31:28] scfc_de: I don't have anything against nuking bots, as soon as there is replacement project for wm-bot, I requested separate labs project for it to be created, but it was never created [12:31:38] let me find the id's [12:32:56] scfc_de: everything that is marked as WONTFIX here https://bugzilla.wikimedia.org/show_bug.cgi?id=51935 [12:33:30] specifically this one is biggest and final blocker: https://bugzilla.wikimedia.org/show_bug.cgi?id=51936 [12:33:58] quoting Coren: [12:34:02] Not only is this outside the scope of tool labs, but it's going to be specifically prohibited; in order to allow the general Wikimedia privacy policy to apply, tools are not allowed to gather IP addresses from their users (which allowing connections from outside would allow). [12:34:03] Tools that need to host publicly-accessible network services must do so from their own project (and subject to the general Labs TOU, including the necessity of posting disclaimers and a lesser privacy policy). [12:34:34] therefore own project is required for wm-bot to be hosted on labs [12:39:48] petan: I already replied on that bug last summer. Why do you need that? You can connect to any port on the exec nodes with an ssh tunnel, you can write some command line scripts that you execute on tools-login and they connect to the exec node, perform their thing and disconnect, you could even replace wm-bot's control port with something that speaks http and then use portgranter, so it is accessible at [12:39:48] https://tools.wmflabs.org/wm-bot/. [12:40:14] scfc_de: I replied on that message you wrote just now. This query relaying has nothing to do with bouncers BTW [12:41:48] But? [12:42:21] but what [12:42:47] there is no "but" in that sentence :P [12:43:18] Then what's blocking it? You don't need "query relaying" for bouncers. [12:43:26] petan: if you want something special, you have to explain why you need that something special. [12:43:57] scfc_de: I am actually wondering why you even think it has anything to do with bouncers, this thing is used for netcat to work [12:44:04] so: why do you need direct access from ze interwebz, and why is it not possible to do this in any other way? [12:44:09] https://bugzilla.wikimedia.org/show_bug.cgi?id=51936#c12 [12:44:35] petan: Because that's all you replied about on the bug :-). [12:44:37] valhallasw: I never said I need direct access from interwebz (whatever that means) I said I need direct access from any other labs instance [12:45:05] petan: well, clearly Coren interpreted that differently. [12:45:11] scfc_de: I replied that because you started talking about bouncers, but the bug itself has nothing to do with bouncers [12:45:55] valhallasw: all I need is that labs instance foo of project bar has access to some TCP port open by wm-bot on project where it lives [12:46:04] that is what it now has and what it wouldn't have if it was migrated [12:47:37] petan: you still haven't answered the question of *why* you need that, but OK. [12:47:53] why I need wm-bot? [12:48:01] well I don't... but some people like it [12:48:02] Replace it with a CGI wrapper? http://tools.wmflabs.org/wm-bot/post?text=ABC that looks up wm-bot's exec-node and port and pushes that to it? [12:48:03] why you need to connect to the bot over some random port [12:48:20] otherwise, what scfc_de suggested just now, or the file option suggested by him on the bug [12:48:27] you could even put that file in ~/public_html! [12:48:41] https://meta.wikimedia.org/wiki/Wm-bot#.40relay-on [12:48:50] valhallasw: I'm not sure if project <=> project works; let me test that. [12:49:01] that's one of most used and most useful features of bot, it couldn't work without this [12:49:34] scfc_de: that sounds horribly slow and ineffective [12:50:07] like you would replace that netcat in script with wget or what? [12:50:18] curl, typically, but yes [12:50:59] and you could just ask for your own exec node [12:51:28] but why... I could as well have own instance, that could be even smaller than exec node and would have lower resource footprint [12:51:34] basically, the problem is you did not explain the use case clearly, so other people did not understand what you were trying to accomplish. [12:51:40] I see no reason to solve something simple with something so complicated [12:51:57] because that link you sent me is actually nowhere to be found in the bug report [12:52:21] because it means the system will be maintained by multiple people instead of just you? [12:53:00] but there is a much better reason not to move than all of this [12:53:05] what [12:53:08] valhallasw: No, project <=> project doesn't work without opening up the security groups (https://wikitech.wikimedia.org/wiki/Special:NovaSecurityGroup). [12:53:09] that's not true [12:53:27] there can be multiple people in labs project [12:53:28] which is people are using your current host name, and will have to change their code if you move the bot [12:54:10] valhallasw: yes, that's why I would like to finally move the bot to some final instance where it can happily live until end of times [12:54:16] How many tools use that netcat feature? [12:54:48] idk, but I think that more than 10 channels use that, most notable anti vandalism channels, and I am pretty sure it could have great use in future [12:55:00] because this is what many people actually want to have, they just don't know it already exist [12:55:16] this allows you to relay even "echo messages" from shell scripts to any irc channel you want [12:55:36] which is pretty useful for bot maintainers who are watching irc channels more than their cron output [12:57:48] Yes, but full-sized Labs project can just use the ircecho module and stay in line with production. Do you have statistics in wm-bot on use? [12:58:30] scfc_de: no I don't collect any statistics on this, but having 200 irc echo bots that outputs 1 line of text a week versus 1 bot that handles it all, is horrible idea [12:58:54] you surely don't want to spawn irc echo bot for every simple script [12:59:03] scfc_de: ircecho? Are there docs for that? [12:59:33] because google can't find them :-) [12:59:48] valhallasw: modules/ircecho/manifests/init.pp in operations/puppet [13:00:39] IMHO wm-bot is just not suitable to be hosted on tools project, we all would save a lot of work if it just lived on dedicated instance (which it already does) [13:00:40] petan: The bot is spawned just once per project and then idles. [13:01:00] scfc_de: we have more than 200 projects on labs [13:01:12] And how many use wm-bot's netcat feature? [13:01:15] also, this ircecho is harder to implement [13:01:23] It is already implemented. [13:01:34] you are replacing better solution with worse solution for pointless reason [13:01:36] scfc_de: puppet classes are not docs, and the puppet class is also not very informative [13:01:47] scfc_de: I mean setup by target user [13:01:51] at least wm-bot has docs that inform someone how to use it ;-) [13:02:13] it's harder to set it up and you also need to be project admin for that, you can't even set up own irc echo for your own scripts that you would host on project like tool labs [13:02:48] Sure, because that would be petan the Tools administrator's job :-). [13:03:05] for example I have my job that generates doxygen on tools project, now I can make the bot ping me when this job fail, how would you do this using irc echo? [13:03:39] scfc_de: no it wouldn't be administrator's job, because unlike ircecho, netcat plugin isn't meant for administrators, it's meant for target users of labs [13:03:52] it doesn't require root or any big knowledge of anything [13:03:56] Setting up ircecho on an instance is the administrator's job. [13:04:00] it's extremely simple, flexible and it works [13:04:21] eh, is it really so hard to understand the difference? [13:04:34] ircecho bot is entirely different thing than this netcat module [13:05:18] it's meant for completely different userse [13:05:23] ircecho is for project admins [13:05:31] netcat is for labs users. all of them. any of them [13:06:03] every simple shell script running under any user on any project on labs can benefit from netcat module [13:06:19] it doesn't even have to be shell script it can be anything written in any language [13:06:30] even in python? ;-) [13:06:33] yes [13:06:37] even in brainfuck [13:07:16] petan: Okay, I give up :-). All the best! [13:24:27] Does anyone know if there's a hardlimit on disk usage? [13:24:51] Would it be a huge problem if I generated ~6-10GB of data for a few hours? [13:25:30] bjelleklang: no there are no limits, I think it wouldn't be huge problem but optimization is your friend ;) [13:25:46] it always is :) [13:26:53] I'm just looking into grabbing all usernames and user creation date, could be a lot of raw data before I manage to compress it and download it [13:33:06] bjelleklang: maybe use some well optimized queries for database? [13:33:19] you can directly fetch this from cloned db [13:33:32] it can be compressed as you read it [13:33:51] ah right, that would be nice to do :) [14:11:25] !log deployment-prep upgrading beta to Elasticsearch 1.0 [14:11:26] Logged the message, Master [14:31:25] !!!!!!!!!!!!!!!!!!!!!!!!!!!! [14:31:35] ;-] [14:36:05] Coren, is it possible to make a special page on wikitech that can be edited by any logged in user? [14:36:45] andrewbogott: Special pages aren't editable, that's why they're special. :-) [14:36:59] um, sorry, I don't mean Special i just mean... [14:37:00] a page [14:37:03] a wiki page [14:37:16] Or is that already the case? I thought it was only contentadmins that could edit most pages... [14:37:35] No, the wiki is editable; contentadmin gives delete, move, and so on. [14:37:39] No, unprotected pages should be editable by all users. [14:37:46] great, then nevermind! [15:06:35] andrewbogott: Are you going to have the opportunity to add a few virtX today? [15:06:49] Coren: I did, there are six now. [15:06:57] Yeay! [15:07:07] * Coren finishes setting up tools without fear. [15:07:08] And three more a puppet patch away. [15:07:29] I feel some sort of gut impulse to keep some empty ones 'just in case' although I can't think of why that would be useful. [15:09:29] How many are there in pmtpa and will be in eqiad? [15:10:20] There are 8 in pmtpa. [15:10:31] Right now there are six running in eqiad, I have three more I can turn on at any time. [15:10:50] And then, eventually, we'll probably ship some or all of the tampa boxes to eqiad and retrench them as virt10xx servers. [15:10:59] Are they special hardware? [15:11:17] the compute nodes in tampa and eqiad are the same hardware. [15:11:36] No, I mean compared to DB servers, Apaches, etc. [15:11:45] It is, in theory, special hardware built for VM hosting. https://wikitech.wikimedia.org/wiki/Cisco_UCS_C250_M1 [15:12:06] They were donated, though, so I haven't done any research to verify that they are in fact the best tool for the job. [15:12:43] Oh, um, here is a more interesting link: http://www.cisco.com/c/en/us/products/servers-unified-computing/ucs-c250-m1-extended-memory-rack-mount-server/index.html [15:14:19] scfc_de: thanks for fixing up that page. [15:14:25] andrewbogott: np [15:15:18] Coren, do you have a salt+git_ [15:15:18] buh [15:15:29] a salt+git+puppet script all ready to go? [15:15:38] Update some self-hosted puppet nodes, or at least gather some stats about same? [15:17:53] andrewbogott: Yeah, ima go get some breakfast and be right with you. [15:18:01] ok [15:27:08] Someone using WinSCP around who could advise at https://wikitech.wikimedia.org/wiki/User_talk:Tim_Landscheidt#Error_in_Creating_directory ? [15:28:01] !log tools chmod g-w ~fsainsbu/.forward [15:28:02] Logged the message, Master [15:39:23] scfc_de: {{worksforme}} [15:39:49] scfc_de: the only thing I can think of is tahir not having write permissions to /data/project/tahir [15:40:08] I can copy files and create directories... [15:42:01] valhalla1w: Odd; he hasn't. Even though he is in the group. Hmmm. Let's see. [15:45:56] Coren: If you want some distraction: IIRC we had a similar situation some time ago: User tahir can "become tahir", but cannot "touch ~local-tahir/abc" => "touch: cannot touch `/data/project/tahir/abc': Permission denied". Directory and user permissions look okay. I think it was some NFS/LDAP hiccup on the NFS server in the past? [15:47:33] scfc_de: Winscp sometimes manages to screw itself out of write permission. IIRC there is a setting in it that says something along the lines of "don't try to do permission stuff and leave it to the server" which needs to be set. [15:48:23] The "touch" error was by me on command line in tools-login. [15:48:51] scfc_de: perhaps your command line screwed itself too :P [15:49:18] petan: Wanna try yourself? [15:49:41] no I am pretty sure my command line is just that screwed up [15:49:52] or maybe it's the nfs issue [15:49:54] :P [15:49:54] Coren: the 'Upload options', 'Set permissions:' 'rw-r--r-- (+x)' setting is off by defualt [15:50:22] I can't find any other such settings [15:51:03] /data/project/tahir is "drwxrwsr-x", so if WinSCP tried to change that, it failed :-). [16:05:36] IIRC the permissions problem was with negative caching on the NFS server. [16:18:13] andrewbogott: I'm running into two bits of trouble; the first is that salt doesn't seem to be able to actually run over all instances (*.pmtpa.wmflabs only hits about 75 instances for some reason). The second is that I haven't managed to find a way to test for mergability nondestructively if there are uncommited changes in the tree [16:19:08] Ah, never mind for problem 1: just timeout [16:23:03] test for mergability... [16:24:00] Will puppet still try to merge if there are local file changes? [16:24:16] Yes. And make a merge --abort not work. [16:24:49] Hm, so maybe we should regard instances with local non-commited changes as presumed broken. [16:25:04] That leaves instances /with/ commited changes. In that case --abort is safe, right? [16:26:01] andrewbogott: It should be. [16:26:15] Ok… so that will get us the big picture, statswise. [16:26:36] Allright, lemme make a run to see how many there are in both categories [16:27:18] And I think we can still rescue instances that are unmergable. We'll just set local changes aside in patch and branch and "git fetch origin && git checkout -b formigration origin" [16:27:33] But, sorry, I'm getting ahead of myself. [16:27:39] We should figure out the size of the problem first. [16:36:34] andrewbogott: Ewww. [16:36:48] ? [16:37:01] 300 out of 400 instances use self-hosted puppet? [16:37:23] 45 self-hosted puppets, only 21 of which would merge cleanly. [16:37:36] Everything else has untracked changes or unstaged changes. [16:38:12] Well, 19 clean for sure, two might still have merge conflicts (commits on both sides) [16:38:23] so that leaves 24... [16:38:26] that's not terrible :) [16:38:37] It's way less than a million! [16:38:46] :-) [16:38:55] virt0:~/gitstatus [16:38:59] virt0:~/gitstatuses [16:40:49] Coren: can I sidetrack you briefly? I'm interested in whether anything in https://dpaste.de/anyd concerns you. [16:41:06] That's a puppet run on a recently migrated instance. [16:42:21] Hm. [16:42:32] Looks like the volume manager thing didn't create that project. [16:42:59] ... possibly because it's not running. [16:43:13] that project has existed for ages though [16:43:22] Shouldn't we leave merging for the project admins? They probably know much better than any automatic thingy. [16:43:25] as long as anything else... [16:43:40] (And can always press "Delete instance" if they don't like it.) [16:43:41] scfc_de: yes we should. We're planning ahead for projects whose members are awol [16:43:53] andrewbogott: Hm. I still don't see it. Lemme check to see why. [16:44:15] scfc_de: in theory project admins should have /already/ resolved these problems, since I've emailed about it to the labs list a couple of times. [16:44:35] Those that are still broken are most likely my problem :( [16:45:22] andrewbogott: manage-nfs-volumes-daemon doesn't even /see/ that project for some reason. [16:45:39] Coren: it has the 'shared home' box unchecked. [16:45:46] It has no gluster shared volume. [16:45:47] Well, the last rebase to production apparently locked me out of most Toolsbeta instances :-). [16:45:54] andrewbogott: aaah! [16:45:57] So, either manage-nfs-volumes-daemon is right and puppet is wrong, [16:46:00] or the other way 'round [16:46:38] andrewbogott: The puppet output is perfectly normal then: it tries to mount the shared /home that the manager doesn't care to create. [16:46:58] Yeah… ugly though. [16:47:01] Hence mounting labstore.svc.eqiad.wmnet:/project/akosiaristests/home failed, reason given by server: No such file or directory [16:47:12] Of course I suppose if we create the shared dir now, it will hide any local files in /home [16:47:29] I don't think there's a way around that -- it's not possible to know whether that was selected or not from puppet afaik. [16:47:34] ugly but maybe ok. [16:47:36] * andrewbogott nods [16:48:50] Coren, and that wasn't a problem in pmtpa because... [16:49:01] shared home was managed by autofs instead of puppet? [16:49:14] Right; which means /it/ errored out in the logs rather than puppet. [16:49:19] The net effect is the same. [16:49:30] * Coren ponders. [16:49:35] I guess we could create a puppet var to suppress that warning. [16:49:46] There /is/ a way around; create a puppet var in... yeah, that. [16:49:49] GMTA. :-) [16:49:52] Not an automated thing, just 'raise this variable on labsconsole if this bothers you' [16:49:56] :) [16:50:18] Well, I wouldn't suppress the warning, I'd outright suppress the mount attempt. :-) [16:50:32] right. [16:50:42] I mean, it could be automated, but that would be work. [16:50:52] Of course, it'd be even better if... yeah, also that. [16:50:54] hm… wait, I wonder if that setting is in ldap already? [16:50:58] Stop thinking faster than me! [16:51:04] sry [16:51:08] :-P [16:52:09] If you have the manage-nfs-volumes script in front of you… what does it check? [16:52:23] * Coren looks. [16:53:36] use_volume in info; it's an array. [16:54:36] So, use_volume=project and/or use_volume=info [16:54:56] =home* [16:55:32] * Coren wonders if we could make a fact out of that. [16:55:36] that may be available to puppet. At least, $::dc is, through some magic I don't understand. [16:55:41] That's in ldap, right? [16:55:44] * Coren nods. [16:56:03] But I think $::dc is there because of the LDAP node thingy. Lemme see what else it exposes. [16:57:02] "[...] every attribute returned by LDAP nodes or parent nodes will be assigned as a variable in Puppet configurations during compilation" [16:57:18] So yeah, it should be there. Lemme see if I can figure out how to get at it. [16:57:27] thx [16:58:02] * andrewbogott is pretty close to conking out [16:58:26] Anything you need from me before you run away, then? [16:58:36] I don't think so… I was going to ask you the same :) [16:58:44] I'm good. :-) [16:59:05] I emailed you a link to migration docs. I'll send another labs-l email tomorrow with a link. [16:59:12] Probably to wikitech-l too I expect. [16:59:47] Yeah, I'll add the tools-specific stuff as soon as it gelled. [17:00:49] If you're ambitious… it would be nice to start a job that tars gluster and/or nfs shared dirs and plops the tarballs on eqiad storage. [17:01:05] I don't even know quite what we'll do with them, but I hate the idea of not using the weekend's worth of bandwidth. [17:01:30] I may try that tomorrow if you don't and if I get to it. But I won't be arround to attend anything. [17:01:59] andrewbogott: Oh, that won't work. [17:02:14] ? [17:02:17] andrewbogott: the use_volume thing is in the /project/ LDAP entry, not the instance's. [17:02:28] oh, yeah, that doesn't help [17:02:36] It could be a fact though. [17:02:48] Not that I have any idea how to write it, offhand [17:03:09] * Coren isn't entirely sure he can code a LDAP query in ruby. [17:03:19] So, probably not worth it. [17:03:46] Yeah, just a set-by-hand var will be adequate. Not a lot of projects don't have shared storage anyway. [17:14:06] ok… g'night! [18:25:26] has anything changed significantly in caching/varnish for beta labs? MobileFrontend has some funny issues with CSS overnight [18:25:48] possibly related to https://gerrit.wikimedia.org/r/#/c/115910 ? just a guess... [19:40:40] Coren, how can I check why a job is in an error state, and more importantly, why does it stay stuck? job id 2609604 (local-gerrit-reviewer-bot) [19:41:04] valhallasw: If you qstat -j you get detailed status [19:41:25] error reason 1: can't get password entry for user "local-gerrit-reviewer-bot". Either the user does not exist or NIS [19:41:52] which doesn't really explain why it stays stuck [19:42:26] Hm. Apparently it got suck while LDAP was ill. [19:43:58] You can clear the error state of a job with 'qmod -cj ' [19:44:34] Coren: maybe that should just be run for all errored jobs centrally every now and then? [19:44:46] imo jobs should just fail instead of hanging in an error state [19:45:39] That is a safeguard to prevent a rogue job from restarting over and over and draining resources. [19:46:44] That doesn't make sense to me. A rogue job could also just submit thousands of jobs [19:48:21] Well yeah, if you do it on /purpose/. This is to prevent accidents (broken script that dies early) not malice. [19:48:57] Having the errored out job prevents further -once from firing; giving you an opportunity to see what the error was and either clear the error or fix things. [19:49:15] Nothing prevents you from regularily cleaning /your/ errored jobs automatically. [19:50:05] ...sigh. [19:50:18] This error state *complete* beats the purpose of SGE [19:50:35] ... what? [19:50:42] The idea is to have something reliable, not something that breaks randomly when there is an LDAP error [19:52:05] Error state is "something went so wrong it's not even possible to start that job". Those are, by definition, very rare events that require some attention. [19:52:31] Those are error states that are completely irrelevant to me. [19:52:44] That /specific/ error wasn't. [19:53:18] How is LDAP temporarily not being available relevant to me? [19:53:44] Send me an email, spew output to stderr, all fine with me, but *hanging* the job is just stupid [19:54:34] We ganglia the error states, and from time to time (and with Icinga sooner), usually an admin resets the obvious ones. [19:55:06] (and, by the way, no, I cannot automatically reset error states, as the only way to clear them is either by job id (which would have to be retrieved in some way from qstat) or by queue (for which I lack permissions) [19:55:14] Like I said, that /specific/ error wasn't. Sadly, gridengine does not have an AI to determine what errors are interesting to what users; the only thing it knows is "this is broken". [19:55:49] valhallasw: Not by queue; that's to clear errored out queues. But you can use '*' as job id. [19:56:28] that also gets me a whole long list of errors for jobs that are not mine, but OK [19:57:13] And someone just did that for all jobs :-). [19:57:48] scfc_de: I did, under the presumption that the LDAP burp almost certainly hit some other jobs. [19:58:14] scfc_de: Jobs that errored out for some other reasons will just try once more and error out again. [19:58:53] Yep. [19:59:03] Coren: so what other reasons would there be for the job to error out? Things like scripts without +x error out with a message to stderr instead of hanging in an error state [19:59:30] valhallasw: For example, if someone sets -o to a file they can't write to. [20:00:47] valhallasw: Some things are checked early enough to abort the submission (+x is one); but it can't check for things like permissions at runtime for instance. (scfc's example of -o is a sallient one) [20:00:57] I see. [20:01:29] valhallasw: Also, continuous scripts may break because you removed their working directory for instance -- restarting them can't work so they error out. [20:02:40] That LDAP error is a rare, odd duck: the script errored out because the user it was to run as "didn't exist anymore". [20:03:01] Still, an error in the error log and an aborted job makes more sense to me than the error state. [20:03:55] Error state is "as far as gridengine know, there is no possible way this can work [anymore]". In that case, it was mistaken. [20:04:28] Coren: and if someone has cronned a job without -once, it will clog up the queue [20:04:58] Sure, but it's basically one row in a database for the errored out job -- we'd need a few million of those before they mattered. [20:06:09] (Errored out jobs consume no resources beyond their spool entry) [21:51:09] I'm a n00b learning and studying and now I'm looking for descriptions of of a couple of _actual_ en.wikimedia.org implementation details. Can someone point me where to look for PHP extensions actually available? (Looking for gmp right now, but I'd like to know where to find this kind of thing.) [21:53:14] Anyone? Am I in the right place to ask this? [21:54:39] danorton: #wikimedia-tech / #mediawiki might be more to the point [21:54:55] I'm not sure what GMP has to do with mediawiki, though. [21:55:12] so for questions on that, #php might be better [21:56:37] I'm asking about actual installation. Is the PHP gmp module installed on en.wikimedia.org &c. [21:56:37] I'm working on an extension for wikipedia, and I'd like to know if I can use GMP [21:57:11] I see. #wikimedia-tech is the right place to ask that. [21:57:34] ok, thx valhallasw [22:11:43] Hi [22:11:53] port 22: No route to host [22:12:02] after I rebooted an instance [22:14:24] Are we supposed to do something after rebooting an instance? [22:22:08] huh: No. What's the instance's name? [22:23:10] scfc_de: abusefilter-global-main (yes, I know, horrible name. but from 2012) [22:23:31] pirsquared@bastion1:~$ ssh abusefilter-global-main [22:23:32] ssh: connect to host abusefilter-global-main port 22: No route to host [22:23:57] It worked before I rebooted [22:26:45] That's odd. On bastion3, "host abusefilter-global-main" gives "10.4.1.38", which is listed as "Private IP" on https://wikitech.wikimedia.org/wiki/Nova_Resource:I-00000512.pmtpa.wmflabs. Security group "default" looks fine as well. Coren? [22:28:52] "ping 10.4.1.38" => "From 10.4.0.220 icmp_seq=1 Destination Host Unreachable". [22:29:05] scfc_de: I tried rebooting it again [22:29:16] Is the console output link just the result of dmesg ? [22:30:02] Don't know; do you see anything interesting there? [22:30:33] no :/ [22:30:33] ubuntu-12.04-precise (deprecated) <- deprecated? How should it be upgraded? [22:32:33] No; from time to time, Labs admins create new master images. All instances that were launched with "old" images get tagged as "deprecated". [22:32:51] (In short: Nothing to worry about.) [22:34:57] Ryan_Lane2: Any idea why abusefilter-global-main is not reachable, giving "No route to host", ping returns at 10.4.0.220? [22:35:28] huh: Last chance would be andrewbogott_afk, but he is probably sleeping ATM :-). [22:35:39] That's okay [22:46:57] scfc_de: any other ideas? [22:47:14] Unfortunately: Nope :-). [23:24:22] (03CR) 10Dzahn: [C: 031] Migrate check_disk_5_2 to root_disk_space for labs. [labs/nagios-builder] - 10https://gerrit.wikimedia.org/r/115822 (owner: 10DamianZaremba) [23:24:38] (03CR) 10Dzahn: "can't merge here, just have +1 in this repo" [labs/nagios-builder] - 10https://gerrit.wikimedia.org/r/115822 (owner: 10DamianZaremba) [23:39:27] (03CR) 10DamianZaremba: [C: 032 V: 032] "Merging post other changes" [labs/nagios-builder] - 10https://gerrit.wikimedia.org/r/115822 (owner: 10DamianZaremba) [23:57:11] (03CR) 10Dzahn: "merged the one in monitoring base in prod" [labs/nagios-builder] - 10https://gerrit.wikimedia.org/r/115822 (owner: 10DamianZaremba) [23:58:07] Damianz: hi [23:58:09] petan: hi [23:58:23] we just merged the monitoring changes [23:58:34] I just saw :) Yays [23:58:37] to me it looks like nothing broke [23:58:39] to you as well? [23:58:46] i just made sure on prod server so far