[00:01:51] YuviPanda: hey if I want to set up a demo mediawiki with some experimental media extensions, should i do a labs-vagrant thing or is there a saner way to set up a MW instance on labs? [00:02:03] brion: labs vagrant atm. [00:02:08] ok [00:02:10] brion: create an instance with trusty! [00:02:14] i need timedmediahandler so … fun times :D [00:02:27] ffmpeg2theora is broken in trusty, but i have a workaround :D [00:02:45] also avconv is broken in trusty, in a different way [00:04:08] hm think i need to make a new project, it doesn’t really fit in my existing ones [00:04:59] brion: there's a timedmediahandler vagrant role [00:05:04] all vagrant roles should work with labsvagrant [00:05:11] ah [00:05:12] yeah, should work with labsvagrant i hope :D [00:05:17] except for the bugs ;) [00:05:22] didn't see ffmpeg2theora being borked [00:05:29] brion: you can also use precise with the 'precise-compat' branch [00:05:41] go to /vagrant, check that branch out and you should be fine [00:05:42] to mirror my vagrant setup i’ll just stick with trusty [00:05:46] ah ok [00:05:49] that sounds sane [00:05:54] need to work out a fix anyway [00:06:06] as i’m sure those media transcode machines are migrating to trusty at some point too [00:06:17] yup [00:06:33] https://github.com/Automattic/socket.io-protocol hm [00:12:33] https://wikitech.wikimedia.org/wiki/Special:Ask/-5B-5BCategory:New-20Project-20Requests-5D-5D-5B-5BIs-20Completed::No-5D-5D/-3FProject-20Name/-3FProject-20Justification/-3FModification-20date/format%3Dbroadtable/sort%3DModification-20date/order%3Dasc/headers%3Dshow/searchlabel%3DOutstanding-20Requests/default%3D(No-20outstanding-20requests)/offset%3D0 [00:12:37] love those special:ask URLs :D [00:19:22] brion: heh, now wait for andrewbogott_afk or Coren or mutante to wake up [00:19:52] i’ll keep playing with my local vagrant until then :) [00:22:36] I know all, see all, and am always awake. I just choose to ignore most of what passes. :-) [00:22:44] brion: Project created (but all lowercase name) [00:22:45] :DD [00:22:52] Coren: yay! [00:23:05] yeah i wasn’t sure the convention there [00:23:34] brion: Strictly speaking, /[a-z][A-Za-z0-9_]*/ but all lowercase is less confusing. [00:23:56] yeah… some lists show capitalized forms but i think that’s a mediawiki oddity [00:24:04] damn case-forcing [00:24:07] * brion gets the time machine back out [00:24:44] 2 cpus should be plenty, i only need a few sample files [00:25:30] * brion wanders off while instance builds [00:26:07] that always happens, doesn't it? [00:26:13] 'oooh yeah, let us get to this' [00:26:21] 'oh, instance building, let me check Reddit for a bit' [00:26:23] BAAAM [00:26:26] 3h later... [00:26:27] hehe [00:26:40] brion: FYI, a caveat for Labs newbies, most of the request disk space is available but unallocated. You probably want to turn on role::labs::lvm::srv in the instance config to get it mounted. [00:26:50] that’s why i’m not frantically putting this together the day before my wikimania talk, i want a working demo with plenty of time ;) [00:27:02] aho good to know [00:27:33] brion: Alternately, you can just create whatever LVM volumes you want to mount them wherever, that class does just that as a simple convenience. [00:28:56] yeah i’m lazy so if i can just click a button that’s better ;) [00:30:18] http://jianshu.io/p/03f1f45021cc [00:30:44] * Coren goes back to pretend he sometimes rests. [00:31:17] * YuviPanda continues drinking rum at 6am, despite that being not the best of ideas, but hey, sunday night and I don't have to work till monday night... [00:40:32] hmm [00:40:35] Error: /Stage[main]/Role::Labs::Instance/Mount[/home]: Could not evaluate: Execution of '/bin/mount /home' returned 32: mount.nfs: mounting labstore.svc.eqiad.wmnet:/project/ogvjs-integration/home failed, reason given by server: No such file or directory [00:41:14] Coren: ? [00:41:57] brion: That's a "normal" error if you did not configure your project for shared home/project. Harmless as it were, unless you intended /home to be on NFS [00:42:04] ah ok [00:42:23] brion: We haven't found a satisfactory way to propagate configuration down into puppet yet. [00:42:25] just trying tof igure out why my instance list shows puppet state as ‘failed’, and that’s the only obvious err i saw in ‘puppet agent —test’ [00:42:43] Yeah, major annoyance that. [00:42:48] fun :D [00:43:17] You can always turn shared home and project store; it's actually generally useful to have your home survive instance rebuilds. [00:43:59] true [00:44:51] Couldn't the configuration be stored (as well) in LDAP and thus accessible via facter? [00:45:40] Coren: is that role::labsnfs::client or something else? [00:46:37] brion: No, go in manage projects, select yours, and use 'configure' in the action column. [00:47:48] i’m there :) [00:47:53] just don’t know what to push :D [00:50:03] i find the conflation of production and labs stuff on wikitech.wikimedia.org confusing when i’m looking for documentation [00:51:01] ah *slap self* [00:51:06] i was in instance configuration not project configuration [00:52:02] thanks Coren :D [00:52:13] it was much easier to find looking in the right place [00:53:05] now to try labs-vagrant setup [10:51:53] scfc_de: nice fix for tools.! :D [10:52:43] YuviPanda: Creds go to Brandon :-). I wouldn't have guessed that it could be so easy. [10:53:20] yeah [10:53:39] he is primary maintainer for one of the popular dns servers, isn't he? [10:54:23] Hmmm? Didn't know that. [10:54:38] scfc_de: https://github.com/blblack/gdnsd [10:54:42] it's what we use, I think [10:56:13] Ah! Yes, I think prod uses that. Still the patch (and Labs) uses dnsmasq, so he seems to be willing to use the tools at hand :-). [10:56:19] (And capable.) [10:56:28] :D [10:56:30] yeah [10:59:34] scfc_de: btw, there's a if in exec_environ for trusty hosts, but I don't see those packages installed in exec-12 ;( [10:59:40] my debugging didn't give me anything. [10:59:46] think you can take a look when you have time? [11:00:47] Sure, let me check. [11:00:50] ty [11:06:32] YuviPanda: You're branching depending on $::lsbdistrelease, but that is "14.04" for Trusty, not "trusty". You apparently need to test $::lsbdistcodename instead. You can check the settings on the different hosts by running "facter". [11:19:30] Morning. The usual recurring request: Could somebody please truncate /var/tmp/phd/log on fab.wmflabs.org? See https://bugzilla.wikimedia.org/show_bug.cgi?id=65861#c13 ? Sorry. :-/ [11:19:44] * andre__ pings Coren if he has two minutes ^ [11:28:12] scfc_de: gah [11:28:15] scfc_de: right [11:28:35] scfc_de: I was an idiot. submitting new patch [11:47:33] YuviPanda: BTW, you're aware of Puppet's "case" statement? Feels more natural for these switches to me. [11:47:49] yeah, I'm aware, but two conditions... [12:05:42] petan: hi [12:06:31] the monit site is at http://mmonit.com/monit/ and docs are at http://mmonit.com/monit/documentation/monit.html [12:21:04] matanya: BTW, do you know if there are directory-level options for puppet-lint à la .pep8? For example, manifests/admin.pp generates lots of "line too long" errors in Jenkins, and if that could be disabled for just that file, that would solve that. [12:22:03] scfc_de: you can disable it in a config file [12:24:26] matanya: Yeah, but looking at https://github.com/rodjek/puppet-lint/, this seems to disable the checks for *all* files. Bummer. [12:26:35] no scfc_de, i think i don't understand [12:26:44] if you use : puppet-lint --no-2sp_soft_tabs-check path/to/file.pp [12:26:49] it doesn't work ? [12:27:33] puppet-lint --no-80chars-check path/to/file.pp [12:27:40] in your case [12:28:21] matanya: No, I mean the automatic Jenkins tests: https://gerrit.wikimedia.org/r/#/c/146050/ => https://integration.wikimedia.org/ci/job/operations-puppet-puppetlint-lenient/4525/console + https://integration.wikimedia.org/ci/job/operations-puppet-puppetlint-strict/4520/console [12:29:25] ah, you need to poke hashar to fix the command line invoked for this job [12:30:04] andre__: truncated. [12:30:15] Coren, thank you! [12:30:31] scfc_de: and even a nicer feature request : https://github.com/rodjek/puppet-lint/issues/247 [12:39:22] matanya: Well, when hashar is already on it, all the better :-). [12:40:09] certainly :) [14:27:20] who runs url2commons tool? [14:30:12] comets: You can look the maintainers up on http://tools.wmflabs.org/; in the case of url2commons it is Magnus Manske. [14:32:54] does he come on IRC much? [14:33:18] comets: He's here regularily. [14:34:15] Coren: wanna +2 https://gerrit.wikimedia.org/r/#/c/146050/ [14:35:11] was wondering if there were changes to his tool in the last few hours, its not uploading like it used too.. [14:40:08] comets: Doesn't it work for you, or are you just looking at some stats? [14:40:44] doesn't seem to be uploading..was working well 12 hours ago.. [14:41:15] hmm, I don't remember seeing Magnus on IRC much [14:41:23] labs-l or emailing him personally would probably be a better option [14:41:32] Or on-wiki. [14:44:49] i avoid manual upload cause it somehow takes 15mins to upload an image which is 1mb .. :( [14:45:52] max upload speed is like 0.1 KB/Sec :| [15:44:59] interesting crontab trouble [15:44:59] 01 * * * * jlocal /usr/bin/env > /data/project/phetools/env.out [15:44:59] fails silently, no mail nor error while [15:44:59] 01 * * * * /usr/local/bin/jlocal /usr/bin/env > /data/project/phetools/env.out [15:44:59] works, jsub & al are in /usr/bin but jlocal is in /usr/local/bin [15:45:51] that's on tools-submit host [16:04:47] did i mention i got my labs-vagrant instance up and running last night? http://ogvjs-testing.wmflabs.org/wiki/Main_Page \o/ [16:11:14] brion: Nice. I saw a flurry of bug reports from you over the weekend. :) [16:11:21] :D [16:11:37] now back to crunch time on iOS app… [16:11:55] but damn I’m loving Vagrant for working on MW stuff [16:12:04] much easier than mucking around with packages on osx [16:12:52] Very much so. And easier to have 4 or 5 completely different setups laying around for jmping from project to project. [16:13:07] yep [16:13:42] having the same roles available on labs-vagrant made it super easy to set up a public demo server once i’d done my local testing [16:14:06] I spent my weekend getting a labs-vagrant server setup to produce ISO images for putting on USB sticks -- https://wikimania-vagrant.wmflabs.org/mediawiki-vagrant/ [16:14:20] nice! [16:14:53] We are hoping to have a pile of sticks available at the Wikimania hackathon to make getting started easier for folks. [16:15:46] otherwise… hog ALL the bandwidth! :D [16:16:48] Yeah. The sticks we had worked pretty well in Zürich. Matt F did all the hard work for that. I've just been polishing up version 0.2 [16:24:07] bd808: nice! [16:24:54] i might try the vmware fusion provider for vagrant in place of the default virtualbox; i’m trying to get all my VMs to play nice with each other :D [16:25:14] parallels won’t run at the same time as virtualbox; vmware fusion will but running them both seems to trigger 100% CPU in virtualbox :P [16:25:44] meanwhile virtualbox doesn’t support retina display resolution in guest operating systems, and improving that in linux/gnome is another of my hobbies :D [16:25:55] it should *theoretically* work [16:26:13] Tyler has done some work with getting mw-vagrant to run in the vmware container. There's a bug somewhere about a few of the problems he hit at various times. [16:26:50] i also noticed the mediawiki-vagrant plugin seems to break other non-mediawiki vagrant instances ;) [16:26:56] i suspect that’s easy to fix though [16:27:20] i want to see if i can encapsulate the ogv.js build environment into a vagrant config [16:27:44] Oh. The plugin hooks things it shouldn't? That should be relatively easy to fix. [16:27:44] so it’s not just “works on my machine” [16:27:57] Coren: For the tools.wmflabs.org IP mapping, could you please restart dnsmasq on (IIRC) virt1000? [16:28:04] yeah i suspect it just needs to check ‘hey is this actually a mediawiki vagrant?’ and then not hook anything if it’s not [16:28:13] * bd808 nods [16:28:16] gotta brush up on my ruby [16:28:29] scfc_de: Eeew. Honestly, that thing is so unstable and rickety the tought of restarting it fills me with dread. [16:29:03] The Vagrant plugin system has "config" hooks that could be used for that. Only enable if flagged on in the Vagrantfile or something. [16:30:17] brion: Dan Duvall would be the guy to ping/assign a bug to on that. He's in the SF office and marxarelli on irc. [16:30:38] spiff [16:30:43] He's our newish "knows about ruby" guy in the QA team [16:33:28] Coren: :-) I think andrewbogott has some experiences with hosts not showing up in DNS, but restarts were never problematic IIRC. It's not urgent, just needs to be done some time. [16:34:54] packing up to go to the office, see y’all in a bit [16:35:10] scfc_de: restarting should be safe, I can do it. [16:35:51] oh, looks like Coren already did [16:39:42] Coren, is everything good with the new service-group code? Is now an OK time for me to purge the old ldap entries? [17:05:42] andrewbogott: I saw no issues in the time since the switch; purging should be safe. [17:10:34] 3Wikimedia Labs / 3tools: tools.wmflabs.org inaccessible via labs instances - 10https://bugzilla.wikimedia.org/54052#c28 (10Yuvi Panda) 5PATC>3RESO/FIX And finally :) [17:10:40] Coren: Did you restart dnsmasq? If I "dig tools.wmflabs.org" on bastion2 for example, I still get 208.80.155.131, while "dig en.beta.wmflabs.org" gives 10.68.16.16. There may be some levels of caching involved, so it might take a while to drop down the pipeline (=> tomorrow), but I want to make sure that someone opened up the tap :-). [17:11:00] scfc_de: I have. [17:11:17] Though I now realize I should probably kick the one on virt0 also. [17:11:37] 3Tool Labs tools / 3X!'s tools: LanguageTool removes article from watchlist - 10https://bugzilla.wikimedia.org/67991 (10tekftu4q) 3UNCO p:3Unprio s:3normal a:3None https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Language_tool_updates_not_showing_up_in_list [17:12:10] Coren: Ok, purging now. [17:15:33] .... and something broke. [17:15:37] * Coren hates dnsmasq. [17:16:56] ;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 64297 [17:16:56] Whu? [17:17:17] But now "dig tools.wmflabs.org" works -- what's broken? [17:17:26] Oh, you mean the header. [17:18:33] And the actual answer. [17:20:12] Coren: looks like ldap is busted [17:20:28] Betacommand: No, it's DNS. I'm on it. [17:20:35] <^demon|away> Krinkle: What you said in -operations, probably ^^ [17:20:38] Ah, no, I see what you mean. DNS because LDAP. [17:20:58] error: sge_gethostbyname failed hmm, that occur each time you play with dns ;) [17:21:35] I am getting 502 Bad Gateway everywhere [17:21:56] ... on tool labs [17:22:13] * Coren kicks LDAP [17:22:27] known pb rillke, Coren is working on it [17:24:24] well, even the log server is down ;) [17:24:39] andrewbogott: Hm. dnsmasq is up, but refusing queries. [17:25:31] andrewbogott: It's not just not listening; it's actually answering every query with REFUSED [17:25:52] Coren: that's on virt1000? [17:26:05] labnet1000 [17:26:13] Oh -- did you restart dnsmasq directly? [17:26:18] It's a slave of nova-network. [17:26:26] So if you kill it and then restart nova-network it should shape up. [17:26:29] * Coren facepalms. [17:26:37] * andrewbogott shrugs [17:26:41] it's not like that's obvious [17:26:52] Oh, kay [17:27:06] It's a labs problem, not just a limn problem. [17:27:21] Coren, did that do it? [17:27:59] andrewbogott: It did. And now I really think we should axe the init script for it if using it is known to always break it. :-) [17:28:18] Yes, yes we should :) [17:28:21] * Coren reiterates his desire to have a real DNS server within labs. [17:28:26] Let's see, I guess the nova-network role can just rm it. [17:28:43] Will that upset dpkg when/if dnsmasq is updated? [17:28:54] * anomie "grr"s over getting a generic 500 error page instead of one that says what actual error his script is trying to report [17:29:18] andrewbogott: It shouldn't; though any update will put it back only to have it removed by puppet. [17:29:29] that seems fine… I'll write a patch right now [17:29:37] anomie: yeah, would be nice to get rid of it :) Coren wants it in for preventing everyone from contacting labs folks when a tool goes down [17:30:13] YuviPanda: Makes it basically impossible for me to debug my tool without rewriting it to work around the generic error page, though. [17:30:14] YuviPanda: I need to find some polite way of making it configurable. I certainly demand it be there by default though. [17:31:30] something ate my prox?! [17:31:38] *proxy [17:32:01] * anomie comments out all the calls to header(), so it now returns errors as 200-status [17:33:48] Ideally, I'd want the generic 500 to only be output if the tool itself just gave a generic 500. Not obvious. [17:34:15] Coren: yeah, so the ideal way is to move the 500 generic error message to the default lighty confi [17:34:15] *config [17:34:17] Coren: so people can override if they so chose [17:34:28] YuviPanda: That'd work for 500. [17:34:40] (Since that one does always come from lighty) [17:35:01] Coren: yeah, and it should also work for 404s generated by the tool. we can catch 404s that aren't hitting any tool [17:35:25] YuviPanda: Sounds good. Thank you for volunteering. :-P [17:36:35] Coren: :D if I do it, I'll probably also clear up the way the config file is created, etc. [17:37:48] wow, diamond really hates it when dns goes down [17:37:57] yeah [17:38:00] I just got the spam [17:39:28] andrewbogott: heh, ganglia is also spamming :) [18:04:23] Did something break with dns today? wdq.wmflabs.org doesn't resolve anymore from outside of labs [18:04:29] Coren: what are these diamond messages [18:04:31] o.O [18:04:39] I received like 400 emails [18:04:56] petan: DNS broke briefly. Diamon gets very noisy if it does. [18:05:04] I wonder why. [18:05:10] multichill: There was an outage, but it should be fixed since. [18:05:11] those look almost like sudo messages [18:05:31] YuviPanda: They are. It's the sudo to puppet afaict. [18:05:41] right. why would dns failing affect that? [18:05:51] I also got a number of email from other people [18:05:56] I guess all sudo was being affected [18:05:57] Coren: no manual page for diamond. what is it? [18:06:06] Coren: If you guys break something or something collapses, could you send an email to labs-l list? [18:09:44] Coren: Same as last time, dig wdq.wmflabs.org @labs-ns0.wikimedia.org doesn't return a result [18:10:11] &replag [18:10:14] @replag [18:10:20] meh bota is down :( [18:10:42] we really need some "I will keep your service up" daemon [18:11:01] I feel like I will have to write some because nobody else can do that :P [18:11:22] !log local-heritage Yeah, dns broken again so 50% chance you can't use the api [18:11:23] local-heritage is not a valid project. [18:11:36] !log tools.heritage Yeah, dns broken again so 50% chance you can't use the api [18:11:36] tools.heritage is not a valid project. [18:11:56] !log bots hi [18:11:58] Logged the message, Master [18:12:03] interesting [18:12:13] I guess tool names are not valid [18:12:55] petan: https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource:Local-heritage/SAL&action=history [18:13:07] hmm [18:13:24] If we had LDAP problems, maybe the bot can't find the group [18:14:17] multichill: I think -- unrelated -- the service groups have been revamped recently, so something may have changed. [18:15:00] In the last 4 days? [18:16:00] Maybe. [18:16:06] YuviPanda: do you know of a daemon that would watch system for certain conditions and if they are not met, did something, flexible, scalable with simple per-user, cron-like interface [18:16:11] which means not puppet :P [18:16:24] service groups were renamed in the web interface but it shouldn't matter to running tools. [18:16:26] I guess there is no such a thing [18:16:45] multichill: Wait, the bot looks in LDAP directly for groups? [18:17:01] Coren: That's my assumption. I don't know where the code is [18:17:10] morebots, isn't it? [18:17:13] Oh, if it relied on the old name /and/ queried ldap directly... [18:17:15] yes [18:17:19] then I just broke that half an hour ago [18:17:21] multichill: If it does, it might have been using the pre-eqiad scheme. [18:17:48] andrewbogott: Would be nice if you could unbreak it [18:18:20] multichill: I can't really unbreak it -- we'll need to update the tool. [18:18:57] So do that! :P [18:19:00] What tool are we talking about? [18:19:19] andrewbogott: https://wikitech.wikimedia.org/wiki/Special:Contributions/Labslogbot [18:19:40] ah petan, isn't that one of your bots? [18:19:43] Ah, so, petan's bot. Sorry petan :( [18:19:49] lemme find the docs for this change... [18:19:53] which one [18:20:09] tools. makes more sense to me than local- [18:20:23] andrewbogott: Old LDAP tree and new one probably helps petan a lot [18:21:50] labs-morebots is NOT my bot, my bot is wm-bot [18:21:50] I am a logbot running on tools-exec-03. [18:21:50] Messages are logged to wikitech.wikimedia.org/wiki/Server_Admin_Log. [18:21:50] To log a message, type !log . [18:21:59] labs-morebots: thanks [18:22:00] I am a logbot running on tools-exec-03. [18:22:00] Messages are logged to wikitech.wikimedia.org/wiki/Server_Admin_Log. [18:22:00] To log a message, type !log . [18:22:03] omg [18:22:08] petan: Here's the context… http://permalink.gmane.org/gmane.org.wikimedia.labs/1652 [18:22:12] (from ages ago) [18:22:20] opening [18:22:22] Lemme find you an example of the new schema [18:22:49] ok I know this [18:22:59] I just don't know how it makes labs-morebots mine :P [18:23:32] um... [18:23:43] (20:23:06) ah petan, isn't that one of your bots? [18:23:43] Sorry, I'm in a meeting so not fully attentive. [18:23:44] (20:23:08) Ah, so, petan's bot. Sorry petan :( [18:23:51] these 2 lines I don't get ^ [18:23:56] ok no problem [18:24:06] So, the problem is with labs-morebots? [18:24:28] I don't even know what the problem is I just know you 2 guys pinged me in same time so I got curious what's up :P [18:24:39] In that case I will fix it myself -- after the meeting :) [18:24:47] good [18:25:38] petan: https://wikitech.wikimedia.org/w/index.php?title=User:Labslogbot&action=history [18:25:52] You created the page [18:25:52] hmm? [18:26:17] that is interesting... [18:26:24] I don't even remember that :) [18:26:26] my memory is getting worse [18:26:42] oh true [18:26:48] I did create it because it was a redlink [18:26:59] other than that I don't have much common with the bot [18:27:11] If it isn't you, who runs the bot and can update the code? [18:27:23] as andrewbogot said, he will [18:27:30] Missed that part [18:27:33] multichill, you're talking about the !log bot, right? [18:27:38] if so, that's me [18:27:41] yup [18:27:44] 'k [18:28:04] andrewbogott: Probably good to update https://wikitech.wikimedia.org/wiki/User:Labslogbot to point to who maintains it [18:28:12] (linkie to service group?) [18:28:25] hoi wolkerige valhallasw [18:28:36] of ben je meer van de gebakken lucht? [18:28:51] ik ben morgen weer in de wolken [18:29:01] Nog beter [18:31:11] ATM, "local-" is hardcoded in /usr/lib/adminbot/adminlogbot.py (courtesy of package adminbot), so https://git.wikimedia.org/blob/operations%2Fdebs%2Fadminbot.git/master/adminlogbot.py#L224 needs to be fixed, repackages and uploaded, and then Puppet should update it on the hosts. [18:31:37] ideally that shouldn't be a package :) [18:31:39] (And afterwards morebots' labs bot restarted.) [18:31:43] and while someone's at it, should also build it for trusty :) [18:32:20] scfc_de: that's so you can do !log pywikibot instead of !log local-pywikibot [18:32:33] or is the issue we have to do !log tools.pywikibot now? [18:32:48] valhallasw`cloud: The issue is that at the moment apparently none of that works :-). [18:32:59] !log local-pywikibot meh? [18:33:00] local-pywikibot is not a valid project. [18:33:06] !log tools.pywikibot meh? [18:33:07] tools.pywikibot is not a valid project. [18:33:09] I see [18:33:33] is LDAP functional? :-p [18:34:33] valhallasw`cloud: I need to fix that, will as soon as I'm out of my current meeting [18:35:51] Coren: BTW, thanks for the restart :-). [18:37:50] !log local-tools.heritage does this work? [18:37:50] local-tools.heritage is not a valid project. [18:37:54] hehe [18:38:07] It would be nice if we could use a proper structure for the service group's wikitech presence. https://wikitech.wikimedia.org/wiki/Nova_Resource:Local-heritage doesn't really exist and can only be confusing :-). [18:39:08] scfc_de: Probably best to move everything in https://wikitech.wikimedia.org/wiki/Category:SAL to the right new name [18:39:32] Only 10 [18:39:58] And I'm pretty sure valhallasw`cloud and I are maintainer of 9 of them ;-) [18:40:52] the bot supports redirects [18:41:02] so just move the pages to somewhere sensible, and the bot will just edit there [18:41:12] (as long as the redirect stays intact, anyway) [18:41:33] multichill: There are some tools that even use the "main" page, e. g. https://wikitech.wikimedia.org/wiki/Nova_Resource:Xtools. "Someone" would need to design a resource type "service group/tool" that fits that use case. [18:41:42] (That is, wiki template.) [18:44:27] Nova Resource:Tools.heritage/SAL or Nova Resource:Tools-heritage/SAL is all fine by me. I don't really care [18:53:54] !log testlabs this is a test message [18:53:55] Logged the message, dummy [18:54:16] multichill: So that I understand the problem… you used to be able to log with instead of ? [18:54:28] !log morebots this is a test message [18:54:28] morebots is not a valid project. [18:54:30] No local- [18:54:37] For things in toollabs [18:54:47] !log tools.morebots this is a test message [18:54:47] tools.morebots is not a valid project. [18:54:55] !log tools-morebots this is a test message [18:54:55] tools-morebots is not a valid project. [18:55:14] andrewbogott: You did see https://git.wikimedia.org/blob/operations%2Fdebs%2Fadminbot.git/master/adminlogbot.py#L224 ? [18:55:21] And earlier comments? [18:55:36] andrewbogott: local- worked, and at some point I submitted a patch to /also/ allow to work (namely by falling back to local- if does not exist) [18:57:31] wow, lots of people have edited and repackaged this since I last looked at it :) [18:57:42] andrewbogott: Is project_rdn = "ou=projects" still current? [18:58:24] multichill: yes, but tools are no longer under the project ou. [19:00:08] Now everything is under ou=servicegroups,dc=wikimedia,dc=org [19:00:51] So, for example, "cn=bastion.rezabot,ou=servicegroups,dc=wikimedia,dc=org" [19:01:11] Ah, so easy to fix? [19:01:25] should be, I just have to remember how to package it [19:01:47] andrewbogott: can you package it for trusty as well when you are doing it? :D We're setting up a trusty host, and it's one of the few packages missing... [19:02:04] YuviPanda: the same package should work just fine on Trusty [19:02:13] andrewbogott: right, but wasn't marked as such, I think? [19:02:16] apt-get couldn't find it [19:02:28] sure [19:02:32] andrewbogott: ty [19:06:47] 3Tool Labs tools / 3X!'s tools: LanguageTool removes article from watchlist - 10https://bugzilla.wikimedia.org/67991#c1 (10tekftu4q) see the enwiki village pump entry for instructions on duplicating bug. Test cases are listed there. It's repeatable. [19:22:10] YuviPanda you have me on IGNORE D: [19:22:57] !say YuviPanda, why u ignore petan? :/ [19:22:57] YuviPanda, why u ignore petan? :/ [19:23:18] oh, I don't have you on ignore, petan :) [19:23:23] lol [19:23:24] ok [19:23:25] just doing mobile apps stuff atm [19:23:26] just a mental ignore? :-p [19:23:30] valhallasw`cloud: :) [19:23:42] anyway before you responded to my question I coded that thing so nvm [19:24:06] for some reason programming is easier for me than using google [19:24:21] yes, but please consult labs-l and other people before deploying something on all the nodes. [19:24:24] petan: did you see my note on websockets/socket.io? [19:24:47] YuviPanda I am not deploying anything I just /coded/ something [19:24:53] valhallasw`cloud: no [19:24:54] cool [19:25:43] petan: basically, doing 'just websockets' is not possible - you need to speak socket.io /over/ websockets [19:26:48] does stream.wmflabs.org return any feeds at all? [19:26:59] ori is currently dealing with some prod stuff, I'll poke him when he's a bit more free to ask about the websocket issue [19:27:28] gifti: yeah, but for beta labs [19:27:37] which is not actively edited, so you need to make some edits yourself [19:27:45] ah, yes [19:28:31] 3Wikimedia Labs / 3Infrastructure: Transition service groups to new globally unique names and UIDs - 10https://bugzilla.wikimedia.org/58997#c14 (10Andrew Bogott) All groups are renamed and the GUI now reflects the new - scheme. Anything left to do here? [19:30:41] wee [19:36:04] So yeah, good news springle is going to send to labs-l: we are going to switch to mariadb 10 earlier than expected to fix a number of issues; this will allow us to get rid of federation. [19:38:00] Hooray! [19:39:46] w00t [19:40:00] * YuviPanda awaits labs-l email for more details [19:41:35] Coren: hmm, next steps for the graphite machines again? :) [19:55:53] scfc_de: are these IPs tools labs execution nodes? * 10.68.16.37 * 10.68.16.32 * 10.68.16.31 * 10.68.17.64 * 10.68.16.36 * 10.68.16.35 [19:56:25] multichill and/or valhallasw`cloud, review please? https://gerrit.wikimedia.org/r/#/c/146180/ and https://gerrit.wikimedia.org/r/146181 [20:00:16] andrewbogott: one comment on https://gerrit.wikimedia.org/r/#/c/146180/ , 181 looks good [20:00:23] thanks [20:02:19] YuviPanda: wikibugs doesn't report to huggle :/ [20:02:35] hmm? [20:02:38] the config has it, no? [20:02:45] it worked before [20:02:52] now that you mention it, I don't see wikibugs anywhere... [20:03:08] hmm, I do see on -dev a while ago [20:03:29] I do see it [20:03:35] it's online [20:03:38] wikibugs has been idle 24mins 16secs, signed on Sun Jul 06 06:45:19 2014 [20:03:43] and no ping reply [20:03:50] just it's quiet [20:04:27] !ping [20:04:27] !pong [20:04:38] at least some bot works! :)) [20:04:39] valhallasw`cloud: 6 [20:04:40] err [20:04:41] & [20:04:42] gah [20:04:43] ^ [20:05:31] what a brilliant programmer could it be to write wm-bot... [20:05:37] wm-bot <3 [20:06:09] se4598: -09..-13 should get NAT IPs, I think. For the fixed ones, you should be able to see them if you go to https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools and then look at the individual instance pages. [20:06:10] try ctcp version [20:06:20] hm. [20:07:43] rebooting... [20:07:54] nothing clear in error log; basically just ping/pongs to wilhelm.freenode [20:08:20] last mail received was From wikibugs-l-bounces@lists.wikimedia.org Mon Jul 14 19:38:45 2014 [20:08:40] so something on the wikibugs-l end is malfunctioning, then. [20:10:30] and I have no time to find the bug at the moment :-p [20:39:34] !log tools manually deleted /var/lib/apt/lists/lock, forcing apt to update [20:39:37] Logged the message, dummy [20:39:52] andrewbogott: you should specify on which host :) [20:39:59] !log on tools-login [20:40:00] on is not a valid project. [20:40:08] !log tools on tools-login [20:40:11] Logged the message, dummy [20:42:15] andrewbogott: I think it needs to be updated on all hosts [20:42:19] all -exec* hosts [20:42:28] valhallasw`cloud: it will be, just, on tools-login apt was actually broken [20:42:32] ahhhhh [20:42:39] in various ways :( [20:43:52] It leaked a lock file and hadn't updated apt since February [20:44:00] oh wow [20:44:09] it keeps happening to puppet now and then [20:44:25] puppet should error out if apt-get update fails. I don't know why it didn't here [20:44:42] andrewbogott: don't worry, puppet has been failing on all hosts forever [20:44:57] libvips isn't available, and is still in list of packages to try to install, and I've been trying to get it removed... [20:45:28] Ah, so that's you is it? I was just staring at that [20:45:32] Shoudl I ignore it for now? [20:45:43] Or, do you have a patch that removes it? [20:46:10] It's a user-requested package, removing it wouldn't be a proper "solution" :-). [20:46:35] scfc_de: it's not been installed in forever [20:46:43] scfc_de: there's an RT ticket, we can add it back when it's reinstalled. [20:47:07] scfc_de: aaand tools-exec-12 now runs puppet with 0 failures. YAY [20:47:17] Compiling it and uploading takes ... 5 minutes? [20:47:41] (Wall clock time.) [20:47:44] hasn't been done, and my deb skills are... nonexistent :( [20:51:42] andrewbogott: What's the range for instances again? /20? [20:52:41] Coren: I think this page is correct: https://wikitech.wikimedia.org/wiki/IP_addresses So… 10.68.16.0/21 [21:15:16] Coren: btw, the trusty exec node now runs puppet with no errors :D want to put that into a queue? [21:21:05] YuviPanda: It'll take a bit before I add the resources to allow people to select trusty; I don't want bots to fail randomly for mysterious (to the enduser) reasons. [21:21:38] Coren: right, but if we don't let it go there by default, and only have the tool authors selectively send it... [21:21:44] Coren: also no webgrid for trusty yet, just exec node [21:23:35] YuviPanda: Exactly, which needs me to set up a resource so that people can select against it. [21:23:56] Coren: indeed, so was just poking for that :) [21:24:00] YuviPanda: Also, I want to notify labs-l first. I'll do that shortly. [21:24:05] Coren: cool! [21:24:16] I really, *really* should read up on gridengine [21:24:29] Coren: do you have any reccomended books / tutorials for reading up on SGE, other than the man pages? [21:26:04] YuviPanda: Not really; there used to be a decent one for SGE ages ago, but it's woefully out of date and probably out of print. [21:26:09] heh [21:35:42] !log testlabs.testlabstestgroup this is a test log message [21:35:43] testlabs.testlabstestgroup is not a valid project. [21:35:46] dammit [21:44:20] !log testlabs.testlabstestgroup this is a test log message [21:44:21] testlabs.testlabstestgroup is not a valid project. [21:44:27] hm [21:52:07] Coren, is there a salt master for tools? I want to run 'apt-get install adminbot' on all the exec nodes [21:53:49] andrewbogott: there isn't [21:53:53] you've to use dsh or pdsh [21:53:57] 'k [21:54:11] andrewbogott: want me to run that for you on the exec nodes? [21:54:22] sure, if you also tell me how you did it [21:54:50] andrewbogott: sure. [21:55:13] andrewbogott: cat exec-nodes | pdsh -w - 'sudo apt-get install adminbot' [21:55:21] andrewbogott: in the file exec-nodes, I've a list of all the exec nodes [21:55:24] and 'exec-nodes' is... [21:55:26] ok :) [21:55:37] thank you! [21:56:54] andrewbogott: it was already latest on all nodes [21:57:07] really? It wasn't on the one I checked... [21:57:29] andrewbogott: hmm, maybe I should run an update first? [21:57:36] apt-get is stuck [21:57:50] same as it was on tools-login. I wonder what that's about... [21:58:26] andrewbogott: ah, with a manual forced update it's installing newer version now [21:58:33] great [21:59:31] andrewbogott: ah, hmm, it also is failing on a few hosts [21:59:34] with the lock file [21:59:38] yeah [21:59:40] which ones? I'll fix [22:00:28] andrewbogott: I'm fixing on -06, the other ones were 01 and -03, but I don't know if they're lock related [22:00:45] andrewbogott: ah, -06's apt seems more borked. can you take a look? [22:00:50] andrewbogott: just removing lock file didn't do anything [22:00:54] hm, ok. [22:00:56] I'll look [22:01:37] actually, 06 looks ok to me now [22:02:03] E: Problem renaming the file /var/cache/apt/srcpkgcache.bin.MNAIcW to /var/cache/apt/srcpkgcache.bin - rename (2: No such file or directory) [22:02:04] E: Problem renaming the file /var/cache/apt/pkgcache.bin.xVCEHB to /var/cache/apt/pkgcache.bin - rename (2: No such file or directory) [22:02:04] E: The package lists or status file could not be parsed or opened. [22:02:06] andrewbogott: ^ I got this [22:02:20] during the update [22:02:23] or the install? [22:02:27] -01 is ok [22:02:29] andrewbogott: during update [22:02:54] -03 is fine too [22:03:21] YuviPanda|zzz: hm, now it's happy. I don't know why it wasn't :/ [22:03:27] andrewbogott: heh [22:03:34] !log testlabs.testlabstestgroup this is a test log message [22:03:36] Logged the message, dummy [22:03:38] woo, finally [22:04:04] valhallasw`cloud: can you confirm that the bot is doing what you'd expect now? [22:10:06] !log pywikibot testmessage [22:10:07] pywikibot is not a valid project. [22:10:10] !log tools.pywikibot testmessage [22:10:12] Logged the message, Master [22:11:04] !log tools.pywikibot testmessage [22:11:16] hrm. [22:11:43] !log tools.pywikibot testmessage 2? [22:11:50] apparently the redirect breaks something [22:11:51] !log tools.lolrrit-wm is everything ok? [22:11:53] Logged the message, Master [22:11:59] right [22:12:06] * YuviPanda|zzz goes off to sleep for real [22:12:38] andrewbogott: so that's broken (but wasn't in use, so doesn't matter), and the 'pywikibot' instead of 'tools.pywikibot' trick doesn't work, but that's also OK I think [22:12:48] at least tools.pywikibot makes sense, while local-pywikibot was weird [22:13:54] So, I didn't follow… why did tools.pywikibot work, and then stop working? [22:15:46] andrewbogott: I placed a redirect on the SAL page [22:16:07] I *think* that used to work, but I don't think I ever tested it in deployment [22:16:10] ah, ok.