[00:18:15] drdee, I've learned how to use pig macros and modularized my geocoding script you can see the links to the code here: http://www.mediawiki.org/wiki/User:LouisDang [01:28:24] louisdang, that's cool [01:28:39] thanks [01:28:48] seems like useful stuff! [01:30:44] yeah, looking forward to learning more [01:31:05] about hadoop and pig [01:45:42] louisdang, thanks so much! ping me tomorrow :) i have some more things on my sleeve [01:46:22] ok talk to you tomorrow. [12:37:58] hey average_drifter [12:46:05] hey drdee [12:46:18] morning [12:46:31] do you have a solution for the line break stuff ? [12:47:30] yes, I made a .gitattributes [12:47:37] I'm trying it out now [12:47:42] I'm planning to try it out this way [12:48:06] echo -e "ABC\nDEF\nGHI\n" > somefile.txt [12:48:35] mv somefile.txt somefile.win.txt [12:48:42] cp somefile.txt somefile.unix.txt [12:48:49] todos somefile.win.txt [12:49:04] (todos is from the tofrodos package which provides commands fromdos and todos) [12:49:12] then add them both to a git repo [12:49:18] git add somefile.win.txt somefile.unix.txt [12:49:32] and then push and clone it separately and diff them to see if they are identical [12:49:36] if it worked, these two should be identical [12:53:28] user@garage:~/gitattribute_test$ git add somefile.win.pl README.md [12:53:28] warning: CRLF will be replaced by LF in somefile.win.pl. [12:53:33] The file will have its original line endings in your working directory. [12:53:33] oh apparently it worked [12:53:39] this is the kind of thing to expect if it actually worked [12:53:41] that message above [12:53:52] drdee: does Erik use the windows github client for git ? [12:54:07] drdee: or perhaps just the git that comes with Linux on stat1 [12:54:35] don't know, i think he use Tortoise GIT [12:57:42] ok, so although I have added the .gitattributes file, with the proper line-endings [12:57:48] and commited everything [12:57:57] meld is still confused https://raw.github.com/wsdookadr/gitattribute_test/master/meld_diff_between_local_files.png [12:58:38] I'll try to clone it and see if diff still says they are different [12:59:29] no, it seems that they're not, so inside the repo they are identical [12:59:34] but locally, for me in particular, they are different [13:07:50] drdee: https://github.com/wsdookadr/gitattribute_test [13:07:59] drdee: in the README.md you can find a summary of what I did [13:08:06] it worked I think [13:08:44] great! [13:41:44] good morning fine peoples :) [13:42:29] mornings [13:52:45] ha this is pretty nice, [13:52:50] i wrote snoopy [13:52:57] not snoopy but sqoopy [13:53:20] and it generates the custom import statements for sqoop [13:54:14] which was necessary because the mediawiki tables use varbinary quite a lot and so all those columns need to be casted to char [13:54:20] but that sucks to do manually [13:54:34] so snoopy inspect the tables and adds the casting automatcally [14:26:53] milimetric ^^ [14:26:56] where is ottomata? [14:27:10] hi drdee [14:27:21] snoopy? [14:27:34] I've no idea what you're talking about :) [14:28:37] no it's sqoopy [14:29:21] right, still no idea what you're talking about [14:34:14] morning ottomata [14:34:21] morning! [14:34:24] i have some questions.... [14:34:37] 1) hive [14:34:44] haha, yessssuhhhh [14:34:51] i was able to run commands [14:34:51] do I have answers? [14:34:51] hmmmm [14:34:51] maybe. [14:34:58] but now i get access denied for user hive [14:35:35] when did that change? [14:35:41] is this on the hive cli? [14:36:44] yes [14:36:53] also, previously you owned the hive dir, adn I changed that [14:36:55] lemme try one thing [14:36:58] so hive cli does start [14:37:26] how about now? [14:39:02] nope, still java.sql.SQLException: Access denied for user 'hive'@'127.0.0.1' (using password: YES) [14:40:34] what are you doing? i want to try [14:41:16] just enter hive [14:41:29] and then 'show tables;' [14:42:12] oohhhh i think i know what the problem i [14:42:12] s [14:42:27] we are no longer using hive's internal metastore but the mysql metastore [14:43:06] so is there a hive user in mysql? [14:45:22] did you completely re-install mysql? [14:45:51] i did, i didn't know you installed it! [14:46:13] yes i had it all working [14:46:21] check https://app.asana.com/0/828917834272/2108205677968 [14:46:35] with all the steps to setup hive, this needs to be puppetized as well [14:46:36] haha, i did the other task [14:46:41] np [14:46:43] 'puppetize mysql 5.5' [14:46:50] cause i need it for oozie [14:47:04] swoop, oozie and hive need mysql [14:47:19] the asana task contains all the steps [14:47:41] with oozie i also have a problem but i will bug you after hive is done [14:47:42] :D [14:48:10] yeah, i was working on oozie yesterday, got stuck at hte myslq part then had to go [14:49:18] k [14:49:25] with oozie i get the following error when trying to run a job [14:49:32] E0501: Could not perform authorization operation, User: oozie is not allowed to impersonate hdfs [14:49:41] brb, have to pick up some stuff [14:51:18] yeah i didn't finish that [15:49:46] hey drdee [16:15:19] hey louisdang [16:15:40] what's up [16:15:55] got anything for me to do? [16:16:28] I'm thinking of writing unit tests for the pig macros [16:16:34] paste me your link one more time [16:16:58] http://www.mediawiki.org/wiki/User:LouisDang [16:17:24] though I've made some changes on my laptop to make it faster [16:27:37] yeah so you could expand the pig library [16:27:45] for example functions like: [16:28:06] is_mobile_site (recognizable if there is '.m.' in the domain name) [16:28:29] is_desktop_site (if '.m.' is absent from the domain name) [16:28:45] we need pig functions to count page views by project language [16:28:57] like en.wikipedia.org is English obvioulsy [16:29:03] but we have about 280 languages :) [16:29:12] ok [16:29:18] is there a lookup table? [16:29:27] this could also be parametrized, where you supply the language code to the pig function [16:29:41] you don't need a lookup table [16:29:41] ok [16:29:57] the first part of the domain is the language code [16:30:01] yes i see [16:30:11] then there is an optional mobile indicator [16:30:11] and then the project [16:30:27] what is reportcard using now? [16:30:33] to fetch its data [16:32:15] the page view data from dumps.wikimedia.org [16:34:11] are the parsing scripts available? it could help me get ideas on getting these metrics [16:35:16] not sure if it's really helpful but here is the link: [16:36:00] https://gerrit.wikimedia.org/r/gitweb?p=analytics%2Fwikistats.git;a=shortlog;h=HEAD [16:36:04] best to use git clone [16:36:14] ok [16:36:23] like git clone http://gerrit.wikimedia.org/r/p/analytics/wikistats.git or something like that [16:37:01] can I get added to the project's avana? [16:37:14] yes i will invite you [16:37:28] ok email is dangl@uw.edu [16:37:30] ottomata how is hive doing? [16:38:54] i'm working on oozie [16:39:58] aight [16:40:07] yeah, almost good with that [16:40:08] then will do hive [16:40:15] wanted to finishe what I started yesterday [16:41:49] didn't mean to bug you [17:01:27] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90 [17:19:31] YES! I figured it out :) [17:19:50] it's so cool, graphs are drawing in limn with d3 :) [17:28:02] WOOOOOOOOOOOOOOOT [17:29:31] ohh ottomata, the metastore also needs to be configured for hue (http://analytics1001.wikimedia.org:8888/beeswax/) [17:30:16] ok cool [17:30:20] i think I just finished with oozie [17:31:40] i still have the impersonate error [17:32:15] impersonate? [17:33:06] btw, who are you logging in as? [17:33:08] in hue? [17:36:05] ottomata see also https://app.asana.com/0/828917834272/2108205677968 [17:36:08] in hue as hdfs [17:50:18] heh, github quotes kanye [17:50:18] https://github.com/blog/831-issues-2-0-the-next-generation [17:50:24] (we're living in the future so the present is our past) [18:44:19] back [19:02:36] this is worth reading -- http://hadoop.apache.org/docs/r0.20.2/hdfs_design.html [19:02:40] it's high-level [19:02:52] importantly, i didn't know about staging. http://hadoop.apache.org/docs/r0.20.2/hdfs_design.html#Staging [19:08:42] dschoon I thought you'd like to know that it took a little less than 10 minutes to add bar chart support :) [19:08:52] :D [19:08:58] d3 == awesome [19:09:10] brb, lunch (man I'm more and more on West coast time) [21:11:08] anyone know where I can find quim? [21:11:11] on IRC? [21:11:20] who what? [21:14:48] I'm trying to talk to Quim Gil about Ohloh and I can't find his IRC whereabouts [21:15:44] have you asked in #wikimedia-staff ? [21:17:52] huh, I can't get in there, it says I need an invitation? [21:18:23] ah, yeah. [21:18:26] do you have a cloak? [21:18:29] yep [21:18:35] https://meta.wikimedia.org/wiki/IRC/Cloaks [21:18:45] i know, I got one, whois me [21:18:50] https://meta.wikimedia.org/wiki/IRC_channels [21:20:43] ah [21:20:49] chat up james forrester [21:21:42] he's James_F [21:21:47] jforrester@wikimedia.org [21:21:49] k, cool [21:21:50] thx [21:22:17] where does he lurk usually? [21:22:21] no idea. [21:22:24] just PM? [21:22:41] oh I didn't even realize you could do that :) [21:22:54] /msg NAME MESSAGE [22:06:59] hey [22:07:05] do you guys know if ottomata will be back later? [22:07:43] he messaged me and i didn't notice. i've taken the concept of "low-contrast" to needless extremes with my limechat theme :/ [22:07:52] haha [22:08:06] he's sometimes back in the evening [22:08:12] like, ~6-7p our time [22:08:27] i'll shoot him an email [22:08:34] word.