[01:55:14] Hey there, I need some help with ascii encoding. [01:55:44] LukeDev: Hi! What's the trouble? [01:56:00] I've been getting this error: [01:56:01] UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 73826: ordinal not in range(128) [01:56:12] Let me give you some context [01:56:43] Good plan [01:56:58] Seems like....Python? [02:00:28] I've been converting html article text that I've gotten through the wikimedia api into python strings, then scraping it with beautiful soup, then isolating sentences and dates with nltk, then writing those strings to a json file. [02:01:18] The error is definitely coming from writing the strings to a .js file [02:01:31] not anything else in the process. [02:01:46] LukeDev: Do you happen to have delicious code for me? [02:02:02] haha [02:02:09] pastebin? [02:02:12] Works for me [02:03:02] just remember that you have a completely different file system then me [02:03:34] Um, K [02:03:59] http://pastebin.com/eyKCjdPf [02:04:29] Does it give you a line number? [02:04:30] if you search for "error" you can find the line the error occurs on. [02:04:52] 72 [02:05:08] actually 73 [02:05:14] Hm [02:05:59] LukeDev: Add an argument to your call to open [02:06:17] LukeDev: open( 'file', encoding='utf-8' ) [02:06:26] thanks! [02:06:40] LukeDev: That's my first instinct, it may not work [02:08:02] LukeDev: Are you using 8-space indents? :) [02:08:18] Er...tab characters? [02:08:31] for the python code yes.... [02:08:37] Iiiinteresting [02:09:22] for the python code yes.... [02:11:26] not a valid parameter [02:11:43] now looking up the open syntax [02:25:42] @marktraceur [02:26:16] I'm looking through the documentation but cant seem to find something that will open it to write in utf-8 [02:28:26] oh hey [02:28:30] LukeDev: open('whateverfile.txt', 'w', encoding='utf-8') [02:28:36] * marktraceur said that before [02:31:03] sry, the encoding parameter is invalid! [02:31:41] Looking to hire a dev for some contract work. PM me if interested. [02:31:58] Really [02:32:08] really [02:32:10] Hmmm [02:32:11] really [02:32:51] to ask about the conceptual underlying of this.... [02:33:29] what will the wikipedia article be written in when it is returned [02:33:30] Funny, 'cause it's in the python docs [02:33:37] Oh [02:33:38] from the simplemediawikiapi [02:33:42] I didn't read it properly [02:33:58] LukeDev: import codecs [02:34:13] LukeDev: codecs.open('filename.txt', 'w', encoding='utf-8' [02:34:14] ) [02:34:20] *slaps face [02:34:32] LukeDev: Hopefully mine, because I should have read better [02:34:52] gamebox: Hi, what sort of contracting work did you mean? [02:34:53] *own [02:35:13] LukeDev: You were provided with misinformation by someone who should know better :) [02:35:46] hahah thanks so much for your help [02:36:47] marktraceur, I own/run Leaguepedia.com -- 150k-200k pageviews a day so I'm running off of a multi-node environment (2 web heads, 1 db). I'm not having many issues except for some caching problems through memcache/varnish as well as I need some help optimizing performance [02:37:36] I manage a staff of 8-10 and have a full time job so the amount of time left to maintain and diagnose issues with the site can sometimes be limited. [02:38:14] it fucking works!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! [02:38:21] Thanks bro! [02:38:29] LukeDev: My pleasure! [02:39:07] gamebox: I feel you may have better luck posting the offer somewhere else....have you tried, e.g., free software job postings? I'm sure I've seen such listings somewhere. [02:39:17] ok [02:39:57] any specific suggestions? [02:40:56] gamebox: If it were my posting I'd use the fsf.org member forums [02:41:14] gamebox: http://www.fsf.org/resources/jobs/ [02:41:31] Hm, price may be a problem [04:38:17] Putting this question here for whenever someone has a chance to answer.. Thank you! What's the best way to mark certain pages to not be cached. IE: time sensitive updates for certain pages [08:24:10] hello [08:38:01] hashar: Hi! [08:45:01] marktraceur: working on https://gerrit.wikimedia.org/r/#/c/8924/ :-D [08:45:14] I am going to introduce a trivial assert method to test HTML content [08:45:25] would just add a newline after each > [08:45:30] would make diff easier to read [08:45:40] would -> will [08:50:20] *nod* it would [08:50:43] hashar: Holiday in the US today, I'm thinking you'll be rather lonely in here :) [08:52:52] marktraceur: I am used to work lonely every morning :-) [08:52:57] it is very quiet in Europe [08:53:22] the german folks are in #wikimedia-wikidata and i18n team in #mediawiki-i18n [08:53:35] it improves past 2pm (noon GMT) when east coaster join the fun [09:01:03] Hm, I suppose [10:05:25] New patchset: Hashar; "Ext-TitleBlacklist ant less verbose" [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22439 [10:05:43] Change merged: Hashar; [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22439 [13:04:06] New patchset: Siebrand; "Add jobs for running tests on 6 extensions." [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22454 [13:24:53] New patchset: Hashar; "(bug 39765) jobs for running tests on 6 extensions." [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22458 [13:25:14] Change abandoned: Hashar; "wrong change" [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22458 [13:25:53] New patchset: Hashar; "(bug 39765) jobs for running tests on 6 extensions." [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22454 [13:26:54] New review: Hashar; "Thanks Siebrand!" [integration/jenkins] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/22454 [13:26:54] Change merged: Hashar; [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22454 [13:38:08] New patchset: Hashar; "adapt extension jobs" [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22459 [13:38:26] Change merged: Hashar; [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22459 [13:39:08] Hey hashar, about? [13:39:28] hello Sam :) [13:39:42] So on fenari we have local commits in php-1.20wmf10 [13:39:48] so we get "Merge made by recursive" [13:40:04] Whats the easiest way to remove them? (superceeded by stuff committed via gerrit) [13:40:12] let me look at it [13:40:19] I guess some local hacks were there [13:40:23] then someone did a git pull [13:40:30] sounds about right [13:40:34] which triggered a merge of remote wmf10 into the local one [13:40:40] so you need to rebase the live hacks [13:40:46] by doing: git pull --rebase [13:40:53] on fenari in the dir of wmf10 [13:40:54] I think [13:40:58] need to verify [13:41:29] * 84469fd - (HEAD, wmf/1.20wmf10) Merge branch 'wmf/1.20wmf10' of ssh://gerrit.wikimedia.org:29418/mediawiki/core into wmf/1. [13:41:57] Reedy: may I pull ? [13:42:04] sure [13:42:22] includes/Revision.php: needs update [13:42:22] includes/StreamFile.php: needs update [13:42:23] maintenance/syncFileBackend.php: needs update [13:42:24] bahhh [13:42:28] live hacks must die [13:42:35] anyway [13:42:37] git status is nice too [13:42:50] # Your branch is ahead of 'origin/wmf/1.20wmf10' by 13 commits. [13:43:30] * 1234567890 - Revert "LOCAL SECURITY FIX FOR BUG XXXX - DO NOT PUSH TO GERRIT YET" (3 days ago) [13:43:32] dohhh [13:43:58] reverts a change made by Chad 5 days ago [13:44:19] git log --decorate --oneline origin/wmf/1.20wmf10..wmf/1.20wmf10 [13:44:31] the live hacks I have no idea Reedy :( [13:45:04] been made by Aaron I guess [13:45:05] Hmm, the live hacks look like it's most likely going to be Aaron [13:45:46] and is not there today [13:45:55] so one possibility would be to commit all 3 hacks in a new commit [13:46:09] send that to Gerrit, add Aaron as reviewer and merge [13:46:16] that will make 1.20wmf10 clean on fenari [13:46:23] ohh [13:46:23] no [13:46:24] hmm [13:46:39] there is the security hacks [13:50:09] !log [13:50:41] Reedy: so now the working copy is clean [13:51:22] thanks [13:52:18] Reedy: note that there are two commits we do not want to send to gerrit [13:52:32] maybe I should reorder the commits and just send the live hack [13:53:05] let me try an interactive rebase [13:56:29] Reedy: cleaned up. The 2 commits that are only in 1.20wmf10 are the security fixes [13:56:43] Heh, should probably cherry pick that into wmf11 later [13:56:50] well [13:57:03] the live hacks from aaron, surely [13:57:07] that's what I mean [13:57:09] the security fix I have no idea [13:57:16] Those were temporary [13:57:24] proper fixes were comitted [13:58:17] ahh [13:58:24] so we can probably get rid of them [13:58:45] grabing a coffee [14:13:19] Nooooo [14:13:26] Our live hack revision has finally conflicted! [14:14:33] http://p.defau.lt/?uncXRrMwOXa3Zo_y_5sK5A [14:15:12] * Reedy waits for local pulls to run [14:25:41] good luck on fixing them :) [14:28:34] It's a simple one [14:28:37] change of indenting and stuff [14:30:17] 'refs/changes/06/7606/1' => 'core', // Most of the WMF "live hacks" https://gerrit.wikimedia.org/r/#/c/7606 [14:31:03] Crap. [14:31:29] It's not in gerrit... [14:34:56] Ohhhh [14:34:58] I've an idea :D [15:48:02] https://www.mediawiki.org/wiki/MediaWiki_1.20/wmf11 [16:20:25] ahah [16:20:34] Reedy: so your idea was to make a new branch ? ;:-D [16:20:52] I am impressed by the number of changes we sneak in 2 weeks [16:23:20] I just cherry picked it onto trunk, fixed the conflict and submitted that to use for nex time [16:29:58] hashar: if you want some easy CR.. [16:29:59] https://gerrit.wikimedia.org/r/#/q/status:open+project:mediawiki/tools/release,n,z [16:30:04] Thanks! [16:30:13] sure [16:30:21] I was not even aware about that tool hahah [16:30:35] feel free to add me as a reviewer for the next changes Reedy [16:30:54] Chad usually does them if he's about [16:32:29] Reedy: I got him on Gtalk, he is busy with his house:) [16:32:35] I guess that is why there are closed days [16:32:40] so we can keep up with houseworking [16:35:31] Reedy: https://gerrit.wikimedia.org/r/#/c/22483/ has an issue [16:35:36] Reedy: there is a trailing i [16:35:37] after ; [16:35:43] in make-deploy-notes/make-deploy-notes [16:36:04] haha [16:36:54] rest is in :) [16:38:55] all in in