[01:45:56] Hello there, I'm looking for wikimedia api advice! [01:46:31] I've bee using beautiful soup to parse through wikipedia articles that I've pulled [01:46:35] LukeDev: Hi again! I can try :) [01:46:49] hahahah glad to know you're here! [01:46:56] Oh, I'm always here [01:47:02] Just not always at the computer [01:47:48] hmmmm that's good to know. One day I'm going to be good enough to help the IRC channel [01:47:59] hopefully. [01:48:12] So I'm pulling the article in html [01:48:41] because beautiful soup works well to clean the html. [01:48:58] hi [01:49:46] However I want to use wikitext because it's easier to navigate when I'm cleaning the article. [01:49:58] a bug has been spotted on http://fr.wikipedia.org/wiki/Sp%C3%A9cial:ExpansionDesMod%C3%A8les [01:50:14] look at the 's ;-) [01:50:29] sorry, I meant to say when I'm searching through the article for certain tokens [01:51:15] desired: just {{FULLPAGENAME}} [01:51:58] is there anything that cleans wikitext to produce simply text? [01:53:19] LukeDev: I can probably come up with a regex that strips out special characters or something.... [01:53:42] So that would be the best way to go..... [01:53:46] Maybe [01:56:06] s/(([\{\}'=][\{\}'=]+)|<[^>]*>// might be a good start [01:56:19] You'll need to deal with.....um, one sec [01:56:53] s/(([\{\}\[\]'=][\{\}\[\]'=]+)|<[^>]*>// might be a good start [01:57:01] (forgot links, don't know how [02:00:15] okay I'll try it and learn [02:13:57] @marktraceur: could I use this? [02:13:59] http://www.mediawiki.org/wiki/Extension:RegexFunctions [02:17:31] Eeeeuuuugh [02:18:09] LukeDev: Wouldn't be helpful on the Python side [02:18:20] LukeDev: Just make a Python regex [02:29:02] I'm looking at the documentation, it doesn't seem that different from regular regex. ' [02:29:26] I think I'll use this to draft it up: http://gskinner.com/RegExr/ [02:31:14] gerrit review makes some of my inline comments monospaced. Is this explained anywhere? [02:34:17] I think it does an implicit switch to monospace if any line in a comment has leading spaces, but there seems to be no documentation whatsoever of gerrit review mode. [12:38:38] back [15:55:07] New patchset: Hashar; "ant: setup-extension should use latest MW master" [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22577 [15:55:07] New patchset: Hashar; "ant: git-snapshot fix display of treeish" [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22578 [15:55:32] Change merged: Hashar; [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22578 [15:55:33] Change merged: Hashar; [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22577 [16:42:30] Change abandoned: Demon; "Not needed." [analytics/graphkit] (master) - https://gerrit.wikimedia.org/r/18647 [19:28:17] New patchset: Ottomata; "Adding prefix-preserving IP anonymization using libanon." [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/22614 [19:30:37] New patchset: Ottomata; "Adding prefix-preserving IP anonymization using libanon." [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/22614 [19:46:51] New patchset: Ottomata; "Adding prefix-preserving IP anonymization using libanon." [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/22614 [19:55:58] TomDaley: would it be possible to grant me the right to remove reviewers on all mediawiki/extensions project ? [19:56:06] so I could remove jenkins-bot :) [19:56:21] You should be able to, afaik. [19:56:25] Lemme double check something [19:56:47] https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/extensions,access - says you should be able to [19:58:14] TomDaley: ok worked on WikiEditor at least :) [19:58:32] thanks! I will check the access next time I have such limitation [20:05:07] New patchset: Hashar; "ant: default to mw.database=sqlite , rm job.properties" [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22620 [20:05:24] Change merged: Hashar; [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22620 [20:47:27] New patchset: Hashar; "phase out build.xml symbolic links" [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22663 [20:47:47] Change merged: Hashar; [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22663 [20:50:54] brion: Could you review https://gerrit.wikimedia.org/r/#/c/18654/ so I can get it off my dashboard? It was an automated addition but I don't have review on WLM. [20:51:30] sure [20:51:35] we'll merge over there eventually ig uess [20:51:46] done [20:51:52] Thanks :) [20:51:57] brion: for someone on vacation you work an awful lot ;] [20:52:14] i've got a couple hours to kill before heading to the airport, may as well :) [20:52:24] heh [20:53:01] brion: vacation? yay! [20:53:16] sumanah: been in chicago for a sci-fi convention and visiting family :) been fun [20:53:19] flying back tonight [20:53:26] brion: were you at WorldCon? [20:53:27] rock [20:53:29] yep [20:53:47] our friend won a hugo :) [20:53:53] Ursula? [20:53:55] * sumanah tries to guess [20:53:57] seanan mcguire [20:54:08] Ah yes! I remember now [20:54:11] she was also up for her mira grant books but didn't win for those [20:54:18] I think she won the Campbell a couple years ago? [20:54:27] yep [20:54:37] I was there for that, in Melbourne [20:54:40] awesoem [20:55:03] Oh, I need to tell you about the 2 times I thoroughly embarrassed myself in front of John Scalzi [20:55:10] 2 separate occasions, both in Melbourne [20:55:30] hehe [20:55:32] brion: it's so cool you got to see your friend win a Hugo! Jealous! [20:55:42] he was toastmaster for this year's awards [20:55:48] yeah :D [21:24:00] TrevorParscal: re Drafts - am I right in inferring that fixing it up is something that will take a few weeks (if you work on it on 20% days)? I ask so I can update https://www.mediawiki.org/wiki/Review_queue [21:24:29] will do [22:01:20] I have extracted the l10n cache rebuilding out of scap to mw-update-l10n [22:01:21] https://gerrit.wikimedia.org/r/22673 [22:01:36] would be great to get some review :-] [22:03:20] hashar: +1 [22:05:33] New patchset: Hashar; "Revert "phase out build.xml symbolic links"" [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22676 [22:06:28] Change merged: Hashar; [integration/jenkins] (master) - https://gerrit.wikimedia.org/r/22676 [22:17:24] danke TomDaley