[01:12:37] Hey Coren, you about? [01:12:46] Kinda. What's up? [01:12:54] Any updates on the deleted edits replication? [01:13:07] https://bugzilla.wikimedia.org/show_bug.cgi?id=49088 [01:14:24] Thanks [09:03:10] * jeremyb gives Coren/petan a poke on https://gerrit.wikimedia.org/r/93426 [09:04:07] what you expect me to do? +1? [09:05:44] petan: i thought you did work on tool labs? [09:05:50] maybe i'm confused [09:05:54] jeremyb: he doesn't have +2 on ops repo [09:05:59] give me the 4am break :) [09:06:05] YuviPanda: yeah, right :P [09:06:13] :D [09:06:21] jeremyb: Coren or andrewbogott_afk mostly [09:06:39] YuviPanda: we need to stop using the ops repo for stuff that ain't ops [09:06:51] jeremyb: agree completely. [09:06:56] +1 ^ [09:07:05] should be in labs/puppet.git [09:07:12] not that the prior workflow was any better [09:07:21] we tell people to 'puppetize everything!', and then make them wait on ops people to +2 [09:07:27] while ops themselves liberally +2 [09:07:35] labs/puppet.git will solve these [09:07:51] which is one thing for actual production boxen [09:08:09] but if the target is labs anyway and a box on which you have root... [09:08:25] !log deployment-prep Restarted bits varnish to clear out the cache. [09:08:33] Logged the message, Master [09:08:34] jeremyb: us talking about it on IRC is going to change nothing [09:09:49] YuviPanda: hehe [09:14:09] !log deployment-prep removed sudo group 'admin', removing root access from any volunteers [09:14:15] Logged the message, Master [09:15:13] !log deployment-prep deleted sudo policy 'webadmins' only had petrb in it with no specific access. [09:15:18] Logged the message, Master [09:19:52] !log deployment-prep reenabling puppet on deployment-apache33 [09:19:58] Logged the message, Master [09:38:55] !log deployment-prep rebooting apache32 for kernel upgrade [09:39:01] Logged the message, Master [09:39:46] !log deployment-prep rebooting apache33 for kernel upgrade [09:39:52] Logged the message, Master [15:17:35] Coren: how do end users request? :) [15:17:40] packages that is [15:17:48] bugzilla is the best way. [15:18:14] * jeremyb points Coren to https://gerrit.wikimedia.org/r/93426 [15:18:41] also, https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools [15:18:51] gerrit also works. [15:18:54] :-) [15:19:23] if there's any of that that i can address somehow i'm happy to help [15:19:28] bugzilla is best, because gerrit changesets sometimes get lost in the sea. [15:19:46] huh, ok [15:21:14] Well, you can do both, really -- a changeset with an attached bugzilla -> absolute best. Those I can close trivially. :-) [15:21:22] yeah, sure [15:21:30] The bugzilla makes sure I see it. [15:21:39] didn't realize gerrit was so invisible [15:22:09] gerrit needs something i've been asking for for a while: a way to ask "the wind" for review. or "people that care about X" [15:22:21] instead of having to pick out individuals [15:22:34] It's not /invisible/, but I have too many changesets I'm listed as reviewer on for me to triage at a glance; whereas bugs are sortable. [15:23:06] individuals can "subscribe" to changesets matching some criteria; that's why most everything labs pops up for me. [15:23:14] right, i know [15:23:23] (93426 merged, btw) [15:23:39] yeah, the bot pinged me :) [15:23:48] doubly (mail and #-operations) [15:23:50] :P [15:24:00] it's possible to do some sorting with gerrit. but not as good i think [15:24:53] Coren: still trying to figure out how tools works... one troubleshooting step i was going to take was logging in to an exec server and looking at what dpkg says is there. wouldn't let me log in [15:25:28] ... it should; the exec nodes have HBA. [15:25:28] Coren: i ended up just testing on tools-login instead and when it worked there and tested again on grid on not there then i assumed package was missing [15:25:35] Oh - but you need to reach them from tools-login [15:25:35] HBA? [15:25:42] Host-Based access. [15:25:45] huh [15:25:49] interesting [15:25:52] not key? [15:26:09] No; gridengine needs to be able to spawn stuff as you without credentials. [15:26:10] it should give me a warning about that instead of just closing the connection [15:26:33] jeremyb: ssh can't be arsed to explain /why/ you're not allowed to login, just that you aren't. [15:26:51] (For one, because you don't want to explain your security to attackers) :-) [15:27:24] then how do you tell people? [15:28:08] Logging in on an exec node is a marginal enough scenario that I don't want to give official support for it. The best way to be told is to ask. :-) [15:28:42] Coren: well i talked about it here earlier and no one told me. fwiw [15:29:09] I missed it, but Petr or Tim would have known. [15:29:32] so, all tools users get all root mail for the project? i guess that should just be projectadmins? [15:29:35] or something [15:38:48] o_O? Nobody should be getting root mail. [15:38:57] Or do you mean mail /from/ root? [15:39:00] no [15:49:57] Coren: puppet's every 30 mins? [15:50:18] 60 mins now. I can force a puppet run if you need it. [15:50:57] you think we could make /var/log/puppet.log have timestamps? :) [15:51:43] looks like it's just at 60. i'll watch the log [15:51:52] i thought it was 30 [15:52:19] It used to be, but the overall load on the openstack infrastructure was getting ridiculous given most runs were dry. [15:52:37] you mean on the clients? [15:53:08] Well, since the labs client are all running on the same infrastructure, that's the one that took the hit. [15:53:53] i meant vs. the master [15:54:28] The master was also releived a bit, but that wasn't the primary concern (since the runs were, mostly, staggered) [16:18:11] * anomie makes https://tools.wmflabs.org/anomiebot/available-packages.php [17:33:09] g'afternoon mortals, how do i chmod a file in labs? i'm getting "Operation not permitted" [17:55:06] alchimista: You can chmod anything you own; I expect you're trying to chmod a file owned by your tool, or vice-verse. [17:55:10] vice-versa* [17:55:31] alchimista: You can take ownership of a file with 'take ' [17:59:53] Coren: in fact i was the problem, i was using "-" instead of "~". A little breack and a coffe make miracles and solve a lot of bugs XD [18:00:29] alchimista: That, or a night's sleep which is the overdrive version. :-) [18:01:41] Coren: i must agree with you, spetcially when i've spend almost 1 hour on a "bug" wich was nothing less than wrong file permissions [19:55:01] hello guys, Danilo from Brazil recommend me to create an account on Tool Labs for some tests on ptwikis [19:55:45] whowcan help me to be approved to have access to the server? [19:56:08] so, Danilo will be able to add me on ptwikis [19:56:32] Hello, rodrigopadula. I haven't done the shell access today yet, but lemme add you real quickl. [19:57:02] thanks... my user on the wiki is Rodrigo Padula and on the shell is rodrigopadula [19:58:21] rodrigopadula: All done. [19:58:52] thank you Coren [20:15:23] Hello, I have an account at ToolsLab and am currently getting HTTP 500 from my PHP scripts every now and then. Are the servers having difficulties? [20:20:42] Nobody here who could assist me? [20:21:30] Coren: ^ [20:23:22] Even calling a simple PHP script such as "echo 'abc'" at https://tools.wmflabs.org/cssk/scripts/test3.php gives me an HTTP 500 Internal Server Error on every second or third refresh currently. [20:23:24] With the toolserver? Not really -- you want a TS admin for that. Your chances are higher on #wikimedia-toolserver [20:23:39] Doh. [20:23:40] OK, thanks, I'll try there. [20:23:45] Why did I read "Toolserver" [20:23:49] No, you're in the right place. :-) [20:23:49] I see. [20:23:52] OK [20:24:52] Blahma2: PHP errors can be found in your tool's ~/php_errors.log [20:25:17] A quick glance at it shows that you're often running out of memory. [20:25:51] Thanks. That might be, because I am processing Parsoid output of long articles. [20:26:34] Blahma2: You'd probably be better off with your own webservice. Take a look at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help/NewWeb [20:27:32] Thank you, I'll migrate to that and see if the issue persists. [20:30:41] Blahma2: interesting, what are you working on? [20:31:08] Now the tiny script does not raise the error anymore, but I'll see with my large one. [20:31:44] gwicke: I am developing a tool that simplifies machine translation of articles between cswiki and skwiki as much as possible. [20:32:13] Parsoid output is great at stripping wikisyntax out of pure text and then putting it back into the translated text. [20:32:36] And I also do various postprocessing to translate link targets via Wikidata etc. [20:32:56] The whole is/will be packaged as a gadget. [20:34:37] Blahma2: cool to see uses of our output for such stuff [20:35:01] are you using libxml for DOM parsing? [20:35:31] Nice to meet a Parsoid developer. I talked to some of you at Wikimania, where I also got the idea to use this for my purpose. [20:35:44] AFAIK there is still no real Parsoid API available to the publci, is there? [20:35:59] I am currently hooked to the POST/GET requests of the demo, unfortunately. [20:36:04] Blahma2: any day now, will be http://parsoidcache.svc.eqiad.wikimedia.org/ [20:37:00] that won't be the final content API we are working on, but it will give you access to the cached internal Parsoid cluster [20:37:04] And not, I do actually care very little about the actual markup - I simply replace any markup with placeholders, which I hope stay in the translated text, and then replace those placeholders back with the markup pieces and convert it back to wiki code. [20:37:12] which is much faster than our labs vm [20:38:27] Cache should be OK, so I am definitely interested in learning when that is out (does not work or me at the moment). [20:38:44] Blahma2: interesting, so you are not converting back to wikitext using Parsoid? [20:39:18] https://gerrit.wikimedia.org/r/#/c/93527/ [20:39:22] At least with ToolLabs I can get local network access to the WMFLabs server without having to host my scripts on yet another party and causing delays by that. [20:40:09] yeah [20:40:17] it is a small labs vm though.. [20:40:46] Blahma2: we recently moved to XML serialization btw, so you should be able to use any XML parser directly [20:41:20] I am. The whole chain is: source article title -> Parsoid DOM -> pure text with markup placeholders in lang. 2 -> translation into lang. 2, still with placeholders -> translation in lang. 2 with placeholders replaced by what they originally meant, i.e. a valid Parsoid DOM again -> wikicode of a new article in lang. 2 [20:42:42] So I kind of make a round trip, but have the text translated in the course of it (hiding the markup from the translation engine). [20:44:04] and markup is HTML markup in this case? [20:45:22] sounds cool in any case [20:45:35] It's HTML markup with Parsoid's data attributes. I actually do not care about the HTML, and if I need to change something (like "translate" the target of a link), I modify those data attributes. [20:46:09] our DOM will soon also be smaller as we are removing the private data-parsoid stuff [20:46:11] I have observed that the conversion prefers changes in those attributes and does not care about the HTML that much if at all. [20:46:49] k [20:47:03] it sounds like you are basically building a generic HTML translation tool [20:49:35] That's true. With additional modifications that perform some tasks that users usually do when translating articles from a language to another. Some of those are also general (such as looking up link equivalents at Wikidata) and some are more or less specific (such as conversion of infobox templates). [20:50:40] I'll be having a talk about this project in a week at the Central & Eastern European Wikimedia meeting in Slovakia, and then at the end of October at WMCZ's conference in Prague. [20:51:00] Hopefully the system is up then, with all the changes still being done by both you and me :) [20:51:33] hehe, we are doing our best ;) [20:51:49] you can also always ping me or join #mediawiki-parsoid [20:52:36] Thanks, I'll try to remember that. This is the first time I've gone to a Wikimedia IRC, because I don't do that often - it feels like in the 90's :) [20:53:00] In your opinion, could any of the recent or future development that you have pointed to help me in some way? [20:53:22] I did not really understand that thing on Gerrit, perhaps because I do not use those systems. [20:53:39] the main upcoming changes will be removal of private information [20:53:47] data-parsoid chiefly [20:54:26] That could be good news for performance, but only as long as you do not get rid with all of that?! [20:54:27] the public API should make your tool much faster [20:54:30] and more reliable [20:54:45] But the public API is not yet out, is that right? [20:54:49] our lab vm is really just us testing things, without any guarantees that it will be up [20:55:05] the public API is basically waiting on the ticket I linked to [20:55:16] should become available any day now [20:55:32] I'll send a post to wikitext-l and wikitech-l about it when it is ready [20:55:41] I see... And what's VisualEditor running on? That must be much more reliable... But I can imagine it is not possible to access that for me. [20:55:52] VE is using the same backend [20:56:00] OK, I'll make sure to be on one of those lists. [20:56:46] cool [20:57:22] As for the data-parsoid attribute, I rely on it in cases such as getting link targets and modifying them. It's much easier to get the original link text than to mess up with the A HREF attribute. [20:57:55] hmm- that is really not part of our API [20:58:10] which difficulty do you have with the href? [20:59:03] (the API is documented in http://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec) [21:01:02] Sorry, browser crash. [21:01:06] Let me find an example for you. [21:09:34] Well, I am actually using data-parsoid in a couple of instances at this moment, but that might not be necesssary indeed. [21:09:50] As long as you keep stuff like data-mw or that's low-traffic though, so good to subscribe [21:13:07] Blahma2_: do you need to know how the link was written? [21:13:10] I'll do subscribe. [21:13:24] normally we abstract such issues [21:13:27] If I want to rewrite the link in a form as much similar to the original as possible, I might care. [21:13:55] you are kind of reverse engineering what we are doing there [21:14:13] Because very often Czech and Slovak use the very same words, so it feels like unnecessary to have to "normalize" links in the process, if I could simply leave everything like it was. [21:14:17] we don't do a terribly good job yet at using link tails on new content when possible [21:14:45] is keeping link tails the main use case? [21:15:05] That's the only thing I've just figured out I could not get from somewhere else. [21:15:25] And the code still works, so it's not crucial, just the output might not be "that nice". [21:16:07] if we add support for link tail serialization on modified content, would that solve your issue? [21:17:33] Apart from processing links, I also translate localized namespace prefixes ("Image:") and parameters (such as "thumb") in images, and I need to be able to access and generate other details such as template name and parameters. [21:20:53] I'd need to be able to both access link tail information of the source wikitext and push links with tails into the target wikitext. Just one of them would be of no use. [21:21:20] (sorry I might not understand exactly what kind of "serialization" you are talking about - is that how you call the process of converting wikitext to DOM?) [21:21:55] sorry if that was unclear- we mainly call html to wikitext 'serialization' for historic reasons [21:22:49] normally we can figure out the link tail automatically by comparing the href with the link text [21:23:36] that might indeed be the case, at least as long you can ignore whitespace, capitalization and similar issues [21:23:40] you would not have to do any manual link tail pushing any more, we'd automatically convert to wikitext with a link tail when possible [21:24:07] we already have much of the logic, just don't apply it to modified content currently [21:26:23] I always thought that you were doing round trips in order to check that you can convert wikitext2dom2wikitext and have the very same wikitext at the beginning and the end... [21:27:00] With introducing any of these features (either giving up tails or automatically creating them when possible), you'd miss that goal, wouldn't you? [21:27:12] Or is it the modified/unmodified content distinction that plays a role? [21:27:30] we'll preserve tail or not tail when the content is not edited [21:27:39] but for an edited link, we'd minimize [21:28:58] In my use case, I've been happy to kind of think of the DOM as a more pleasant way of presenting wikisyntax. Because wikisyntax uses rich markup, while in DOM, markup is everything between brackets and everything out of them is text. [21:29:21] That may explain why I took for granted that I can control things such as link tails to the very last character... [21:29:49] would default minimization (with link tails) work fine for you? [21:30:04] my impression is that most wikis prefer that form when possible [21:30:28] I work a lot with wikis in languages that use flection, so I can confirm your impression. [21:30:40] Although software and bots usually prefer it the other way round. [21:30:51] https://bugzilla.wikimedia.org/show_bug.cgi?id=52240 [21:30:51] So I think it would indeed work fine for me. [21:31:47] I thought there might be cases when such automatic minimization creates unintelligible results (i.e. over-shortening something that has the same ending only by coincidence), but it is difficult to think of an example. [21:32:26] especially with the rule that the tail cannot normally contain white space etc [21:37:05] One might think of links where the link target and the link text are unrelated, but happen to share a common beginning. [21:37:10] Such as: Apart from B, there is [[A|another]] letter close to the begin of the alphabet. [21:37:29] Nobody but the algorithm would shorten it to: Apart from B, there is [[a]]nother letter close to the begin of the alphabet. [21:37:58] Such result would work, but it would definitely surprise editors who would later come across it in wikitext. [21:38:12] yeah [21:38:14] Although I think we might consider such cases rare and therefore tollerable. [21:38:18] it is still readable though [21:38:32] It is. Though that's what I've called "unintelligible". [21:38:59] Some people might have similar feelings towards tails that are no full grammatical suffixes but merely fragments of those. [21:39:14] But that is a hint that there is probably no clear threshold. [21:39:41] we can only do so much without full AI ;) [21:40:23] granted, this would not require full AI, but it would still not be trivial [21:41:26] I can imagine. [21:42:13] Thanks for discussing this with me. [21:42:33] I'll see in the upcoming days what else can I do in my scripts and whether I can manage that all without data-parsoid. [21:43:01] Is there a way in which I could contact you or someone else from the team if I get more questions or comments? [21:43:39] I mean, it's invaluable if I can talk to developers, because else discovering it all for myself can be a great adventure, but also lead to great confusions very easily. [21:49:33] Blahma2_: sure, we normally hang out in #mediawiki-parsoid [21:49:44] and you can mail to wikitext-l [21:50:17] see the links in https://www.mediawiki.org/wiki/Parsoid#Getting_started [21:55:58] Thanks for the info. [21:56:49] And I will post a comment to that last bug report you've mentioned. [21:59:27] Done: https://bugzilla.wikimedia.org/show_bug.cgi?id=52240#c3 [22:01:15] In the meantime, I wish you good night, at least if you are in Europe like me. Good bye. [22:29:12] How long does it take to create a new tool? [22:30:02] More than a few seconds. /me tuts.