[19:06:00] Hiya TimStarling! yt? I am trying to do som hhvm extension porting stuff, and I have never done any PHP C stuff, and need some tips [19:20:41] ottomata: yo [19:20:48] where j00 at [19:21:22] yooo i the engineering manager q&a [19:21:29] but only kinda listening [19:21:39] whatcha doin? will you be my guide? :) [19:21:46] yeah i have a few minutes at least [19:21:53] which extension(s) were you considering for porting? [19:21:53] haha, ok, i come down and find you [19:22:13] this one [19:22:14] https://github.com/EVODelavega/phpkafka [22:01:37] A quick question, could someone please remind me how to connect to a instance through bastion? My ssh tunnel isnt working and I dont mind doing this manually. I have been looking at https://wikitech.wikimedia.org/wiki/Help:Access_to_instances_with_PuTTY_and_WinSCP and I do not know why my setup isnt working. [22:05:04] is it not just "ssh instance" [22:08:42] can you login to the bastion? [22:12:13] White_Cat: I'd login to the bastion first, then check "ssh-add -l" if your credentials are forwarded properly and then "ssh targetinstance" from there [22:12:37] most probably your credentials are not forwarded or something [22:13:25] and in the screenshots for putty it still says "pmtpa", replace that with "eqiad". We moved to a different location [22:44:40] I can log into the bastion [22:44:49] I can login to the targetinstance now [22:44:57] I think I wasnt forwarding ssh [22:45:25] but I still cant seem to make WinSCP to work :/ [22:47:12] saper / mutante do you have a fix for me? [22:47:24] are you in wikimania by any chance actually? [22:53:35] White_Cat: not yet, will be on the weekend [22:54:37] White_Cat: so if putty works, winscp should also work. what is the error message [22:55:39] oh, maybe not, since that example uses agent forwarding and we disabled that for security reasons [22:55:47] do you use plink.exe ? [23:03:31] Hi. Is there a convenient way (such as a web form) to find which wiktionaries have a page with a given title? If I know at least one such wiktionary, then the interwikimedia links are there automatically, but what if I don't know any wiktionary with such a page? [23:22:56] b_jonas: i'm afraid not yet. checked on toollabs. OmegaWiki used to try and be a single dictionary, but i see search is disabled. i hope in the future it can be wikidata [23:25:08] mutante: thank you [23:26:01] b_jonas: maybe you could make one in tool labs ?:) it would be a good one [23:26:20] at least you'd get a free shell and other developers [23:27:11] mutante: yes, I was wondering. One way would be to download the files that contain the page title only, there are such files on the data dumps server for at least some wiktionaries, but I'm not sure if there's one for each. [23:28:56] b_jonas: there is . example http://dumps.wikimedia.org/ruwiktionary/ , http://dumps.wikimedia.org/dewiktionary/ etc [23:28:58] mutante: does the tool server already run some cron job that keeps downloading fresh data dumps (or some of them), or mirrors the databases using api requests? [23:29:14] b_jonas: it mirrors databases, yes [23:29:27] mutante: yes, certainly, but I don't know if those exist for _all_ wiktionary sites, or only some. I know the dumps produced differ between the sites. [23:29:47] and for this, it would be important to follow all wiktionaries plus the incubator [23:29:52] let me check [23:30:05] (or at least all but a few that I query dynamically) [23:30:21] (the biggest ones could be omitted but I don't think it's worth) [23:32:23] Different question meanwhile. Has it ever happened that the preferred main url of a wikimedia project has changed, other than by changing the protocol from http to https, eg. because the language code part of the domain got changed from a temporay code to a new iso language code after the project graduated to a site? [23:32:38] And if so, in what cases and when? [23:33:14] b_jonas: there are 172 wiktionaries per Special:SiteMatrix but 217 directories on the dump server that have "wiktionary" in the name [23:33:22] so.. all [23:33:24] plus [23:33:52] b_jonas: people _want_ it to change but we could never do it [23:34:08] mainly because of the database names being tied to it [23:34:25] mutante: directories isn't enough, like I said, for some project, a different set of dumps is produced. that's not a deal-breaker, because I can get the title list from a full xml dump, but may complicate this. [23:34:36] b_jonas: https://phabricator.wikimedia.org/T21986 [23:34:48] mutante: I know about the simple English wikipedia which people want to change, [23:35:05] from simple to ... ? [23:35:37] but are the database names necessarily tied to the domain? there doesn't seem to be a very obvious mapping, you have to look them up in a table I think, though for any one project the language code is just prepended. [23:35:52] mutante: I think to en-simple or en_simple or en.simple or something like that [23:35:55] the current blocker is that external storage [23:36:10] to make it something that actually works as an iso language code according to their formatting rules [23:36:41] or according to that rfc that tells how to use extended iso language codes or whatever [23:36:59] b_jonas: en.simple would not work because we cant have an SSL cert that covers *.*.wikipedia.org [23:37:18] mutante: yeah, and en_simple wouldn't work either because it's not a valid domain name [23:37:27] but I don't know what the right delimiter is [23:37:33] it should be - [23:37:44] let me check the RFC [23:38:25] https://meta.wikimedia.org/wiki/Special_language_codes may be relevant [23:38:40] so regarding the wiktionary dumps [23:38:44] this is what you want for all? [23:38:46] enwiktionary-20140728-all-titles.gz [23:38:52] all-titles should be enough ? [23:39:18] there is even "all-titles-in-ns0" maybe even just that [23:39:19] mutante: I think so, yes, plus some rarely changing information I can query from the api like namespace names or whatever [23:39:31] probably not namespace names, but url so I can link there [23:39:50] yes, you can build the URLs from the API, i do that for some statistics stuff [23:40:40] mutante: I know I can... but I need to know the url to even call the api, though I think I can find that out from interwikimedia codes on en.wiktionary [23:41:17] though I'm not sure the api is enabled on all projects, or that api.php has a consistent path under the base url [23:41:36] b_jonas: it's consistent enough that you should only need the list of language codes to start with [23:42:05] anyway, existing tools have probably solved the problem of how to contact all projects [23:42:29] http://wikistats.wmflabs.org/display.php?t=wt [23:42:52] ^ all it stores in the db is the domain / language prefix [23:44:10] https://en.wikipedia.org/wiki/Special:SiteMatrix probably lists all of them [23:44:26] it lists even the most obscure projects [23:44:41] and anyway, this must be solved by other tools already [23:44:45] yea, same thing, i just synced those 2 [23:45:23] oh, nice statistics though [23:47:20] Has someone made statistics on how many entries some of the big wiktionaries contain if you don't count those entries that contain no data other than such that is created by bots from conjugated/declined forms of words? [23:48:11] I don't mean this for all wiktionaries, that would be difficult, but for the big ones, especially fr.wiktionary which seems to contain a lot of bot-created conjugation entries. [23:51:24] On language codes, the relvant document is http://www.ietf.org/rfc/bcp/bcp47.txt [23:55:29] And I think it says the code should be en-x-simple, or en-simple if that's defined in the registry