[02:10:08] Hi [02:10:17] Someone can do a server side upload to commons? [02:10:24] https://phabricator.wikimedia.org/T122685 [02:36:13] my god Ian Murdock :'( [02:42:39] O_O [03:14:37] So saddening. [08:50:59] hello [08:54:00] Hi ToAruShiroiNeko [08:54:23] I am looking for information on chinese wikipedia [08:57:23] ToAruShiroiNeko: What information? [08:57:33] I want to learn how text is stored [08:57:38] which variant of chinese is used [08:57:44] traditional or simplified [08:57:49] I know it is convereted [08:58:06] ToAruShiroiNeko: As an example, take a look at https://zh.wikipedia.org/w/index.php?title=%E6%B3%B0%E5%9D%A6%E5%B0%BC%E5%85%8B%E5%8F%B7&action=edit [08:58:29] We use notations like -{zh-cn:'''泰坦尼克號'''; zh-tw:'''鐵達尼號''';zh-sg:'''鐵達尼'''; zh-hk:'''鐵達尼號''';}- [09:01:27] The software automatically transliterates text between variants, and converts many commonly used phrases. When things can't be automatically converted, we use -{}- in wikitext. [09:02:05] Hmm [09:02:13] so how is that stored in the database? [09:02:31] I know no chinese so I cannot tell the difference unfortunately [09:04:30] Let me specify [09:04:46] so I am curious if I can see the editors used variant [09:05:01] if the editor inputs in zh-cn can I determine this information? [09:05:37] ToAruShiroiNeko: I think you can get this information from the diffs. [09:06:30] It's not a simple flag in the database. It's a complex problem, like how you distinguish between American English and British English. [09:07:28] But it's simpler in this case, as different variants of Chinese use different characters. [09:07:33] yes [09:07:41] can you give me an example diff for each variant? [09:08:13] we create machien learning tools for wikis at revision scoring and we hope to serve chinese wikipedia [09:08:25] and it would be helpful if we can determine the variant like this [09:08:51] I imagine people using -zh-cn variant vandalises differently from ones using -zh-tw for example [09:12:48] Actually have you heard about us? :o [09:12:55] Traditional Chinese (zh-hk/zh-tw): https://zh.wikipedia.org/w/index.php?title=%E9%BB%83%E6%96%87%E6%84%8F&curid=5185136&diff=38614642&oldid=38614618 [09:13:26] ToAruShiroiNeko: Hmm... ORES? [09:13:42] yes [09:13:51] And simplified: https://zh.wikipedia.org/w/index.php?title=%E7%8B%AC%E7%89%B9%E7%A4%BE%E4%BC%9A_(%E5%8A%A0%E6%8B%BF%E5%A4%A7%E6%94%BF%E6%B2%BB)&curid=5185622&diff=38614646&oldid=38614626 [09:16:01] zh-hk/zh-tw is the same? [09:16:53] There may be differences in some expressions, which are not evident in this edit. They both use traditional Chinese. [09:17:05] By the way, changing text from one variant to another in the source wikitext is considered vandalism. [09:17:41] noted that down, we will treat it as such. [09:18:03] But anyway, there is [[User:Liangent-bot]] which automatically reverts such edits. [09:18:13] A prolific example: https://zh.wikipedia.org/wiki/User_talk:203.210.7.86 [09:18:17] can you give a different example of traditional? [09:18:48] it has urls in it which would confuse the intial tests [09:18:56] Okay, let me see... [09:19:16] We hope to provide scores for existing bots should they choose to use them, we will not interfere with anything on-wiki [09:20:08] Traditional one: https://zh.wikipedia.org/w/index.php?title=%E5%8F%B0%E6%B9%BE%E5%90%8D%E5%98%B4%E6%B1%87&curid=4034839&diff=38614612&oldid=34971394 [09:21:13] zh-sg is different from the other two variants? [09:21:52] zh-sg uses simplified Chinese, like zh-cn. [09:22:34] so that is four variants total? [09:22:44] two using simplified and two using traditional [09:23:42] There are five in total. zh-cn, zh-sg are simplified, and zh-hk, zh-mo and zh-tw are traditional. [09:23:58] "mo" is for Macau, by the way. [09:24:06] ah [09:29:42] approaching 15k GCI edits https://www.mediawiki.org/wiki/Special:CentralAuth/IoannisKydonis [09:29:57] so would this sentence be accurate and fair? [09:31:54] Nemo_bis: Wow, awesome. [09:32:02] Chinese Wikipedia holds four variants: zh-cn, zh-sg, zh-hk, zh-tw, zh-mo. First two of these use Simplified Chinese and latter three uses Traditional Chinese. I am told all five has a level of unique phrases to a degree, it would be difficult to distinguish between them. It is possible to distinguish between Simplified Chinese and Traditional Chinese as they use different character [09:32:02] sets. [09:33:20] ToAruShiroiNeko: Looks fine to me, except it's "five" instead of "four" (but you have probably already noticed that) [09:34:08] Pinging liangent, as he is more experienced in this area :P [09:35:31] good eye :D [09:37:17] sure [09:37:27] I posted it @ https://phabricator.wikimedia.org/T119687 [09:38:06] feel free to comment anything you feel inadequate or inaccurate [09:38:17] ToAruShiroiNeko: Okay! [09:38:49] I've told editors in #wikipedia-zh about this. They will join if they can help. [09:39:01] james970028: ToAruShiroiNeko 是ORES成员 [09:41:11] thanks that is great! [16:57:08] Hey all, I have a problem. How do I 'cancel' the special characteristics of the pipe character ('|') in a citation (e.g. {{ cite web | title = this that | and more | date = 25-12-30}}) [16:57:35] The title of the page should be "this that | and more" [16:58:17] Does {{!}} work? [16:58:24] Or do you need to escape it? [16:59:03] Reedy, that works, thank you :) [19:37:01] cscott and I mulling putting T112987 and T114072 in the same slot on Monday, 3:40pm at WikiDev '16 [19:37:50] cscott advocating for dropping T114072 on the floor, despite potential howls. cf https://phabricator.wikimedia.org/T119593#1911777 [19:38:14] https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit_2016#Program [19:38:29] anyway, I'm just throwing that out there before I step afk for a little bit. discuss ;-) [19:38:47] * cscott considering wearing a nametag annointing himself as the "king of R2" [20:11:40] robla: do we know if anyone from wikia will be at the summit? [20:22:17] cscott: I don't know for sure. There's an incomplete attendee list linked to from the main summit page ^ [20:22:38] people need to explicitly opt in to having their name published [20:57:58] * robla responds to cscott at https://phabricator.wikimedia.org/T113002#1911802 [20:59:57] i'm baffled why we'd want to discuss
and not LanguageConverter. [21:00:29] the reading folks really really really want
[21:00:39] yes, they *want* it, they don't want to *talk* about it. [21:00:43] as far as I can tell. [21:01:00] yeah they just want the magic parsoid/core elves to make it happen [21:01:13] right. so why is it worth talking about? it's already on the parsoid roadmap. [21:01:17] cscott:: I'm rereading T113002 right now, and it appears this is an example of where you aren't taking "no" for an answer. Not done reading yet though, so that may be unfair [21:01:35] whereas there is a great difference of opinion re language converter and content translation. it seems we really *ought* to talk about that. [21:02:11] robla: i believe i am consistent in saying we should keep our options open. [21:02:32] cscott: are you looking for an opportunity to talk *at* people more, or is this something you really intend to learn more about next week? [21:02:45] i've heard from both sides equally often. you should ask @gwicke whether he agrees with @tstarling that we should just add LanguageConverter support to VE and Parsoid, for instance. [21:03:17] well, i'd rather put @tstarling and @gwicke in a room (together with all the others who have strong opinions, including the language team) [21:03:41] last time we did that, we actually came out with a reasonable (short-term) plan forward. [21:03:52] my only advocacy here is that we should have *some* plan. [21:04:46] cscott, I think you've got an opportunity to say your piece about T113002 in the T119022 session at 2pm on Monday [21:05:15] you are causing me to lose all hope in this dev summit. [21:05:35] T119022 doesn't even have an agenda [21:05:46] it has a list of *other* session topics [21:06:25] T113002 has a specific goal, and two very specific options to discuss. i am optimistic we can make progress on concrete technical goals. [21:08:30] and Tim very clearly said T113002 is "potentially distracting from the technical discussion we need to have at the dev summit." [21:09:08] I'm pretty sure he didn't mean that he didn't mean distracting from the technical discussion *in general* thus that we shouldn't discuss T113002 at all. [21:09:15] I'm going ask my question again and then ignore this channel until you answer it [21:09:20] cscott: are you looking for an opportunity to talk *at* people more, or is this something you really intend to learn more about next week? [21:09:40] he's welcome to advocate his point in R2 at the T113002 discussion. i don't believe it is a consensus opinion, but if it is, i'm happy to go with that. that would be a productive outcome of the meeting. [21:10:05] robla: i find that question rather insulting, which is why i politely did not respond to it earlier. [21:10:40] i have spent three years talking to people about T113002. If Tim was expressing a consensus opinion, we'd be done with the matter already and we wouldn't need to continue discussing it. [21:11:41] i'm not saying his opinion is *wrong* (or *right*). I'm just saying that *i* don't have a dog in this fight, other than trying to get the warring sides to reach consensus. and working toward consensus is supposed to be the point of this summit. [21:12:46] also: @tstarling was just referring to the "splitting the wikis" part. you seem to be taking his comment to mean we shouldn't discuss T113002 in a session at all. why is that? [21:15:17] I just fully reread T113002. Tim's comment here I think states the community consensus pretty well: https://phabricator.wikimedia.org/T113002#1837794 [21:15:55] gwicke hasn't weighed in at all in this task, nor has anyone who disagrees. it's just you versus the world if you read T113002 [21:18:26] robla: perhaps you should have come to my wikimania session last year then. [21:19:07] sorry, I was a little busy then :-P [21:20:26] cscott: is your expectation that you get to use force of personality in undocumented in-person conversations, and then tell everyone else that "sorry, you missed it. we discussed it already" [21:21:18] robla: your ad hominem attacks are really quite depressing, considering that you haven't actually attended any of the talks I've given on this topic. [21:22:09] cscott: do I need to go into further detail as to why I wasn't at Wikimania 2015? [21:22:15] 3~/win 29 [21:23:54] paravoid: ? [21:24:46] * robla is trying to figure out if paravoid is making obscure snark that he's not getting, or if paravoid accidently typed something in the wrong window :-) [21:27:04] robla: i am aware. you had an opportunity at last year's dev summit as well. [21:27:44] cscott: is there video of your talk? [21:28:23] you just seem to be making assumptions about how i've handled this discussion in the past that seem to be at odds with reality. [21:28:35] sadly, no. the wmf is pretty terrible about recording sessions. [21:28:37] cscott: is there video of your talk? [21:28:41] perhaps dev summit '16 will be different? [21:29:07] cscott: you've had ample opportunity to make a tech talk this year [21:29:26] Robertson 1 is the only room that has video this year [21:29:30] ok, i need to step away for a while. [21:29:40] you seem to be making this extremely personal. [22:14:30] * quiddity pours everyone a cup of tea. (208 cups of tea!) Plus platefuls of unspecified types of biscuit/cookie. [22:15:33] thanks quiddity [22:15:43] * cscott thanks quiddity [22:22:56] tea sandwiches? scones? [22:32:32] bd808: what do you envision we should get done on the
side of thigns? [22:33:35] (if you have an opinion) [22:34:57] I'm not sure what needs to be done other than coding it. There are some technical questions as I recall about how to handle various edge cases, but I think that's mostly implementation details [22:49:24] thanks! yeah, I'm going to Phiddle in Phab phor a bit [22:50:33] * robla will never get tired of making s/^f/^ph/ jokes, even if everyone else is already tired of it :-P [22:51:25] oops: s/^f/ph/ [23:51:25] what do toknes mean in phab? [23:52:43] ToAruShiroiNeko: nothing really. some people try to use them a bit like votes in bugzilla (which also really meant nothing to us) [23:53:43] so it isnt cold hard cash showeing my thread then. :/ [23:53:44] :p