[07:40:47] * yuvipanda waves [07:41:41] since I've discovered some free time, I'm playing around with the wikidata json dumps [07:42:02] 👍 [07:42:19] am currently running a script that's dumping all labels for all items (and properties) in all languages into a rocksdb [07:42:26] in a form suitable for caseless comparison [07:42:40] my hopeful next step [07:42:52] is to build a simpler way to query WDQS [07:43:29] so instead of [07:44:18] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples#Cats [07:44:19] I can write [07:44:27] {'instance of': 'cat'} [07:44:55] and very efficiently translate that into WDQS [07:45:10] I spent some time googling around, but couldn't find anything like it tho [07:50:30] since I'll have labels for all languages, I can easily make this work in any language :) [07:50:33] with fallbacks [07:54:56] looks like initial population will take about 8 more h to finish [07:59:20] anyway, please let me know if there have been other attempts at simplified forms of SPARQL specifically targetting wikidata :D [08:05:53] except ofc labels aren't case insensitive :( [08:05:56] https://www.wikidata.org/wiki/Q11345716 is also same as Q1 [08:06:31] hmm, there's like 40 things called universe [08:48:42] ok, apparently python and rocksdb don't go together very well [08:48:47] I'll dig into this again :) [09:30:03] yuvipanda: I don't think it could automatically guess since labels aren't unique, but maybe something like the item search when adding statements (i.e. write "cat", it lists matches and you pick the one you mean) could make it easier to write queries without having to copy and paste ids [09:30:29] nikki: yeah, or I could also just put all of the things with that name into an 'or' :D [09:30:33] but that sounds terrible and inefficient [09:32:10] heh, it would also return things people didn't intend, like if you want to find things in london, you probably don't mean any london [09:33:41] that's true! [09:34:20] looks like this wall can't be climbed by me atm, so I'm going to give up and do other things :( [09:34:24] it was fun playing with rocksdb tho :D [09:36:55] :) [09:39:53] trying things is good even if it doesn't always work... who knows when something you did previously might be useful [09:48:08] nikki: yeah [09:48:20] nikki: I'm still going to try to do *some* of it tho [09:48:28] nikki: mostly as a way to brush up my C++ :D [09:54:50] Why can't I save P2093:"George B. Wallace"? [09:54:56] Malformed input: George B. Wallace [09:55:14] ...I'm not seeing any malformatting [10:00:03] on https://www.wikidata.org/wiki/Q24669646 [10:01:19] That is supposed to work… can you try it again? [10:01:36] Maybe re-type the input to make sure there are no zero-width spaces in there or things like that [10:06:26] I sometimes get errors which go away when I try to save again [10:21:24] hoo: hmm..yeah, might hvae been a unicode control character yeah... [10:21:54] Good to here [16:07:35] is there an established practice for storing album durations? should I just store as seconds or is there a better way? [17:06:41] nyuszika7h: hi, at the moment: no. [19:26:36] nyuszika7h: Follow the source? If the size of a painting is in cm in the source, I just cm, if the source says inch, I would use inch [19:29:07] sjoerddebruin: It was fun yesterday! [19:30:41] multichill: yeah <3 [20:19:33] multichill: well, it's minutes+seconds [20:19:51] would be nice to store it as XXm XXs like shown in the infobox (though that doesn't get its info from Wikidata on enwiki anyway) [20:20:09] Platonides: Hey, you in? Importing paintings from the Prado. In English I have "painting by ", in Spanish "pintura de ". Is this correct? :-) [20:20:42] nyuszika7h: I think you need to wait for unit conversion or build that it into the template with LUA. [20:20:57] I would just do in seconds [20:21:02] well it doesn't use Wikidata for now so meh [20:21:07] I just entered it in seconds for now [20:21:22] Template can convert 140 seconds to 2:20 [20:21:33] yeah [20:21:55] just looks weird to see it on Wikidata in seconds but I guess it doesn't matter there [20:22:19] The lack of plural words there annoys me more. :P [20:22:42] * hoo mumbles unit symbols [20:25:18] So much stuff to do... [20:38:48] I don't understand https://phabricator.wikimedia.org/T142082#2623862 [20:39:51] Nemo_bis: I think the idea is that you read through the entity before you start adding things [20:40:17] Does someone seriously think that happens? [20:40:26] No idea [20:40:31] Mostly I cmd + f the property name :P [20:40:36] * nikki grumbles at the entity suggester suggesting malé in the maldives as the best match for "male" for sex/gender [20:40:47] sjoerddebruin: that, or just give up, is what I expect from users [20:41:18] I never understood the removal of the TOC [20:41:31] The TOC didn't help with this bug though [20:42:07] Maybe. [20:42:09] Reaching the #wikipedia-sitelinks anchor or similar is just a workaround for geeks who can as well find other workarounds [20:42:27] (A workaround which I found comfortable, admittedly) [20:57:08] multichill: I'd say "cuadro de " [20:59:09] hoo: would it be possible for the item search to take into account the number of incoming links when ranking the results? [21:01:01] nikki: In theory yes, but probably not in practice [21:01:15] :/ [21:01:16] but with the change of that to elastic, this should improve considerably [21:01:18] The current search infrastructure can't handle that AFAIK. [21:01:31] the current weighting is just a hack dennyvrandecic_ implemented as a stopgap [21:02:07] sjoerddebruin: If we really really wanted, we could… but not worth it, especially as elastic based search is in the making [21:02:22] any idea when that'll be done? [21:02:23] :O [21:02:24] Aynway, my tea is ready [21:02:29] nikki: Nothing concrete, no [21:02:38] I think most of the ground work is done by now [21:02:48] but not sure what's left on our side and from discovery [21:03:06] I guess I'll just have to continue making a list of things which are weird and then I can test all of them again when it's done :P [21:03:14] Yes :D [21:03:18] https://phabricator.wikimedia.org/T117520#2750023 :O [21:03:23] Also for the property suggester, please [21:03:35] We have my page already, right? [21:03:44] sjoerddebruin: Yes [21:03:53] not sure how many examples of problem items you ahve there [21:04:05] Oh, I could add those if you want. [21:04:10] but I also don't know when (if at all) I can come around to changing it [21:04:22] yeah, I've been adding my examples for that to sjoerd's page [21:04:25] I hope soonish… but that's just hope [21:04:27] I'll bribe Lydia next week. [21:04:55] anyway, cu later [21:04:57] although those are a bit awkward since adding properties can change the suggestions and I usually want to add more info to the items where I spot things [21:06:43] Jep [21:07:18] Films and scientific articles are still too close with suggestions at moments. [21:07:36] But we're on bad suggestions/no suggestions now, as we removed a lot already. [21:37:23] Platonides: Ok, i've heard both. Some sort of regional difference? [21:37:33] https://www.museodelprado.es/en/the-collection/art-works?cidoc:p2_has_type@@@pm:objectTypeNode=http://museodelprado.es/items/objecttype_20 [21:37:36] Working on those [21:38:09] Platonides: And how would you translate "painting by anonymous painter"? [21:38:35] multichill: they *are* "pinturas", but it's not how I'd translate it [21:38:43] oh, those are murals [21:38:50] No, scroll down [21:39:01] These are transfered difficult thingies [21:42:55] Platonides: And how would you translate "painting by anonymous painter"? [21:43:47] "cuadro de autor desconocido" seguramente [21:57:46] https://www.wikidata.org/w/index.php?title=Q22082602&type=revision&diff=398589917&oldid=398283165 :-) [22:14:19] nikki: I'm trying again, this time in raw C++ :D [22:14:54] (am trying to make a rocksdb with access to a labels -> id mapping) [23:05:37] no matter how much time pass, there is always an IP that removes links in Q15057455, may someone semi-protect it for a long period? Thx [23:06:27] !admin