[08:25:44] What SPARQL query will give me a list of Wikidata items that have no statements, sitelinks, labels, descriptions, anything at all? [09:18:39] aude: mornign [09:41:16] mvolz: What would it take for Citoid to support ISBN input? [09:41:54] harej: a publically accessible repository with isbns and metadata in it :) [09:42:16] In the past we've talked to OCLC about using worldcat [09:42:27] Isn't Wikipedia Library working on a deal with them? [09:42:41] Yes, Jake was doing so [09:43:01] but WMF tried working out a deal with them more than a year ago, and that fell through [09:43:32] and then Jake tried to get it working as well and I haven't heard anything recently about that [09:43:52] WMF being me and JamesF and Legal :) [09:44:10] And then Jake brought some stuff to JamesF again. [09:45:39] I can ask Jake if there are any updates [09:45:44] Merillee is there correct? [09:45:52] Have you talked to her? [09:52:26] Has there been any thoughts of using openlibrary.org data? [09:55:09] involans: yes! [09:55:21] I have chatted with the developers and that is definitely something we could do [09:55:44] mvolz: I am in the room with Jake. I can ask him [09:56:21] their database isn't as complete so we were hoping to get worldcat and to be honest I haven't devoted the developer time to using it, partially because it keeps on seeming like we're about to reach a deal with worldcat :P [09:56:57] if someone else wants to take a crack at it it is of course more than welcome :) [09:57:19] harej: sweet [09:57:44] i asked james_F in #mediawiki-visualeditor but he's on PST so probably is asleep ^-^ [09:57:51] "Great question. Let me check my email and I will forward you the most recent email." [11:01:03] * aude waves [11:01:09] hi mvolz [11:04:59] :D [11:05:23] we are figuring out mapping from citoid / zotero to wikidata properties [11:05:25] https://etherpad.wikimedia.org/p/wikicite-citoid [11:05:46] and i am poking at the script [11:08:30] hi aude :) [11:08:42] Do you have an opinion about https://phabricator.wikimedia.org/T110399? [11:08:46] (see my last comment) [11:12:03] aude: great! I have it working on test wiki, sorta. (labelData.entities[entityId].labels.en.value is undefined, but the title shows up :D) [11:13:28] Citoid for Wikidata? :) [11:16:25] hoo: yes the script that aude has written :) [11:19:35] \o/ [11:19:50] i am changing how the script loads to avoid timing issues [11:26:28] aude: how is P78 title? :) [11:26:28] P78 phab_update_tag sprint error - https://phabricator.wikimedia.org/P78 [11:26:54] * aude stabs the bot :P [11:27:16] P78 is configuration and might be title on my wiki or test.wikidata [11:27:16] P78 phab_update_tag sprint error - https://phabricator.wikimedia.org/P78 [11:27:55] no! bad bot! [11:28:51] how do you represent the name of an item? [11:29:02] is there some notation i could use for that? [11:29:12] it's not quite a property. [11:29:38] "label"? [11:33:27] so this json as it is done here is going to be a bit hard [11:33:38] we might actually need a wikibase format or something [11:33:44] particularly for authors [11:33:54] authors is what tripped us up with template data on mediawiki as well :) [11:34:19] the main problem here is that we have to represent that the label is the firstName and lastName concatenated. [11:34:32] probably a wikibase format would be good (eventually) [11:34:45] we don't need it until you get to authors really [11:35:15] i would just have the widget search for the author item [11:35:29] if there is a good enough confidence match, then suggest them [11:35:49] if not, then help the user to create the item and/or search [11:35:49] yes but it's a bit hard to respresent that in the json [11:35:56] is what I am saying :D [11:35:58] yeah [11:36:02] so with publicationTitle [11:36:07] you need to add an item there [11:36:17] and you just search for the literal publicationTitle [11:36:20] but with author [11:36:46] i would find publication by looking up issn number [11:36:50] if avaialble [11:37:21] we would want these identifiers, ideally, indexed in elasticsearch or somewhere for lookups [11:37:48] yeah I'm just saying you can't really represent that in the json really. or at least not a way that pops to mind [11:37:49] on the etherpad, it would help to note these issues :) [11:37:49] :) [11:38:15] right now, the widget is doing these steps [11:38:23] I guess it's sort of already there at the bottom. [11:38:28] ideally, citoid would do them in the backend and then provide a wikibase format [11:41:38] @mvolz -- are you around? [11:42:06] skarcher: not physically in Berlin unfortunately ^-^ [11:42:09] but on IRC ya [11:43:07] yes, I'd know if you were here ;) -- I have a couple of Citoid questions, haven't dug into the code at all, but like to understand better [11:43:21] for DOIs and PMIDs, you're not using Zotero code at all, correct? [11:44:07] skarcher: nope. I know there are parts of Zotero that can do that, but we're only using the functionality exposed by translation-server [11:44:10] Can you point me to the code there? I'd like to see how hard it'd be to add support for DataCite DOIs and for ISBNs via non-Worldcat APIs (Library of Congress, in particular) [11:48:03] It's pretty knarly :S [11:48:05] https://github.com/wikimedia/citoid/blob/master/lib/Scraper.js#L293 [11:48:14] that's where we ask for crossref metadata [11:48:37] so after that we could request datacite metadata if it fails [11:49:03] to turn data cite metadata you'll need a translator for what format it comes in [11:49:15] skarcher: [11:49:24] yup, I know that -- I've seen the CrossRef one [11:49:57] writing a translator is a huge pita. (for citoid). :) [11:50:15] there are tests that can help a lot [11:50:21] I'll see how far I can get on that; their API is very similar to CrossRef's, so I might be able to use a fair amount of C&P [11:50:43] yeah unfortunately it was built for cross ref's OLD api [11:50:47] not the new json one [11:50:50] oh nos [11:50:55] so no help there. [11:50:57] :) [11:51:01] I need to update that. [11:51:31] OK, I'll poke around a bit -- do you have recommendations for dev/testing environment? [11:52:49] out of curiosity -- if Z's translation server did add "add by identifier" it'd be pretty easy to add to citoid, correct? [11:54:55] mostly https://www.mediawiki.org/wiki/Citoid#Installation_2 the tests relevant for the translators are in https://github.com/wikimedia/citoid/tree/master/test/features/unit/translators [11:55:02] thanks [11:55:27] skarcher: so, yes. What we currently do is resolve the doi, and then put that url into translation-server [11:56:12] but if you're saying that you could add additional fcnality to translation-server that would take a doi that normally doesn't resolve to a page that matches a url in zotoer [11:56:15] zotero* [11:56:22] but gives a result [11:56:25] we could use that [11:57:08] skarcher: have you used mocha before? [11:57:27] no :( [11:57:29] if you add .only after the it or the describe it will only run the tests you're interested in [11:57:34] so describe.only [11:57:36] or it.only [11:57:44] will run only the tests inside the block [11:57:55] otherwise when you run the tests it will run ALL The tests and they take long [11:58:00] and will be annoying [11:58:01] :) [11:58:38] so if you want to only run the dublin core translators tests [11:58:39] https://github.com/wikimedia/citoid/blob/master/test/features/unit/translators/dublinCore.js#L8 [11:58:40] Mocha also supports filtering by search terms when you call it [11:59:07] describe.only('dublinCore translator unit', function() { [11:59:12] involans: oooh [11:59:15] didn't know that [12:00:16] see: http://mochajs.org/#usage, specifically the `--grep` option [12:00:49] this matches any of the 'describe' or 'it' titles [12:01:42] skarcher: another tip, some of the tests will fail if you aren't using opendns. :) [12:02:12] you use mocha for unit test for the translators? [12:02:33] skarcher: yes [12:04:04] I am looking now also at Citoid [12:04:52] and unfortunately I just realised I don't have good tests built for something like cross ref or datacite :( only from html metadata [12:07:16] quit [12:07:40] Can you post the test url again? [12:07:57] https://citoid.wikimedia.org/ [12:08:00] that one? [12:08:17] No the specific unit test [12:08:35] * aude taking quick break to investigate a bug on wikidata (and wikivoyage, ...) [12:10:45] ok, thks sebastian [12:16:17] skarcher: , zuphilip, https://github.com/wikimedia/citoid/blob/master/test/features/unit/scraper.js tests the translators really thoroughly based on some raw html files for embedded metadata [12:16:39] should probably have something similar for some raw json from datacite [12:16:55] and crossref once we switch to newer api [12:17:16] oh wow, https://www.wikidata.org/wiki/User:Aude/refs-template.json is formatted visually [12:17:49] sorry, we were distracted by some MARC questions ;-) [12:18:37] aude: Do you an opinion re https://phabricator.wikimedia.org/T110399#2329562 ? [12:19:59] loading from slave is probably okay (with not so much caching) [12:20:24] the chance of an edit conflict or something is probably small [12:20:31] due to slave lag [12:22:58] mvolz: What is the reason to do https://github.com/wikimedia/citoid/tree/master/lib/translators instead of using the Zotero Translators for that? [12:24:29] So, Zotero works by using regex to match to the url. If a url is given that doesn't match a translator, translation-server won't run on it [12:24:46] those translators are for urls that Zotero won't run on because it doesn't have a matching translator for [12:24:55] they will run on any page it is given [12:25:12] we did at one point look and see if we could use Zotero translators directly [12:25:27] but it turned out to be too intractable :) [12:25:54] there definitely is duplication of effort there. [12:25:57] though [12:26:52] and the way we chose to write the translators was to stick to the data model directly, whereas Zotero will just throw out invalid fields instead, which in retrospect would have been a lot easier to code. :P [12:27:03] mvolz: https://github.com/filbertkm/wikidata-refs/pull/1 [12:27:15] aude: Ok [12:27:54] suppose next thing might be to add some stuff like 'use strict' and syntax checks [12:28:03] hoo: it's just my opinion [12:28:19] maybe also ask Daniel [12:29:15] He's not working today, is he? [12:29:55] he's at the conference [12:30:28] if we want to work on this asap, then i could ask him to look [12:35:01] aude: Not ASAP [12:35:05] but this week or next week [12:35:10] Aaron asked me to [12:35:18] * hoo is at a conference too [12:40:15] hoo: ok [12:40:31] * aude waves to lucie [13:16:49] anybody got news from nikki ? ^^' [13:21:26] Alphos: context? [13:22:02] andre__ a bot request they made, been trying to get a bit more details before launching it [13:22:43] i tried PMing them on irc a few days ago, to no avail :-( [13:22:46] ah [13:22:57] bnc maybe ? [13:23:25] come to think of it, i could {{ping}} them :/ [13:23:25] 10[1] 1010https://www.wikidata.org/wiki/Template:Reply_to - Redirección desde 10https://www.wikidata.org/wiki/Template:ping?redirect=no [13:23:38] shush AsimovBot ! :p [13:30:26] aude Incabell said you waved at me? :) [13:35:51] Hi frimelle [13:36:36] \o/ [13:39:54] mvolz: The installation of Citoid looks quite complicated. Have you ever thought about a Docker container for it? [13:41:56] zuphilip: we have a role installed for it on vagrant https://www.mediawiki.org/wiki/MediaWiki-Vagrant [13:43:12] which is kind of similar. [13:44:09] zuphilip: it's fairly simply to install if you don't install zotero as well [13:44:39] if you are only working on citoid innards you don't necessarily need to install zotero [13:44:43] I got citoid working [13:44:47] justa lot of zotero dependant tests will fail [13:44:58] audephone: yay! [13:45:01] Not with vagrant [13:46:55] I will try to install it, when I have better internet... (I was just referring that there is a lot of text to read about it ;-) [13:48:51] audephone: also completely ignore what I said about the api [13:49:02] application/json isn't valid for raw, and neither is format= [13:49:03] lol [13:49:07] worst code review ever! [13:49:20] Oh ok [13:50:00] Somehow I think loading the scripts can be better [13:51:18] audephone: you have to use action=query to get json [13:51:28] Yeah [13:52:12] Don't know I need to use getJSON [13:53:03] But it works [14:01:27] audephone: that'll all get replaced by resourceloader stuff eventually anyway so it doesn't matter [14:01:29] :) [14:27:11] Jonas_WMDE: I lost the link you sent me to play with the new wdqs UI stuff, care to share it again? :D [14:32:13] hi, could someone look at this PR? https://github.com/wmde/WikidataApiGem/pull/9 [14:32:47] *looks* [14:33:24] tgr: looks trivial! is it tested? ::P [14:34:00] addshore: not sure how I could test it before it's published [14:35:02] cool, well I don't do ruby, but it looks sane. care to give it a once over 1 more time and I'll merge it! [14:40:48] addshore: I don't do much ruby either, just tried to copy those who do :) I'll update the gem version in Wikibase once it's merged, any problems should show up at that point (although mediawiki_api itself has already been updated in all other repos and works fine, so I doubt this could cause an error) [14:41:22] oh, hah, I dont actually have a merge button for that repo! Lydia_WMDE ??? I can have? ;) [14:43:56] addshore: jonas-wmde.github.io [14:44:07] :D [14:44:18] addshore I am next to Lydia [14:44:19] Jonas_WMDE: 404 :( [14:44:40] Can stop by here? [14:44:54] where are you both? :P [14:44:59] @mvolz -- erlier you wrote: "those translators are for urls that Zotero won't run on because it doesn't have a matching translator for [14:24] they will run on any page it is given" [14:45:06] Addshore next to sandra [14:45:11] Haven't moved [14:45:17] ahh okay!!! [14:45:35] So Zotero does that, too, with a bunch of translators and they run on their version of translation server [14:45:50] their target regex is simply empty. Why can't Citoid use those? [14:46:11] (That's COinS, Embedded Metadata, unAPI, and DOI) [14:46:11] skarcher: maybe we can! [14:46:17] oooh [14:46:51] I didn't see anything like that that from reading translation-server docs or code though- any chance you know how? :D [14:47:25] they don't run automatically? If I just send them to the Zotero instance of that via bookmarklet it works [14:47:42] skarcher: yeah, that is the standalone version of Zotero [14:48:03] nono, I didn't have that running [14:48:06] or with the chrome extension [14:48:10] nope [14:48:22] I can tell because it sends me to the server [14:48:48] https://github.com/zotero/translation-server [14:49:03] if you give that a url that doesn't match the regex [14:49:14] it will give you a non 200 status and no response [14:49:27] interesting, let me try and see if I can make sense of that [14:49:33] skarcher: possible we're talking about different servers? [14:49:41] I know zotero.org is a whole different ballgame [14:50:07] I'm not aware of any additional translation code they're running, though [14:50:07] which I now remember is 501 not implemented [14:50:20] for any non matching url. [14:51:03] addshore: [14:51:03] https://jonaskress.github.io/#%23Chemical%20elements%20and%20their%20properties%0A%23defaultView%3AMultiDimension%0ASELECT%20%3FelementLabel%20%3F_boiling_point%20%3F_melting_point%20%3F_electronegativity%20%3F_density%20%3F_mass%20WHERE%20%7B%0A%20%20%3Felement%20wdt%3AP31%20wd%3AQ11344.%0A%20%20%3Felement%20wdt%3AP2102%20%3F_boiling_point.%0A%20%20%3Feleme [14:51:03] nt%20wdt%3AP2101%20%3F_melting_point.%0A%20%20%3Felement%20wdt%3AP1108%20%3F_electronegativity.%0A%20%20%3Felement%20wdt%3AP2054%20%3F_density.%0A%20%20%3Felement%20wdt%3AP2067%20%3F_mass.%20%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22.%20%7D%0A%7D%0ALIMIT%20100 [14:51:54] Jonas_WMDE: awesome! [14:52:21] hmm OK, so I'll investigate this. It seems such a waste to not use those. We put a ton of effort into the generic translators. [14:52:32] mvolz: My colleague did some work on the translation server as well: https://github.com/infolis/translation-server and https://github.com/infolis/translation-server-cli [14:54:08] what on earth happened to the size/date in the search results? it's a barely readable lime green D: [14:54:47] Nikki will be fixed in swat [14:54:53] In a few minutes [14:56:05] skarcher: yes, it seems silly to duplicate your efforts... I probably should have looked into this more myself as well when I first noticed it in the bookmarklet like 6 months ago or something :) [14:56:36] the problem is I don't really use the browser zotero so I miss many of the cool things which (i think) are missing in translation-server [14:57:15] phew :) [14:58:03] not exactly sure where that code lives that runs on arbitrary urls. [14:58:27] zuphilip: has he done a pull request upstream? [14:59:09] mvolz: no not yet [14:59:53] mvolz: configure to run as gecko "g" instead of "v" (or a similar hack) did a difference for us... [15:00:29] zuphilip: what are you using it for? [15:00:56] zuphilip: yes, we have done a slightly different approach which is to fork the translators and add 'v' flags where the seem to work [15:01:46] mvolz: resolve dois to url and find the pdfs [15:01:53] @mvolz -- we'd take those as a massive PR back to Zotero; that way your updates become easier and other people benefit, too [15:02:10] skarcher: yes we absolutely should [15:02:20] the problem is that some of them are kind of hacky [15:02:38] in particular google books which causes an internal server error [15:02:44] whoops [15:02:53] would need to fix that before pushing upstream [15:02:54] :P [15:03:12] for some reason on the multiple results pages it is not happy [15:03:19] and causes an error [15:03:36] but on individual book pages give good results. [15:04:20] the other issue is that the tests for translation-server have been broken for a while and I don't feel comfortable pushing things upstream which I haven't been able to run the tests on :/ [15:05:06] skarcher: I guess I will try anyway and maybe get input on some of those issues? [15:05:35] mvolz: since I'm the most likely person to merge them... yes [15:05:43] woo! [15:06:43] not too worried about the lack of automated tests [15:07:31] tgr: merged [15:07:49] mvolz: multiples failing is more of an issue -- there's also just not that much of a downside. Zoterakingo will try Standalone first anyway, so if something is marked wrongly compatible, I don't really see anything bre [15:09:59] addshore: thanks! could you also tag it as 0.2.1? [15:40:16] ah, I forgot how to github [17:48:19] anyone with access to https://github.com/wmde/WikidataApiGem still around? [17:48:34] Hi, is there a way to get Special:WhatLinksHere as json? [18:01:38] aeroid: https://www.mediawiki.org/wiki/API:Backlinks [18:30:12] tgr: Lydia_WMDE can help you [18:39:03] Lydia_WMDE: if you could tag https://github.com/wmde/WikidataApiGem/commit/d3965d9c000bfb244e9cd63d4df94291840b6817 as "0.2.1" that would be great [18:39:31] I'm trying to fix tests which will break when AuthManager gets enabled and this is the last missing piece [21:16:27] how can I save a new value on an entity? [21:16:54] I have filled the value, the qualifier and a reference [21:17:00] but save is disabled :( [21:40:16] turns out tthe refernce was wrong [21:42:24] hmm [21:42:36] why is it showing ±1 now? :S