[22:06:15] is there a way (via API) to search for file metadata at commons? [22:06:35] Hi Sagan [22:06:36] use case: I want to have a list of all files from User A that were take by Camera B [22:06:40] hello gry [22:06:52] the second part is exif metadata I guess? [22:09:10] if the exif is the data at the bottom of the file page, then yes [22:17:39] Sagan, do you have an example of a file whose metadata says what camera it is? [22:18:26] i think you may want to do a chained request (get image metadata for all uploads of user A), but i've not done it before, i'm not sure how it works [22:18:41] https://commons.wikimedia.org/wiki/File:Ermita_de_Santo_Cristo_de_Miranda,_Santa_Mar%C3%ADa_de_las_Hoyas,_Soria,_Espa%C3%B1a,_2017-05-26,_DD_65.jpg has a camera model field [22:18:45] gry: for example: https://commons.wikimedia.org/wiki/File:Talalpsee_%C3%9Cbersicht.jpg [22:19:21] or if it helps: I do not need the data of all files, my request is more like "did User A used Camera B at any time" [22:22:01] it's the same either way, you'd have to filter client-side [22:23:25] Can't even do it nicely with sql [22:23:27] https://commons.wikimedia.org/wiki/Special:ApiSandbox#action=query&format=json&list=allimages&aisort=timestamp&aiprop=extmetadata&aiuser=Luke081515 [22:23:36] You can do it non-nicely though [22:23:44] eh [22:23:45] Use LIKE queries [22:23:53] trudat [22:23:57] to try and read the serialized php blob [22:24:03] I've done that in the past [22:24:46] e.g. https://quarry.wmflabs.org/query/23209 [22:24:48] I suppose for this case it would work well enough, since you're looking for one pecific string [22:24:56] not exactly what you need, but demonstrates the idea [22:25:25] There has been on-and-off talk about exposing exif metadata to cirrus search. I dont think anyone has ever gotten around to doing it [22:25:30] if you wanted "What cameras did User A use?" then trying to do it pure SQL is a less great idea [22:25:50] better solution long term is just copy it to structured data [22:26:26] https://phabricator.wikimedia.org/T150809 [22:27:31] Well it is structured data in a sense. I would say that its a failure of the structured data stuff, if the reccomended approach is all users manually copy all the exif stuff into structured data. Users shouldn't have to do that manually [22:27:55] it's mostly being done by bot [22:29:10] but yes, there have definitely been problems with how SDC has been rolled out. [22:29:20] Then again, there would have been problems no matter how it was done [22:45:21] AntiComposite, hm, with aiprop=metadata I get it shown in the API, would at least be enough for my usecase [22:45:36] then I can get a list of all files + and search via ctrl f then [22:47:11] AntiComposite: I mean, i still think copying exif data to sdc via bot is the wrong model. It should be imported automatically on upload [22:56:52] Sagan, i saw https://commons.wikimedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=&list=allimages&generator=revisions&aisort=timestamp&aiuser=Anidaat&ailimit=max&grvprop=ids%7Ctimestamp%7Cflags%7Ccomment%7Cuser%7Ctags but really don't see how to either (a) get image exif from this or (b) chain another request on top of it which searches by the metadata. (unless it is in image categories, they can be shown) [22:58:01] gry: hm, my aproach was https://commons.wikimedia.org/wiki/Special:ApiSandbox#action=query&format=json&list=allimages&aisort=timestamp&aiprop=metadata&aiuser=Anidaat [22:58:16] the I get "name": "model", "value" "DMC-FX01" etc [22:58:19] *then [22:58:52] that sounds efficient :-) [22:59:35] since it's a one time thing I don't really care about performance, as long as it does not take hours ;) [22:59:48] So you do care about performance ;) [23:00:04] hm, a bit, ok :D [23:02:28] There's only so much time until the heat death of the universe [23:25:48] [[Tech]]; Ecko complex24; /* Cyber security involving critical infrastructure */ new section; https://meta.wikimedia.org/w/index.php?diff=19858503&oldid=19854181&rcid=14971319 [23:26:26] well that ought to be good [23:26:49] [[Tech]]; Bawolff; Undo revision 19858503 by [[Special:Contributions/Ecko complex24|Ecko complex24]] ([[User talk:Ecko complex24|talk]]) spam; https://meta.wikimedia.org/w/index.php?diff=19858504&oldid=19858503&rcid=14971322