[09:47:03] ciao sabas [10:03:28] ciao Nemo_bis [10:35:05] SMalyshev! When I run runBlazegraph.sh I just get a bunch of output and then it says "Killed" =[ [10:35:38] addshore: hm... what is a bunch of output? could you post it like in gist or pastebin? [10:35:43] yarp! [10:36:12] I suspect some memory error but need output to see what it is [10:36:29] wait... BAH, its working this time [10:36:45] I thought I had this working around an hour ago, and just tried again and it failed, but working again now! [10:36:59] heh :) ok, problem solved ;) [10:37:15] just about to try and link the updater up to my wikibase and the query service backend! [10:37:56] hmm, okay, i ended the process and tried running it agian and it came up as "Killed" again! [10:38:00] I'll paste the output [10:38:40] SMalyshev https://pastebin.com/cYHmK2ek [10:39:52] Thats the 2 runs, the first of which seems to be just fine, and the second of which gets killed [10:40:07] Running with a HEAP_SIZE of 2g [10:43:42] Wait, perhaps the issue is the docker container only reports as having 2GB memory, so i should use a smaller heap size... [10:53:09] hmm yes this looks like a work of OOM killer or something [10:53:09] "2017-09-25: The forty-one-millionth item is skipped" [10:53:12] why is it skipped? [10:59:18] bah, this is annoying [11:21:07] why does searching for simple, lowercase notions (e.g. 'truth') not yield the lowercase entity in the first page of results, but only people, albums, etc.? Looks like some tweaking could help. [11:21:15] SMalyshev? [11:21:38] abartov: by searching you mean fulltext search? [11:21:44] yeah [11:22:30] working on it. fulltext search will be improved soon. Right now it's not very good [11:22:52] abartov: watch https://phabricator.wikimedia.org/T178851 [11:23:44] once we have proper index for labels and descriptions, and proper wikidata support in elastic query builder, we can use proper search profiles too [11:24:06] and then Q7949 will be on top, like it is in prefix search [11:24:39] abartov: right now what probably happens is that ranking is made according to wikipedia rules. which are totally wrong for wikidata of course [11:24:57] SMalyshev: ah, excellent. thank you! [11:36:42] 2017-10-28 11:36:02.152:WARN:oejuc.AbstractLifeCycle:main: FAILED o.e.j.w.WebAppContext@6f75e721{/bigdata,file:/tmp/jetty-0.0.0.0-9999-blazegraph-service-0.3.0-SNAPSHOT.war-_bigdata-any-4519211453306268408.dir/webapp/,STARTING}{file:/wikidata-query-rdf/dist/target/service-0.3.0-SNAPSHOT/blazegraph-service-0.3.0-SNAPSHOT.war}: java.lang.OutOfMemoryError: Direct buffer memory [11:37:40] SMalyshev: so, I have to remove the -Xmx param that is being set as docker doesn't play nicely with that, but then it seems it just actually runs out of memory rather than being killed for using too much [11:38:26] addshore: would setting Xmx to something like 500M work? Also, what you are trying to do there? which data you are loading? [11:38:40] just start the service, not loading any data [11:39:30] hmm... then shouldn't be needing too muc memory. Try running with 1g [11:39:56] i.e.set HEAP_SIZE=1g [11:56:27] SMalyshev: managed to get that working! [11:56:33] cool [11:56:37] How do I pass the updater the correct API to look at to poll for changes? [11:56:51] what do you mean? [11:56:54] Right now I have runUpdate.sh -h http://wdqs:9999 [11:57:11] ahh. hmm let me see [11:57:16] but I guess I need to pass the location of my mediawiki / wikibase install somewhere :) [11:59:27] https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual#Configurable_properties - wikibaseHost is the property [12:00:56] hmm, so runUpdate.sh -h http://wdqs:9999 --wikibaseHost wikibase ? [12:02:06] addshore: Where's sourcegoat?! [12:03:03] addshore: no, UPDATER_OPTS="-DwikibaseHost=http://wikibase" [12:03:33] addshore: it's java property (yeah it's messy some are options, some Java props) [12:06:31] SMalyshev: awesome! [12:06:35] RDF store reports the last update time is before the minimum safe poll time [12:06:58] it is all runningg!!!! :D I just need to set an initial time within blazegraph! [12:08:07] I guess I could make a mock dump to pass to loadData.sh that just contains a latest timestamp to add? [12:33:10] Am I in [12:41:46] SMalyshev: what is the Jolokia agent? [12:42:51] Also, when having "UPDATER_OPTS=-DwikibaseHost=http://wikibase" the updater still seems to poll from wikidata [12:43:06] wdqs-updater_1 | 12:42:57.464 [main] INFO org.wikidata.query.rdf.tool.Updater - Polled up to 2017-09-28T12:43:22Z (next: 20170928124322|602936936) at (11.7, 6.8, 2.9) updates per second and (1101.3, 637.5, 274.3) milliseconds per second [12:43:07] wdqs-updater_1 | 12:42:57.732 [main] INFO o.w.q.r.t.change.RecentChangesPoller - Got 91 changes, from Q2842405@567959710@20170928124322|602936936 to Q25880209@567959805@20170928124329|602937033 [13:31:00] addshore: jolokia is the library allowing publishing metrics [13:31:46] https://jolokia.org/ [13:34:48] addshore: hmm... wikibaseHost should be hostname I think... But even then it should be still working. What running it with -v option produces? What are the URLs? [13:35:08] addshore: if you want, I'm in booth area, we could debug it [13:35:12] SMalyshev: runUpdate.sh -h http://wdqs:9999 -- --wikibaseHost wikibase [13:35:18] now I need ot make it use http not https! [13:40:19] SMalyshev: --wikibaseScheme http :D [13:40:32] yep just found it :) was literally typing this [13:40:41] SMalyshev: now just to make it not look at "GET /w/api.php and instead look at "GET /api.php [13:41:40] hmm this may be hardcoded... [14:42:23] Jonas_WMDE! around? [14:42:37] addshore sure! [14:42:41] where are you? :D [14:43:03] where would you assume me to be? [14:43:34] somewhere in the wikidatacon venue ;) [14:44:34] no query workshop with lucas [15:00:38] Jonas_WMDE: aaaaaah! [15:13:09] Jonas_WMDE: oooh, fixed the layout, but now clicking the run quqery button downloads the result instead of displaying it! [15:14:20] no js? [15:14:36] hmmmmm,mmm [15:15:48] nope [15:18:20] Jonas_WMDE: I have some evil stuff in the console https://usercontent.irccloud-cdn.com/file/jj2gTrVw/image.png [15:18:36] also, I can see you, mwhahahahaaaaa [15:18:37] ohoh [15:18:50] your config.js file is outdated [15:18:54] ooooh [15:18:57] *looks* [15:19:10] https://usercontent.irccloud-cdn.com/file/THfC86By/image.png [15:19:18] What do I need to change? :D [15:19:28] look at master [15:19:39] variable names have changed (root) [15:21:40] var root = 'https://query.wikidata.org/'; seems evil [15:24:48] Jonas_WMDE: hooraaay, working! [15:24:54] Lydia_WMDE: done ;) [15:24:59] addshore :) [15:25:18] addshore deploy all the dockers? [15:26:15] Jonas_WMDE: the whole chain of everything is working, compose file to run wikibase, with a query service and UI and updater :) [15:26:52] wikibase as a service! [15:56:03] SMalyshev: one final hurdle [15:56:09] I just tried rebuilding all of the images [15:56:09] yes? [15:56:09] [INFO] Blazegraph extension to improve performance for Wikibase FAILURE [01:22 min] [15:56:30] ailed to execute goal on project blazegraph: Could not resolve dependencies for project org.wikidata.query.rdf:blazegraph:jar:0.3.0-SNAPSHOT: The following artifacts could not be resolved: com.blazegraph:bigdata-cache:jar:2.1.5-SNAPSHOT, com.blazegraph:bigdata-client:jar:2.1.5-SNAPSHOT, com.blazegraph:bigdata-common-util:jar:2.1.5-SNAPSHOT, com.blazegraph:bigdata-core:jar:2.1.5-SNAPSHOT, com.blazegraph:bigdata-util:jar:2.1.5-SNAPSHOT, [15:56:30] com.blazegraph:ctc-striterators:jar:2.1.5-SNAPSHOT: Could not find artifact com.blazegraph:bigdata-cache:jar:2.1.5-SNAPSHOT in wmf.mirrored (http://archiva.wikimedia.org/repository/mirrored) -> [Help 1] [15:57:05] hmm weird... [15:58:28] they should be there... [15:59:14] builds fine for me though [15:59:40] maybe some network issue? [15:59:51] hmmm [15:59:53] *rebuilds* [16:06:17] hmm, a rebuild did the same :( [16:06:54] PROBLEM - High lag on wdqs1004 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [1800.0] [16:07:22] SMalyshev: could it be that you have some dependency downloaded which is no longer in the repo / downloadable? So your build works and mine doesnt? [16:07:34] PROBLEM - High lag on wdqs1003 is CRITICAL: CRITICAL: 31.03% of data above the critical threshold [1800.0] [16:07:38] hmm possible. let me check [16:08:34] PROBLEM - High lag on wdqs1005 is CRITICAL: CRITICAL: 34.48% of data above the critical threshold [1800.0] [16:08:35] hmm yeah looks like blazegraph 2.1.5 no longer on maven... weird [16:08:42] Could not find artifact com.blazegraph:bigdata-cache:jar:2.1.5-SNAPSHOT in wmf.mirrored (http://archiva.wikimedia.org/repository/mirrored) [16:08:45] silly maven [16:27:42] addshore: I can try to re-deploy my copy to archiva [16:28:08] but it may require to add snapshot repo to the pom, I'm not sure it picks that up now [16:29:11] :/ [16:29:15] I have no idea! :D [16:31:15] As far as I know this is now the 1 thing that blocks be from showing my birthday present tomorrow :D [16:35:10] addshore: do you really need to rebuild it? [16:35:36] you can get all binaries from https://github.com/wikimedia/wikidata-query-deploy/ [16:36:15] you'll have to set up gitfat, but that should be trivial [16:36:38] hmm, okay, let me give that a go [16:41:55] and i guess i will not need git fat if i get it from a zip! [16:57:48] addshore: true [16:57:56] zip contains everything. [16:59:11] Actually, maven release 0.2.5 should be pretty fresh too, I've done it just a week or so ago [16:59:17] http://search.maven.org/#artifactdetails%7Corg.wikidata.query.rdf%7Cservice%7C0.2.5%7Cpom [16:59:52] so something like http://search.maven.org/remotecontent?filepath=org/wikidata/query/rdf/service/0.2.5/service-0.2.5-dist.zip should work [17:23:55] SMalyshev: bah, Error: Invalid or corrupt jarfile jetty-runner-9.2.3.v20140905.jar from the zip [17:24:09] addshore: hmm weird [17:24:12] that was from the github zip [17:24:17] let me try the maven one! :) [17:24:29] let me check [17:24:56] addshore: where did you d/l the zip? [17:25:04] master branch of github! [17:25:15] which github? [17:25:34] https://github.com/wikimedia/wikidata-query-deploy/archive/master.zip [17:25:43] addshore: ahh it doesn't work that way [17:25:47] you'd still need gitfat [17:25:58] awwww :( [17:26:01] that zip still has gitfat stubs [17:26:09] zipping them up does not resolve gitfat [17:26:38] okay, but the mazen zip should work without gitfat then? :D [17:36:53] yep [17:37:00] maven one needs no gitfat [17:37:13] it is fat enough by itself :) [17:49:47] SMalyshev: https://github.com/addshore/wikibase-docker/commit/5fa1fe390ad95312c3b1c821af55eea8ebb20ca9 done and works! [17:49:54] WOO! [17:58:03] SMalyshev: another question! where on disk does blazegraph store its data? / what do i want to make sure I dont throw away? [18:37:58] aude: rror from line 120 of /var/www/html/extensions/WikibaseImport/src/EntityImporterFactory.php: Call to undefined method Wikibase\Repo\WikibaseRepo::getBaseDataModelDeserializerFactory() [18:38:02] =[ [19:21:17] SMalyshev: next task, to get federated queries working on my wdqs instance! [19:21:46] It looks like the default for wikibaseServiceWhitelist is whitelist.txt and that file is included in https://github.com/wikimedia/wikidata-query-deploy [19:21:52] ahhh, but wait, im not using https://github.com/wikimedia/wikidata-query-deploy now! [19:24:45] addshore: yes, whitelist.txt is basically the list of URLs [19:25:35] you can copy it from deploy repo or add your own, as you like [19:25:55] I guess I just need to add query.wikidata.org :) [19:26:06] yep good idea :) [19:48:26] SMalyshev: 19:45:25.029 [update 7] WARN org.wikidata.query.rdf.tool.Updater - Contained error syncing. Giving up on Q83 :P [19:48:43] So, I am running mediawiki and wikibase 1.29 and the verison of wdqs that you gave me [19:49:00] addshore: hmm weird. did it say anything else> [19:49:00] ? [19:49:04] might there be incompatibilities between that version of wikibase rdf & what wdqs updater expects? [19:49:24] https://www.irccloud.com/pastebin/Uuy7lHEf/ [19:49:34] 1.29 is a bit old but not that old... try running with runUpdate.sh -- -v and seeing what it says [19:49:54] wdqs-updater_1 | Caused by: org.openrdf.rio.RDFParseException: IRI included an unencoded space: '32' [line 87] [19:50:02] ah this is the problem [19:50:31] hmm not sure why. maybe a fixed problem [19:50:44] addshore: what do you have at http://wikibase/wiki/Special:EntityData/Q83.ttl?nocache=1509219844221&flavor=dump ? [19:50:59] https://www.irccloud.com/pastebin/zj5YbZaY/ [19:51:23] localhost:8181 === wikibase [19:52:13] err it doesn't have line 87? [19:52:39] haha, that is the full output [19:53:08] https://usercontent.irccloud-cdn.com/file/4KwUqofG/image.png [19:53:18] hmm. that's strange. No idea what's going on there. [19:54:01] okay, might not be worth looking into too much [19:54:36] java definitely thinks there's line 87 and space there [19:55:24] the same entity on wikidata.org is https://www.wikidata.org/wiki/Q18040662 [19:55:50] can it be some cache issue or some URL difference or something? Because I don't see how it can happen otherwise [19:56:03] that entity has statements [19:56:05] but that seems to display the ttl just fine https://www.wikidata.org/wiki/Special:EntityData/Q18040662.ttl?nocache=1509219844221&flavor=dump [19:56:06] your dump doesn't [19:56:37] Okay, it looks like something probably went wrong with them import of the item! [19:56:47] probably more an issue with the import script then than the updater [19:56:56] possible [20:21:44] RECOVERY - High lag on wdqs1004 is OK: OK: Less than 30.00% above the threshold [600.0] [20:23:14] RECOVERY - High lag on wdqs1005 is OK: OK: Less than 30.00% above the threshold [600.0] [21:27:24] RECOVERY - High lag on wdqs1003 is OK: OK: Less than 30.00% above the threshold [600.0]