[09:31:19] hmm this site notice thing is interesting [09:31:53] only appearing in when browsing in en? [09:32:00] Auregann_WMDE: ^^? [09:32:08] apparently yeah [09:38:36] Auregann_WMDE: is it only on search? [09:39:40] addshore: no, sjoerd find some on other pages [09:39:54] * addshore scroll sup [09:40:02] * addshore re reads ticket [13:48:53] hey sjoerddebruin (or another admin) - can I request undeletion of another wikidata qid that our project is using? [13:48:59] https://www.wikidata.org/wiki/Q7579310 [13:49:17] as before I'm happy to fill in the details if the page doesn't have enough of them to meet the notability threshold [14:33:31] can someone do a mass spam delete? [14:33:57] https://www.wikidata.org/w/index.php?title=Special:Contributions/74.213.17.130&offset=&limit=500&target=74.213.17.130 all med spam up to at least global healing ctr [15:06:19] SMalyshev: Is there documentation on how the WQS Blazegraph instance is protected from malicious update/delete queries? E.g. is it running in read-only mode, or are incoming queries validated/filtered? [15:08:21] earldouglas: there is a proxy in front of it [15:08:49] earldouglas: and the propxy sets X-BIGDATA-READ-ONLY "yes" [15:08:53] as a header [15:09:37] Oh, that's clever. Thanks for the pointer! [15:09:39] earldouglas: in wmf production, the config could be found at https://github.com/wikimedia/puppet/blob/e6a3cf3f2345c7907aad6e7b9c5eb1d4f512e634/modules/wdqs/templates/nginx.erb#L105 [15:10:05] earldouglas: another example is this from the wikibase-docker repo https://github.com/wmde/wikibase-docker/tree/master/wdqs-proxy [15:10:20] then the blazegraph instance itself is not public other than via the proxy :) [15:11:22] Right. I was curious how it could be protected even behind a proxy, given that the select and update queries are both POSTs to /sparql. [15:12:21] That read-only header is perfect -- I can simply inject that on unauthenticated/unauthorized calls to my Blazegraph instance. [15:13:01] addshore: Thanks for your help! [15:13:40] :) [15:15:24] For reference, here's where that header is defined: https://github.com/blazegraph/database/blob/f4164a9/bigdata-core/bigdata-sails/src/java/com/bigdata/rdf/sail/webapp/BigdataServlet.java#L124 [15:45:31] addshore: Does Blazegraph need to be configured to respect that header? When I pass it to my instance, I can still mutate data: [15:45:41] curl -s -i 'http://localhost:8080/blazegraph/namespace/kb/sparql' -H 'X-BIGDATA-READ-ONLY: yes' -H 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8' --data 'update=INSERT...' [15:45:51] as far as I know no [15:46:49] are you posting? [15:46:58] i also notice this bit https://github.com/wmde/wikibase-docker/blob/master/wdqs-proxy/latest/wdqs.template#L29-L31 [15:47:01] Yes, --data implies -X POST [15:47:16] perhaps X-BIGDATA-READ-ONLY does not apply to post [15:47:55] If so, there might be a hole in WQS's security. Has it been tested with SPARQL update queries? [15:48:24] wdqs also does not allow post https://github.com/wikimedia/puppet/blob/e6a3cf3f2345c7907aad6e7b9c5eb1d4f512e634/modules/wdqs/templates/nginx.erb#L123-L125 [15:48:46] can you try sending a request to your instance with X-BIGDATA-READ-ONLY but only a get? [15:53:34] Yep, using a get request causes updates to be rejected as invalid queries, whether or not read-only is set. [15:56:12] addshore: WDQS definitely allows POST [15:56:20] the query service UI falls back to POST for queries that are too long to GET [15:56:33] (GET is preferred because POST bypasses the cache) [15:56:45] aaah yes, wait, im just reading that nginx config thing wrong [15:57:18] * addshore is away for a bit [15:57:21] Lucas_WMDE: Does the injected read-only header prevent update queries? When I try it locally it does not. [15:57:36] I assume it must, but I haven’t tried it [17:15:54] earldouglas: WDQS does not allow update queries from outside [17:18:26] readonly header should work for everything... let me check [17:18:47] SMalyshev: How does it enforce that? It looks like POST requests are allowed, and I am able to mutate state on my local instance when I use the read-only header. [17:19:29] in BigdataServlet, it checks the header in isWritable [17:19:46] if (req.getHeader(HTTP_HEADER_BIGDATA_READ_ONLY) != null || getConfig(servletContext).readOnly) { [17:19:47] buildAndCommitResponse(resp, HTTP_METHOD_NOT_ALLOWED, MIME_TEXT_PLAIN, [17:19:47] "Not writable."); [17:20:09] Yeah, I thought that should work, but I am able to make changes with the following request: [17:20:25] curl -s -i 'http://localhost:8080/blazegraph/namespace/kb/sparql' -H 'X-BIGDATA-READ-ONLY: yes' -H 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8' --data 'update=INSERT...' [17:20:35] hmm this is weird [17:20:37] let me see [17:20:47] can you send me the full command line? [17:20:52] Sure [17:21:27] curl -s -i 'http://localhost:8080/blazegraph/namespace/kb/sparql' -H 'X-BIGDATA-READ-ONLY: yes' -H 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8' --data 'update=INSERT%20DATA%20%7B%0A%20%20%3Cfoo%3E%20%3Cbar%3E%20%3Cbaz%3E%20.%0A%7D' [17:21:37] That inserts a single triple: . [17:22:04] It can be deleted by running the same curl command with DELETE in place of INSERT. [17:23:02] earldouglas: I get "not writable" response... weird [17:23:11] are you sure you are running a recent version? [17:23:33] Do you get that from Blazegraph directly, or from a proxy in front of it? [17:23:45] blazegraph shows version on title screen when launched. should be visible in logs too [17:24:07] I am running 2.1.4 [17:24:35] earldouglas: no it's from blazegraph, it's blazegraph code that produces "not writable" message [17:24:47] earldouglas: 2.1.4 may be too old [17:25:44] earldouglas: see https://jira.blazegraph.com/browse/BLZG-7794 it's the bug that implements it [17:25:57] it's 2.1.5 (which btw was released 2 days ago :) [17:26:16] Ha, really? [17:26:18] but before that we used local fork. Now it's probably makes sense to just use 2.1.5 [17:28:07] that should support the read-only header. Also, simpler way would be to ban POSTs on proxy, but that would not allow to do long queries [17:28:15] because GET has limits on url size [17:28:29] we used to only allow GET but with the header we now allow POST too [17:28:46] For my use case, GETs are sufficient. Mainly I wanted to figure out why the read-only header was ineffective for POSTs. [17:29:03] "Not writable." [17:29:12] Deploying 2.1.5 solved it! [17:29:14] earldouglas: yeah, so that's probably because your build did not have it yet :) [17:29:16] Thanks a lot, SMalyshev [17:29:17] earldouglas: great [21:40:52] * thedj ponders how to model the distinct ingredient used for distilled alcohol... [21:42:40] * thedj pours more whiskey