jena-fuseki-access - Propagate request/service context
What
Process request/service context in AccessControlledDataset the same way as is done in (parent) query processor, i.e. honour the server => dataset => endpoint context value priority order.
Why
Standard SPARQL queries honour the context of the endpoint (see docs) as well as setting of reuqest-specific timeout via the timeout URL parameter. These two features allow one to:
- Override context values specified at server and/or dataset level (e.g. disable access to text indexing or change the default timeout for the endpoint)
- Specify per-request timeouts
(See bottom of summary of example. See also mailing list thread.)
How
- Use
QueryExecto create theQueryExecutioninstead ofQueryExecutionFactory(the latter does not considerHTTPActions's context) - Use same
timeoutparameter logic for pre-request timeouts as in base SPARQL query processor
Note: I am not convinced that the way I've updated AccessControlledDataset (and promoted a helper method from SPARQLQueryProcessor.java to public) is the right way to go. But I can confirm that the following use-case works:
- Define a dataset
DS1withjena:textindexing enabled - Define a dataset
DS2, withAccessControlledDatasetwrappingDS1. (The access actual rules are irrelevant here.) - Define service
Aexposing a query endpoint forDS1, with extended context:ja:context [ ja:cxtName "http://jena.apache.org/text#index" ; ja:cxtValue false ] ; - Define service
Bthe same asA, but forDS2
Current behaviour (pre-patch):
-
jena:textis exposed only with query endpoint ofB. SPARQL queries againstAdo not match text-indexed properties.
Expected behaviour (post patch):
- Neither
AnorBmatch text-indexed properties
From users@ thread https://lists.apache.org/thread/h0c81qjl8oc83yl2xf7xvt4l0pw4grrf
It looks like there is an issue as to whether the text dataset should push down the context setting for the index or not. Requiring endpoint configuration isn't so user-friendly.
This PR may contain a change to make anyway but this jena-fuseki-access uses context settings DataAccessCtl.symAuthorizationService itself so allowing the endpoints to modify context may be a security risk. Needs investigation - it's been a long time since I looked at the code!
An outstanding question from email:
- Doesn't that [setting the context to an illegal value] cause warnings in the Fuseki log?
It'll need some tests.
- Doesn't that [setting the context to an illegal value] cause warnings in the Fuseki log?
(copied from thread reply):
Yes it does - three (expected) warnings from TextQueryPF:
-
Context setting 'symbol:http://jena.apache.org/text#index'is not a TextIndex -
Failed to find the text index : tried context and as a text-enabled dataset -
No text index - no text search performed
It'll need some tests.
I'd be happy to assist with this (time permitting) if I'm pointed in the right direction.
Apologies - I updated the description since it had a mistake. The expected behaviour section has changed to:
Current behaviour (pre-patch):
- jena:text is exposed only with query endpoint of B. SPARQL queries against A do not match text-indexed properties.
Expected behaviour (post patch):
- Neither A nor B match text-indexed properties
(Changing to draft as suggested, since the best way to address this is still being debated.)
@vtermanis -- there has been a bug fixes for general context handling with queries and these are now on the main branch. I don't think it immediately relates to your report but I wanted to make sure you are working against the right state of the codebase.
@afs thanks for the heads-up, do you mean:
- https://github.com/apache/jena/pull/1444
Is that the main one or other there others in particular? (Though it'll probably be clear when I rebase what's changed and look at the query servlet & processor code)
#1444 is about transactions.
#1375 and parts of some others around that time.