datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

[DISCUSSION] Should we implement access control hooks in DataFusion?

Open alamb opened this issue 2 years ago • 5 comments

Is your feature request related to a problem or challenge?

@comphead brings up an excellent point here: https://github.com/apache/arrow-datafusion/pull/7441#issuecomment-1698341294

Basically many database systems allow some sort of access control such as allowing some users the ability to read data but restricting writing.

There are many different granularities of such controls (like restricted to schemas, tables, read schema vs read data, etc) that a system might want to implement.

Describe the solution you'd like

If anyone needs this today, they can build it on top of LogicalPlan (by checking LogicalPlan contents and implementing whatever controls they want).

It might be interesting to add some sort of built in (extensible) mechanism into DataFusion to make this process easier.

Describe alternatives you've considered

I think we should wait for someone with a real usecase / implementation on top of LogicalPlan that we can upstream once we work out the details rather than designing this in advance but I wanted to file the ticket to track the idea

Additional context

No response

alamb avatar Sep 07 '23 21:09 alamb

cc @waynexia @liukun4515 , how do you think about it ?

jackwener avatar Sep 13 '23 18:09 jackwener

@alamb do you means datafusion add the component to support ACL? I think it will make datafusion more complex, and the positioning of datafusion is not DBMS.

liukun4515 avatar Sep 14 '23 02:09 liukun4515

Adding a complete implementation of access control in DataFusion might be hard. But it looks viable to me to add some basic components, to make it easier to build customized ACL on top of DataFusion.

Currently in our project, this functionality is implemented by matching the LogicalPlan and the SQL AST before executing it, just like @alamb mentioned above. https://github.com/GreptimeTeam/greptimedb/blob/9ff7670adfb56a80fe6ffeab8bdab9bcfe55543c/src/servers/src/interceptor.rs#L52-L60

For cases I can come up with, these hooks should be enough to accomplish ACL requirements, as the query and plan contain all the necessary information in theory. We can consider evolving DataFusion's hooks from that. Like replacing QueryContext with TaskContext orSessionContext. And maybe add an extra ACL-related field in that context.

waynexia avatar Sep 14 '23 09:09 waynexia

@alamb do you means datafusion add the component to support ACL?

@liukun4515 I was thinking more like what @waynexia mentions -- datafusion would have some hooks that are extensible (aka based on a Trait) and have a simple default implementation (perhaps a noop) built into DataFusion. I am still not sure how useful such a feature would be

alamb avatar Sep 14 '23 19:09 alamb

Hey folks. We've somehow made our way to this issue, so I can chime in here:

We're interested in using Postgres RLS around our DataFusion integration. I don't believe it's DataFusion's place to handle that, but some hooks to help us connect Postgres's permission/access control features to DataFusion would be super useful to us. Happy to discuss here or in Discord

philippemnoel avatar Feb 23 '24 05:02 philippemnoel

@philippemnoel if you have any results that come from your discussions, it would be most helpful if you can post them (or a link to them) on this ticket for anyone in the future who might also be interested

alamb avatar Feb 26 '24 18:02 alamb