add datafusion-cli support of external table locations that object_store supports
Is your feature request related to a problem or challenge? Please describe what you are trying to do. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] (This section helps Arrow developers understand the context and why for this feature, in addition to the what)
Describe the solution you'd like A clear and concise description of what you want to happen.
Within the datafusion-cli, I'd like to be able to run a create external table SQL with a location like an S3 bucket and have it try to acquire auth from the env.
create external table lineitem stored as parquet location 's3://my_bucket/lineitem/';
-- or
create external table logs stored as csv partitioned by (year, month) location 's3://my_other_bucket/logs/';
-- or whatever location prefix for the Azure and GCP connection..
This will be useful for testing, as well as for tutorials showing features of DataFusion.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.
Hi @kmitchener, did some digging on this issue and looks like there is some prior thinking on this. Most specifically this issue.
@turbo1912 yes, and the object_store is new since then as well. Also, @timvw has just today published some cookbooks regarding object_store and s3 that may help with this: https://github.com/datafusion-contrib/datafusion-cookbook
If you want to give it a shot, go for it!
More inspiration can be found in https://github.com/datafusion-contrib/datafusion-catalogprovider-glue where we take info from the metastore to register external tables...