Support Azure Blobs as application resources in Spark
Hello @jafreck @timotheeguerin
Right now AZTK in Spark SDK when aztk.spark.client.Client.submit() is called,
it assumes that ApplicationConfiguration contains paths to local files in jars and files fields.
In our case we already have the spark job resources uploaded to Azure Blob Storage so we want to avoid downloading and uploading them again.
From what I see, aztk.spark.client.Client.submit() calls generate_task which uploads files to blob storage, generates ResourceFiles for them, replaces local paths with file names in application config and uploads it as application.yml file to blob storage.
I would like to have an option to provide resource_files directly to Client.submit() and thus skip uploading files.
Right now we use a workaround where we basically reimplement generate_task and generate resource_files for our blobs ourselves. This seems brittle as it is coupled to AZTK SDK implementation and can break when AZTK changes in future.
I think this is a great feature. We should support both scenarios - local upload and referencing existing files in storage. Thanks for the feature request!