weasel icon indicating copy to clipboard operation
weasel copied to clipboard

Fix Azure storage for assets and remote storage

Open jtnicholl-cosairus opened this issue 1 year ago • 1 comments

Though the documentation claims you can use any protocol that Smart Open supports, Azure storage accounts don't work. I wrote a quick fix to get it working.

Description

Smart Open allows providing transport parameters, which are optional, unless you're using Azure storage containers, then they are required and must include a client that has the connection string.

I made a quick fix to grab the connection string from the environment variable AZURE_STORAGE_CONNECTION_STRING if the URL is Azure. I know I probably didn't follow conventions correctly, but I needed to use Azure and that got it working.

Types of change

Small change to remote.py.

Checklist

  • [x] I confirm that I have the right to submit this contribution under the project's MIT license.
  • [x] I ran the test suite, and all new and existing tests passed.
  • [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

jtnicholl-cosairus avatar Jun 21 '24 20:06 jtnicholl-cosairus

So after this I moved on to remote storage with push and pull, and ran into another problem. Smart Open expects Azure URLs to start with azure://, but Cloudpathlib expects az://. So when Cloudpathlib sees azure:// it thinks it's a local file path and changes it to azure:/, with the added bonus on Windows that it becomes azure:\ instead (along with all the other slashes). This prevents Azure from working for remote storage even after the above fix. I pushed another commit to replace az:// with azure:// in remote.py before passing it to Smart Open, which feels a bit hacky but I don't know how else to get it working.

jtnicholl-cosairus avatar Jun 28 '24 18:06 jtnicholl-cosairus