Copy blobs with AD auth
- Package Name: azure.storage.blob
- Package Version: 12.12.0
- Operating System: Linux
- Python Version: 3.8
Describe the bug
Copying blobs between accounts doesn't work with AD Auth - returns error: CannotVerifyCopySource.
I tried other tools like azcopy and saw that this same action works correctly there (without any download/reupload or accessing the account keys to generate SAS).
To Reproduce Steps to reproduce the behavior:
- Create 2 storage accounts, and upload a file in the first account (via the Azure portral).
- Give your identity/service principal
Storage Blob Data Contributoron both accounts - Then run this code:
from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient
credential = DefaultAzureCredential()
container_name = "abc"
blob_name = "myfile_1M"
source_account_url = "https://somesource.blob.core.windows.net/"
destination_account_url = "https://somedestination.blob.core.windows.net/"
source_blob_service_client = BlobServiceClient(account_url=source_account_url, credential=credential)
source_container_client = source_blob_service_client.get_container_client(container_name)
source_blob = source_container_client.get_blob_client(blob_name)
source_blob_url = source_blob.url
destination_blob_service_client = BlobServiceClient(account_url=destination_account_url, credential=credential)
destination_blob = destination_blob_service_client.get_blob_client(container_name, source_blob.blob_name)
copy = destination_blob.start_copy_from_url(source_blob_url)
print("done")
- Get an
CannotVerifyCopySourceerror
Expected behavior The copy operation should be submitted successfully (just as if I was using SAS)
Hi @tamirkamara Tamir, thanks for reaching out.
When doing copy operations using AAD with the Python SDK, you need to generate the OAuth token yourself and provide it to the source_authorization keyword argument. Here is a sample based on your sample:
from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient
credential = DefaultAzureCredential()
...
# Requesting a token to the Storage resource
token = "Bearer {}".format(credential.get_token("https://storage.azure.com/.default").token)
# Note: OAuth copy is only supported currently for sync copy so you need requires_sync=True
copy = destination_blob.start_copy_from_url(source_blob_url, source_authorization=token, requires_sync=True)
Hopefully this helps resolve your issue but please let me know if it does not. Thanks.
Thanks @jalauzon-msft. That requires_sync requirement is a downside since I deal with large files and not sure the caller can wait... Any plans/work to remove that requirement?
Hi @tamirkamara Tamir, unfortunately this is a service limitation. For some reason only the Copy Blob From URL operations support AAD auth while Copy Blob does not. This is sort of what the requires_sync parameter controls.
I think this may actually be a more severe limitation for you since it actually only supports blobs up to 256 MiB. If you are trying to copy a blob larger, there does not seem to be a way to do it with AAD auth currently using, standard copy operations.
AzCopy uses a different mechanism for copying blobs where it uses a series of Put Block from URL calls and then a final Put Block List. Put Block from URL supports AAD auth (via the same source_authorization and this is why this works for AzCopy. We don't currently offer this form of copy in the SDK, but it would be possible to implement yourself. We've also been considering adding something like this form of copy in Python but have no ETA on when that would be available.