[engsys] Global Sanitizers inconsistently sanitize storage account names, recordings unreplayable
Describe the bug
https://github.com/Azure/azure-sdk-for-python/pull/35196 introduced a collection of "global" sanitizers that scrub secrets from recordings as they are written to disk.
I'm currently writing a test, where the code path involves:
-
Fetching details about a storage account
-
Usage those details to build the uri for the next request
This sanitizer will redact the storage account name in the recording from the response in Step 1.
https://github.com/Azure/azure-sdk-for-python/blob/511aef315bf6919f52c90adb1803a3b9079cbb05/tools/azure-sdk-tools/devtools_testutils/proxy_startup.py#L379
There is no "global" sanitizer that sanitizes storage account names from request urls.
This leaves my recording un-replayable.
In recording mode, the code receives the sanitized request and tries to send a subsequent request to a URL it builds with the sanitized values: https://sanitized.blob.core.windows.net. But the recording stored an unsanitized URL for that subsequent request, https://account-name.blob.core.windows.net, so the proxy is unable to find a match.
To Reproduce Steps to reproduce the behavior:
-
Succesfully record a test in live mode that:
- Fetches some response with details about a storage account
// Example response { "id": "/subscriptions/00000000-0000-0000-0000-000000000/resourceGroups/00000/providers/Microsoft.MachineLearningServices/workspaces/00000/datastores/workspaceblobstore", "name": "workspaceblobstore", "type": "Microsoft.MachineLearningServices/workspaces/datastores", "properties": { ..., "subscriptionId": "00000000-0000-0000-0000-000000000", "resourceGroup": "resource-group", "datastoreType": "AzureBlob", "accountName": "account-name", "containerName": "d49eda6a-ab96-4d00-b108-33768a3d0aee-azureml-blobstore", "endpoint": "core.windows.net", "protocol": "https", "serviceDataAccessAuthIdentity": "WorkspaceSystemAssignedIdentity" }, "systemData": { ... } }- Uses that response to build the URL for a subsequent request
https://account-name.blob.core.windows.net/d49eda6a-ab96-4d00-b108-33768a3d0aee-azureml-blobstore/path/to/files -
Attempt to re-run the test in recording mode
Expected behavior
The test should run off the recording, and pass
Actual behavior
The test fails
ERROR root:proxy_fixtures.py:312
-----Test proxy playback error:-----
Unable to find a record for the request PUT https://sanitized.blob.core.windows.net/d49eda6a-ab96-4d00-b108-33768a3d0aee-azureml-blobstore/LocalUpload/0e7abff4dcb2ddd489d3e72fa2039bf6/README.md?sv=2021-10-04&si=azureml-system-datastore-policy&sr=c&sig=Sanitized
Method doesn't match, request <PUT> record <HEAD>
Uri doesn't match:
request <https://sanitized.blob.core.windows.net/d49eda6a-ab96-4d00-b108-33768a3d0aee-azureml-blobstore/LocalUpload/0e7abff4dcb2ddd489d3e72fa2039bf6/README.md?sv=2021-10-04&si=azureml-system-datastore-policy&sr=c&sig=Sanitized>
record <https://account-name.blob.core.windows.net/d49eda6a-ab96-4d00-b108-33768a3d0aee-azureml-blobstore/LocalUpload/0e7abff4dcb2ddd489d3e72fa2039bf6/README.md?sv=2021-10-04&si=azureml-system-datastore-policy&sr=c&sig=Sanitized>
Screenshots If applicable, add screenshots to help explain your problem.
Additional context Add any other context about the problem here.