[Azure ML SDK v2] Issue while registering new data asset and then getting it back right after registration
- Package Name: azure-ai-ml
- Package Version: 1.0.0
- Operating System: Windows Server 2022 Standard
- Python Version: 3.9.13
Describe the bug
When I try to register a new data asset using MLClient.data.create_or_update() and then try to get newly registered entity by name using MLClient.data.get() right after creation, it throws me an error like:
ValidationException Traceback (most recent call last)
---> 12 registered_data_asset = ml_client.data.get(name='new_dataset_name', label='latest')
File ...\site-packages\azure\ai\ml\operations\_data_operations.py:135, in DataOperations.get(self, name, version, label)
126 raise ValidationException(
127 message=msg,
128 target=ErrorTarget.DATA,
(...)
131 error_type=ValidationErrorType.INVALID_VALUE,
132 )
134 if label:
--> 135 return _resolve_label_to_asset(self, name, label)
137 if not version:
138 msg = "Must provide either version or label."
File ...\site-packages\azure\ai\ml\_utils\_asset_utils.py:797, in _resolve_label_to_asset(assetOperations, name, label)
790 msg = "Asset {} with version label {} does not exist in workspace."
791 raise ValidationException(
792 message=msg.format(name, label),
793 no_personal_data_message=msg.format("[name]", "[label]"),
794 target=ErrorTarget.ASSET,
...
700 error_type=ValidationErrorType.RESOURCE_NOT_FOUND,
701 )
702 return latest
ValidationException: Asset new_dataset_name does not exist in workspace workspace_name.
However, after couple of seconds get method will work.
Seems that data asset creation is asynchronous and there is a small time lag between data asset creation and ability to get this data asset from the workspace.
CLI version works fine, but it az ml data create... takes significantly more time than SDK version.
To Reproduce Steps to reproduce the behavior:
dataset = Data(
path='azureml://datastores/...',
type='uri_folder',
description='Test',
name='new_dataset_name',
)
dataset = ml_client.data.create_or_update(dataset)
registered_data_asset = ml_client.data.get(name='new_dataset_name', label='latest')
Expected behavior The last command in the sequence should finish successfully and return registered dataset details.
Label prediction was below confidence level 0.6 for Model:ServiceLabels: 'Service Bus:0.11162622,Storage:0.06302417,Tables:0.0483478'
@azureml-github
@glebrh thanks for reporting this. We're working on a fix now, in the mean time you can add a small (1s) sleep between create and get
Hi @glebrh. Thank you for opening this issue and giving us the opportunity to assist. We believe that this has been addressed. If you feel that further discussion is needed, please add a comment with the text “/unresolve” to remove the “issue-addressed” label and continue the conversation.
Hi @glebrh, since you haven’t asked that we “/unresolve” the issue, we’ll close this out. If you believe further discussion is needed, please add a comment “/unresolve” to reopen the issue.