azure-sdk-for-python icon indicating copy to clipboard operation
azure-sdk-for-python copied to clipboard

Local deployment failing when trying to deploy a model registered in workspace.

Open Gunnvant opened this issue 1 year ago • 1 comments

  • Package Name: azure.ai.ml:
  • Package Version: 1.12.1:
  • Operating System: mac os:
  • Python Version:3.10:

Describe the bug Trying to do a local deployemnt based on docs present here https://learn.microsoft.com/en-in/azure/machine-learning/how-to-deploy-online-endpoints?view=azureml-api-2&tabs=azure-cli

I am able to create a local endpoint but the deployment is failing. Below is the error trace

Creating local deployment (mobile-pricing-endpoint-a116c8ce / blue) ....Done (0m 20s)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[19], line 1
----> 1 ml_client.online_deployments.begin_create_or_update(blue_deployment,local=True).result()

File ~/miniforge3/envs/azure-ml/lib/python3.10/site-packages/azure/core/tracing/decorator.py:78, in distributed_trace.<locals>.decorator.<locals>.wrapper_use_tracer(*args, **kwargs)
     76 span_impl_type = settings.tracing_implementation()
     77 if span_impl_type is None:
---> 78     return func(*args, **kwargs)
     80 # Merge span is parameter is set, but only if no explicit parent are passed
     81 if merge_span and not passed_in_parent:

File ~/miniforge3/envs/azure-ml/lib/python3.10/site-packages/azure/ai/ml/_telemetry/activity.py:275, in monitor_with_activity.<locals>.monitor.<locals>.wrapper(*args, **kwargs)
    272 @functools.wraps(f)
    273 def wrapper(*args, **kwargs):
    274     with log_activity(logger, activity_name or f.__name__, activity_type, custom_dimensions):
--> 275         return f(*args, **kwargs)

File ~/miniforge3/envs/azure-ml/lib/python3.10/site-packages/azure/ai/ml/operations/_online_deployment_operations.py:216, in OnlineDeploymentOperations.begin_create_or_update(self, deployment, local, vscode_debug, skip_script_validation, local_enable_gpu, **kwargs)
    214     log_and_raise_error(ex)
    215 else:
--> 216     raise ex

File ~/miniforge3/envs/azure-ml/lib/python3.10/site-packages/azure/ai/ml/operations/_online_deployment_operations.py:142, in OnlineDeploymentOperations.begin_create_or_update(self, deployment, local, vscode_debug, skip_script_validation, local_enable_gpu, **kwargs)
    135         except Exception as ex:
    136             raise LocalDeploymentGPUNotAvailable(
    137                 msg=(
    138                     "Nvidia GPU is not available in your local system."
    139                     " Use nvidia-smi command to see the available GPU"
    140                 )
    141             ) from ex
--> 142     return self._local_deployment_helper.create_or_update(
    143         deployment=deployment,
    144         local_endpoint_mode=self._get_local_endpoint_mode(vscode_debug),
    145         local_enable_gpu=local_enable_gpu,
    146     )
    147 if deployment and deployment.instance_type and deployment.instance_type.lower() in SmallSKUs:
    148     module_logger.warning(
    149         "Instance type %s may be too small for compute resources. "  # pylint: disable=line-too-long
    150         "Minimum recommended compute SKU is Standard_DS3_v2 for general purpose endpoints. Learn more about SKUs here: "  # pylint: disable=line-too-long
    151         "https://learn.microsoft.com/en-us/azure/machine-learning/referencemanaged-online-endpoints-vm-sku-list",
    152         deployment.instance_type,  # pylint: disable=line-too-long
    153     )

File ~/miniforge3/envs/azure-ml/lib/python3.10/site-packages/azure/ai/ml/operations/_local_deployment_helper.py:103, in _LocalDeploymentHelper.create_or_update(self, deployment, local_endpoint_mode, local_enable_gpu)
    101     log_and_raise_error(ex)
    102 else:
--> 103     raise ex

File ~/miniforge3/envs/azure-ml/lib/python3.10/site-packages/azure/ai/ml/operations/_local_deployment_helper.py:88, in _LocalDeploymentHelper.create_or_update(self, deployment, local_endpoint_mode, local_enable_gpu)
     82     deployment_metadata = json.dumps(deployment._to_dict())
     83     endpoint_metadata = (
     84         endpoint_metadata
     85         if endpoint_metadata
     86         else _get_stubbed_endpoint_metadata(endpoint_name=deployment.endpoint_name)
     87     )
---> 88     local_endpoint_polling_wrapper(
     89         func=self._create_deployment,
     90         message=f"{operation_message} ({deployment.endpoint_name} / {deployment.name}) ",
     91         endpoint_name=deployment.endpoint_name,
     92         deployment=deployment,
     93         local_endpoint_mode=local_endpoint_mode,
     94         local_enable_gpu=local_enable_gpu,
     95         endpoint_metadata=endpoint_metadata,
     96         deployment_metadata=deployment_metadata,
     97     )
     98     return self.get(endpoint_name=deployment.endpoint_name, deployment_name=deployment.name)
     99 except Exception as ex:  # pylint: disable=broad-except

File ~/miniforge3/envs/azure-ml/lib/python3.10/site-packages/azure/ai/ml/_utils/_endpoint_utils.py:101, in local_endpoint_polling_wrapper(func, message, **kwargs)
     99 event = pool.submit(func, **kwargs)
    100 polling_wait(poller=event, start_time=start_time, message=message, is_local=True)
--> 101 return event.result()

File ~/miniforge3/envs/azure-ml/lib/python3.10/concurrent/futures/_base.py:451, in Future.result(self, timeout)
    449     raise CancelledError()
    450 elif self._state == FINISHED:
--> 451     return self.__get_result()
    453 self._condition.wait(timeout)
    455 if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]:

File ~/miniforge3/envs/azure-ml/lib/python3.10/concurrent/futures/_base.py:403, in Future.__get_result(self)
    401 if self._exception:
    402     try:
--> 403         raise self._exception
    404     finally:
    405         # Break a reference cycle with the exception in self._exception
    406         self = None

File ~/miniforge3/envs/azure-ml/lib/python3.10/concurrent/futures/thread.py:58, in _WorkItem.run(self)
     55     return
     57 try:
---> 58     result = self.fn(*self.args, **self.kwargs)
     59 except BaseException as exc:
     60     self.future.set_exception(exc)

File ~/miniforge3/envs/azure-ml/lib/python3.10/site-packages/azure/ai/ml/operations/_local_deployment_helper.py:219, in _LocalDeploymentHelper._create_deployment(self, endpoint_name, deployment, local_endpoint_mode, local_enable_gpu, endpoint_metadata, deployment_metadata)
    202 (model_name, model_version, model_directory_path,) = get_model_artifacts(
    203     endpoint_name=endpoint_name,
    204     deployment=deployment,
    205     model_operations=self._model_operations,
    206     download_path=deployment_directory_path,
    207 )
    209 # Resolve the environment information
    210 # - Image + conda file - environment.image / environment.conda_file
    211 # - Docker context - environment.build
    212 (
    213     yaml_base_image_name,
    214     yaml_env_conda_file_path,
    215     yaml_env_conda_file_contents,
    216     downloaded_build_context,
    217     yaml_dockerfile,
    218     inference_config,
--> 219 ) = get_environment_artifacts(
    220     endpoint_name=endpoint_name,
    221     deployment=deployment,
    222     environment_operations=self._environment_operations,
    223     download_path=deployment_directory,
    224 )
    225 # Retrieve AzureML specific information
    226 # - environment variables required for deployment
    227 # - volumes to mount into container
    228 image_context = AzureMlImageContext(
    229     endpoint_name=endpoint_name,
    230     deployment_name=deployment_name,
   (...)
    236     model_mount_path=f"/{model_name}/{model_version}" if model_name else "",
    237 )

File ~/miniforge3/envs/azure-ml/lib/python3.10/site-packages/azure/ai/ml/_local_endpoints/validators/environment_validator.py:51, in get_environment_artifacts(endpoint_name, deployment, environment_operations, download_path)
     29 """Validates and returns artifacts from environment specification.
     30 
     31 :param endpoint_name: name of endpoint which this deployment is linked to
   (...)
     48 :raises: azure.ai.ml._local_endpoints.errors.CloudArtifactsNotSupportedError
     49 """
     50 # Validate environment for local endpoint
---> 51 if _environment_contains_cloud_artifacts(deployment=deployment):
     52     if isinstance(deployment.environment, Environment):
     53         environment_asset = deployment.environment

File ~/miniforge3/envs/azure-ml/lib/python3.10/site-packages/azure/ai/ml/_local_endpoints/validators/environment_validator.py:208, in _environment_contains_cloud_artifacts(deployment)
    207 def _environment_contains_cloud_artifacts(deployment: OnlineDeployment):
--> 208     return isinstance(deployment.environment, str) or deployment.environment.id is not None

AttributeError: 'NoneType' object has no attribute 'id'

To Reproduce Steps to reproduce the behavior:

  1. Do a model run and register your model in using mlflow.log_model
  2. Then follow the instructions in the link provided above

Expected behavior The local deployment should happend as listed in the docs

Gunnvant avatar Mar 03 '24 11:03 Gunnvant

Providing the code used:

from azure.ai.ml import MLClient
from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential

keyVaultName = ""
KVUri = f"https://{keyVaultName}.vault.azure.net"
credential = DefaultAzureCredential()
client = SecretClient(vault_url=KVUri, credential=credential)

subs_id = client.get_secret("subscription-id").value
rg_name = client.get_secret("ml-resource-group").value
ws_name = client.get_secret("ml-workspace-name").value


ml_client = MLClient(
    credential=credential,
    subscription_id=subs_id,
    resource_group_name=rg_name,
    workspace_name=ws_name,
)

mobile_pricing_model = ml_client.models.get(name="mobile_pricing_model",version="2")

from azure.ai.ml.entities import ManagedOnlineEndpoint

import uuid
# endpoint_name = "mobile-pricing-endpoint-" + str(uuid.uuid4())[:8]
endpoint_name = "mobile-pricing-endpoint-a116c8ce"

endpoint = ManagedOnlineEndpoint(
    name = endpoint_name, 
    description="this is a sample endpoint",
    auth_mode="key"
)

ml_client.online_endpoints.begin_create_or_update(endpoint, local=True)

from azure.ai.ml.entities import ManagedOnlineDeployment, Environment

endpoint_name="mobile-pricing-endpoint-a116c8ce" ## endpoint name has format requirements
blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=endpoint_name,
    model=mobile_pricing_model,
    instance_type="Standard_DS3_v2",
    instance_count=1,
)

ml_client.online_deployments.begin_create_or_update(blue_deployment,local=True).result()

Gunnvant avatar Mar 03 '24 11:03 Gunnvant

@Gunnvant you have not specified the deployment environment in the code. Please re-check the https://learn.microsoft.com/en-in/azure/machine-learning/how-to-deploy-online-endpoints?view=azureml-api-2&tabs=azure-cli

Example: model = Model(path="../model-1/model/sklearn_regression_model.pkl") env = Environment( conda_file="../model-1/environment/conda.yaml", image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest", )

blue_deployment = ManagedOnlineDeployment( name="blue", endpoint_name=endpoint_name, model=model, environment=env, code_configuration=CodeConfiguration( code="../model-1/onlinescoring", scoring_script="score.py" ), instance_type="Standard_DS3_v2", instance_count=1, )

sshiri-msft avatar Mar 04 '24 18:03 sshiri-msft

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Azure/azure-ml-sdk @azureml-github.

github-actions[bot] avatar Mar 05 '24 18:03 github-actions[bot]

@sshiri-msft, this is an mlflow model. For local deployment, do we need to provide the environment as that's not the case with deployment on azure.

It would help if docs can specify that local deployment scenarios (for testing and debugging) are different from azure deployment.

Gunnvant avatar Mar 06 '24 12:03 Gunnvant

@Gunnvant yes, for local deployment you will have to provide the environment.

Good suggestion will try to follow up with the relevant PM to update the docs.

sshiri-msft avatar Mar 07 '24 00:03 sshiri-msft