Can't deploy pretrained model even after following the documentation
Discussed in https://github.com/aws/sagemaker-python-sdk/discussions/3638
Originally posted by monika-prajapati February 6, 2023 I have a model that I want to deploy as a sagemaker endpoint. I followed this documentation and did the following:
- Create inference.py script with model_fn, input_fn, predict_fn, and output_fn using this as reference
- Make file/folder structure according to documentation and make model.tar.gz file
. ├── code │ ├── inference.py │ └── requirements.txt └── model.pth
I created model.tar.gz with . as root, while in a directory containing code folder.
My code in the sagemaker notebook looks like this
import boto3
import sagemaker
from sagemaker.pytorch import PyTorchModel
session = boto3.Session()
sagemaker_client = session.client('sagemaker')
role = sagemaker.get_execution_role()
# Define the model data location in S3
model_data = 's3://speech2textmodel/model.tar.gz'
# Define the model architecture
model1 = PyTorchModel(model_data=model_data,
role=role,
entry_point='inference.py',
framework_version='1.6.0',
py_version='py3')
predictor = model1.deploy(instance_type='ml.m5.xlarge', initial_instance_count=1)
I got error
UnexpectedStatusException: Error hosting endpoint pytorch-inference-2023-02-06-09-28-21-891: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..
This is error in cloudwatch
ERROR - /.sagemaker/ts/models/model.mar already exists.
```</div>
do you have solution for this? I am facing the same problem
@KennyTC Nope.
I am facing the same issue... @KennyTC @bhattbhuwan13 Have you fixed this?
I face the same issue with the same error, seems that the error message is not meaningful. In my case the requirement.txt had versions of libraries that weren't compatible with the Python version that I chose for the container image. I realized about that seeing the begin of the CloudWatch log for that particular deploy execution. After I fixed that issue with the requirements, I was able to deploy my PytorchModel and get the endpoint created and running for it.
I was able to resolve this by ensuring the Pytorch image version specified matched my custom requirements.txt and python version e.g.
pytorch_model = PyTorchModel(model_data=fname,
role=role,
entry_point='inference.py',
framework_version='2.1.0',
py_version='py310')
requirements
boto3==1.33.3
botocore==1.33.3
torch==2.0.0