transformers icon indicating copy to clipboard operation
transformers copied to clipboard

flan-t5-xl and flan-t5-xxl model deployment on Sagemaker fails on deploying from HuggingFace Hub

Open rags1357 opened this issue 2 years ago • 4 comments

System Info

Code to replicate from model hub - https://huggingface.co/google/flan-t5-large/tree/main -> Deploy -> Amazon SageMaker endpoint -> AWS

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
hub = {
	'HF_MODEL_ID':'google/flan-t5-xl',
	'HF_TASK':'text2text-generation'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
	transformers_version='4.17.0',
	pytorch_version='1.10.2',
	py_version='py38',
	env=hub,
	role=role, 
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
	initial_instance_count=1, # number of instances
	instance_type='ml.m5.xlarge' # ec2 instance type
)

predictor.predict({
	'inputs': "The answer to the universe is"
})

The endpoint invocation fails with below error -

2023-03-10 19:15:14,508 [INFO ] W-google__flan-t5-xl-5-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5', 'model.ckpt.index', 'flax_model.msgpack'] found in directory /.sagemaker/mms/models/google__flan-t5-xl or from_tfandfrom_flax set to False.

This is possibly because if you look at files under "Files and Versions"( Link) the model has been split up into multiple (pytorch_model-00001-of-00002.bin files because of the size) and the out of the box solution is looking for one pytorch_model.bin file and failing

Who can help?

No response

Information

  • [ ] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

Provided code in the issue description

Expected behavior

Should work out of the box on SageMaker deployment

rags1357 avatar Mar 10 '23 19:03 rags1357

@philschmid could you please help here? I've gone though your workaround here

rags1357 avatar Mar 10 '23 19:03 rags1357

You need to install a more recent version of Transformers, 4.17.0 won't support sharded checkpoints.

sgugger avatar Mar 10 '23 21:03 sgugger

@rags1357 you can check out this blog post: Deploy FLAN-T5 XXL on Amazon SageMaker

philschmid avatar Mar 11 '23 10:03 philschmid

Thank you @sgugger and @philschmid , will try it with the updated Transformers version

rags1357 avatar Mar 13 '23 19:03 rags1357

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 10 '23 15:04 github-actions[bot]