sagemaker-python-sdk icon indicating copy to clipboard operation
sagemaker-python-sdk copied to clipboard

Extra quotation marks from training job description

Open b5y opened this issue 2 years ago • 0 comments

Describe the bug Extra quotation marks returned when you call sagemaker_submit_directory from sagemaker_submit_directory

To reproduce Let's we have some PyTorch estimator, then if we do:

import sagemaker
from sagemaker import get_execution_role
from sagemaker.pytorch import PyTorch


boto_session = boto3.session.Session()
sagemaker_session = sagemaker.Session(boto_session=boto_session)
role = get_execution_role(sagemaker_session=sagemaker_session)
code_location = "s3//path/to/code/location/sourcedir.tar.gz"
output_path = "s3://path/to/output"
train_instance_type="ml.g5.4xlarge"

def get_hyperparameters():
    hyperparameters = {
       "some hyperparameters"
    }
    
    return hyperparameters

estimator = PyTorch(
            entry_point="train.py",
            source_dir="./source_dir",  # directory of your training script
            code_location=code_location,
            role=role,
            framework_version="2.1",
            py_version="py310",
            instance_type=train_instance_type,
            instance_count=1,
            volume_size=10,  # size of the storage volume in GB
            output_path=output_path,
            hyperparameters=get_hyperparameters()
        )
train_data_loc = "s3://path/to/train/data"
val_data_loc = ''
test_data_loc = ''
channels = {
    'training': train_data_loc,
    # 'validation': val_data_loc,
    # 'test': test_data_loc
}

training_job_name = "some-training-job-name"
estimator.fit(
    inputs=channels,
    wait=False,
    job_name=training_job_name
)

describe_training_job = estimator.latest_training_job.describe()
model_data_url = describe_training_job["ModelArtifacts"]["S3ModelArtifacts"]
# PROBLEM IS HERE
source_dir = describe_training_job['HyperParameters']['sagemaker_submit_directory']

source_dir returns string in a format '"s3://path/to/the/sourcedir.tar.gz"' so every time I need to do source_dir.strip('\"')

Expected behavior No extra quotation marks after getting the sagemaker_submit_directory

System information A description of your system. Please provide:

  • SageMaker Python SDK version: 2.199.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): from sagemaker.pytorch import PyTorch
  • Framework version: not able to import torch in ml.t3.medium instance
  • Python version: 3.10.6
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N

Additional context Happens in ml.t3.medium instance in SageMaker Domains Studio.

b5y avatar Feb 04 '24 07:02 b5y