sagemaker-distribution icon indicating copy to clipboard operation
sagemaker-distribution copied to clipboard

release: v0.13.0

Open github-actions[bot] opened this issue 1 year ago • 7 comments

This pull request was created by GitHub Actions/AWS CodeBuild! Before merging, please do the following:

  • [ ] Review changelog/staleness report.
  • [ ] Review build/test results by clicking Build Logs in CI Report (be patient, tests take ~4hr).
  • [ ] Review ECR Scan results.

github-actions[bot] avatar May 06 '24 21:05 github-actions[bot]

Staleness Report: 0.13.0(gpu)

Package Current Version in the Distribution image Latest Relevant Version in Upstream
${\color{red}ipython}$ 8.12.2 8.22.2
jinja2 3.1.3 3.1.3
ipywidgets 7.8.0 7.8.0
${\color{red}numpy}$ 1.24.4 1.26.4
${\color{red}pandas}$ 2.0.3 2.2.2
boto3 1.34.98 1.34.98
aws-glue-sessions 1.0.5 1.0.5
conda 23.11.0 23.11.0
jupyterlab 3.6.7 3.6.7
${\color{red}keras}$ 2.13.1 2.15.0
${\color{red}matplotlib}$ 3.7.3 3.8.4
pip 23.3.2 23.3.2
${\color{red}scipy}$ 1.10.1 1.13.0
${\color{red}scikit-learn}$ 1.3.2 1.4.2
py-xgboost-gpu 1.7.6 1.7.6
thrift_sasl 0.4.3 0.4.3
pyhive 0.7.0 0.7.0
python-gssapi 1.8.3 1.8.3
${\color{red}pytorch-gpu}$ 2.0.0 2.1.2
sagemaker-headless-execution-driver 0.0.12 0.0.12
sagemaker-kernel-wrapper 0.0.2 0.0.2
sagemaker-python-sdk 2.218.1 2.218.1
sagemaker-studio-analytics-extension 0.0.21 0.0.21
sasl 0.3.1 0.3.1
${\color{red}tensorflow}$ 2.13.1 2.15.0
${\color{red}torchvision}$ 0.15.2 0.16.1

Staleness Report: 0.13.0(cpu)

Package Current Version in the Distribution image Latest Relevant Version in Upstream
${\color{red}ipython}$ 8.12.2 8.22.2
jinja2 3.1.3 3.1.3
ipywidgets 7.8.0 7.8.0
${\color{red}numpy}$ 1.24.4 1.26.4
${\color{red}pandas}$ 2.0.3 2.2.2
boto3 1.34.98 1.34.98
aws-glue-sessions 1.0.5 1.0.5
conda 23.11.0 23.11.0
jupyterlab 3.6.7 3.6.7
${\color{red}keras}$ 2.13.1 2.15.0
${\color{red}matplotlib}$ 3.7.3 3.8.4
pip 23.3.2 23.3.2
${\color{red}scipy}$ 1.10.1 1.13.0
${\color{red}scikit-learn}$ 1.3.2 1.4.2
py-xgboost-cpu 1.7.6 1.7.6
thrift_sasl 0.4.3 0.4.3
pyhive 0.7.0 0.7.0
python-gssapi 1.8.3 1.8.3
${\color{red}pytorch}$ 2.0.0 2.1.2
sagemaker-headless-execution-driver 0.0.12 0.0.12
sagemaker-kernel-wrapper 0.0.2 0.0.2
sagemaker-python-sdk 2.218.1 2.218.1
sagemaker-studio-analytics-extension 0.0.21 0.0.21
sasl 0.3.1 0.3.1
${\color{red}tensorflow}$ 2.13.1 2.15.0
${\color{red}torchvision}$ 0.15.2 0.16.1

AWS CodeBuild CI Report

  • CodeBuild project: buildtestpublicimage1C7307A-9AzES2hf19lW
  • Commit ID: 4ab51b228e3d4d8f0a127e4620559b5782ba6bf3
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

=========================== short test summary info ============================
FAILED test/test_dockerfile_based_harness.py::test_dockerfiles_for_gpu[keras.test.Dockerfile-required_packages0]
FAILED test/test_dockerfile_based_harness.py::test_dockerfiles_for_gpu[scipy.test.Dockerfile-required_packages4]
FAILED test/test_dockerfile_based_harness.py::test_dockerfiles_for_gpu[sagemaker-studio-analytics-extension.test.Dockerfile-required_packages20]
FAILED test/test_dockerfile_based_harness.py::test_dockerfiles_for_gpu[pandas.test.Dockerfile-required_packages7]
FAILED test/test_dockerfile_based_harness.py::test_dockerfiles_for_gpu[sm-python-sdk.test.Dockerfile-required_packages8]
FAILED test/test_dockerfile_based_harness.py::test_dockerfiles_for_gpu[pytorch.examples.Dockerfile-required_packages9]
====== 6 failed, 7 passed, 13 skipped, 6 warnings in 18020.18s (5:00:20) =======

TRNWWZ avatar May 10 '24 16:05 TRNWWZ

Staleness Report: 0.13.0(gpu)

Package Current Version in the Distribution image Latest Relevant Version in Upstream
${\color{red}ipython}$ 8.12.2 8.24.0
jinja2 3.1.4 3.1.4
ipywidgets 7.8.0 7.8.0
${\color{red}numpy}$ 1.24.4 1.26.4
${\color{red}pandas}$ 2.0.3 2.2.2
${\color{red}boto3}$ 1.34.101 1.34.102
aws-glue-sessions 1.0.5 1.0.5
conda 23.11.0 23.11.0
jupyterlab 3.6.7 3.6.7
${\color{red}keras}$ 2.13.1 2.15.0
${\color{red}matplotlib}$ 3.7.3 3.8.4
pip 23.3.2 23.3.2
${\color{red}scipy}$ 1.10.1 1.13.0
${\color{red}scikit-learn}$ 1.3.2 1.4.2
py-xgboost-gpu 1.7.6 1.7.6
thrift_sasl 0.4.3 0.4.3
pyhive 0.7.0 0.7.0
python-gssapi 1.8.3 1.8.3
${\color{red}pytorch-gpu}$ 2.0.0 2.1.2
sagemaker-headless-execution-driver 0.0.12 0.0.12
sagemaker-kernel-wrapper 0.0.2 0.0.2
sagemaker-python-sdk 2.219.0 2.219.0
sagemaker-studio-analytics-extension 0.0.21 0.0.21
sasl 0.3.1 0.3.1
${\color{red}tensorflow}$ 2.13.1 2.15.0
${\color{red}torchvision}$ 0.15.2 0.16.1

Staleness Report: 0.13.0(cpu)

Package Current Version in the Distribution image Latest Relevant Version in Upstream
${\color{red}ipython}$ 8.12.2 8.24.0
jinja2 3.1.4 3.1.4
ipywidgets 7.8.0 7.8.0
${\color{red}numpy}$ 1.24.4 1.26.4
${\color{red}pandas}$ 2.0.3 2.2.2
${\color{red}boto3}$ 1.34.101 1.34.102
aws-glue-sessions 1.0.5 1.0.5
conda 23.11.0 23.11.0
jupyterlab 3.6.7 3.6.7
${\color{red}keras}$ 2.13.1 2.15.0
${\color{red}matplotlib}$ 3.7.3 3.8.4
pip 23.3.2 23.3.2
${\color{red}scipy}$ 1.10.1 1.13.0
${\color{red}scikit-learn}$ 1.3.2 1.4.2
py-xgboost-cpu 1.7.6 1.7.6
thrift_sasl 0.4.3 0.4.3
pyhive 0.7.0 0.7.0
python-gssapi 1.8.3 1.8.3
${\color{red}pytorch}$ 2.0.0 2.1.2
sagemaker-headless-execution-driver 0.0.12 0.0.12
sagemaker-kernel-wrapper 0.0.2 0.0.2
sagemaker-python-sdk 2.219.0 2.219.0
sagemaker-studio-analytics-extension 0.0.21 0.0.21
sasl 0.3.1 0.3.1
${\color{red}tensorflow}$ 2.13.1 2.15.0
${\color{red}torchvision}$ 0.15.2 0.16.1

AWS CodeBuild CI Report

  • CodeBuild project: buildtestpublicimage1C7307A-9AzES2hf19lW
  • Commit ID: 013239a45fd990af14448708cec73730344e0ae9
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

=========================== short test summary info ============================
FAILED test/test_dockerfile_based_harness.py::test_dockerfiles_for_gpu[keras.test.Dockerfile-required_packages0]
FAILED test/test_dockerfile_based_harness.py::test_dockerfiles_for_gpu[scipy.test.Dockerfile-required_packages4]
FAILED test/test_dockerfile_based_harness.py::test_dockerfiles_for_gpu[sagemaker-studio-analytics-extension.test.Dockerfile-required_packages20]
FAILED test/test_dockerfile_based_harness.py::test_dockerfiles_for_gpu[pandas.test.Dockerfile-required_packages7]
FAILED test/test_dockerfile_based_harness.py::test_dockerfiles_for_gpu[sm-python-sdk.test.Dockerfile-required_packages8]
FAILED test/test_dockerfile_based_harness.py::test_dockerfiles_for_gpu[pytorch.examples.Dockerfile-required_packages9]
====== 6 failed, 7 passed, 13 skipped, 6 warnings in 18038.08s (5:00:38) =======

keras - error Failed to load in-memory CUBIN (compiled for a different GPU?).: CUDA_ERROR_INVALID_IMAGE: device kernel image is invalid, verified v0.12.0 having same issue, as this version is not external and nothing new breaking, this failure is fine. We should deprecate V0 version everywhere asap.

scipy - failures documented https://github.com/aws/sagemaker-distribution/issues/30 sagemaker-studio-analytics-extension - failures documented https://github.com/aws/sagemaker-distribution/issues/315 pandas - single test failure, which is fixed in newer version: https://github.com/pandas-dev/pandas/issues/54709 sm-python-sdk - failures documented https://github.com/aws/sagemaker-distribution/issues/316 pytorch - local run succeed, transient issue

TRNWWZ avatar May 12 '24 19:05 TRNWWZ

Confirmed the keras test failure is not related to the image itself but the version of cuda-nvcc installed in test. Need to update test to install correct version (matching CUDA 11.8 on the image), as well as install additional jax dependency.

claytonparnell avatar May 13 '24 17:05 claytonparnell