sagemaker-inference-toolkit
sagemaker-inference-toolkit copied to clipboard
Serve machine learning models within a 🐳 Docker container using 🧠 Amazon SageMaker.
**Issue #, if available:** #110 **Description of changes:** Initial draft of a non-breaking option to support passing an extra "request context" parameter to handler override functions (`transform_fn`, `input_fn`, `predict_fn`, `output_fn`),...
**Describe the feature you'd like** Users of {PyTorch, HuggingFace, XYZ...} SageMaker DLCs in script mode should be able to access the [SageMaker CustomAttributes request header](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpointAsync.html#API_runtime_InvokeEndpointAsync_RequestParameters) (and maybe other current/future request...
*Issue #, if available:* This PR fixes the issue described here in PT inference toolkit repo, since the fix can be applied at the transform function in the sagemaker inference...
**Describe the feature you'd like** The sagemaker inference toolkit `sagemaker_inference` should support (at a bare minimum) Python 3.8. It's currently only tested against 2.7, 3.6 and 3.7. **How would this...
**Describe the bug** The sagemaker inference toolkit `sagemaker_inference` is only being tested against Python 2.7, 3.6 and 3.7, yet installers like pip, poetry etc. would gladly attempt to install the...
**Describe the feature you'd like** Inference toolkit, when starting up MMS, will repackage the model contents by copying the contents from /opt/ml/model to /.sagemaker/mms/models: https://github.com/aws/sagemaker-inference-toolkit/blob/master/src/sagemaker_inference/model_server.py#L76. This is unnecessary and MMS...
**Describe the bug** Invoking the `torch.jit.attach_eia()` method on the huggingface transformers roBERTa model results in the following error: `RuntimeError: class '__torch__.torch.nn.modules.normalization.___torch_mangle_6.LayerNorm' already defined`. **To reproduce** ``` import torch, torcheia from...
**Describe the bug** The model server timeout ("used for model server's backend workers before they are deemed unresponsive and rebooted") currently set in with env vars using `SAGEMAKER_MODEL_SERVER_TIMEOUT` is listed...
**What did you find confusing? Please describe.** Based on the documentation the complete example [multi_model_bring_your_own](https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/multi_model_bring_your_own), it seems that `sagemaker-inference-toolkit` is only for multi-model requirements. But I have also seen links...
this allows it to correctly find the path to install from whl files. *Issue #, if available:* n/a *Description of changes:* Change install requirements from relative paths. This allows it...