Sergey Mkrtchyan
Sergey Mkrtchyan
Hi, Great notebook! Just wanted to mention that there is no need to pass the `generator` in the constructor of the `EncoderDecoder` class. It makes it a bit confusing as...
Hello, I have a SageMaker training job that uses `763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:1.9.1-gpu-py38-cu111-ubuntu20.04` as its base image. With this I am observing approx `2,340,707 smpl/s` processing speeds. Upgrading this image to anything else,...
Fixes an issue introduced in the commit `984ca2dc` where the `else` is misaligned causing the following crash when running data validation tool. ``` Traceback (most recent call last): File "",...
**Describe the bug** When Mistral 7B Instruct v0.3 model is deployed as a SageMaker endpoint the tokenizer is always producing one extra token as compared to the tokenizer being loaded...
Hello, Congratulations on an excellent paper! Do you have any insights on runtime efficiency compared to an auto-regressive setup? I understand that each sampling step can be parallelized since prediction...