Trian Xylouris

Results 33 comments of Trian Xylouris

> Hi @mayank31398, > > I am still working on this. Can I ask what an average maximum number of tokens for an input would be? Potentially, this can go...

Hey @henrydylan - have you tried just specifying an optimizer via e.g. ``` ..."optimizer": { "type": "Adam", "params": { "lr": 0.00015 } },... ``` Maybe that could help. Also, you...

Also, thanks from my side @stephenroller for the huge amount of work you have made available to all of us! One question: I understand the limitations of this technology and...

Hey @tomerip - were you able to find a workaround? I am experiencing the same problem with gpt-models.

Hi @RezaYazdaniAminabadi , just checking whether you had the chance to work on that PR so far?

Happy to help with testing any potential fixes! If it will still take some time, then it would be great if there is a link with Bloom's fix, so that...

Thanks @RezaYazdaniAminabadi for fixing this! Commit 4abd455521965930d0e921de8afc0073ea7df9d1 from the [PR you mentioned](https://github.com/microsoft/DeepSpeed/pull/2212) fixes the problem when I tested it using a Huggingface `gpt2` model. By the way: The commit aafba00c81eaf29c0c2b209a94bc31f4de942936...

Below is a possibly related bug. I added some sample code to reproduce this error for a `GPT2` model on an NVidia A10G. Let me know @RezaYazdaniAminabadi @cmikeh2 if you...

FYI @mallorbc , @tomeras91 , @RezaYazdaniAminabadi : My related issue which I detailed above is fixed in [this PR](https://github.com/microsoft/DeepSpeed/pull/2212). More precisely, my issue does not appear when I install the...