hanlinxuy

Results 3 issues of hanlinxuy

Hello, I have followed the training configuration introduced here (https://github.com/microsoft/torchscale/issues/52) with retnet_medium architecture. I have some questions that I would appreciate if anyone could answer them. The first is about...

I am wondering how much improvement eagle-2 can achieve with other method like sequoia/mcsd?

hello guys, I am wondering if you guys have any plan on releasing the script of Persimmon-8B sparsify and training? I saw only t5, bert and GPT in this repo.