Srinivasan Nandakumar comments

Repositories
Issues
Comments

Results 3 comments of


                                            Srinivasan Nandakumar

Tinyllama issues

Thanks. More info on this from my other run: same script runs perfectly fine on a RTX 4090 where I set bf16 training to be true. So my guess is...

Tinyllama issues

(An update here for more info) I tried another model (Qwen 1.5B) with fp 16 training and it works fine. Problem is specific to tiny llama i think.

[Bug]: beam search: max_logprobs cannot be higher than 20

Hi, I figured a work around. The underlying search code increases the beam width by a factor of 2. So when initializing the LLM , set max_logprobs to twice the...