tcluoct

Results 3 comments of tcluoct

I think @JingerAI want to say 1.3B, i also trained the demo 1.3B model slow. Maybe there are some missing setting issue.

I'm using A100 which have better performance than A6000.

After the train, when i run the final model. It responds very weird. ![image](https://user-images.githubusercontent.com/4354899/233229125-90df3068-7d82-40cc-9a21-743e314b289b.png)