rsong0606 comments

Repositories
Issues
Comments

Results 3 comments of


                                            rsong0606

[Usage]: Do I need to specify chat-template for Qwen model?

@jeejeelee Hey Jee! I added the chat template as you described above. But I noticed the slower inference speed compared to other models I experimented before, like llama2. Do you...

gpu memory size recommended for pruning the llama2-7b-chat-hf model

@Eric-mingjie Thanks Eric, mine is 24 GB GPU memory. Given that at least 14GB would be used to load the model. I still have ~10 GB left in Nvidia L4....

Support for LLaMA-2

@simlaharma I had a similar issue as you did. check this post, it worked for me. https://github.com/huggingface/datasets/issues/6746