Yidan Wang issues

Repositories
Issues
Comments

Results 5 issues of


                                            Yidan Wang

lora fine-tuning

Is it necessary to fine-tune all parameters during the training process? Why does the loss explode when I use lora fine-tuning Llama 2 7B?

Loss Function Problem

Thank you for your work. I have the following questions to discuss with you: 1. why does the loss mention in equation 2 of the paper need to sum the...

Balance-Marking Hash

I would like to inquire whether hash collisions may occur in this paper, given that the entire message space is mapped to a smaller space. If so, how can I...

LLaDA Slowdown with Multiple A100 GPUs

Why do the speeds of two programs interfere with each other when running the LLaDA model on two A100 GPUs? For example, running LLaDA on A100 No. 0 alone takes...

Repetitive Sequences

Is `argmax` the only sampling method for the LLaDA model? Why are the model outputs sometimes filled with '\n'? How can I mitigate this effect?