DiffuSeq icon indicating copy to clipboard operation
DiffuSeq copied to clipboard

If there is any rule to modify the parameters

Open zkzhou126 opened this issue 2 years ago • 1 comments

Hello! I trained the model on the WMT16 dataset and modified the parameters to the following values image The main modifications were dim and seq_len, what's more, I change the learning_step to 120000, to make the result better. But I still got very poor results. image I wonder when I change these parameters, do I have to change other parameters along with them?
When I trained the model with your original parameters, the results were not good enough because of dim and seq_len, but they were better than the current results.

zkzhou126 avatar Jan 26 '24 04:01 zkzhou126

Hi, Many hyper-parameters may take effects on the final results, including bsz, seq_len, dim, steps and tokenizers. Also, other techniques such as self-conditioning, length prediction, may help the training.

summmeer avatar Feb 23 '24 06:02 summmeer