rui comments

Results 4 comments of

rui

llama3.2 fine tuning generates repeated pattern towards the end of one epoch

> I get a similar observation on my fine-tuning on custom dataset. Did you plot your training loss with wandb? > > I wonder whether this is a learning rate...

llama3.2 fine tuning generates repeated pattern towards the end of one epoch

> Your loss seems to be fine, maybe train longer or increase learning rate? Repetitive answers usually mean that the model is still adapting to the new domain. Yup, those...

llama3.2 fine tuning generates repeated pattern towards the end of one epoch

> yeah I agree, my finetuned model performs worse than a smaller LLaVA-Onevision finetuned model on my custom dataset. The loss doesn't manage to go as much down. Let's see...

AssertionError: no sync context manager is incompatible with gradientpartitioning logic of ZeRo stage 3

I guess it comes with compatible versions b/w transformers, in my case, I have to use deepspeed 0.15.4 for transforerms==4.37.2