LIU, Shih-Yang
LIU, Shih-Yang
Hi, thanks for your interest in our work. Unfortunately, the projector weight for llama3 is currently not available in LLaVA. Unless you train the projector yourself, using llama3 within the...
@JieShibo Hi, could you elaborate more on how to manually synchronize the gradient and address the synchronization problem? Thank you so much!
樓主你好,能麻煩您提供我TaxiBJ的 dataset嗎,非常感謝. Email: [email protected]
Did anyone know the solution? I am assuming the setting per_device_train_batch_size = 4 on a single GPU is equivalent to total batch size = 4 which is the paper setting,...
I changed the lr in the CoLA training script to 2e-4 and solved the CoLA constant 0 eval correlation value problem, but still couldn't reproduce the MNLI result :(
But I am still only getting 62.82 CoLA score, anyone encountered similar problem when trying to reproduce the result
I also got the same result for the FFT checkpoint and a significantly lower score for the LoRA checkpoint compared to the reported LoRA score, appreciate if anyone could help...
Hi, your implementation is not correct, please refer to section 4.2 in the paper and the official implementation https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/layer.py
Hi, for retraining the embedding layer, I recommend using standard LoRA, as DoRA has not yet been tested on retraining the embedding layer.
Hi I think this may be due to the Transformers package version being too new to the local peft folder, you can check here: https://github.com/huggingface/transformers/blob/2fc33ebead50383f7707b17f0e2a178d86347d10/src/transformers/integrations/peft.py#L33