MoRA
MoRA copied to clipboard
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Thank you for sharing your results. In return I will share my own: If you reformulate the code so that during the forward pass, it adds the decompressed MoRa weights...
Please tell me the solution I'm using a gemma-2-9b-it, meta-llama-2-7b-hf And My custom dataset. Version - torch 2.4.0 - bitsandbytes 0.43.2 - accelerate 0.33.0 ## Error Message ``` [rank0]: return...
is it possible to support QMoRA with huggingface bitsandbytes?
修改: 1.在源文件261行前加入: while pad_size > in_f : x = torch.cat([x, x[..., :]], dim=-1) pad_size-=in_f 这是由于没考虑pad_size > in_f得情况导致。 2.在源文件293行代码替换为: if out_x.numel() == 0: out_x = out_x.view(*x.shape[:-1], out_f) else : out_x =...
Could you provide a slurm script to run the fine-tuning code? Apparently there are some issues with deepspeed, by just using the provided instructions.
我观察到在peft-tuners-lora-layers.py的代码中,好像并没有给出decoupling的压缩与解压缩的方法,貌似mora_type = 1,2,3,4都对输入x进行了某种形式的求和,好像都是sharing的某种形式,然后旋转的方法也是基于这个做的?
Hi, thank you very much for your great repo again! I would like to use this codebase to conduct the experiments about full-fine tuning. However, I find the results is...
数据集格式
`RANK=8 deepspeed --num_gpus=8 --num_nodes=2 train.py \ --base_model --micro_batch_size 4\ --wandb_run_name mora_math_r8 --lora_target_modules q_proj,k_proj,v_proj,o_proj,gate_proj,down_proj,up_proj \ --num_epochs 3 --deepspeed ds.config --wandb_project lora-math --lora_r $RANK --batch_size 128 \ --data_path meta-math/MetaMath \ --save_steps 3000...
基础模型:llama-7b-hf 我尝试将微调后的参数合并,但出现了错误 `Traceback (most recent call last): File "/home/kww/test_model.py", line 24, in merged_model = model.merge_and_unload(safe_merge=True) File "/home/kww/MoRA/peft-mora/src/peft/tuners/lora/model.py", line 721, in merge_and_unload return self._unload_and_optionally_merge( File "/home/kww/MoRA/peft-mora/src/peft/tuners/lora/model.py", line 375, in _unload_and_optionally_merge target.merge(safe_merge=safe_merge,...
I don't see any adoption of this method whatsoever. Why is it so unpopular? The latest articles dates back to a year ago. Or maybe it exists under another name?