Christopher Chou comments

Results 15 comments of


                                            Christopher Chou

NotImplementedError: Cannot copy out of meta tensor; no data!

To fix the issue with the "found optimizer but no scheduler", simply remove the optimizer from the deepspeed config. This was a new change with the new version of transformers...

NotImplementedError: Cannot copy out of meta tensor; no data!

My deepspeed config and training script is the same as listed [here](https://github.com/lm-sys/FastChat/blob/main/docs/training.md)

Add callback on save for LoRA

Yes, reloading from the new checkpoint still works, and the checkpoint size for the `adapter_model` was 17M for me. I was able to test `apply_lora` and it works with the...

I discovered and tested that the program (following [this thread](https://discuss.huggingface.co/t/trainer-option-to-disable-saving-deepspeed-checkpoints/13262/4)) will not create the `pytorch_model.bin` file if we set `"stage3_gather_16bit_weights_on_model_save": false` in `ds_config.json` and the checkpoint resumes. So, now the...

Add callback on save for LoRA

I tested resuming from checkpoint while deleting all the files `zero_pp_rank_x_mp_rank_00_model_states.pt` and I run into an AssertionError when loading it from the checkpoint presumably since it's looking for those files.

Add callback on save for LoRA

I was working on this PR, but the best I can do so far is to be able to load from an adapter but we lose the LR schedule, optimizer...

The effect of lora finetune with difference target_modules

I heard that people also get good results fine-tuning "fc1" and "fc2" modules from [this paper](https://arxiv.org/pdf/2110.04366.pdf): > "we conclude that modifying head attention shows the best results when the parameter...

The effect of lora finetune with difference target_modules

Yes, I believe it does add additional parameters.

Error serving baize-lora-7B

I think you should use `apply_lora.py` to merge the adapter `project-baize/baize-lora-7b` to the base model `llama-7b`.

Support Yi-VL-6B/34B

Closing because it is resolved through #112