TensorRT-LLM
TensorRT-LLM copied to clipboard
Alpha scaling incorrect when using rslora
System Info
Any
Who can help?
@kaiyux
Information
- [X] The official example scripts
- [ ] My own modified scripts
Tasks
- [X] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
If a model with LoRA weights is trained using rslora tensorRT-LLM will incorrectly calculate the scaling for the LoRA weights
https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/lora_manager.py#L632
This line should be updated based on the adapter_config.json --> rslora value to normalize by the square root of rank
Expected behavior
LoRa adapters give expected performance even when trained using rslora
actual behavior
LoRa adapters trained using rslora give very bad generations (due to incorrect adapter weights being applied)
additional notes
I would also like to have the lora scaling information used within the examples/hf_lora_convert.py script (which presently reads in the adapter_config.json file but does not consider alpha scaling)