chenqianfzh
chenqianfzh
Can you add a line in your script to download the repo to a local path and run from there? For instance, you can add lines like the following before...
> This completely bypasses the existing LoRA logic and implements its own. I don't think this is a good design and it clashes with already existing code. We should instead...
> Is it theoretically possible for the QLoRA adapter to be loaded and unloaded at will? I am not sure what you mean by "at will". Do you mean load/unload...
> Ok, that's what I wanted to confirm. Thanks for clearing it up. In that case: > > 1. for consistency, I would suggest ditching the `qlora_supported` decorator and just...
> Thank you for your excellent work. Here are some personal opinions: > > * vLLM has supported quantized models with LoRA, refer to [quant model+lora](https://github.com/vllm-project/vllm/blob/main/tests/lora/test_quant_model.py). These can be generalized...
@Yard1 @jeejeelee I just updated the MR of QLoRA/BitsAndBytes with the changes suggested. Could you please take another look? Thanks for the great advice from you. Learned a lot and...
@mgoin @Yard1 @jeejeelee Thanks for the feedback. Working on the changes now.
> We should also add a test for this - it's ok if it's just an end to end one (load a small model from huggingface hub and see if...
@jeejeelee @Yard1 @mgoin I have updated the PR, addressing and resolving all the comments. Additionally, I have added the necessary unit tests. Could u please review it again? However, I...
> > @jeejeelee @Yard1 @mgoin > > I have updated the PR, addressing and resolving all the comments. Additionally, I have added the necessary unit tests. Could u please review...