LiMa-cas
LiMa-cas
when I reference, is it much slower since I need if else to see which precision to dequantize?
Hi,What‘s the difference between llm-awq and autoawq?thanks in advance!!!
as mentioned above
hello, how much time need and what datasets are u used?
1. is the finetune need each layer? could I used for some layers finetune once? 2. is codebook quantized method is slower than AWQ? 3. when I inference,it is successful...
 torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 28.00 GiB. GPU 0 has a total capacity of 47.54 GiB of which 9.50 GiB is free. Process 1509125 has 9.68...
hi,where is FileNotFoundError: Unable to find '/home/lyh/data/hf/Shargpt/ShareGPT_V4.3_unfiltered_cleaned_split.json',thanks a lot
qlora 问题
qlora请问是对fp16微调呢还是int4微调呢? 为什么我跑出来的结果需要加fp16原始的模型参数呢,这样就很大了