LiMa-cas issues

Results 12 issues of


                                            LiMa-cas

when I reference, is it much slower since I need if else to see which precision to dequantize?

What‘s the difference between llm-awq and this？

Hi，What‘s the difference between llm-awq and autoawq？thanks in advance！！！

how can I get the models of 0.45% sparsity by myself?

Does it support LLAMA3-8B-INSTRUCT or Qwen2-7b-instruct？

as mentioned above

QLORA VS PTQ

hello， how much time need and what datasets are u used？

question about the finetune

1. is the finetune need each layer？ could I used for some layers finetune once？ 2. is codebook quantized method is slower than AWQ? 3. when I inference，it is successful...

![image](https://github.com/user-attachments/assets/7a8e2b8a-3b61-49f8-8602-e579963850df) torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 28.00 GiB. GPU 0 has a total capacity of 47.54 GiB of which 9.50 GiB is free. Process 1509125 has 9.68...

could it support LLAMA-3.1-8B-INSTRUCT?

FileNotFoundError: Unable to find '/home/lyh/data/hf/Shargpt/ShareGPT_V4.3_unfiltered_cleaned_split.json'

hi，where is FileNotFoundError: Unable to find '/home/lyh/data/hf/Shargpt/ShareGPT_V4.3_unfiltered_cleaned_split.json',thanks a lot

qlora 问题

qlora请问是对fp16微调呢还是int4微调呢？为什么我跑出来的结果需要加fp16原始的模型参数呢，这样就很大了