OmniQuant
OmniQuant copied to clipboard
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
Hi, so as I've come to find out -- it appears that it is a bit more complicated than it is made to be in the .ipynb. As far as...
Hi, when I tried to reproduce the evaluation results for Llama-2-13b w4a4, I got "nan" for both WIKI and C4. However, the reproduction results are good for Llama-2-13b w6a6 and...
" CUDA_VISIBLE_DEVICES=0 python main.py \ --model /home/Projects/model_zoo/facebook/opt-30b \ --epochs 20 --output_dir ./log/opt-30b-w6a6 \ --wbits 6 --abits 6 --lwc --let --alpha 0.75 --eval_ppl \ --net opt-30b " When I use omniquant...
 I have obtained the weight offset factor for llama3-8b, but there was a unique mismatch issue during my compression process.  My scaling factor code has not been changed,...
 Why is the compressed file one file instead of the pre trained weights, where there are many files for training the model? Can the compressed file be used for...
when using transformers verison 4.35.2, I got this error, and similar error for quanting llama: it seems you are using version
OmniQuant-main/models/int_falcon_layer.py", line 52, in __init__ self.maybe_rotary = copy.deepcopy(org_module.maybe_rotary) File "local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__ raise AttributeError("'{}' object has no attribute '{}'".format( AttributeError: 'FalconAttention' object has no attribute 'maybe_rotary' transformers version:...
Some time ago, in README there was a link to the "fixed version" of AutoGPTQ: [AutoGPTQ-bugfix](https://github.com/ChenMnZ/AutoGPTQ-bugfix). However, current README gives link to the original repo: [AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ). So, does this mean...
I have run main.py, --tasks piqa,arc_easy,arc_challenge,boolq,hellaswag,winogrande Due to the inability of the dataset to be loaded, I can only download the dataset from Hugging and load it locally. Then I...
Dear author, may I ask how to evaluate the W6A6 opt model using the provided [ChenMnZ/OmniQuant](https://huggingface.co/ChenMnZ/OmniQuant/tree/main), `act_shifts`, and `act_scales` ? Here is my command ```bash python main.py --model $PATH_TO_MY_OPT_CHECKPOINT \...