OmniQuant issues

Questions regarding Infusing Omniquant into MLC

3

Hi, so as I've come to find out -- it appears that it is a bit more complicated than it is made to be in the .ipynb. As far as...

BuildBackBuehler

When reproducing evaluation results for Llama-2-13b w4a4, I got nan

1

Hi, when I tried to reproduce the evaluation results for Llama-2-13b w4a4, I got "nan" for both WIKI and C4. However, the reproduction results are good for Llama-2-13b w6a6 and...

NewDriverLee

OPT-30B

" CUDA_VISIBLE_DEVICES=0 python main.py \ --model /home/Projects/model_zoo/facebook/opt-30b \ --epochs 20 --output_dir ./log/opt-30b-w6a6 \ --wbits 6 --abits 6 --lwc --let --alpha 0.75 --eval_ppl \ --net opt-30b " When I use omniquant...

Arthur-Ling

Llama-3-8B

5

![image](https://github.com/OpenGVLab/OmniQuant/assets/149936473/6680ad6f-5edb-483f-8f43-2f7c03642c62) I have obtained the weight offset factor for llama3-8b, but there was a unique mismatch issue during my compression process. ![image](https://github.com/OpenGVLab/OmniQuant/assets/149936473/114b47cd-36a8-4e74-8ba6-5aaa814a73f9) My scaling factor code has not been changed,...

hsb1995

Why is the compressed file one file instead of the pre trained weights, where there are many files for training the mode

1

![image](https://github.com/OpenGVLab/OmniQuant/assets/149936473/b86fe737-3287-41ed-8b10-963e71e3677c) Why is the compressed file one file instead of the pre trained weights, where there are many files for training the model? Can the compressed file be used for...

hsb1995

TypeError: FalconRotaryEmbedding.forward() missing 1 required positional argument: position_ids

when using transformers verison 4.35.2, I got this error, and similar error for quanting llama: it seems you are using version

luchangli03

AttributeError: 'FalconAttention' object has no attribute 'maybe_rotary'

1

OmniQuant-main/models/int_falcon_layer.py", line 52, in __init__ self.maybe_rotary = copy.deepcopy(org_module.maybe_rotary) File "local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__ raise AttributeError("'{}' object has no attribute '{}'".format( AttributeError: 'FalconAttention' object has no attribute 'maybe_rotary' transformers version:...

luchangli03

AutoGPTQ or AutoGPTQ-bugfix?

7

Some time ago, in README there was a link to the "fixed version" of AutoGPTQ: [AutoGPTQ-bugfix](https://github.com/ChenMnZ/AutoGPTQ-bugfix). However, current README gives link to the original repo: [AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ). So, does this mean...

Alvant

Other Task

1

I have run main.py, --tasks piqa,arc_easy,arc_challenge,boolq,hellaswag,winogrande Due to the inability of the dataset to be loaded, I can only download the dataset from Hugging and load it locally. Then I...

hsb1995

How to properly evaluate W6A6 models using checkpoint from the mode zoo

2

Dear author, may I ask how to evaluate the W6A6 opt model using the provided [ChenMnZ/OmniQuant](https://huggingface.co/ChenMnZ/OmniQuant/tree/main), `act_shifts`, and `act_scales` ? Here is my command ```bash python main.py --model $PATH_TO_MY_OPT_CHECKPOINT \...

ChengZhang-98

OmniQuant
OmniQuant copied to clipboard

Metadata

Questions regarding Infusing Omniquant into MLC

When reproducing evaluation results for Llama-2-13b w4a4, I got nan

OPT-30B

Llama-3-8B

Why is the compressed file one file instead of the pre trained weights, where there are many files for training the mode

TypeError: FalconRotaryEmbedding.forward() missing 1 required positional argument: position_ids

AttributeError: 'FalconAttention' object has no attribute 'maybe_rotary'

AutoGPTQ or AutoGPTQ-bugfix?

Other Task

How to properly evaluate W6A6 models using checkpoint from the mode zoo

← Metadata

Owner

Metadata

OmniQuant OmniQuant copied to clipboard

Metadata

← Metadata

Owner

Metadata

OmniQuant
OmniQuant copied to clipboard