OmniQuant Obtained different PPL for Wikitext and C4 compared to results reported in the paper

Hi, thank you so much for the amazing paper and repo.

I am trying to reproduce the Wikitext and C4 perplexity in the OmniQuant paper. I downloaded the repo and run the following experiment:

CUDA_VISIBLE_DEVICES=0 python main.py --model meta-llama/Llama-2-7b-hf --epochs 20 --output_dir ./log/llama-7b-w3a16g128 --eval_ppl --wbits 3 --abits 16 --group_size 128 --lwc

As shown in the paper, the ppl for Wikitext and C4 for Llama-2-7B at w3g128 should be 6.03 and 7.75, respectively. But I obtained the following results from the log.

[2024-09-12 03:20:31 root] (main.py 144): INFO wikitext2 : 6.098666191101074 [2024-09-12 03:23:30 root] (main.py 144): INFO c4 : 7.8100385665893555

Did I set the hyperparameters wrongly? Hope you could help me clarify, thanks!

Sep 12 '24 13:09 yc2367

The checkpoints have some mismatch with current code.

Retrain by yourself through current code can successfully reproduce the results.

Oct 11 '24 14:10 ChenMnZ

Thank you for the quick response!

Oct 11 '24 15:10 yc2367