Perplexity is off for Llama 2-7b
Hello, I hope this finds you well.
I was trying to prune Llama 2-7b with wanda (cloned directly from your codebase), so I ran the following command: python main.py --model meta-llama/Llama-2-7b-hf --prune_method wanda --sparsity_ratio 0.5 --sparsity_type unstructured --save out/llama2_7b/unstructured/wanda/
but I get a perplexity of 10.27 which is way higher than what you guys are reporting. It is being pruned with c4 and tested on wikitext2 (I changed nothing in the codebase). Do you guys maybe have a guess on what I might be doing wrong?
TIA
Hi, Could you check if the performance of dense LLaMA2-7b match our number in the paper as well?
Hi, Thanks for your prompt response. Yes, the dense is off too. I'm getting 7.72 for LLaMA2-7b and you guys are reporting 5.12. Can you maybe clone your repository again yourself and see if you can reproduce the results?
hmm, the number I get from reruning is still 5.12 for dense LLaMA2 (context size 4096), even with context size 2024, the number would be around 5.5, as verified by other works (e.g., table 4 in https://arxiv.org/abs/2306.00978).
I'm running with context size 4096 as well (nsampels = 333). This is so weird. What version of datasets and transformers are you using?