wanda icon indicating copy to clipboard operation
wanda copied to clipboard

OPT-66B, unstructured sparsity gets wikitext perplexity 3404.0751953125

Open dhjoo98 opened this issue 1 year ago • 1 comments

Hello, I used the scripts to prune the OPT-66B. (Unstructured, n_samples 128) Upon with, I get a wikitext perplexity of 3404, which is way off the metric given in the paper.

I was wondering if the code output metric should be scaled by 0.01, (thus 3.404 perplexity) Or if this is an outlier performance.

dhjoo98 avatar May 02 '24 16:05 dhjoo98

This seems to be an outlier performance, which i get before from running on OPT-66B. I wasn't able to look into this (mainly because LLaMA and LLaMA2 is much more popular), but it would be interesting to study why this is the case from a scientific perspective.

Eric-mingjie avatar May 03 '24 02:05 Eric-mingjie