it's strange for the MMLU result

Open milktea888 opened this issue 1 year ago • 1 comments

Hello, I test the MMLU, using different sparsity from 0 to 90%。But the mmlu score forcus on 50%, which didn't show significant decline as the sparsity becoming large. Since MMLU is choice questions, the sparsified model can give the correct choice, but in the explanation part, jumbled code appeared. Could I say, in this sparse way, it does not affect the logistic thinking but affect the language ability(jumed code)?

Looking forward to your reply! Thanks

Dec 10 '24 10:12 milktea888

Yeah this is interesting. I wonder if MMLU is contaminated

Sep 26 '25 00:09 chromecast56