kkwhale7

Results 15 comments of kkwhale7

@tonysy @lvhan028 @so2liu @cdpath i need your help!!

You haven't implemented the evaluation logic for subjective questions, why are the values displayed on the official website different from ours ![image](https://github.com/open-compass/opencompass/assets/79788571/28cff94c-2551-4201-83c1-04565eb1bdb1)

> We only include the objective questions of Gaokao in OpenCompass but your score in your website 18.9 in GAOKAO ![image](https://github.com/open-compass/opencompass/assets/79788571/1731cf7b-e7ba-4143-bc59-4ac5e32c717d) we cant reproduce it!

in my way, I only calculate the objective score 15.13 ![image](https://github.com/open-compass/opencompass/assets/79788571/c4fb1add-6443-49ce-9531-751ac3aa4397)

> The MCQ problems select one answer from 4 choices. The result is meaningless when smaller than 25%. I got it.So do you directly ignore the scores of multiple topic...

I have discovered a new phenomenon where the predictions generated by the gen task in GAOKAO are also inconsistent, somehow in ZeroRetriever, all below are with llama2-7b models ![image](https://github.com/open-compass/opencompass/assets/79788571/51bf47f4-4078-4312-b64c-ed72a4c3b126) ![image](https://github.com/open-compass/opencompass/assets/79788571/31381eb0-aa8a-4a1c-854f-1c45854d648a)

thank you for your patience. I know your score calculation method now. But why are the two predictions different when I have the same config, This postprocess method only answers...

> I have discovered a new phenomenon where the predictions generated by the gen task in GAOKAO are also inconsistent, somehow in ZeroRetriever, all below are with llama2-7b models ![image](https://user-images.githubusercontent.com/79788571/275699402-51bf47f4-4078-4312-b64c-ed72a4c3b126.png)...