Using BARTScore to Compare 2 summaries without Human Evaluation
I went through the analysis script for comparing 2 evaluation metric wrt human evaluation (meta evaluating evaluation metric).
I wanted to know if there is some way to compare 2 summaries with help of standalone BARTscore.
Eg:- Higher Rouge Score then better the summary. Similarly can we calculate BARTScore for 2 summaries and then conclude that higher BARTScore better it is
Yes, similar to Rouge score, the higher the BARTScore, the better the summary.
Yes, similar to Rouge score, the higher the BARTScore, the better the summary.
I had a similar doubt about the interpretation of the scores. I understand higher = better. I'm still confused about the interpretation in terms of absolute score (how much a high score is a good score). In the paper, what were the absolute scores for the REALSumm and SummEval datasets? This would give a good reference point.