Weizhe Yuan

Results 10 comments of Weizhe Yuan

On the SummEval dataset, for FLU, COH and INFO, we also used BARTScore(s->h).

Here are some rules we have followed when deciding which BARTScore variant to use. * based on the definition of the evaluation perspective (for example, factuality must rely on the...

We used Python 3.7

You can customize your inputs to the scorer, for example, ``` bart_scorer.score(['my src sentence'], ['my hpy sentence']) ```

I did reproduce this with the model trained on parabank2, altho the numbers are slightly different to yours. When changing "I have 2 siblings" to "I have two siblings", the...

The results shown in our paper is the avg of all prompts (see section 4.2.2 Settings -- Selection of Prompts for more details). Specifically, we get the score for one...

I'm not exactly sure, since it's been a long time. You can check the numbers with `analysis.ipynb` to see which one matches the results. As I recall, we may have...

`bart_score.pth` is not related with importing the library. It is just our trained model. If you don't specify the checkpoint to be `bart_score.pth`, the default setting will download the `facebook/bart-large-cnn`...

Yes, similar to Rouge score, the higher the BARTScore, the better the summary.