Daniel Deutsch
Daniel Deutsch
@niansong1996 Check out the changes I made in #5207 and #5208. I was able to match the BART paper's scores using [these hyperparameters](https://github.com/pytorch/fairseq/blob/425c36eafff535fe7337f8bdd5ace22ebacc78cb/examples/bart/summarize.py#L10-L11) for min/max sequence length and the length...
Also to reproduce the ROUGE-L result, you actually need to sentence split the summaries before running ROUGE.
The AllenNLP ROUGE is fine for training, but you should really use the original package for calculating the final scores on the test set. I think the AllenNLP version operates...
This one is popular https://github.com/google-research/google-research/tree/master/rouge. It's included in this pip package https://pypi.org/project/rouge-score/ and used by Huggingface https://github.com/huggingface/datasets/blob/67574a8d74796bc065a8b9b49ec02f7b1200c172/metrics/rouge/rouge.py
Yes, you need to use the length penalty of 2.0 via the [LengthNormalizedSequenceLogProbabilityScorer](https://github.com/allenai/allennlp/blob/39d7e5ae06551fe371d3e16f4d93162e55ec5dcc/allennlp/nn/beam_search.py#L491-L509). I think that should make the difference. I have never used the Google ROUGE package myself, but...
I think the selection of the ID would need to be dataset-specific. There isn't really an "official" ordering of the CNN/DailyMail dataset as far as I am aware. Each instance...
I think using the `datasets` IDs when available would be helpful since many people use their dataset readers now. For example, [here](https://github.com/huggingface/datasets/tree/master/datasets/cnn_dailymail) they have unique IDs for the CNN/DailyMail dataset...
Can you be more specific about what is missing? I zipped the output from the aclpub2 tool. This is what I delivered to the publication chairs: https://drive.google.com/file/d/1AOKfrBeEIL1sViZWuAfbsfUw1actbFpg/view?pli=1
Thanks Juri for running the command! The last issue that I was made aware of was that papers.yml was incorrectly formatted. In the original delivery of the proceedings, I just...
Hi Yanjun, There are two versions of ROUGE, one which is a wrapper around the original Perl implementation and one which is written (by me) in Python. For the Perl...