torchmetrics Add METEOR metric

🚀 Feature

Add another NLP metric - METEOR (Lavie and Agarval, 2007).

Motivation

METEOR is another metric used for the machine translation evaluation similarly to BLEU, however, it demonstrates a higher correlation with human judgements of translation quality.

Pitch

To support the METEOR metric will be likely to require nltk dependency. The nltk package is also, however, needed for the ROUGE metric. METEOR thus should not bring any new dependency.

Additional context

I will be happy to start working on this feature next week if desired to be added :)

Oct 29 '21 10:10 stancld

@stancld sounds great :]

Oct 29 '21 17:10 SkafteNicki

@stancld do want to take it on your own or you can promote it on slack #new_contributions and then supervise it :rabbit:

Nov 03 '21 12:11 Borda

@Borda I've been working on that. :]

I've almost finished the implementation resembling the behaviour of the METEOR metric from nltk package. Once ready, I'll send a PR.

The aforementioned implementation is, however, based on the original paper (Lavie and Agarval, 2007) and thus is quite obsolete.

There is also a new version of this metric based on the paper of Denkowski and Lavie, 2014. I suggest doing this upgrade in another PR because it seems the only implementation of this metric is written in Java (see Meteor 1.5) (except for this python wrapper which directly utilizes the Java implementation) so it may take some extra time to implement it. (In general, I am trying to write the current implementation to be adaptable for newer versions)

Nov 03 '21 13:11 stancld