evaluation icon indicating copy to clipboard operation
evaluation copied to clipboard

Add BLiMP task

Open jumelet opened this issue 4 years ago • 0 comments

One thing I was unsure about is how to split up model performance on individual subtasks: within BLiMP it would be a bit odd to just merge all accuracies together into a single number, but I can imagine that given the scale of different datasets that are considered we don't necessarily want to split up tasks into subtasks as well.

However, if we would want that split to be present as well I can easily add it. Can the self.metrics dictionary contain any kind of entry?

jumelet avatar Jan 24 '22 15:01 jumelet