Sasha Luccioni comments

Results 38 comments of


                                            Sasha Luccioni

Add missing metrics

We should probably look into GAN metrics as well, like Kernel Inception Distance (KID), Inception Score (IS) and Fréchet Inception Distance (FID) (maybe we should let people import them directly...

Add missing metrics

How about RL metrics? e.g. https://analyticsindiamag.com/metrics-for-reinforcement-learning/

Computer vision metrics: [SSIM](https://en.wikipedia.org/wiki/Structural_similarity) [PSNR](https://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio) There are various [object detection metrics](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/evaluation_protocols.md) implemented by Tensorflow

Feature: integration standard libraries

We could also connect to [`GEM`](https://github.com/GEM-benchmark/GEM-metrics) as per @yjernite 's [proposal](https://github.com/huggingface/evaluate/issues/10#issuecomment-1098485145) in another thread.

Feature: integration standard libraries

And to [Skimage](https://scikit-image.org/docs/dev/api/skimage.metrics.html)

Feature: integration standard libraries

And to [Torch Fidelity](https://github.com/toshas/torch-fidelity) -- for generative metrics

Feature: standardize inputs/outputs of metrics

As per our meeting today, we proposed to have standardized structure for inputs, in dictionary form. An initial proposal of that structure can be: ``` { "references": , "predictions": ,...

Feature: standardize inputs/outputs of metrics

Sorry, indeed, I should have distinguished inputs and parameters :) On Mon, Apr 11, 2022 at 6:00 AM Quentin Lhoest ***@***.***> wrote: > Just to distinguish the two cases: sources...

Feature: standardize inputs/outputs of metrics

Sure, that could work! I mean, having two separate versions of the code seems a bit redundant, but I agree that the way it's implemented now makes it stand out...

Feature: standardize inputs/outputs of metrics

The only metrics that violate the proposed format are in the *Edge* cases in my comment above -- all the other metrics that we currently have are compatible with it....