Sasha Luccioni

Results 38 comments of Sasha Luccioni

We should probably look into GAN metrics as well, like Kernel Inception Distance (KID), Inception Score (IS) and Fréchet Inception Distance (FID) (maybe we should let people import them directly...

How about RL metrics? e.g. https://analyticsindiamag.com/metrics-for-reinforcement-learning/

Computer vision metrics: [SSIM](https://en.wikipedia.org/wiki/Structural_similarity) [PSNR](https://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio) There are various [object detection metrics](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/evaluation_protocols.md) implemented by Tensorflow

We could also connect to [`GEM`](https://github.com/GEM-benchmark/GEM-metrics) as per @yjernite 's [proposal](https://github.com/huggingface/evaluate/issues/10#issuecomment-1098485145) in another thread.

And to [Skimage](https://scikit-image.org/docs/dev/api/skimage.metrics.html)

And to [Torch Fidelity](https://github.com/toshas/torch-fidelity) -- for generative metrics

As per our meeting today, we proposed to have standardized structure for inputs, in dictionary form. An initial proposal of that structure can be: ``` { "references": , "predictions": ,...

Sorry, indeed, I should have distinguished inputs and parameters :) On Mon, Apr 11, 2022 at 6:00 AM Quentin Lhoest ***@***.***> wrote: > Just to distinguish the two cases: sources...

Sure, that could work! I mean, having two separate versions of the code seems a bit redundant, but I agree that the way it's implemented now makes it stand out...

The only metrics that violate the proposed format are in the *Edge* cases in my comment above -- all the other metrics that we currently have are compatible with it....