mir_eval icon indicating copy to clipboard operation
mir_eval copied to clipboard

Separation test precision on OSX

Open bmcfee opened this issue 2 years ago • 1 comments

Looks like we're hitting numerical precision issues again, sometimes, in the source separation tests on OSX. See https://github.com/craffel/mir_eval/actions/runs/8426164125/job/23106646394?pr=374

The tests passed previously (eg in #370), and since we're now seeing the RNG before every test function execution, the deviation must be coming from the unerlying blas implementation and/or some interaction with the hardware on which it's deploying.

I've been seeing similar weirdness in other packages (pescador, librosa) lately, and there are some strange things happening with openblas and xsimd that expose the non-associativity of floating point arithmetic in cases like this.

I propose to "fixing" this by detecting the execution platform and raising the atol parameter on OSX deployments in test_separation.py. Since the separation metrics are in decibels, I don't think we should be too concerned about raising the tolerance from 0.01dB to 0.05dB, and keeping the stricter tolerance on better-behaved platforms should keep us safe.

Meanwhile, I don't think separation test failures on osx should be a blocker to merging unrelated PRs (eg #374).

bmcfee avatar Mar 26 '24 14:03 bmcfee

Punting this from the milestone since #382 renders it irrelevant.

bmcfee avatar Aug 16 '24 13:08 bmcfee