POT Getting Error while computing sinkhorn distance

Got below error :

/lib/python2.7/site-packages/ot/bregman.py:347: RuntimeWarning: invalid value encountered in multiply Kp = (1 / a).reshape(-1, 1) * K ('Warning: numerical errors at iteration', 0)

Command: ot.sinkhorn(a=input_vector, b=output_vector, M=distance_matrix, reg=0.01, verbose=True)

Details : input_vector.shape : (8342,) [Sums upto 1] output_vector.shape : (8342,) [Sums upto 1] distance_matrix.shape : (8342,8342) [Euclidean distance]

What could be possible issue here. Please assist.

ot.sinkhorn returns OT matrix. How can we convert it to single number which is equivalent of distance.

Jul 02 '18 09:07 manishbansal-fk

Hello,

Me and my colleague are getting the same warnings related to Sinkhorn as the above. This makes the whole computation of sinkhorn2 go to 0, whenever you have zeros in the array or when you choose a small regularizer.

There are two ways to fix it, as far as me and my colleague are aware of: a) add a small number to your vectors, say 1e+6 b) use http://pot.readthedocs.io/en/stable/all.html#ot.utils.clean_zeros and see #30 - you can write a function yourself too.

Finding the right regularization term however is another point we're trying to figure out since for smaller reg=1e-3 it will compute the distance as zero - I am not sure if there's a recipe for this, maybe the literature has some pointers?

You can use ot.sinkhorn2() - this will return the regularized distance.

Jul 02 '18 11:07 patricieni

Hello @patricieni,

Thank you for your detailed answer, I was writing mine when yours appeared but you did a quicker and better job.

I think we should add an additional regularization parameter for sinkhorn that automatically add a small value or add a small term during the algorithm in order to avoid those kind of numerical problem. Still while the algorithm do not fail anymore, the convergence speed can be rather slow for small regularization.

Jul 02 '18 11:07 rflamary

Thanks @rflamary - we've had some issues with using Sinkhorn because of that, plus a lot of precision errors in Python.

I can submit a pull request for that extra parameter for Sinkhorn if you want? I was going to ask about submitting a pull request for implementing Greenkhorn as well, but that would be a bit of work.

Jul 03 '18 10:07 patricieni

Yes a PR would be nice.

Basically we just need to add a small eps value in the divisions around the following lines https://github.com/rflamary/POT/blob/master/ot/bregman.py#L357

this eps=1e-16 by default should be passed as parameter of the sinkhorn_knopp function and should exist as parameter in the sinkhorn function also (entry point).

As a matter of fact I just saw in the code that we implemented this trick in the sinkhorn_stabilized function but with a fixed value of eps https://github.com/rflamary/POT/blob/master/ot/bregman.py#L554 mlaybe we should add eps as a parameter for this function also.

Also don't forget to update the documentation of both functions if you add a parameter.

I don't like adding a value to the histograms inside sinkhorn since it basically return a false solution, the epsilon above is just a simple stabilization of the algorithm that works even when histograms have 0 values.

Greenkhorn would be nice obviously (we have been thinking about it also) but it is indeed more work and will require several tests in addition to all the coding and documentation.

Jul 03 '18 11:07 rflamary

Thanks @rflamary I will try and have a go at it in that case!

On a slightly different note, or perhaps related, I'm trying to run emd2 on MNIST and for some reason I have not managed to normalize properly (arrays never sum to 1.0) no matter what methods I use (sklearn.preprocessing, math.fsum(), np.sum()) resulting in problem infeasible, simplex errors when computing emd2. I'm using either float32 or float64.

More specifically, there seems to be a 1e-8 tolerance for the C++ lp solver from the ot library (saw that from debugging) and although in python we can fix that by adding or removing the extra difference to be within the solver's tolerance, on GPUs it does not work. This might be related to single point precision arithmetics done on GPU (I think) so the question is, how would you go around normalizing correctly to remove the error resulting when solving emd2 and do you have any pointers for how to do it on GPU?

Jul 04 '18 10:07 patricieni

Hi @patricieni , sorry for long time reply. Could you post a short code that is reproducing your error ? Thanks in advance

Sep 07 '18 13:09 ncourty

a) add a small number to your vectors, say 1e+6

😂

Jun 12 '19 17:06 jilljenn

Closing this issue because we implemented the solver with option method='sinkhorn_log' that is stable numerically. See :
https://pythonot.github.io/all.html#ot.sinkhorn

Jan 15 '24 12:01 rflamary