tensorly icon indicating copy to clipboard operation
tensorly copied to clipboard

Problematic init for PARAFAC2 with non-negativity only on A

Open MarieRoald opened this issue 4 years ago • 6 comments

One thing we did not consider when we implemented PARAFAC2 is that by reformulating it, so we have one B-matrix for each row of A instead of one B-matrix for each row of C, we change the order of the updates. Intuitively, this shouldn't have a big effect. However, PARAFAC2 has a special sign indeterminacy for the Bk-matrices, and it is common to resolve this via non-negativity constraints only on A (Helwig, N. Psychometrika, 2013). A consequence of this is that if one C and B-component "points in the opposite direction of the data", then the corresponding component in A should be all negative, which leads it to be all zero.

This is not a big problem for completely non-negative tensor decompositions since it is then unlikely (if not impossible?) that we observe this problem. However, PARAFAC2 is often fitted with non-negativity on only one or two modes, but never all three (with ALS at least), so this problem is more likely.

Whenever this problem occurs, then we get a cryptic SingularMatrixError, with no information that this is the result of a bad initialisation. Maybe we should have a better error message here (to show that it comes from a bad initialisation)?

I found two ways that seemed to remedy this problem. The first is to initialise the B-matrices orthogonal by initialising the B-matrix (which we multiply with Pk to obtain Bk) as an identity matrix. This is what the MATLAB implementation does, but I'm not sure if this will work in all cases. The second option was to have an option to specify the ordering of the modes we update in the nn_cp_hals function, since if we update A last, then the other factor matrices are already (probably) "aligned with the data". Then we can use this option within the parafac2 function to update the factor matrices in reverse order, and that way match the update order of the reference implementation.

MarieRoald avatar Jul 01 '21 17:07 MarieRoald

Nice catch @MarieRoald!

I can also think of another option: check at the first iteration if A is all-negative before clipping. If this is the case,

  • print a warning advising the user to user a better initialization (svd-based or even parafac)
  • revert the update of A, or do not clip for that first iteration

I do not like so much the idea of having the update order in the signature as 99.99% of people will not care about that and it will make the doc even harder to read to fix a problem that we can adress in other ways. Your first solution would work however, but I guess any nice initialization for the factors would avoid this issue as well!

cohenjer avatar Jul 06 '21 08:07 cohenjer

I lost track of this -- is it still ongoing?

JeanKossaifi avatar Dec 28 '22 18:12 JeanKossaifi

I can fix this easily with the second option, just wanted to check @MarieRoald's opinion.

cohenjer avatar Jan 03 '23 08:01 cohenjer

@MarieRoald is unfortunately sick, but my impression is that it is still a problem.

Also, I think this should be a problem with non-negative PARAFAC too if you have non-negativity on only some of the modes?

yngvem avatar Jan 03 '23 19:01 yngvem

Sorry to hear :/

Indeed we can in principle have a similar issue with any nonnegative algorithm. I think the nnCP algorithms are more robust to this problem:

  • Multiplicative Updates does not force true zeros
  • HALS updates one column at a time, so having a fully negative factor is very unlikely unless the data is entirely negative (in which case, why use nonnegative CP? the solution is 0 anyways).

Nevertheless, we run into a different problem: once a column is set to 0 in HALS, the corresponding component is 0 in the rest of the algorithm. Therefore we can be stuck in this saddle point.

A fix proposed in the literature, which is actually required to obtain convergence guarantees, is to instead impose factors to be $\geq \epsilon$ for some small $\epsilon$. This way the ``zero-locking'' phenomenon disappears, as well as division by 0 in Parafac2. This is referenced for instance in this book and discussed first in this publication, also this one. To obtain true zeroes back, we can post-process the factors after convergence.

I can make a tentative PR for this; it was one of the things I wanted to improve in HALS anyways to ensure convergence and improve numerical stability.

cohenjer avatar Jan 04 '23 08:01 cohenjer

(Arrived here because I have hit this SingularMatrixError.)

Maybe I am missing something, but could we simply transpose the projection tensor before running CP, and then reverse the factors order after? It should also be easy enough to modify the non-negative list to the new indices. I know this feels like a hack, but it solves the issue with little additional complexity, and doesn't modify CP in any way. I like the ordering of PARAFAC2 here because it makes all the list iterations simple in Python syntax.

(Improving HALS is also great, in addition to this.)

aarmey avatar Jan 24 '23 15:01 aarmey