lapack Fix sorcsd

Closes #695.

Revert "SORCSD2BY1: remove dead code" This reverts commit d245b4f6ef5ed18cff4ef53d75a96b49f259bc3a.

Revert "SORCSD: fix documentation on matrix dimensions" This reverts commit bdcd890a185482119c52dd7acd9a702f0cad782a.

Keep corrections in the documentation, which make sense.

Aug 05 '22 21:08 weslleyspereira

The double precision tests still show: 39 failing tests. See https://github.com/Reference-LAPACK/lapack/actions/runs/2806513051

My personal machine found 16 tests failing for the CS decomposition in double precision:

Command: "/usr/bin/cmake" "-DTEST=/home/weslleyp/storage/lapack/build_RelWithDebInfo/bin/xeigtstd" "-DINPUT=/home/weslleyp/storage/lapack/TESTING/csd.in" "-DOUTPUT=/home/weslleyp/storage/lapack/build_RelWithDebInfo/TESTING/dcsd.out" "-DINTDIR=." "-P" "/home/weslleyp/storage/lapack/TESTING/runtest.cmake"
Directory: /home/weslleyp/storage/lapack/build_RelWithDebInfo/TESTING
"LAPACK-xeigtstd_csd_in" start time: Aug 05 17:04 MDT
Output:
----------------------------------------------------------
Running: /home/weslleyp/storage/lapack/build_RelWithDebInfo/bin/xeigtstd
ARGS= OUTPUT_FILE;/home/weslleyp/storage/lapack/build_RelWithDebInfo/TESTING/dcsd.out;ERROR_FILE;/home/weslleyp/storage/lapack/build_RelWithDebInfo/TESTING/dcsd.out.err;INPUT_FILE;/home/weslleyp/storage/lapack/TESTING/csd.in
Test OUTPUT:

 Tests of the CS Decomposition routines

 LAPACK VERSION 3.*.1

 The following parameter values will be used:
    M:         0    10    10    10    10    21    24    30    22    32

    P:         0     4     4     0    10     9    10    20    12    12

    N:         0     0    10     4     4    15    12     8    20     8


 Relative machine underflow is taken to be    0.222507-307
 Relative machine overflow  is taken to be    0.179769+309
 Relative machine precision is taken to be    0.111022D-15

 Routines pass computational tests if test ratio is less than   30.00


 CSD routines passed the tests of the error exits (  8 tests done)

 CSD: CS Decomposition
 Matrix types: 
    1: Random orthogonal matrix (Haar measure)
    2: Nearly orthogonal matrix with uniformly distributed angles atan2( S, C ) in CS decomposition
    3: Random orthogonal matrix with clustered angles atan2( S, C ) in CS decomposition
 Test ratios: 
   2-by-2 CSD
    1: norm( U1' * X11 * V1 - C ) / ( max(  P,  Q) * max(norm(I-X'*X),EPS) )
    2: norm( U1' * X12 * V2-(-S)) / ( max(  P,M-Q) * max(norm(I-X'*X),EPS) )
    3: norm( U2' * X21 * V1 - S ) / ( max(M-P,  Q) * max(norm(I-X'*X),EPS) )
    4: norm( U2' * X22 * V2 - C ) / ( max(M-P,M-Q) * max(norm(I-X'*X),EPS) )
    5: norm( I - U1'*U1 ) / (   P   * EPS )
    6: norm( I - U2'*U2 ) / ( (M-P) * EPS )
    7: norm( I - V1'*V1 ) / (   Q   * EPS )
    8: norm( I - V2'*V2 ) / ( (M-Q) * EPS )
    9: principal angle ordering ( 0 or ULP )
   2-by-1 CSD
   10: norm( U1' * X11 * V1 - C ) / ( max(  P,  Q) * max(norm(I-X'*X),EPS) )
   11: norm( U2' * X21 * V1 - S ) / ( max(  M-P,Q) * max(norm(I-X'*X),EPS) )
   12: norm( I - U1'*U1 ) / (   P   * EPS )
   13: norm( I - U2'*U2 ) / ( (M-P) * EPS )
   14: norm( I - V1'*V1 ) / (   Q   * EPS )
   15: principal angle ordering ( 0 or ULP )
 M=  21 P=   9, Q=  15, type  2, test 10, ratio= 0.802656E+11
 M=  21 P=   9, Q=  15, type  2, test 11, ratio= 0.988228E+11
 M=  21 P=   9, Q=  15, type  4, test 11, ratio= 0.600480E+15
 M=  24 P=  10, Q=  12, type  4, test 10, ratio= 0.375300E+15
 M=  24 P=  10, Q=  12, type  4, test 11, ratio= 0.643371E+15
 M=  30 P=  20, Q=   8, type  2, test 10, ratio= 0.280538E+12
 M=  30 P=  20, Q=   8, type  2, test 11, ratio= 0.267436E+12
 M=  22 P=  12, Q=  20, type  1, test 10, ratio= 0.217202E+14
 M=  22 P=  12, Q=  20, type  1, test 11, ratio= 0.260562E+14
 M=  22 P=  12, Q=  20, type  4, test 11, ratio= 0.450360E+15
 M=  32 P=  12, Q=   8, type  1, test 10, ratio= 0.896914E+15
 M=  32 P=  12, Q=   8, type  1, test 11, ratio= 0.491797E+15
 M=  32 P=  12, Q=   8, type  3, test 10, ratio= 0.211857E+15
 M=  32 P=  12, Q=   8, type  3, test 11, ratio= 0.171219E+14
 M=  32 P=  12, Q=   8, type  4, test 10, ratio= 0.375300E+15
 M=  32 P=  12, Q=   8, type  4, test 11, ratio= 0.225180E+15
 CSD:     16 out of    600 tests failed to pass the threshold


 End of tests
 Total time used =         0.01 seconds


Test ERROR:

Test /home/weslleyp/storage/lapack/build_RelWithDebInfo/bin/xeigtstd returned 0

Any idea where they could be, @christoph-conrads ?

Maybe related to that: I found an inconsistency between single and double precision codes, and proposed deb3d85984335ac638876d8a7705542b594a299d as a fix. I had the same failing tests.

Aug 05 '22 23:08 weslleyspereira

my tests in #695 showed you'd need to restore dorbdb6.f to fix the 39 errors.

Aug 06 '22 06:08 martin-frbg

my tests in #697 showed you'd need to restore dorbdb6.f to fix the 39 errors.

Thanks, @martin-frbg! I completely bypassed your comment: https://github.com/Reference-LAPACK/lapack/issues/695#issuecomment-1203569363. I will apply those changes too.

Aug 06 '22 09:08 weslleyspereira

Seems what breaks the tests is the changes around line 266 (257 in the 3.10.1 version) where the conditional for the early RETURN was changed and elements set to zero that were previously left alone. (Probably the latter is the actual cause, still testing right now)

Aug 06 '22 10:08 martin-frbg

Yes minimal fix is just

       IF( NORM_NEW .LE. N * EPS * NORM ) THEN
-         DO IX = 1, 1 + (M1-1)*INCX1, INCX1
-           X1( IX ) = ZERO
-         END DO
-         DO IX = 1, 1 + (M2-1)*INCX2, INCX2
-           X2( IX ) = ZERO
-         END DO
          RETURN
       END IF

but I do not see any benefit from any of the other changes in dorbdb6.f - maybe I'd need to be a mathematician rather than a lowly computational chemist for that ?

Aug 06 '22 10:08 martin-frbg

@martin-frbg, your minimal fix makes my tests pass indeed. Thanks!

However, I understand that setting X1 and X2 to this code is important for dorbdb5. See:

https://github.com/Reference-LAPACK/lapack/blob/3381a0ec8249be684428421bd30ed388de93fc47/SRC/dorbdb5.f#L218-L226

The point here, as I understand, was to identify zero projections. If a zero projection is found, use a different projection:

https://github.com/Reference-LAPACK/lapack/blob/3381a0ec8249be684428421bd30ed388de93fc47/SRC/dorbdb5.f#L228-L264

I will let @christoph-conrads comment on that before making additional changes.

Aug 08 '22 15:08 weslleyspereira

The reason for modifying the xORBDB6 return value can be found in #634. xORBDB5 is calling xORBDB6 to determine if a vector x lies in the range of a matrix A or not. xORBDB6 is a procedure (a function with return type void in C). Hence, the only way to determine if x lies in ran(A) was to compute the norm of x after xORBDB6 returned and check how close it is to zero. xORBDB5 and xORBDB6 used different criteria for the zero check leading to a bug in xORCSD2BY1. To fix the issue and after consulting with the original author Brian D. Sutton, it was decided to set x explicitly to zero to save the caller from having to re-implement parts of the xORBDB6 logic.

Assuming setting x explicitly to zero is a problem, one could turn xORBDB6 into a function returning a boolean. Modifying the entire vector x and then computing its norm only to figure out the result of xORBDB6 is a tad bit too complicated anyway but this solution did not break the API.

Aug 08 '22 20:08 christoph-conrads

I have rebased with the current master ( that includes #702 ), and then all tests pass in my machine. I am now running the tests on Github Actions. Thanks @mjacobse for noticing the issue with the types!

Aug 10 '22 13:08 weslleyspereira

I will merge this PR since the failing tests in double precision were fixed by #702.

[Edit:] Actually, can someone review my changes so I can merge it? Thanks!

Aug 10 '22 23:08 weslleyspereira

@weslleyspereira Thanks for fixing the issues. The changes make sense and I support merging your PR in.

Sep 18 '22 15:09 angsch