_solve_adjoint_derivative_dense much slower than np.linalg.solve
diffcp is installed with openmp flags:
MARCH_NATIVE=1 OPENMP_FLAG="-fopenmp" pip install diffcp
It's at least 5 times slower than np.linalg.solve.
Eigen solve should not be much slower than np.linalg.solve.
Report here in case the code performance can be improved.
We force Eigen to be single thread, so we can multi-thread diffcp. On the other hand, I'm pretty sure that np.linalg.solve is multi-thread. So that might explain the 5x difference (which is probably around the number of cores you have).
On Fri, Aug 7, 2020 at 7:51 PM Zichao Yang [email protected] wrote:
diffcp is installed with openmp flags:
MARCH_NATIVE=1 OPENMP_FLAG="-fopenmp" pip install diffcp
It's at least 5 times slower than np.linalg.solve. Eigen solve should not be much slower than np.linalg.solve.
Report here in case the code performance can be improved.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cvxgrp/diffcp/issues/38, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB7LUGMM4YFTOJVFN2GE4NDR7S4URANCNFSM4PYKYYSA .
It seems diffcp is using many cores in backward when batch_size = 1 ?