AppleAccelerate.jl icon indicating copy to clipboard operation
AppleAccelerate.jl copied to clipboard

AppleAccelerate SVD is slower than libopenblas64

Open OliverDudgeon opened this issue 1 year ago • 0 comments

I was benchmarking this on my M4 Pro Macbook Pro (10 core) and I noticed that although I get benefits for some BLAS operations (matrix multiplications etc) SVD is a fair bit slower than libopenblas64. Here's the results I get on my machine:

  • AppleAccelerate
In [1]: using LinearAlgebra, BenchmarkTools, AppleAccelerate

In [2]: A = rand(ComplexF32, 2000, 2000);

In [3]: @btime svd(A);
  1.548 s (21 allocations: 169.04 MiB)
  • 10 BLAS threads
In [1]: using LinearAlgebra, BenchmarkTools

In [2]: BLAS.get_num_threads()
Out[2]: 10

In [3]: A = rand(ComplexF32, 2000, 2000);

In [4]: @btime svd(A);
  841.862 ms (21 allocations: 169.04 MiB)

OliverDudgeon avatar Feb 21 '25 13:02 OliverDudgeon