carlkl

Results 91 comments of carlkl

Can OpenBLAS be used somehow instead?

Unfortunately I have absolutely no idea how to use OpenBLAS inside pyccel. What you basically need is to link against libopenblas.dll with `-lopenblas`. But I have no clue how to...

@EmilyBourne, I can build OpenBLAS locally; this is not a problem. But the last time I locked whats going on I can't found where `-llapack` is set and how to...

unfortunately the openblas latest develop https://github.com/xianyi/OpenBLAS/commit/406d9d64e97eb6bd83f7d9d55336272391e4126a together with a cherrypicked https://github.com/jeromerobert/OpenBLAS/commit/ee71dd3bf1480599e71c06064d8fd9d3f74f5a38 patch doesn't solved the gemv performance issue https://github.com/xianyi/OpenBLAS/issues/532. The library was build with MAX_STACK_ALLOC=2048. See https://github.com/winpython/winpython/issues/82#issuecomment-95347118

something weird happens with OpenBLAS dgemv. Running @hiccup7's scipy code above with ONLY ONE thread in OpenBLAS gives about the same perfomance as scipy-MKL. The performance drops as more threads...

here it is: - platform windows, openblas develop https://github.com/xianyi/OpenBLAS/commit/406d9d64e97eb6bd83f7d9d55336272391e4126a together with a cherrypicked https://github.com/jeromerobert/OpenBLAS/commit/ee71dd3bf1480599e71c06064d8fd9d3f74f5a38 patch despite the name: - https://bitbucket.org/carlkl/mingw-w64-for-python/downloads/openblas-fb02cb0_amd64.7z - fortran ordering (C ordering is much slower) - M...

about 4 according to the taskmanager. The MKL performance is not degraded if more than one thread is used. A solution might be to increase `GEMM_MULTITHREAD_THRESHOLD`. Was the default `4`...

with the latest develop from wernsaar (updated dgemv_n kernel for nehalem and haswell) I still have the same behaviour with and without threads (steered with coresp. environment variables) @hiccup7 's...

- Platform: - windows amd64 - gcc with win32thread model - openblas: latests wernsaar develop - Makefile.rule: - TARGET = HASWELL - DYNAMIC_ARCH = 0 - CC = gcc -...

Python users? Be aware, that MKL as included in numpy-MKL is free, but not for every usecase. I'm not a laywer, but I think you need to buy a MKL...