can not run programm with openblas.dll build without DYNAMIC_ARCH option
Hello
When I compile with
make BINARY=64 TARGET=HASWELL DYNAMIC_ARCH=1 HOSTCC=g++ NUM_THREADS=64 CC=gcc FC=x86_64-w64-mingw32-gfortran USE_THREAD=1 USE_OPENMP=1 OPENBLAS_COMPLEX_STRUCT=1 -j8
everything is fine, but when i compile without DYNAMIC_ARCH=1 library compiles but during
execution i get 'error writing location' run time error
Environment: Win 10 64 bit msys2 64bit gcc.exe (Rev1, Built by MSYS2 project) 7.2.0
Processor (CpuZ output): Number of cores 4 (max 4) Number of threads 8 (max 8) Name Intel Core i7 4790 Codename Haswell Specification Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz Package (platform ID) Socket 1150 LGA (0x1) CPUID 6.C.3 Extended CPUID 6.3C Core Stepping C0 Technology 22 nm TDP Limit 84.0 Watts Tjmax 100.0 °C Core Speed 998.7 MHz Multiplier x Bus Speed 10.0 x 99.9 MHz Stock frequency 3600 MHz Instructions sets MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, EM64T, VT-x, AES, AVX, AVX2,
Are gcc and gfortran the same version ? I suspect the only magic of using DYNAMIC_ARCH=1 here is that it suppresses running the "make tests" step as part of the build.
yes they are from same distribution $ gfortran --version GNU Fortran (Rev1, Built by MSYS2 project) 7.2.0 Copyright (C) 2017 Free Software Foundation, Inc.
dows it make sense to try make BINARY=64 TARGET=HASWELL DYNAMIC_ARCH=1 HOSTCC=g++ NUM_THREADS=64 CC=gcc FC=x86_64-w64-mingw32-gfortran USE_THREAD=1 USE_OPENMP=1 OPENBLAS_COMPLEX_STRUCT=1 -j8 tests? I'll check out
it did tests without error. Now I cleaned and recompiling without DYNAMIC_ARCH
Unfortunately I cannot check the Windows build procedure myself. Is there any particular reason why you pass OPENBLAS_COMPLEX_STRUCT=1 as a build option ? I believe this is an internal setting, you may want to try FORCE_OPENBLAS_COMPLEX_STRUCT=1 instead if you really need it ?
yes . I'm using visual studio 2013 with armadillo. If I dont define LAPACK_COMPLEX_STRUCTURE I get errors. I thought they were related
Will check again
and one more question...will -O3 option give more speed up?
test passed without 'DYNAMIC_ARCH=1'...i'l doing now without OPENBLAS_COMPLEX_STRUCT=1
when try to run program it loads libgomp-1.dll is that normal?
You told it to USE_OPENMP, so loading the GNU OMP implementation is correct.
will try to build using gcc not msvc tomorrow
i mean program(using mingw builds with ms visual studio right now)
should be HOSTCC=gcc CC=x86_64-w64-mingw32-gcc
I can not put CC=x86_64-w64-mingw32-gcc becouse of this https://github.com/xianyi/OpenBLAS/issues/339 compiler tries to execute x86_64-w64-mingw32-ar but there is no such file. output
$ which ar
/mingw64/bin/ar
So I did
make BINARY=64 TARGET=HASWELL HOSTCC=gcc NUM_THREADS=64 CC=gcc FC=x86_64-w64-mingw32-gfortran USE_THREAD=1 USE_OPENMP=1 FORCE_OPENBLAS_COMPLEX_STRUCT=1 -j8
Same thing --execution stops. May be another mingw64 build is needed? using msys2 mingw64
from mingw64 whic one to choose posix-seh or win32-seh? I plan to link use library with msvc 2013
In mingw64 environmet program compiles links to openblas and runs Ok. but now with mscv
If you don't have correct binutils it will never work.
Fedora and Debian mingw cross compilers work like charm. I don't think original could be as broken as you tell.
what you mean in "correct binutils"? mingw64 (msys2) openblas is building fine. programm is linking and working. with DYNAMIC_ARCH option also everything is ok... it links in visual studio and program works Did you mean that visual studio 2013 is not enough good for dll linking DYNAMIC_ARCH=0 option?
and one more question. I tryed msys2 with mingw64 and separate mingw64 builds. There is no x86_64-w64-mingw32-ar.exe in archives, but there are: x86_64-w64-mingw32-gcc-ar.exe and ar.exe Size of executables are different (x86_64-w64-mingw32-gcc-ar.exe and ar.exe) also where is no x86_64-w64-mingw32-ranlib.exe but x86_64-w64-mingw32-gcc-ranlib.exe and ranlib.exe all in mingw64 /bin folder. Size of executables are different (x86_64-w64-mingw32-gcc-ranlib.exe and ranlib.exe)
So which one to choose? and what is difference?(both are in 64bit toolchain)
Can you tell exactly where "execution stops" when you do not specify DYNAMIC_ARCH ? Is it during the build of OpenBLAS, or when you try to run your own program that is linked against OpenBLAS ? From my limited understanding of the mingw setup, the ar.exe and x86_64-w64-mingw32-gcc-ar.exe are basically the same thing, one probably a wrapper for the other just to satisfy different tool naming conventions for native or cross-compiling. I wonder if you really need the HOSTCC=gcc statement if you are compiling directly on Windows and not cross-compiling for Windows on a Linux system. Maybe at least the wrong names for ar.exe etc. come from the build system thinking you must be cross-compiling when you are not. Unfortunately I do not have a Windows system available for testing at the moment.
When I run program linked against OpenBlas using msvc 2013 as soon as program tries to call LAPACKE_zgesv or arma::solve (for complex. use std::complex
BINARY=64 make dll is produced but same problem....Don't understand magic :) ... will use built with DYNAMIC_ARCH=1 option
Can you describe 'same problem's you are referring to from the point of view of a debugger or at least some linker or runtime error message or the code affected? MSVC std::complex uses same binary representation as GCC. Can you supplement your builds with exact options passed to build and the final report of the build when complete? You don't need openmp on windows, default windows threads should be just fine.
I removed msys2 and did fresh install+ only mingw64 64bit toolchain. (till now i was using msys2 with sourceforge mingw-32 64 bit builds...Don't remember what mess i did :) )
Now things are little bit better some calls from OpenBlas work: LAPACKE_zgetri, arma::det for real matrix works...but calling armadillo's solve gives 'exception writing location'.. I'll post outputs tomorrow....
Just found this https://groups.google.com/forum/#!msg/openblas-users/lRzmnA8X1FU/gIi71oZABgAJ will check tomorrow it is my openblas side or arma's side misconfiguration...
About openmp... You mean without openmp and setting max numtheds during compilation will enable winpthread in mingw build?
I think winapi CreateThread is independent from POSIX APIs
openmp disabled it still requires libwinpthead-1.dll... It is no mater what parameters i choose on openblas
side and on armadillo side (ARMA_BLAS_UNDERSCORE or ARMA_BLAS_CAPITALS or neither)
execution is halted calling 'zgesv' or 'getrf' (it's armadillos solve() and det() functions). see png left side call hierarchy. MaxrixXvector multiplication works. cblas_dgemm call works

The dependency on libwinpthread probably comes from a posix-seh or pthread-seh toolchain, from what I could find on the net it should go away if you use a win32-seh mingw. Do I understand correctly that it is now crashing even if you call the LAPACKE_zgesv function directly, with no Armadillo function involved ? (Perhaps we would need a backtrace from an OpenBLAS that was built with DEBUG=1 )
Binary from sf.net could work too
here is a buid info: make BINARY=64 CC=x86_64-w64-mingw32-gcc FC=x86_64-w64-mingw32-gfortran USE_THREAD=1 USE_OPENMP=0 INTERFACE64=1 FORCE_OPENBLAS_COMPLEX_STRUCT=1 NUM_THREADS=8 -j8
programm defines: #define HAVE_LAPACK_CONFIG_H //#define ARMA_DONT_USE_LAPACK //#define LAPACK_COMPLEX_CUSTOM //#define LAPACK_COMPLEX_STRUCTURE //#define LAPACK_COMPLEX_CPP #define LAPACK_COMPLEX_STRUCTURE #define ARMA_DONT_USE_WRAPPER #define ARMA_DONT_USE_CXX11 //#define ARMA_32BIT_WORD
#define ARMA_BLAS_LONG #define ARMA_BLAS_LONG_LONG #define NOMINMAX calls: info = LAPACKE_zgesv(LAPACK_COL_MAJOR, (lapack_int)(Nm), 1, (lapack_complex_double*)(Aa.memptr()), (lapack_int)(Nm), (lapack_int*)(ipiva.memptr()), (lapack_complex_double*)(b.memptr()),Nm); /* LAPACKE_zgesv error ** On entry to ZGESV parameter number 4 had an illegal value LAPACKE_zgetri error ** On entry to ZGETRI parameter number 6 had an illegal value
ZGESV parameter number 4 illegal would mean LDA is smaller than N although you gave Nm for both, so possibly something wrong with the passing of the Aa.memptr argument. (Unless Nm happens to be zero at this point, which it probably should not be.)