Can't compile for NVIDIA GTX 590
Hello,
Trying to benchmark hashkill on my GTX 590 and see how hard it is to write a plugin (any doc?), but I can't seem to compile with the 590. Putting an old NVIDIA card on the same machine compiles fine... any ideas?
Thanks!
The error is:
nvidia_ipb2_long: building for sm_10 nvidia_ipb2_long: flags = -cl-nv-arch sm_10 -DSM10 nvidia_ipb2_long: compilation for sm_10 successful (size = 562 KB) hashkill clGetProgramInfo: CL_INVALID_VALUE
make[2]: *** [nvidia_ipb2_long] Error 1
make[2]: Leaving directory /home/XXXX/Downloads/hashkill-master/src/kernels/compiler' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory/home/XXXX/Downloads/hashkill-master/src'
make: *** [all-recursive] Error 1
No docs for now. The error is strange though. Can you please provide the nvidia driver version?
The CUDA driver is NVIDIA-Linux-x86_64-319.32.run and the toolkit is cuda-5.5.11_linux_64.run. We're on Ubuntu 13.04 64-bit.
Can you please list all the nvidia_ipb2_long_* kernel binaries that successfully built in src/kernels ? Did the sm_11 kernel fail or was that some other kernel (590 would be sm_20 I guess)
Sorry for the delay, the computer wasn't available...
It seems we do not get any .ptx files at all. So sm_10 seems to be the very first one and it doesn't build?
I have the same problem here with my GeForce GTX 690 on my gentoo system. Can it be a problem that I have also an AMD-card installed (running fine with hashkill) nvidia-drivers-331.20, kernel 3.7.3-gentoo 64bit
nvidia_ipb2_long: building for sm_10 nvidia_ipb2_long: flags = -cl-nv-arch sm_10 -DSM10 nvidia_ipb2_long: compilation for sm_10 successful (size = 562 KB) hashkill clGetProgramInfo: CL_INVALID_VALUE
With some hacks I skipped sm_10 and got the same for sm_20 and sm_30 nvidia_ipb2_long: building for sm_20 nvidia_ipb2_long: flags = -cl-nv-arch sm_20 nvidia_ipb2_long: compilation for sm_20 successful (size = 562 KB) hashkill clGetProgramInfo: CL_INVALID_VALUE
on my other PC it compiles like a charm with only an old sm 10(11) nvidia-card. ... else { binaries[i]=NULL; } //err = _clGetProgramInfo(program, CL_PROGRAM_BINARIES, sizeof(char *), binaries, NULL ); //checkErr( "clGetProgramInfo", err ); ... if I comment out this lines near 215 in nvidia-compiler.c I got a lot of .ptx files. I will test if it works tomorrow.
any ideas? Thanks!
Looks like I need to get one of those 690s I am pretty sure that commenting out that line will lead to junk data saved in the ptx files, but it's worth trying anyway.
Apparently GTX690 is a new architecture, not enabled in the compiler yet. I will now patch it to support sm_35, yet it's weird why older architectures are unsupported by the device...
OK, sm_35 build support added in the compiler. Can you check now?
Thanks!
I will test it now but compiling all the kernels takes time. I am not sure if that will fix the problem. I know that the hardware works well with older kernels (sm_10...).
... two hours later ...
Now I know that it does not fix the problem. It starts compiling sm10 and fails.
I patched it that it will start with sm35 and it fails the same way. There must be another problem. Perhaps it is a problem of that PC (it is gentoo).
nvidia_ipb2_long: building for sm_35 nvidia_ipb2_long: flags = -cl-nv-arch sm_35 nvidia_ipb2_long: compilation for sm_35 successful (size = 562 KB) hashkill clGetProgramInfo: CL_INVALID_VALUE
I have just tested your androidfde plugin (on amd) - It is verry fast! I have written a small cuda code some weeks ago for the same problem, but yours reaches more than double of the speed with my amdcard compared to my nvidiacard. Now I do not know how much faster the amd card is. I only know that the nvidia card was more expensive :P
Are you planning to add more vendor specific android fde plugins?
Samsung has a differend way a little more komplex but known (there is a nice presentation http://www.irongeek.com/i.php?page=videos/derbycon3/1202-what-s-common-in-oracle-and-samsung-they-tried-to-think-differently-about-crypto-laszlo-toth-ferenc-spala ).
HTC uses a longer key (algo unknown ;( for me). I can provide sample dumps with known passwords.
Thanks for your time! I do not want to make you problems. But I am verry interested what my problem is ;)
Do you have any more ideas what I can test?
Best regards,
Gerd
Gesendet: Montag, 06. Januar 2014 um 17:28 UhrVon: gat3way [email protected]: gat3way/hashkill [email protected]: gerdigerdi [email protected]: Re: [hashkill] Can't compile for NVIDIA GTX 590 (#47)
OK, sm_35 build support added in the compiler. Can you check now?
— Reply to this email directly or view it on GitHub.
It might be also caused by API change in the opencl runtime (which would be weird but possible). Which driver version are you using? Can you grep for CL_PROGRAM_BINARIES in the opencl headers to check what numerical value it was defined with?
AMD is generally faster than NVIDIA for integer arithmetics which are cruicial for crypto, so it's normal that your AMD card is better than your NV one, that of course depends on the hardware model.
If I have enough information and test samples, I will add new algos, problem is that my spare time is very limited now and I am sharing it between the GPGPU crypto stuff and my new SDR radio hobby :)
I got it to work :) - no not really :/ - I got a workaround!
I tried clcc to compile one kernel and it worked! So I think it is no problem of my system ;)
("https://devtalk.nvidia.com/default/topic/483496/clcc-an-nvidia-opencl-command-line-compiler/")
./clcc "-cl-nv-arch sm_30 -cl-nv-cstd=CL1.1" nvidia_androidfde.cl nvidia_androidfde.ptx
lzma -z nvidia_androidfde.ptx cp ../nvidia_androidfde.ptx.lzma /usr/local/share/hashkill/kernels/nvidia_androidfde__sm30.ptx
After starting hashkill ...
[hashkill] Loading kernel: /usr/local/share/hashkill/kernels/nvidia_androidfde__sm30.ptx [hashkill] Loading kernel: /usr/local/share/hashkill/kernels/nvidia_androidfde__sm30.ptx
Progress: 0% Speed: 145K c/s (avg: 147K c/s) Cracked: 0 passwords
[hashkill] GPU1: 72K c/s [Temp]: 50C (GeForce GTX 690) [hashkill] GPU2: 73K c/s [Temp]: 49C (GeForce GTX 690)
... O.K. I see the amd is really faster and cheaper
I think that one is used?!
/usr/lib64/OpenCL/global/include/CL/cl.h
or that?
/opt/cuda/include/CL/cl.h
both with the same CL_PROGRAM_BINARIES version
#define CL_PROGRAM_BINARIES 0x1166
I am using nvidia-drivers-331.20
I am sorry that I can not help more, but I am new to opencl. I only tried a little bit cuda ;)
Thanks again!
Greetings, Gerd
Gesendet: Dienstag, 07. Januar 2014 um 10:49 UhrVon: gat3way [email protected]: gat3way/hashkill [email protected]: gerdigerdi [email protected]: Re: [hashkill] Can't compile for NVIDIA GTX 590 (#47)
It might be also caused by API change in the opencl runtime (which would be weird but possible). Which driver version are you using? Can you grep for CL_PROGRAM_BINARIES in the opencl headers to check what numerical value it was defined with?
AMD is generally faster than NVIDIA for integer arithmetics which are cruicial for crypto, so it's normal that your AMD card is better than your NV one, that of course depends on the hardware model.
If I have enough information and test samples, I will add new algos, problem is that my spare time is very limited now and I am sharing it between the GPGPU crypto stuff and my new SDR radio hobby :)
— Reply to this email directly or view it on GitHub.
Morning!
I think I found the problem in nvidia-compiler.c but I am not sure.
err = _clGetProgramInfo(program, CL_PROGRAM_BINARIES, sizeof(char *), binaries, NULL );
I read that that funktion creates binaries for each device. My nvidia is a double card. So i changed it to:
err = _clGetProgramInfo(program, CL_PROGRAM_BINARIES, sizeof(char _)_2, binaries, NULL );
and the compile error is gone :) - I did not test the kernel!
Best regards,
Gerd
Gesendet: Montag, 06. Januar 2014 um 17:28 UhrVon: gat3way [email protected]: gat3way/hashkill [email protected]: gerdigerdi [email protected]: Re: [hashkill] Can't compile for NVIDIA GTX 590 (#47)
OK, sm_35 build support added in the compiler. Can you check now?
— Reply to this email directly or view it on GitHub.