Djip007
Djip007
News?
Any news, Update, Version? Is it a leave?
hello, Try to install rocm-5.3.2 for rhel8 (rocky8 in my case) but look there is permission error on some file: https://repo.radeon.com/rocm/rhel8/5.3.2/main/repodata/dbc2743caaab3e018c806bb7d5474028a85071c6dc866a45dae1021af9416e86-filelists.xml.gz report => 403 Forbidden (did'nt test all files... it...
https://github.com/Mozilla-Ocho/llamafile/blob/a8124633ea9b5860712a954a7cb6d9dc4bd6b365/llamafile/sgemm_bss_avx512bf16.cpp#L33 not 100% sur but, I think this code is wrong because of x86 CPU are little endian... _mm512_loadu_ps is to load fp32 that have 4byte it load {b1,b2,b3,b4} and...
qq idée concernant le +4 ca ne doit marche pas toujours: - si vous avez fait changé l'un des 2 boitier ca ne marche plus... (mon cas...) - 2eme boitier......
add the 2 changes needed for llamafile: - remove UMA build option - use it in all case if hipalloc failed with 'not have enough memory' Note: with linux kernel...
With AMD APU (like my Ryzen 7940HX) it is possible to use "UMA" to extand VRAM. And in my case I can't alloc more than 4Go of VRAM (bios config)....
### Contact Details [email protected] ### What happened? Build last main branch on Fedora 40, Ryzen 7940HS. llamafile crache with signal SIGILL, Illegal instruction. I try gdb, but can't find how...
Wanted to do more but this patch is simple. On none AVX512 CPU there is only 16 register. The compiler do not reorder the madd ops on sgemm, so did...
You was not too far from good speed. But if you look at BLIS paper, bloc_A need to be keep in L2 cache, on x86 CPU (zen ...) there is...
- change dispache strategie (thanks: https://github.com/ikawrakow/ik_llama.cpp/pull/71 ) - more cache freindly some result: | cpu_info | model_filename | size | test | 0.8.17 t/s | PR t/s | | -------------------------------------------:...