montagetao

Results 7 comments of montagetao

i found when add the "-t" in /etc/memcached.conf , then restart the memcached service i see the cpu bottlneck on NUMA node1 when load DB(R:W 0:1), i think its the...

启⽤AVX512及AMX指令集 cmake llama.cpp -B llama.cpp/build_amx \ -DGGML_NATIVE=OFF \ -DGGML_AVX512=ON \ -DGGML_AVX512_BF16=ON \ -DGGML_AVX512_VBMI=ON \ -DGGML_AVX512_VNNI=ON \ -DGGML_AMX_TILE=ON \ -DGGML_AMX_INT8=ON \ -DGGML_AMX_BF16=ON # 编译可执⾏文件 cmake --build llama.cpp/build_amx --config Release -j --clean-first...

> [@montagetao](https://github.com/montagetao) > > Thanks for the example command. Did you find that it improved performance of llama.cpp on your newer Intel Xeon with AMX extensions? I'm benchmarking some newer...

> Just profile the llama.cpp for deepseek R1 with Q4_K_M, found it only use the AVX VNNI, not use the AMX instruction. > > ![Image](https://github.com/user-attachments/assets/f1ebd579-0d62-402e-95c4-fdcf8d17814b) > > ![Image](https://github.com/user-attachments/assets/381a0af0-83bc-4464-aaae-c5a7e7a6602a) got it,...

> [@oldmikeyang](https://github.com/oldmikeyang) [@montagetao](https://github.com/montagetao) > > I believe at least some AMX extensions do work, but only for certain tensor quantizations ([int8 quants exist, but not sure how to run on...

thanks for you replying, i have invite my team members , but they can't log in ,its need password ,

thank you , i will try later