Reinforce-II
Reinforce-II
@Billzhong2022 could you provide those testing results 1. running on a single cluster, e.g. taskset -c 0-3 , or something equivalent on windows 2. use ```-t 4```, and set environemnt...
> Hi LLAMA team, > > Any update? Do you use MACRO "GGML_USE_OPENBLAS" for all your modified codes in commit [780e24a](https://github.com/ggerganov/llama.cpp/commit/780e24a22eb595b705cbe8284771e9ceff1c4dd2)? > > @ReinForce-II use -t 4: The performance is...
> Hi @ReinForce-II , > > How about 1.? You can run cmd /c start /b /affinity 0xf main.exe -t 4 ... in powershell, **Answer:** The performance is very very...
> Hi @ReinForce-II , > > But after reverting commit [780e24a](https://github.com/ggerganov/llama.cpp/commit/780e24a22eb595b705cbe8284771e9ceff1c4dd2), the performance is much better on platform https://www.qualcomm.com/products/mobile/snapdragon/pcs-and-tablets/snapdragon-x-elite. How to explain it? > > Thanks! It might have something...
> Hi @ReinForce-II , > > Please help debug and fix this issue. > > Thank you very much! It would be great help if you can kindly provide some...
Hi, @Billzhong2022 Please take a look at [4ae60ad8](https://github.com/ReinForce-II/llama.cpp/commit/4ae60ad811319710cca608e69031d95d78918f65) The commit is not specified for snapdragon device, but it might also alleviate your problem. hope for your feedback.
> @ReinForce-II It seems the performance issue is due to the function 'ggml_compute_forward()' was called twice in function 'ggml_graph_compute_thread()'. It was just called once in previous code. May we avoid...