Zijie Tian

Results 5 comments of Zijie Tian

same problem. Can someone explain why the "No optimal configure" message appears? Could you also briefly explain the principle behind autotuning?

I noticed that there are some Metal operators in the experimental submodule, but I still can't install them on macOS using USE_CPP=1. Is there any way to install these operators?

Unexpectedly **SLOW** performance on Apple M4 MAX for Llama-3-8b-EfficientQAT-w2g128-GPTQ compared to AGX Orin. I use following command to run your code on AGX and M4MAX ``` ./build-arm64/bin/llama-cli -m /gguf/Llama-3-8b-EfficientQAT-w2g128-GPTQ-GGUF/llama-3-8b-w2g128.gguf -p...