Zijie Tian comments

Results 5 comments of


                                            Zijie Tian

GaiaGPU: Sharing GPUs in Container Clouds

nice

[2023-12-04 11:52:08,378] [INFO] [autotuner.py:1110:run_after_tuning] No optimal DeepSpeed configuration found by autotuning.

same problem. Can someone explain why the "No optimal configure" message appears? Could you also briefly explain the principle behind autotuning?

error: command 'g++' failed with exit status 1, maybe due to python version

Same problem, ask the answer.

is this only for linux?

I noticed that there are some Metal operators in the experimental submodule, but I still can't install them on macOS using USE_CPP=1. Is there any way to install these operators?

Introduce New Lookup-Table(LUT)-Based Matrix Multiplication Method (TMAC)

Unexpectedly **SLOW** performance on Apple M4 MAX for Llama-3-8b-EfficientQAT-w2g128-GPTQ compared to AGX Orin. I use following command to run your code on AGX and M4MAX ``` ./build-arm64/bin/llama-cli -m /gguf/Llama-3-8b-EfficientQAT-w2g128-GPTQ-GGUF/llama-3-8b-w2g128.gguf -p...