ZiWei Yuan
ZiWei Yuan
为什么现在看到的ch1已经是实现完成的代码了?
https://github.com/KMSorSMS/OS-Spring-of-NOTE.git
> /root/ktransformers/csrc/balance_serve/build May you enter the balance_serve dir to build and see the output? you can `cd` to it and use `rm -rf build` and `cmake -B build` and finally...
> hello, I have the same problem and the installation failed, what can I do? @qiyuxinlin seems like the version of flash_infer used is wrong (mismatch)?
You are welcome to report new models that are supported in the kt_kernel with your snapshot to prove
Do you have avx512 support? Seems our kernel is for amx/avx512. So you may use `lscpu` and check if avx512 is available. But we are writing support for more general...
> > Do you have avx512 support? Seems our kernel is for amx/avx512. So you may use `lscpu` and check if avx512 is available. But we are writing support for...
I am working on fixing this. I have tested the different sets: ``` export CPUINFER_ENABLE_AMX=OFF export CPUINFER_CPU_INSTRUCT=AVX512 ``` ``` export CPUINFER_CPU_INSTRUCT=AVX2 export CPUINFER_ENABLE_AMX=OFF ``` ``` export CPUINFER_ENABLE_AMX=ON ``` ``` export...
Please add some description to this PR.
> https://docs.vllm.ai/en/v0.8.3/getting_started/installation/cpu.html > > > Supported features > > vLLM CPU backend supports the following vLLM features: > > Tensor Parallel > > Model Quantization (INT8 W8A8, AWQ, GPTQ) >...