AlbertGao

Results 9 comments of AlbertGao

或者进行如下操作也有相同的报错. cd PhotonLibOS cmake -B build cmake --build build -j 8

> Here is the step by step tutorial to run it: https://www.youtube.com/watch?v=Xui3_bA26LE and here is the written guide: https://github.com/Teachings/AIServerSetup/blob/main/06-DeepSeek-R1-0528/01-DeepSeek-R1-0528-KTransformers-Setup-Guide.md > > Note: I have been unable to run it on...

> Can you share CUDA version, nvcc and step by step on which commands you ran to build it? I can try to reproduce it and find a fix. [AG]...

I copy this command: python ktransformers/server/main.py --architectures Qwen3MoeForCausalLM --model_path --gguf_path --optimize_config_path ktransformers/optimize/optimize_rules/Qwen3Moe-serve.yaml --backend_type balance_serve

@Azure-Tang can you help check ? thank you very much

> > [@Azure-Tang](https://github.com/Azure-Tang) can you help check ? thank you very much > > Hi, I think you are using `fp8` yaml, which needs to load special weights. > >...

1. Ktransformers --model_path ./DeepSeekR10528-conf --gguf_path ./DS-R1-0528-IQ1_S --port 10002 --web True --max_new_tokens=3000 --optimize_config_path ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-serve.yaml 2. /DeepSeek-V3-Chat-fp8-linear-ggml-experts.yaml 1&2 两个都不行 报ggml_type 18 not implemented

@Azure-Tang 我用得GGUF 不是[KVCache-ai/DeepSeek-V3-GGML-FP8-Hybrid](https://huggingface.co/KVCache-ai/DeepSeek-V3) . 我得找对应的0528 DeepSeek-R1-IQ1S-FP8 model-00000-of-00061.safetensors 这种格式的. 针对R1-0528 和 R1-T2 能给个Q1S-FP8 的model 吗?

好的, 自己试试吧. 最近的R1-T2 这个你们试过了吗? performance 有提升吗?