mlmz

Results 19 comments of mlmz

你好,可以给出fastapi服务端一些日志吗,看报错是服务端没有响应

![Image](https://github.com/user-attachments/assets/89dac707-c44f-4e38-a30a-7a277060659a)https://docs.sglang.ai/backend/server_arguments.html try it

without this parameter,can it work?

I see,this parameter works in the way you say. if the checkpoint is fp8, you should load it without specifying any arguments. ![Image](https://github.com/user-attachments/assets/656089ce-077e-467b-9d1f-05e19597fe4f)https://docs.sglang.ai/references/quantization.html#online-quantization

how would you define "fully utilize"?

thanks for raising this issue, @ByronHsu is working on PD-Disaggregation, @ByronHsu could you take a look at this issue, thanks

We haven't found the root cause of your problem yet, you can try running some low concurrency (say 8 concurrent) tasks to warm up, and then increase the concurrency (say...

> ``` > 12 results - 4 files > > sglang • python/sglang/srt/model_executor/model_runner.py: > ... > > sglang • python/sglang/srt/openai_api/adapter.py: > 557 "min_new_tokens": request.min_tokens, > 558: "thinking_budget": request.thinking_budget, > 559...

Thank you for raising this issue, @ispobock @zhyncs could you help look at this issue, thanks