mlmz
mlmz
@zhc7 superbench少了一个任务集,Digital Card Game
你好,可以给出fastapi服务端一些日志吗,看报错是服务端没有响应
https://docs.sglang.ai/backend/server_arguments.html try it
without this parameter,can it work?
I see,this parameter works in the way you say. if the checkpoint is fp8, you should load it without specifying any arguments. https://docs.sglang.ai/references/quantization.html#online-quantization
how would you define "fully utilize"?
thanks for raising this issue, @ByronHsu is working on PD-Disaggregation, @ByronHsu could you take a look at this issue, thanks
We haven't found the root cause of your problem yet, you can try running some low concurrency (say 8 concurrent) tasks to warm up, and then increase the concurrency (say...
> ``` > 12 results - 4 files > > sglang • python/sglang/srt/model_executor/model_runner.py: > ... > > sglang • python/sglang/srt/openai_api/adapter.py: > 557 "min_new_tokens": request.min_tokens, > 558: "thinking_budget": request.thinking_budget, > 559...
Thank you for raising this issue, @ispobock @zhyncs could you help look at this issue, thanks