Fan comments

Results 9 comments of

Fan

Seems context is not honored and kaggle data search shows unrelated cards

Hi! Thanks for reporting this issue. I guess there are some reasons for this issue: 1. I noticed you are using GPT-3.5-turbo-16k, which sometimes won't perfectly follow our system prompt....

Does sglang support multi-node backend model?

Hi~ I also wonder is there a way we can start a server with multi-GPU, e.g., I want to start the server using `llama-7b-chat` and can I simply set `tp-size=8`...

Does sglang support multi-node backend model?

Thanks for your quick reply!!😊😊 I understand the inter-GPU communication cost now and indeed 7B model works just fine on simple GPU. So can I say that data parallel is...

Does sglang support multi-node backend model?

Got it! Thanks!

[FT] Caching

Hi~ really need this feature! One question: When I use the same tasks to evaluate the models(with same architecture but are from different runs), will the evaluation sample (docs and...

[FT] Caching

That's Cool! thanks for your explanation!

feat: handle/test all subgraphs

Thank you for your kind words!😄 Since we haven't conducted experiments on multilingual data, I don't have a definitive answer, but I think ProX could work better given proper SFT...

feat: get logs improvements

Hi @mtasic85, thank you for your interest in ProX! We will try it on code data in the coming days; however, I can't confirm the exact timeline yet. Unlike our...

@mtasic85 If you are still interested in large-scale and high-quality code dataset, you may find our new [MegaMath](https://huggingface.co/datasets/LLM360/MegaMath) dataset helpful, especially the [megamath-code](https://huggingface.co/datasets/LLM360/MegaMath/tree/main/megamath-code) subset. Although our primary goal is to...