Jonas Yang CN

Results 7 comments of Jonas Yang CN

Hi @Hukongtao , I am working on this, please let me know your model such I could verify against.

> > Hi @Hukongtao , I am working on this, please let me know your model such I could verify against. > > qwen Got it!

Hi @Hukongtao , we are actively working on this. But it won't catch up next release due to impact of t his beyond our expectation. Please expect it will be...

> Hi, I'm also looking to disable the KV cache completely as my use-case requires only the first token generation. > > The only work-around so far has been to...

@aayush-sarvam , can you let us know model name? I think you are using dtensor policy v1. We are focusing on v2 (which is built on top of NeMo Automodel)...

jHi @aayush-sarvam , do you mean you want to continue from checkpoint? In general, we accept HF model format, if HF is able to load it, NeMo RL dtensor should...

Great to know Automodel has already supported FP8.