Yuhang He comments

Repositories
Issues
Comments

Results 4 comments of


                                            Yuhang He

Why the paper says the kernel devolves to mean pooling when σ becomes infinity?

I have the same question as well

Is there a docker environment for RTX 6000 pro blackwell GPU?

Same problem. The program gets stuck at `ray::WorkerDict.ref_init_model`. I think it's related to vLLM?

Any plans to support megatron for GRPO training?

Same request. I am looking forward to using Megatron for training LLM with GRPO. It can help me save GPU resources. I have seen that Verl added Megatron GRPO support....

grpo训练32b模型OOM

> decrease `vllm_gpu_memory_utilization` > > btw > > ``` > --sleep_level 1 > --offload_model true > --offload_optimizer true > --gc_collect_after_offload true > ``` > > These options are intended for...