Mert Unsal comments

Results 50 comments of


                                            Mert Unsal

IMDB with default root only loads half the data

Same issue in Windows 10!

Manual Integrate cannot do (1/cos(x)) dx

Yes - I agree with you that even if the function worked properly it does not achieve anything beyond integrating 1/cos(x). I think at the very least it should be...

Async pipeline in generate and compute score

We have already implemented this feature, please check `reward_model.launch_reward_fn_async=True` argument

Async pipeline in generate and compute score

Probably best way to do this is using async chat scheduler and collecting reward results at the end. For batch rewards, we have implemented a batch reward manager, please check...

在使用多图像数据微调kimi-vl时训练卡死

Is this issue still there?

Qwen/Qwen3-Omni-30B-A3B-Instruct Fine Tuning

The speed seems to be extremely slow on 8xH100 with config (only difference is long context training) ``` PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' \ NPROC_PER_NODE=8 \ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \ megatron sft \ --load 'Qwen3-Omni-30B-A3B-Instruct-mcore' \...

Mert Unsal

IMDB with default root only loads half the data

Manual Integrate cannot do (1/cos(x)) dx

Async pipeline in generate and compute score

Async pipeline in generate and compute score

在使用多图像数据微调kimi-vl时训练卡死

Qwen/Qwen3-Omni-30B-A3B-Instruct Fine Tuning

[BUG] `import deepspeed` crashes on `deepspeed==0.16.3` with `triton==3.2.0` on CPU machine

Finetuning Base Model

Training Qwen/Qwen2.5-Coder-32B-Instruct model OOM

Training Qwen/Qwen2.5-Coder-32B-Instruct model OOM