verl
verl copied to clipboard
discrete profiling with mindspeed doesn't work
System Info
npu mindspeed+vllmascend
Information
- [ ] The official example scripts
- [x] My own modified scripts
Tasks
- [x] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
megatron_worker.py does not implement the DistProfiler protocol and lacks the corresponding decorators in functions such as generate_sequences and update_actor. This causes it to fail when running in the profiling's discrete mode. It is unclear whether this issue affects GPU profiling, but it definitely prevents it from working on NPUs.
And maybe some ci check about profiling is required to avoid this issue.
Npu mindspeed is introduced in https://github.com/volcengine/verl/pull/2707,
Expected behavior
enable profiling discrete mode