RL
RL copied to clipboard
Dtensor policy v1 and v2 future plan
There are several issues about dtensor policy v1 and v2 now.
- There are problems for v2 (NeMo Automodel to support some diffusion models.)
- Researchers enjoy v1 merits: fully HF native and transparency.
But there are other issues from SW perspective:
- NeMo Automodel will be much more powerful and feature rich built on top of FSDP.
- There are many duplicated codes in v1 and v2 which will make maintenance more difficult.
With many discussions between researching we had some conclusions:
- Keep v1 till they feel comfortable to deprecate.
- Increase Automodel support for gapped features.
- Automodel will have HF native fallback mode and this will be supported with in v2 dtensor policy worker.
- Our refactor works on v2 policy worker will also apply to v1 to make code cleaner on both side.
- There will be divergent features between v1 and v2 in future releases including EP, PP, HSDP, CP + Seq pack via TE.