Jiang Jiwen issues

Results 4 issues of


                                            Jiang Jiwen

[Question] Where is CVPO?

### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/omnisafe/issues) and [Discussions](https://github.com/PKU-Alignment/omnisafe/discussions) that this hasn't already been reported. (+1 or comment...

question

[QUESTION] Is FP32 supported in MultiNode Training

We plan to finetune a model in MegatronLM, the model (11B) is sharded with tp=4, pp=16. We want to finetune the model in fp32 rather than fp16 or bf16. The...

[QUESTION] Has standalone_embedding_stage been supported yet in core?

I met an issue and want to split the embedding layer out of transformer block to make it alone in single pp stage, but I found that it has not...

Question about transformer support

Hi, I am wondering hf adapter support transformer version > 4.47.0, becasue the signature of _flash_attention_forward has changed, the length is different in https://github.com/zhuzilin/ring-flash-attention/blob/be3b01f5706f45245f9b6d78d6df231954b2ee64/ring_flash_attn/adapters/hf_adapter.py#L23