hjc3613

Results 12 comments of hjc3613

I reprocess the train/valid/test files with spacy tokenization, and using opennmt-tf re-train the data, now the result becames normal . the only problem is tha the ppl on valid dataset...

> @hjc3613 sorry for the inconvenience, this feature is not well tested thats why we didn't mention it much. If you are interested, would love to work with you and...

> > 参考 #361 十分感谢!

> 是stage 3下才有的错误吗,stage 2能跑么? stage2下没跑,因为单卡现存小,只有12G,会爆显存,一共6张12G的卡,貌似只能尝试stage3,我想在CPU下调试,但是加了--no_cuda后不起作用,依然会挪到GPU上,如果方便的话,麻烦告知一下怎么在cpu上跑?

> > 是stage 3下才有的错误吗,stage 2能跑么? > > stage2下没跑,因为单卡现存小,只有12G,会爆显存,一共6张12G的卡,貌似只能尝试stage3,我想在CPU下调试,但是加了--no_cuda后不起作用,依然会挪到GPU上,如果方便的话,麻烦告知一下怎么在cpu上跑? 我用的是alpaca模型,就是将 原始llama+chinese_llama_lora_plus+_chinese_alpaca_lora_plus三个模型合并之后,再去微调的,这个步骤有问题吗?还是说应该在 llama+chinese_llama_lora_plus这两个模型进行合并去微调,不该加alpaca adapter?

> 查到一些信息:是因为用了stage3,有人跟我遇到一样的情况:https://github.com/huggingface/transformers/issues/22705,https://github.com/microsoft/DeepSpeed/issues/842,https://discuss.huggingface.co/t/deepspeed-zero3-does-not-work-with-diffusion-models-does-anyone-know-how-to-fix-this/36293,楼主如果能有解决方案的话,烦请及时告知,不胜感激。

> Hi, @hjc3613 , you can offload to nvme instead of cpu memory, please checkout out [nvme offload](https://www.deepspeed.ai/tutorials/zero/#offloading-to-cpu-and-nvme-with-zero-infinity). thanks for your reply, I have test nvme offload, but failed, related...

the pure bf16 is better than nvme offload, beacuase if can store all params、gradients、adam optimizer states in gpu memory and can run faster than any other offload method, as it...

thanks a lot,I will have a try

> You can achieve that by setting `fp32_optimizer_states=False ` in initialization of `DeepSpeedCPUAdam`, this param is added to deepspeed from version 0.14.3. > > note: if you are using transformers...