mynewstart

Results 27 comments of


                                            mynewstart

多任务微调时的预训练

感谢回答

preprocess和pretrain的参数target的选择

> 1. batch_size低的话，对模型训练会有一定的影响，不过不大。显存不够的话，可以尝试使用小模型。 > 2. 现在只训练了1000步，应该继续训练，acc_mlm到70甚至80以上，再去下游任务微调，可能会有更好的效果。mlm这个任务比nsp更难，一般在大语料上，acc_mlm不会超过90 > 3. 建议直接用一个经过shuffle的大语料进行训练，拆成多个小语料没有必要，效果可能会变差，因为模型会更偏向于学习后面训练的语料请问预训练是不是不支持wwm，如果要对robert-wwm进行预训练，直接使用现在的版本对效果有影响吗？如果不用nsp任务，预训练数据的格式可以是一行吗？

Automatic model parallel inference by deepspeed

try to use 'self_attn.o_proj', 'mlp.down_proj'?

baby llama2 The training reported an error, and it was still good just now and suddenly reported the error

Hi, have you resolved the issue?

[Question] DeepSpeed Zero3 save_checkpoint() got empty mode_states files

My solution is to save checkpoints by myself or you can use [zero_to_fp32](https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/utils/zero_to_fp32.py)

新的代码会导致OOM

好的，请问你们也是有类似的体验吗？用之前deepspeed-chat的代码能训练更大的模型？

[BUG] Deepspeed Zero 3 Inference InFlight Params with new HuggingFace Mixtral Model

> > Changing that parameter fundamentally changes the model > > > same problem It's similar to #4094 > > > > > > 1. I modify the `num_experts_per_tok` to...

[BUG] Deepspeed Zero 3 Inference InFlight Params with new HuggingFace Mixtral Model

Will the inference speed slow down if I understand correctly, and will the model's performance deteriorate? > > > > Changing that parameter fundamentally changes the model > > >...

[BUG] Deepspeed Zero 3 Inference InFlight Params with new HuggingFace Mixtral Model

I can fully fine-tune Mistral7b*8 instruct with deepspeed zero3 on 2 A100-80GB instances, the code won't hook and run smoothly. I didn't change anything except disabling the evaluation part to...

Add FALCON-40B Inference-Kernel Support

> save_mp_checkpoint_path= Hi @RezaYazdaniAminabadi , Thanks for your contribution. I used this script and met the following issue. My environment is deepspeed=0.12.3, transformers=4.34.0,torch=2.0.1, instance is p4de. Could you help know...

1
2
3
›