lyz
lyz
@zjersey yep, get it. if i modify it by myself, i should modify the model.proto, encoder.cc.cu, decoder.cc.cu and some kernel function files, then i compile source codes. is that. right?
@zjersey @hexisyztem yep, get it, thank you.👍👍👍
@hexisyztem thank you👍,i will email you, if i was confusing the details about "lightseq" inference engine.
@yynil 嗨,大佬,我这边follow你这边用colossalai支持chatglm训练的代码,在sft和reward model训练的时候都没有问题,但是我在执行'sh train_prompts.sh'的时候,会在脚本train_peft_prompts.py的160行处报错, ‘(actor, actor_optim), (critic, critic_optim) = strategy.prepare((actor, actor_optim), (critic, critic_optim))’ 具体报错:*** RuntimeError: torch.cat(): expected a non-empty list of Tensors 这里我有点不太清楚,希望大佬指点一下。
@yynil act_num, cri_num = 0,0 for name,para in actor.named_parameters(): if para.requires_grad: print(name) act_nuum +=1 for name,para in critic.named_parameters(): if para.requires_grad: print(name) cri_num +=1 print(act_num, cri_num) act_num为0 cri_num为58 actor、critic模型都加载进来了,actor、critic模型的word_embeddings权重如下: critic.model.base_model.model.transformer.layers[1].input_layernorm.weight '...
@JThh sorry, i don't have "slurm" and "openmpi" libs on my machines, i test the demo on 2 machines(A and B), we can access to each other with "ssh". here...
yes, it was sovled | | ***@***.*** | | ***@***.*** | ---- Replied Message ---- | From | Jiatong (Julius) ***@***.***> | | Date | 04/24/2023 23:46 | | To...
ok, i will close it later | | ***@***.*** | | ***@***.*** | ---- Replied Message ---- | From | Jiatong (Julius) ***@***.***> | | Date | 04/26/2023 18:12 |...
for my case, node01 can not communicate with node02, | | ***@***.*** | | ***@***.*** | ---- Replied Message ---- | From | ***@***.***> | | Date | 08/10/2023 21:59...
@Cloopen-ReLiNK 嗨,帅哥,你这边最终怎么解决的呢?我也碰到了这个问题