delltower

Results 10 comments of delltower

unsubscribe 在 2020-02-05 13:36:00,"张大力" 写道: 我也碰到了这个问题,不过我是在R里做的。只要在导入向量文本的时候选择半角空格为分隔符,就能分辨出不同的词汇。 感谢! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

> 有数据的话当然可以。 您好 如何用领域数据进行二次预训练呢

> 在目前的中文LLaMA模型的基础上,加入领域内的数据,进行二次预训练

> Collaborator 谢谢!请问有训练框架推荐吗

> 先训练的题目参考https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling里的run_clm.py 精调的题目是按[Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca)来做的。 您好readme上说加入了20G中文资料继续训练,想参考机器配置和训练时间,我想在垂领域继续训练,做个参考

The hardware settings: RAM:251G 4GPUs :A100*40G Please tell me how to debug? ^_^

> The checkpoints are intermediate full model. You can use the adapter_model.bin once the training is completed. How to extract adapter_model.bin file from the checkpoint folder?

> I try it and I got this error /workspace/work/LMFlow-main-baichuan/examples/finetune.py:66 in 63 64 65 if __name__ == '__main__': ❱ 66 main() 67 /workspace/work/LMFlow-main-baichuan/examples/finetune.py:59 in main 56 print("model", model_args) 57 print("data",...

debug BaichuanModel( (embed_tokens): Embedding(64000, 5120, padding_idx=0) (layers): ModuleList( (0-39): 40 x BaichuanLayer( (self_attn): BaichuanAttention( (W_pack): MergedLinear( in_features=5120, out_features=15360, bias=False (lora_dropout): Dropout(p=0.1, inplace=False) (lora_A): Linear(in_features=5120, out_features=16, bias=False) (lora_B): Conv1d(16, 10240, kernel_size=(1,),...

> 使用自己的数据和公开数据,loss都不降低 请贴一下训练参数和框架