delltower comments

Results 10 comments of


                                            delltower

<Baidu Encyclopedia 百度百科 300d>词表重复

unsubscribe 在 2020-02-05 13:36:00，"张大力" 写道：我也碰到了这个问题，不过我是在R里做的。只要在导入向量文本的时候选择半角空格为分隔符，就能分辨出不同的词汇。感谢！ — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

能不能通过领域数据对模型进行微调

> 有数据的话当然可以。您好如何用领域数据进行二次预训练呢

能不能通过领域数据对模型进行微调

> 在目前的中文LLaMA模型的基础上，加入领域内的数据，进行二次预训练

能不能通过领域数据对模型进行微调

> Collaborator 谢谢！请问有训练框架推荐吗

能不能通过领域数据对模型进行微调

> 先训练的题目参考https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling里的run_clm.py 精调的题目是按[Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca)来做的。您好readme上说加入了20G中文资料继续训练，想参考机器配置和训练时间，我想在垂领域继续训练，做个参考

no error log and exits with return code = -7

The hardware settings： RAM：251G 4GPUs ：A100*40G Please tell me how to debug? ^_^

Why is the lora_model so large after fine-tuning?

> The checkpoints are intermediate full model. You can use the adapter_model.bin once the training is completed. How to extract adapter_model.bin file from the checkpoint folder?

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

> I try it and I got this error /workspace/work/LMFlow-main-baichuan/examples/finetune.py:66 in 63 64 65 if __name__ == '__main__': ❱ 66 main() 67 /workspace/work/LMFlow-main-baichuan/examples/finetune.py:59 in main 56 print("model", model_args) 57 print("data",...

加载模型报错 get_input_embeddings NotImplementedError

debug BaichuanModel( (embed_tokens): Embedding(64000, 5120, padding_idx=0) (layers): ModuleList( (0-39): 40 x BaichuanLayer( (self_attn): BaichuanAttention( (W_pack): MergedLinear( in_features=5120, out_features=15360, bias=False (lora_dropout): Dropout(p=0.1, inplace=False) (lora_A): Linear(in_features=5120, out_features=16, bias=False) (lora_B): Conv1d(16, 10240, kernel_size=(1,),...

LORA训练，loss不降

> 使用自己的数据和公开数据，loss都不降低请贴一下训练参数和框架