archwolf118 comments

Results 18 comments of


                                            archwolf118

training the model

@Vidhi-Pat Please help me with the same by providing the implementation [email protected] Thank you!

mps could not work both in container and host in the same time

I found the solution! The docker container user must be the same with the Host machine user. So, you need to add "-u 1000:1000" like: `sudo CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 nvidia-cuda-mps-control -d`...

请问大家怎么判断模型微调之后是否生效？

> 还是说，微调之后，模型的回答风格发生改变，比如微调数据的answer很短，模型微调后更倾向于短回答？我也发现这个问题了，怀疑是maxlength设置成320的原因，因为原始模型输入的最大长度有2048

What are the eos_token_id and bos_token_id

same question， if the fine-tune need same configuration?

Fine-tuning

I have same question. Thank you for reply.

Official LLaMA on HuggingFace anytime soon?

We are waiting for the llama on hugingface! Thank you!

运行成功，能进ui。这之后需要挂梯子吗

> 你好，我是在linux服务器上部署的，请问您知道怎么开启clash的服务模式和tun模式吗？同问，谢谢大佬

运行成功，能进ui。这之后需要挂梯子吗

我是Ubuntu 20.04系统，显卡3080 首先保证能用clash能上外网（网上介绍很多）我在clash.yaml里面设置 ![image](https://user-images.githubusercontent.com/7391017/224874947-6e7326d2-fc73-4ea8-9727-bb5d291bf90f.png) 再重连一下clash网络就通了，由于显存不够，只跑了一个看图说话的模型。 ![image](https://user-images.githubusercontent.com/7391017/224877920-9a7f6a96-928f-42b3-b660-c237a9a7568e.png)

finetune完之后又进行效果评估吗？

@mymusise 您好！我成功跑通了该模型，用了中文数据集它有1万左右的数据量,又加入了一些简单的其他数据，比如问“你的名字是什么？”，回答：“羊驼” finetune了5轮，但是我infer时，问“你的名字是什么？”，模型还是回答“我是ChatGLM-6B” 根据您的经验，如何将模型中的一些内置的知识替换成我们想要的知识，谢谢了。

大概5小时可以训练完，但是loss一直是0，是正常的吗

是不是显卡的问题？我用V100就报半精度的错，用3090就没事，真是奇怪，怀疑是算力的问题。你的显卡是什么型号的？