Demon Boy
Demon Boy
> 伙计们,它和 llama2 一样吗? I have not tested it myself, but I saw the evaluation results on meta, which are better and more advanced than llama2, and I heard that...
> ``` > ~/repo/FastChat$ python -m fastchat.serve.model_worker --model-path ~/repo/models/Qwen-14B-Chat-Int4 --gptq-wbits 4 --gptq-groupsize 128 --model-names gpt-3.5-turbo > 2023-09-28 14:36:05 | INFO | model_worker | args: Namespace(host='localhost', port=21002, worker_address='http://localhost:21002', controller_address='http://localhost:21001', model_path='~/repo/models/Qwen-14B-Chat-Int4', revision='main',...
+1,希望有训练和推理的Tokens/s/p
> > > > 同问,想知道预训练数据集的格式 另外我在仓库里看到了预训练脚本的命令行里dataset是wiki_demo,是不是就是表示数据集是data/wiki_demo.txt,下面是我看到的   > > > > > > > > > 是的,具体信息在dataset_info.json里 > > > > > > 麻烦问下,增量预训练数据格式的配置流程是不是如下这样: > > > > 1....
> @1737686924 你的预训练数据是必须手动按回车才算一行,如果一段话连在一起是不是就不算一行,并且还要注意一行的长度不能超过待训练模型的最大toke是吗? 图1 图2 图1是一本书转化为txt形式的,这样拿去预训练能行不?不行的话应该怎样处理?图2的预训练数据格式能否如此?
> 正常的,那可是嬛嬛,你骂嬛嬛一句,还不许嬛嬛还嘴啦?哈哈哈啊哈 
> 正常的,那可是嬛嬛,你骂嬛嬛一句,还不许嬛嬛还嘴啦?神奇,在这一方面达到人类的水平
刚刚开了个昇腾+llamafactory交流大会,里面讲到了一些基础的东西 
> 开始时的 AMP 检查是通过还是失败? With AMP, the check passed, but it encountered a gradient explosion,However, the training accuracy has not improved
> Can you try using `amp=False` and `half=False`?  I have tried it, and I found that there is a problem with the loss, and the accuracy has increased strangely,...