Demon Boy

Results 18 comments of Demon Boy

> 伙计们,它和 llama2 一样吗? I have not tested it myself, but I saw the evaluation results on meta, which are better and more advanced than llama2, and I heard that...

> ``` > ~/repo/FastChat$ python -m fastchat.serve.model_worker --model-path ~/repo/models/Qwen-14B-Chat-Int4 --gptq-wbits 4 --gptq-groupsize 128 --model-names gpt-3.5-turbo > 2023-09-28 14:36:05 | INFO | model_worker | args: Namespace(host='localhost', port=21002, worker_address='http://localhost:21002', controller_address='http://localhost:21001', model_path='~/repo/models/Qwen-14B-Chat-Int4', revision='main',...

+1,希望有训练和推理的Tokens/s/p

> > > > 同问,想知道预训练数据集的格式 另外我在仓库里看到了预训练脚本的命令行里dataset是wiki_demo,是不是就是表示数据集是data/wiki_demo.txt,下面是我看到的 ![image](https://private-user-images.githubusercontent.com/18082104/312683089-db02f787-cce0-441a-8265-83422936aaf2.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTA0MDk5MDAsIm5iZiI6MTcxMDQwOTYwMCwicGF0aCI6Ii8xODA4MjEwNC8zMTI2ODMwODktZGIwMmY3ODctY2NlMC00NDFhLTgyNjUtODM0MjI5MzZhYWYyLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAzMTQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMzE0VDA5NDY0MFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTkwYWI5MWM5N2ZlZGM3NDJmNmEwZDRhYzkxMDFiYjk4MWFkZDFkNzcyNmIyYjY1ZDA5YTQ1Mjc3OTdhZjNlMDEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.7_gtrsCqbgf5RE_vPjP6vhShxfbQd5lRKUy3YE5YRuI) ![image](https://private-user-images.githubusercontent.com/18082104/312683171-7463c59f-c7af-4501-98fe-b61b0bf52678.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTA0MDk5MDAsIm5iZiI6MTcxMDQwOTYwMCwicGF0aCI6Ii8xODA4MjEwNC8zMTI2ODMxNzEtNzQ2M2M1OWYtYzdhZi00NTAxLTk4ZmUtYjYxYjBiZjUyNjc4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAzMTQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMzE0VDA5NDY0MFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWZjNjk4NjdiMmJiNTBjNjhlZTFhZjczYTAwM2IwMzc3NGI3NjAyMmQ1ZWU2NDgxMWYzZmQ4ZWE0YjZhMzdhZmYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.e8ngqzaKZaVN63sm8TmlQ9B7NYABwdAs_EWyvDmLXGk) > > > > > > > > > 是的,具体信息在dataset_info.json里 > > > > > > 麻烦问下,增量预训练数据格式的配置流程是不是如下这样: > > > > 1....

> @1737686924 你的预训练数据是必须手动按回车才算一行,如果一段话连在一起是不是就不算一行,并且还要注意一行的长度不能超过待训练模型的最大toke是吗? 图1![image](https://github.com/hiyouga/LLaMA-Factory/assets/156283920/41cd6eb3-afe0-4fd8-ad21-5fe8c0404bf9) 图2![image](https://github.com/hiyouga/LLaMA-Factory/assets/156283920/f72eeb59-ac42-4d13-b71e-bca114cc190b) 图1是一本书转化为txt形式的,这样拿去预训练能行不?不行的话应该怎样处理?图2的预训练数据格式能否如此?

> 正常的,那可是嬛嬛,你骂嬛嬛一句,还不许嬛嬛还嘴啦?哈哈哈啊哈 ![image](https://github.com/datawhalechina/self-llm/assets/156283920/bd105a51-4f6d-42c3-9bb7-a9103b11e26a)

> 正常的,那可是嬛嬛,你骂嬛嬛一句,还不许嬛嬛还嘴啦?神奇,在这一方面达到人类的水平

刚刚开了个昇腾+llamafactory交流大会,里面讲到了一些基础的东西 ![image](https://github.com/hiyouga/LLaMA-Factory/assets/156283920/0f392574-4153-4ea0-a52e-6751ea3772de)

> 开始时的 AMP 检查是通过还是失败? With AMP, the check passed, but it encountered a gradient explosion,However, the training accuracy has not improved

> Can you try using `amp=False` and `half=False`? ![image](https://github.com/user-attachments/assets/2ca3f1e1-dbd8-42db-85ad-67a7d2356348) I have tried it, and I found that there is a problem with the loss, and the accuracy has increased strangely,...