YY Lin comments

Results 9 comments of


                                            YY Lin

Will it be possible that tfx.transform support NLP text processing directly?

> Hi, we have a [example](https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_utils.py#L122) for vocabulary in TFT Yep, This is vocabulary for String2ID. However it's not enough for NLP's String2IDs encoding. I checked tensorflow transform and tensorflow...

请发布一个小参数版本的ChatGLM，与ChatGLM-6B共享Tokenizer，让RLHF最后一步PPO能够最大可能提速

由于ChatGLM的小模型版本迟迟不能发布等原因，ChatGLM的支持我这边暂时暂停开发了。

Add two features which supports training PPO in one graphic card for large model and ChatGLM-6B model support

> @yynil hello, if you want to support more models, can you add all models class in https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat/coati/models? you can make a dir glm in https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat/coati/models/glm I'll create another branch...

Add two features which supports training PPO in one graphic card for large model and ChatGLM-6B model support

Since the ChatGLM is not willing to release a smaller model to public to train a reward model, I'm suspending the support to ChatGLM. My Branch will then move to...

[FEATURE]: ChatGLM model support

> 感谢大佬，这里记录我在跑这份代码的时候遇到的几个问题，可能都与我的环境有关。 > > * 加载模型时，有2处需要.half() > * reward model 训练时 loss在第一个batch后变为nan，需要backward时 model.float() forward时再 model.half() 你可以去我的fork，去最新的分支Add_GLMChat，我重构了代码，并且把GLM自己的（bs，1，seq，seq）的attention mask加进去了，训练时候，critic默认把use action设置为False，loss下降的更符合GLM的风格（下降很迅速）

[FEATURE]: ChatGLM model support

是的。否则sentencepiece做tokenize太慢，无法接受Sent from my iPhoneOn Apr 23, 2023, at 14:21, KevinFan0 ***@***.***> wrote: 感谢大佬，这里记录我在跑这份代码的时候遇到的几个问题，可能都与我的环境有关。加载模型时，有2处需要.half() reward model 训练时 loss在第一个batch后变为nan，需要backward时 model.float() forward时再 model.half() 你可以去我的fork，去最新的分支Add_GLMChat，我重构了代码，并且把GLM自己的（bs，1，seq，seq）的attention mask加进去了，训练时候，critic默认把use action设置为False，loss下降的更符合GLM的风格（下降很迅速）大佬，我看了最新的分支，有一个关于数据集的疑问，准备数据集是还是以你main分支中的readme方法先执行easy_dataset吗？ —Reply to this email directly,...

[FEATURE]: ChatGLM model support

就是一行行文本，增加domain knowledge。比如教科书等Sent from my iPhoneOn Apr 23, 2023, at 14:49, KevinFan0 ***@***.***> wrote: 是的。否则sentencepiece做tokenize太慢，无法接受Sent from my iPhoneOn Apr 23, 2023, at 14:21, KevinFan0 @.> wrote: 感谢大佬，这里记录我在跑这份代码的时候遇到的几个问题，可能都与我的环境有关。加载模型时，有2处需要.half() reward model 训练时...

[FEATURE]: ChatGLM model support

> 因为你的actor和critic没有设置成train，或者没有load进来，你打印一下trainable paramters，数量应该是0.

[FEATURE]: ChatGLM model support

有没有attention mask，chatglm输出结果截然不同 > 2023年5月6日 10:02，zhangyuanscall ***@***.***> 写道： > > > 你可以去我的fork，去最新的分支Add_GLMChat，我重构了代码，并且把GLM自己的（bs，1，seq，seq）的attention mask加进去了，训练时候，critic默认把use action设置为False，loss下降的更符合GLM的风格（下降很迅速） > > 请问（bs，1，seq，seq）的attention mask是必须的么，我看chatglm官方的代码里面是没有需要使用attention mask的 > > — > Reply to this email directly, view it on...