Peter Geng comments

Results 9 comments of


                                            Peter Geng

can you add chinese model?

这个场景用can好像不太礼貌，🐶

I have turned 60. Anyone wanna bet whether they'll release code before or after I reach heaven?

while another open source animator released, this one will copy then open source.

finetuned with a 10-line dataset, not work as expected.

I think this is not a problem to this project codes, thanks to this project, but I don't know where the problem is.

finetuned with a 10-line dataset, not work as expected.

below is some params of Lora & Trainer ``` MICRO_BATCH_SIZE = 4 BATCH_SIZE = 128 GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE EPOCHS = 7 # paper uses 3 LEARNING_RATE = 2e-5...

finetuned with a 10-line dataset, not work as expected.

> > below is some params of Lora & Trainer > > ``` > > MICRO_BATCH_SIZE = 4 > > BATCH_SIZE = 128 > > GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE...

finetuned with a 10-line dataset, not work as expected.

> @RG-sw default params, but epoch to 10 and learning rate to 1e-3, worked, but I don't think it's a good solution, as above said.

领域语料增量预训练怎么做？

同样的疑问

训练时 gpu的耗时能透露一下吗

@ydli-ai 老哥, 这个不同的大模型预训练，对gpu需求是线性的吗？比方说您7B的模型，32xA100用两天，那么如果用65B的模型(按10倍算)，用32xA100需要20天吗？或者用320xA100需要2天吗？谢谢

Docs- Chapter 8: No explanation of how dashboard is dynamic

I got same concern about chapter 8, how `data.ts` changed to dynamic renderring? thanks