Rogue Knight
Rogue Knight
我也一样,而且我发现他的数据比原始数据多一个attr维度,请问你知道这个维度代表什么吗?DCRNN 中提供的原始数据貌似没有这个维度
他的batch是nonetype,所以需要检查下有没有用到他之前Batch.from_data_list,但是我认为即使加上也是存在问题的
me too!
We trained for 250 and 5500 epochs on qwen2.5-3b, but neither reflected R1's self-reflectiveness. The outputs from the training are shown below: Response (train 250 epoch): **reasoning** When Irene works...
> > We trained for 250 and 5500 epochs on qwen2.5-3b, but neither reflected R1's self-reflectiveness. The outputs from the training are shown below: > > Response (train 250 epoch):...
> > And how many steps did you do it for? You need at least 300 steps which we wrote about in our guide: https://docs.unsloth.ai/basics/reasoning-grpo > > Also our notebooks...