Charlie comments

Repositories
Issues
Comments

Results 6 comments of


                                            Charlie

Does not generate docstring with @dataclass decorator

Loss does not drop when using Liger Kernel at Qwen2.5

> @Arcmoon-Hu good to know that. I am aware of the transformer issue and will fix it ASAP Is it fixed?

Loss does not drop when using Liger Kernel at Qwen2.5

> in my case, using liger kernel for Qwen 2.5 VL with sequence parallelism, the grad norm quickly goes to nan Hi ! What framework supports liger kernel and sequence...

关于训练的一些疑惑

> 你解决了吗？我也非常困惑... 这个step数，我调整accumulative_counts 对于总step数没有什么影响... 非常奇怪 +1

关于训练的一些疑惑

同样的问题，accumulative_counts 似乎是不生效的，实际 step 数是 accumulative_counts=1 的情况

[Feature]: Support tool calls for DeepSeek.

mark