Xiangchendong issues

Results 7 issues of


                                            Xiangchendong

why do not support load from local dir but always download

why do not support loading from local dir but always download? May I pull request for this?

[Feature] performance problem

### Is your feature request related to a problem? Please describe. 非常赞赏学长们的工作！我有一个小小的问题注意到readme里有一个吞吐和显存占用的表格。BMtrain显著优于Deepspeed- megaton，我好奇这其中的优化主要来源于什么地方呢。同样的逻辑，为什么我们能够支持更多的bach size，吞吐更高？是否也有显卡配置的原因呢（sxm的机器是不是会因为高带宽抹去这样的差距）。我觉得做到这样的优化绝对是系统顶会级别的工作，学长们有兴趣分析这其中的优化点并总结成文章投稿吗。其实我非常希望使用BMtrain的框架，但是只看到其中的好，不知道为什么好，心里就不踏实。 ### Describe the solution you'd like 同上 ### Describe alternatives you've considered _No response_...

enhancement

Xiangchendong

why do not support load from local dir but always download

[Feature] performance problem

style train

性能问题

[REQUEST] dynamic batch size with gradient accumulate

H100 attn kernel bug

fix h100 bench interface