Zhipeng Zhang
Zhipeng Zhang
Thanks for the update. LGTM.
先定位一下问题看看。您可以点进ps的日志看一下具体问题,可能是ps挂了?
抱歉回的晚了~看起来是ps挂了,您测试过小一些的数据集(节点数目少一些)吗?
I have tested v1.0.7 and v2.0.4. The result turns out that none of them supports attention mask --- - A: using flash attention with attention mask - B: not using...
+1 for Really awesome!
> > This looks interesting! How accurate is it? > > We randomly selected several parallel configurations and conducted "Memory Requirement" tests on the 7B llama2 model using a single...