axlearn
axlearn copied to clipboard
Perf optimize with unroll=8
Enable unroll for GPU for better communication interleaving