WayXG
Results
2
issues of
WayXG
We need pip install packaging before pip install flash-attn
Thanks for the great work! I have some questions about the training configuration. For the training batch size, I assume that we will collect rollout_batch_size = 1024 trajectories into the...