David Sidler
Results
1
issues of
David Sidler
Running kernel allreduce8Read across 64 vGPUs (in CPX mode) revealed synchronization bugs. The PR addresses them by: - Synchronize threads before signaling that output (outChannels) are valid to guarantee ordering...
noCI