David Sidler

Results 1 issues of David Sidler

Running kernel allreduce8Read across 64 vGPUs (in CPX mode) revealed synchronization bugs. The PR addresses them by: - Synchronize threads before signaling that output (outChannels) are valid to guarantee ordering...

noCI