lbann
lbann copied to clipboard
Add distributed Scatter/Gather
- Enables distconv implementations of Scatter and Gather layers
- Implements NVSHMEM based RMA kernels for scatter/gather on DiHydrogen tensors
- Adds example applications in
applications/graph/DistConvGNN/syntheticfor benchmarking distributed Scatter, Gather, and GCN - Adds unit tests
To do:
- [ ] Fix error in distconv identity layer causing mismatched mini-batch dimension