petals
petals copied to clipboard
Automatic test for failed backward
Hi! Can you still recall what problem did you mean to report?
_edit: talked to Dmitry, here's what he meant: we need a test that checks if the model can still run backward pass if one of the servers used for forward pass has left the network (or failed).