Grad strides do not match bucket view strides.
When I use DDP, I got this warning. Which code cause this?
UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. grad.sizes() = [256, 256, 1, 1], strides() = [256, 1, 256, 256] bucket_view.sizes() = [256, 256, 1, 1], strides() = [256, 1, 1, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:325.) Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
@CreamyLong Same warning. Have you solved the problem?
@CreamyLong Same warning. Have you solved the problem?
hi, have you solved it? same problem