Results 2 comments of Dan Yao

Because qloop's fwd and bwd use different layouts, so we refactor dropout to decouple fwd and bwd, but dropout after refactor brought a lot of overhead to fwd, there are...

I think we should combine backward and backward_dropout.