Jeff Wong
Jeff Wong
Thanks for your answering. The output shape of original softmax loss tensor is `(batch_size, num_box)` which means a scalar value for each box. But, in your implementation, the return of...
> hi guys, you only have the problem with multiple nodes? I get the same issue even on a single node but multiple processes(ranks). Any suggestion? same issue here, could...
@GyuminDev In original paper, there is a hyper-parameter called compression factor. This factor is used to decimate the tensor which is feed from a dense block to a transition layer....