Connor Anderson

Results 8 comments of Connor Anderson

I'm also curious about this. What was the reasoning for switching from `exp` to `sigmoid`? Was it just to keep the result bounded?

Ah, I see. I wonder if you could use something like `ReLU(x) + 1`. Then your gradient would always be nice and strong, and the constant would prevent the divide...

Sure, I understand that. But the gradient of the `ReLU` is bounded (it's constant) as well. And it's a simpler function, without the vanishing gradient problem of the `sigmoid`. I...

@wanglouis49 Thanks for your insight.

I'd like to, I just haven't had the time to get to it. I'm also welcome to other people trying it out and sharing results!

For anyone else encountering this problem, the issue occurs when the effective batch size is 1 (which happens for CUB when using 2 GPUs on the very last validation batch)....

`grid_sample()` requires coordinates normalized between -1 and 1: see [here](https://pytorch.org/docs/stable/generated/torch.nn.functional.grid_sample.html#:~:text=normalized%20by%20the%20input%20spatial%20dimensions.%20Therefore%2C%20it%20should%20have%20most%20values%20in%20the%20range%20of%20%5B%2D1%2C%201%5D). This is likely the issue based on the information provided here. Any points that back-warp to values outside of [-1,...