M1ngXU

Results 25 comments of M1ngXU

> * Tensor with 2-max-values in Max Last stores wrong gradients #111 according to your last comment in that issue, we just have to change the docstring?

> Another option is to make in/out height/width part of the Conv2D specification. It's a tooooon of generic parameters though... Also unclear how to verify that all the parameters are...

> > Another option is to make in/out height/width part of the Conv2D specification. It's a tooooon of generic parameters though... Also unclear how to verify that all the parameters...

> Downside of `convolution-rs ` is that it uses ndarray and probably ends up allocating (which is why the reported benchmarks are so slow. > > My current plan is...

> Currently looking around for a good reference implementation for backward operation. I've gathered that "conv transpose" is part of it (both from pytorch ConvTranspose2d documentation, and convolution-rs also includes...

> The second line is somehow necessary to infer the `_`. The second line is obviously necessary since the model is just a tuple, with non-relating elements xD

> Was not expecting to get the equivalent of "LazyLinear" for free this easily! What is “LazyLinear”?

> I think having a Linear module is much better than just using a tuple. I feel like if you specifically want no bias, then just use `MatMul` i think...

Already closed with #247 with [this](https://github.com/coreylowman/dfdx/blob/d79ca599d7b839ee9aa1577ef4f646625942ce07/examples/nightly-conv-net.rs) example

why not mix cudnn with custom cuda kernels? tensors in cudnn are a descriptor and the data; the data being just an allocation like a cuda slice (different representations, but...