Valérian Rey
Valérian Rey
It seems that the Python 3.13 wheels are now available since the release of version 3.2.7.post2 (https://pypi.org/project/scs/3.2.7.post2/#files). For me, it's now possible to install scs in a Python 3.13 environment....
Hi! We've been experimenting a bit with [jax](https://github.com/jax-ml/jax) lately, but making a jax version would require quite a lot of work. In the long term, I would love to have...
Ok, then I'm not sure this PR is such a good idea. I'm gonna keep this open for now.
@PierreQuinton That's a pretty good pre-requesite to #458
Hi Emile! Thanks a lot for all the work! We didn't know that torchjd didn't work for sparse tensors, and we never thought about the graph neural network use-case, so...
> Might be worth mentioning in the docs that custom autograd Functions can be a viable workaround for `vmap()`-compatible code though for users who want to have control over their...
This is big for architectures with very big linear layers, like AlexNet. For AlexNet, on cuda, with batch_dim=0, this leads to: - Double max batch size (from batch_size=19 to batch_size=38...
Also pretty big for Transformers (with the change i suggested to handle higher order tensors). Times for forward + backward on WithTransformerLarge with BS=256, A=Mean on cuda:0. Reduced from 3.13...
This seems to break `NoFreeParam` (tiny errors) and `ModuleReuse` (large errors). Need to investigate that. For `ModuleReuse`, my guess is that it simply doesn't consider cross terms anymore, so it's...
@PierreQuinton I found a way to compute the gramian with autograd with no cross terms from module reuse / inter-module param reuse: 30fdc0078be5. Basically, the idea is to have a...