singd
singd copied to clipboard
SINGD: KFAC-like Structured Inverse-Free Natural Gradient Descent (http://arxiv.org/abs/2312.05705)
Once KFAC is part of a [`curvlinops`](https://github.com/f-dangel/curvlinops/) release, we can try to remove all KFAC related computations from this repository and use `curvlinops` instead. Currently this is not trivially possible,...
At the moment, all methods fall back to `.to_dense` and `.from_dense`.
At the moment, all methods fall back to `.to_dense` and `.from_dense`.
During one of our internal discussions, we realized that the code starts to accumulate multiple re-scaling operations which are required to avoid over/under-floating when using `float16` (e.g. using the average...
The Kronecker approximation depends on a NN architecture. We should support important GNN layers such as the [`GCNConv` layer](https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#convolutional-layers). A reference Kronecker approximation for the GCNConv can be found at...
We originally incorporated this function to support updates based `K.T @ (a @ a.T) @ K` and `K.T @ (g @ g.T) @ K`, e.g. in the private ASDL implementation....
We might be able to implement [`_extract_patches`](https://github.com/f-dangel/singd/blob/main/singd/optim/utils.py#L12-L40) more efficiently for some special cases, e.g. - when `dilation=1`, we can look into using something like the [`im2col_2d`](https://github.com/kazukiosawa/asdl/blob/master/asdl/utils.py#L51-L71) function in ASDL (if...
For example: - Making sure two structured matrices are of same type before adding/matrix-multiplying/... - Making sure two structured matrices are of same dimension before adding/matrix-multiplying/...
There is already a template in `docs/examples/example_02_unsupported_parameters.py`. You can take a look at `docs/examples/example_03_param_groups.py` for inspiration.