Abhirath Anand
Abhirath Anand
As things stand, only some layers in Flux have direct functional equivalents in NNlib - `maxpool` and `meanpool` do, for example, while the adaptive versions of those two layers don't....
In long `Chain`s in Metalhead, it is often the case that there are layers that can be reduced to `identity` - `Dropout(p = 0)` is a frequent occurrence, along with...
Following the discussions in https://github.com/FluxML/Metalhead.jl/pull/119, I realised that currently there is no way for the user to programmatically pass in weight initialisation strategies for layers in a `Chain`-like structure based...
Currently, the examples for [ConvMixer](https://github.com/FluxML/model-zoo/blob/dce9dd0d44567ee9dabccba9b92b931298b462ce/vision/convmixer_cifar10/convmixer.jl#L6) and [VGG](https://github.com/FluxML/model-zoo/blob/dce9dd0d44567ee9dabccba9b92b931298b462ce/vision/vgg_cifar10/vgg_cifar10.jl#L39) have the code for the models written inside them. Since Metalhead has cleaner versions of these models, would it make sense to replace...
In using the formatter for Metalhead.jl (https://github.com/FluxML/Metalhead.jl/pull/163) with the `sciml` style, it seems to somehow want to keep the final two `end`s on the same line for testsets with a...
This is an implementation of [EfficientNetv2](https://arxiv.org/abs/2104.00298). There's clearly some code duplication between the EffNets and the MobileNets, but figuring out how to unify that API and its design is perhaps...
These changes will make it easier to port over pretrained weights for the models from PyTorch. - [ ] Right now, GoogLeNet matches the implementation in the paper, which does...
Convolution and BatchNorm layers have been fused during inference in many models, most notably [LeViT](https://github.com/facebookresearch/LeViT). It would be a good idea to have this as an option in the `conv_norm`...
`DropBlock` is a type of regularisation that tries to replace dropout. The [original paper](https://arxiv.org/abs/1810.12890) describes it as best used with a linear scaling rate across blocks in a model, as...
The current documentation format is a little weird and Publish exposes a lot of private functions in the API reference as well. There are several steps that need to be...