Abhirath Anand

Results 22 issues of Abhirath Anand

As things stand, only some layers in Flux have direct functional equivalents in NNlib - `maxpool` and `meanpool` do, for example, while the adaptive versions of those two layers don't....

enhancement
good first issue

In long `Chain`s in Metalhead, it is often the case that there are layers that can be reduced to `identity` - `Dropout(p = 0)` is a frequent occurrence, along with...

Following the discussions in https://github.com/FluxML/Metalhead.jl/pull/119, I realised that currently there is no way for the user to programmatically pass in weight initialisation strategies for layers in a `Chain`-like structure based...

enhancement

Currently, the examples for [ConvMixer](https://github.com/FluxML/model-zoo/blob/dce9dd0d44567ee9dabccba9b92b931298b462ce/vision/convmixer_cifar10/convmixer.jl#L6) and [VGG](https://github.com/FluxML/model-zoo/blob/dce9dd0d44567ee9dabccba9b92b931298b462ce/vision/vgg_cifar10/vgg_cifar10.jl#L39) have the code for the models written inside them. Since Metalhead has cleaner versions of these models, would it make sense to replace...

In using the formatter for Metalhead.jl (https://github.com/FluxML/Metalhead.jl/pull/163) with the `sciml` style, it seems to somehow want to keep the final two `end`s on the same line for testsets with a...

bug

This is an implementation of [EfficientNetv2](https://arxiv.org/abs/2104.00298). There's clearly some code duplication between the EffNets and the MobileNets, but figuring out how to unify that API and its design is perhaps...

new-model

These changes will make it easier to port over pretrained weights for the models from PyTorch. - [ ] Right now, GoogLeNet matches the implementation in the paper, which does...

good first issue
model-bug

Convolution and BatchNorm layers have been fused during inference in many models, most notably [LeViT](https://github.com/facebookresearch/LeViT). It would be a good idea to have this as an option in the `conv_norm`...

enhancement
layers

`DropBlock` is a type of regularisation that tries to replace dropout. The [original paper](https://arxiv.org/abs/1810.12890) describes it as best used with a linear scaling rate across blocks in a model, as...

enhancement
layers

The current documentation format is a little weird and Publish exposes a lot of private functions in the API reference as well. There are several steps that need to be...

documentation