torchinfo Error in computing Linear Layer Multiply adds

Describe the bug When the linear layer has a multidimensional input and output (shape with 3 dimensions or more) the computed multiple adds will be incorrect.

To Reproduce

Add line similar to in model with

model_stats = summary(model, input_size, img_metas=img_metas, gt_semantic_seg=seg, depth=9, col_names=["input_size", "kernel_size", "output_size", "num_params", "mult_adds"])

Make sure the linear layer has multiple dimensions as below.

Layer (type:depth-idx)            Input Shape               Kernel Shape              Output Shape              Param #         Mult-Adds
           Linear: 5-8             [1, 22528, 64]            [64, 256]                 [1, 22528, 256]           16,640                    16,640

Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior

Notice the number of multiply adds is listed as 16640 but should be 374865920

It appears line 161 of https://github.com/TylerYep/torchinfo/blob/main/torchinfo/layer_info.py fails to take into account that the behavior of linear will require applying the kernel beyond the single output dimension.

Screenshots

Layer (type:depth-idx)            Input Shape               Kernel Shape              Output Shape              Param #         Mult-Adds
           Linear: 5-8             [1, 22528, 64]            [64, 256]                 [1, 22528, 256]           16,640                    16,640

Additional context Just noticed this was not the correct number of FLOPS in a model using a linear layer such as this.

Aug 24 '21 21:08 jlclemon

Did not realize this was markdown. My mistake on the formatting. Thanks for fixing the formatting.

Aug 24 '21 21:08 jlclemon

Thanks for reporting this issue. Any PR or help fixing this is much appreciated!

Aug 28 '21 08:08 TylerYep

@jlclemon the 1 in the input_size is the batch_dim if i'm not wrong, right? also it would be helpful if you could provide us with the model arch and a gist of what you are trying to do(atleast for a beginner like me). ty

Feb 14 '22 12:02 notjedi

I'm experiencing similar error. It seems that when calculating mult-adds of a torch.nn.Linear, only the first and last dimension of the input tensor (batch size and feature dimension) are considered.

Environment

System: Ubuntu 22.0 Docker image with GPU support
Package version:
- pytorch 2.1.1
- torchinfo 1.8.0

Reproduce

from torch.nn import Linear
from torchinfo import summary

bs, cin, cout = 5, 3, 8
model = Linear(cin, cout)

in_size = (bs, 10, cin)
print(summary(model, input_size=in_size, col_names=["input_size", "output_size", "num_params", "mult_adds"]))

in_size = (bs, 100, 100, cin)
print(summary(model, input_size=in_size, col_names=["input_size", "output_size", "num_params", "mult_adds"]))

Output:

============================================================================================================================================
Layer (type:depth-idx)                   Input Shape               Output Shape              Param #                   Mult-Adds
============================================================================================================================================
Linear                                   [5, 10, 3]                [5, 10, 8]                32                        160
============================================================================================================================================
Total params: 32
Trainable params: 32
Non-trainable params: 0
Total mult-adds (M): 0.00
============================================================================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00
============================================================================================================================================
============================================================================================================================================
Layer (type:depth-idx)                   Input Shape               Output Shape              Param #                   Mult-Adds
============================================================================================================================================
Linear                                   [5, 100, 100, 3]          [5, 100, 100, 8]          32                        160
============================================================================================================================================
Total params: 32
Trainable params: 32
Non-trainable params: 0
Total mult-adds (M): 0.00
============================================================================================================================================
Input size (MB): 0.60
Forward/backward pass size (MB): 3.20
Params size (MB): 0.00
Estimated Total Size (MB): 3.80
============================================================================================================================================

The Mult-Adds for two input sizes are all 160$=5\times(3+1)\times8$, the multiple-accumulate operation amount for input size (5, 1, 3).

Jan 26 '24 08:01 MewmewWho