torchinfo icon indicating copy to clipboard operation
torchinfo copied to clipboard

Error in computing Linear Layer Multiply adds

Open jlclemon opened this issue 4 years ago • 6 comments

Describe the bug When the linear layer has a multidimensional input and output (shape with 3 dimensions or more) the computed multiple adds will be incorrect.

To Reproduce

Add line similar to in model with

model_stats = summary(model, input_size, img_metas=img_metas, gt_semantic_seg=seg, depth=9, col_names=["input_size", "kernel_size", "output_size", "num_params", "mult_adds"])

Make sure the linear layer has multiple dimensions as below.

Layer (type:depth-idx)            Input Shape               Kernel Shape              Output Shape              Param #         Mult-Adds
           Linear: 5-8             [1, 22528, 64]            [64, 256]                 [1, 22528, 256]           16,640                    16,640

Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior

Notice the number of multiply adds is listed as 16640 but should be 374865920

It appears line 161 of https://github.com/TylerYep/torchinfo/blob/main/torchinfo/layer_info.py fails to take into account that the behavior of linear will require applying the kernel beyond the single output dimension.

Screenshots

Layer (type:depth-idx)            Input Shape               Kernel Shape              Output Shape              Param #         Mult-Adds
           Linear: 5-8             [1, 22528, 64]            [64, 256]                 [1, 22528, 256]           16,640                    16,640

Additional context Just noticed this was not the correct number of FLOPS in a model using a linear layer such as this.

jlclemon avatar Aug 24 '21 21:08 jlclemon

Did not realize this was markdown. My mistake on the formatting. Thanks for fixing the formatting.

jlclemon avatar Aug 24 '21 21:08 jlclemon

Thanks for reporting this issue. Any PR or help fixing this is much appreciated!

TylerYep avatar Aug 28 '21 08:08 TylerYep

@jlclemon the 1 in the input_size is the batch_dim if i'm not wrong, right? also it would be helpful if you could provide us with the model arch and a gist of what you are trying to do(atleast for a beginner like me). ty

notjedi avatar Feb 14 '22 12:02 notjedi

I'm experiencing similar error. It seems that when calculating mult-adds of a torch.nn.Linear, only the first and last dimension of the input tensor (batch size and feature dimension) are considered.

Environment

  • System: Ubuntu 22.0 Docker image with GPU support
  • Package version:
    • pytorch 2.1.1
    • torchinfo 1.8.0

Reproduce

from torch.nn import Linear
from torchinfo import summary

bs, cin, cout = 5, 3, 8
model = Linear(cin, cout)

in_size = (bs, 10, cin)
print(summary(model, input_size=in_size, col_names=["input_size", "output_size", "num_params", "mult_adds"]))

in_size = (bs, 100, 100, cin)
print(summary(model, input_size=in_size, col_names=["input_size", "output_size", "num_params", "mult_adds"]))

Output:

============================================================================================================================================
Layer (type:depth-idx)                   Input Shape               Output Shape              Param #                   Mult-Adds
============================================================================================================================================
Linear                                   [5, 10, 3]                [5, 10, 8]                32                        160
============================================================================================================================================
Total params: 32
Trainable params: 32
Non-trainable params: 0
Total mult-adds (M): 0.00
============================================================================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00
============================================================================================================================================
============================================================================================================================================
Layer (type:depth-idx)                   Input Shape               Output Shape              Param #                   Mult-Adds
============================================================================================================================================
Linear                                   [5, 100, 100, 3]          [5, 100, 100, 8]          32                        160
============================================================================================================================================
Total params: 32
Trainable params: 32
Non-trainable params: 0
Total mult-adds (M): 0.00
============================================================================================================================================
Input size (MB): 0.60
Forward/backward pass size (MB): 3.20
Params size (MB): 0.00
Estimated Total Size (MB): 3.80
============================================================================================================================================

The Mult-Adds for two input sizes are all 160$=5\times(3+1)\times8$, the multiple-accumulate operation amount for input size (5, 1, 3).

MewmewWho avatar Jan 26 '24 08:01 MewmewWho