aihwkit icon indicating copy to clipboard operation
aihwkit copied to clipboard

AnalogConv2d incorrectly converts the input size

Open Zhaoxian-Wu opened this issue 1 year ago • 3 comments

Description

The input of analog convolutional layer AnalogConv2d is in the form of (batch_size, input_channel, height, width) (e.g. (8, 1, 28, 28) for MNIST) and it shouldn't receive the vectorized input like (8, 784). To demonstrate that, consider we have define that

rpu_config = FloatingPointRPUConfig()
model = AnalogConv2d(
    in_channels=1, out_channels=3, kernel_size=5, rpu_config=rpu_config
).to(DEVICE)

At the very beginning, it rejects the vectorized input correctly

images = torch.empty((8, 1, 28, 28)).to(DEVICE)
# uncorrectly vectorize the inpute
images = images.view(images.shape[0], -1)
# error incurred. It is expected: 
# IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)
model(images)

However, after dealing one unvectorized image, the model weirdly accepts the vectorized input:


images = torch.empty((8, 1, 28, 28)).to(DEVICE)
# dealing with a normal image at first
model(images)
# uncorrectly vectorize the inpute
images = images.view(images.shape[0], -1)
# no error incurred ?!
model(images)

To be worth noticing, this weird behaviour happens only in CUDA version. Besides, the torch counterpart doesn't has the same behaviour.

How to reproduce

The minimal working example is attached as follows

import torch
from aihwkit.nn import AnalogConv2d, AnalogSequential
from torch.nn import Conv2d
from aihwkit.simulator.configs import FloatingPointRPUConfig

def run(DEVICE, is_analog):
    if is_analog:
        rpu_config = FloatingPointRPUConfig()
        model = AnalogConv2d(
            in_channels=1, out_channels=3, kernel_size=5, rpu_config=rpu_config
        ).to(DEVICE)
    else:
        model = Conv2d(
            in_channels=1, out_channels=3, kernel_size=5
        ).to(DEVICE)

    images = torch.empty((8, 1, 28, 28)).to(DEVICE)
    model(images)
    # uncorrectly vectorize the inpute
    images = images.view(images.shape[0], -1)
    model(images)
    
# AnalogConv2D in aihwkit has this issue
run('cuda:0', is_analog=True)    # pass  (unexpected)
run('cpu', is_analog=True)       # error (as expected)


# Conv2D in pytorch doesn't
run('cuda:0', is_analog=False)   # error (as expected)
run('cpu', is_analog=False)      # error (as expected)

Other information

  • Pytorch version: 2.1.2
  • Package version: aihwkit-gpu 0.9.0
  • OS: Linux
  • Python version: 3.10.13
  • Conda version (or N/A) :23.11.0

Zhaoxian-Wu avatar Apr 08 '24 04:04 Zhaoxian-Wu

@Zhaoxian-Wu Thanks for raising this. This is an artifact of the indexed convolution, it can handle any input also 2D as it will reshape internally, and does not have the shape requirement of pytorch, which in some cases is restrictive. But it is true, we removed this flexibility for CPU and not for CUDA so it is indeed inconsistent. Best to add a shape check in all cases

maljoras avatar Apr 08 '24 05:04 maljoras

Thanks for the rapid reply. To be honest, I got surprised by its running with an unintended reshaping. The code using AnalogConv2d may work in one file (if one sends (8, 1, 28, 28) before sending (8, 784)) while it fails in the other file (if he/she forgets to send (8, 1, 28, 28) before sending (8, 784)). I guess it would be a safer way to disable the automatic reshaping or leave a flag in the input to indicate the auto-reshaping is enabled, to avoid potential running error

Zhaoxian-Wu avatar Apr 08 '24 16:04 Zhaoxian-Wu

@Zhaoxian-Wu, would you be interested in contributing this check to the code to have consistency across CPU and CUDA implementations?

kaoutar55 avatar May 08 '24 15:05 kaoutar55

btw, you should not use "empty" as this does not initialize the underlying memory. Do you see the same behavior with zeros?

maljoras avatar Aug 25 '24 22:08 maljoras

@Zhaoxian-Wu this issue has been resolved.

kaoutar55 avatar Sep 18 '24 15:09 kaoutar55