AnalogConv2d incorrectly converts the input size
Description
The input of analog convolutional layer AnalogConv2d is in the form of (batch_size, input_channel, height, width) (e.g. (8, 1, 28, 28) for MNIST) and it shouldn't receive the vectorized input like (8, 784). To demonstrate that, consider we have define that
rpu_config = FloatingPointRPUConfig()
model = AnalogConv2d(
in_channels=1, out_channels=3, kernel_size=5, rpu_config=rpu_config
).to(DEVICE)
At the very beginning, it rejects the vectorized input correctly
images = torch.empty((8, 1, 28, 28)).to(DEVICE)
# uncorrectly vectorize the inpute
images = images.view(images.shape[0], -1)
# error incurred. It is expected:
# IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)
model(images)
However, after dealing one unvectorized image, the model weirdly accepts the vectorized input:
images = torch.empty((8, 1, 28, 28)).to(DEVICE)
# dealing with a normal image at first
model(images)
# uncorrectly vectorize the inpute
images = images.view(images.shape[0], -1)
# no error incurred ?!
model(images)
To be worth noticing, this weird behaviour happens only in CUDA version. Besides, the torch counterpart doesn't has the same behaviour.
How to reproduce
The minimal working example is attached as follows
import torch
from aihwkit.nn import AnalogConv2d, AnalogSequential
from torch.nn import Conv2d
from aihwkit.simulator.configs import FloatingPointRPUConfig
def run(DEVICE, is_analog):
if is_analog:
rpu_config = FloatingPointRPUConfig()
model = AnalogConv2d(
in_channels=1, out_channels=3, kernel_size=5, rpu_config=rpu_config
).to(DEVICE)
else:
model = Conv2d(
in_channels=1, out_channels=3, kernel_size=5
).to(DEVICE)
images = torch.empty((8, 1, 28, 28)).to(DEVICE)
model(images)
# uncorrectly vectorize the inpute
images = images.view(images.shape[0], -1)
model(images)
# AnalogConv2D in aihwkit has this issue
run('cuda:0', is_analog=True) # pass (unexpected)
run('cpu', is_analog=True) # error (as expected)
# Conv2D in pytorch doesn't
run('cuda:0', is_analog=False) # error (as expected)
run('cpu', is_analog=False) # error (as expected)
Other information
- Pytorch version: 2.1.2
- Package version: aihwkit-gpu 0.9.0
- OS: Linux
- Python version: 3.10.13
- Conda version (or N/A) :23.11.0
@Zhaoxian-Wu Thanks for raising this. This is an artifact of the indexed convolution, it can handle any input also 2D as it will reshape internally, and does not have the shape requirement of pytorch, which in some cases is restrictive. But it is true, we removed this flexibility for CPU and not for CUDA so it is indeed inconsistent. Best to add a shape check in all cases
Thanks for the rapid reply. To be honest, I got surprised by its running with an unintended reshaping. The code using AnalogConv2d may work in one file (if one sends (8, 1, 28, 28) before sending (8, 784)) while it fails in the other file (if he/she forgets to send (8, 1, 28, 28) before sending (8, 784)). I guess it would be a safer way to disable the automatic reshaping or leave a flag in the input to indicate the auto-reshaping is enabled, to avoid potential running error
@Zhaoxian-Wu, would you be interested in contributing this check to the code to have consistency across CPU and CUDA implementations?
btw, you should not use "empty" as this does not initialize the underlying memory. Do you see the same behavior with zeros?
@Zhaoxian-Wu this issue has been resolved.