Pytorch model with flexible image size
🐞Describe the bug
Unable to make flexible input size when converting from pytorch. I think it's related to https://github.com/apple/coremltools/issues/890 https://github.com/apple/coremltools/issues/880 and https://github.com/apple/coremltools/issues/756. I extend this to input ImageType and make simplest script to reproduce whole process.
- The way with
RangeDimto make input flexible from documentation just do not work at all! Getting:
File "/Users/user/miniconda3/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 1336, in _get_scales_from_output_size
scales_h = (output_size[0] + 1e-4) / float(input_shape[-2])
TypeError: unsupported operand type(s) for +: 'NoneType' and 'float'
- Just the same network CAN work with flexible input size if sizes are enumerated OR MLMultiArray is used.
So the model is correct and framework do able to execute it with any size but only if enumerated in advance. This brings me to thoughts there's some flag or typo which ignores input range.
Trace
in predict
return self.__proxy__.predict(data, useCPUOnly)
RuntimeError: {
NSLocalizedDescription = "Error binding image input buffer input.";
}
To Reproduce
def env_info():
import sys
print('python version:', sys.version)
import torch
print('PyTorch version:', torch.__version__)
import coremltools
print('coremltools version:', coremltools.__version__)
# make a simpliest super-resolution model
def export():
print('='*24, 'export', '='*24)
import torch
from torch import nn
from torch.nn import functional as F
class SuperNet(nn.Module):
def __init__(self):
super().__init__()
def forward(self, x):
return F.interpolate(x, scale_factor=(2, 2))
model = SuperNet().eval()
# doesn't matter that is the image size here, can be just anything
traced = torch.jit.trace(model, torch.rand((1, 3, 1, 1)))
torch.jit.save(traced, 'traced.pt')
INPUT_NODE = 'input'
OUTPUT_NODE = '30'
MODEL_DEFAULT_INPUT_SIZE = (256, 256)
IMAGE_SIZE = (512, 512)
def convert():
global OUTPUT_NODE
print('='*24, 'convert', '='*24)
import torch
import coremltools as ct
from coremltools.models.neural_network import flexible_shape_utils
import coremltools.proto.FeatureTypes_pb2 as ft
from coremltools.models.neural_network.builder import NeuralNetworkBuilder
traced = torch.jit.load('traced.pt')
# The way from the documentation just throw: TypeError: unsupported operand type(s) for +: 'NoneType' and 'float'
# input_shape = ct.Shape(shape=[1, 3, ct.RangeDim(256, 1024, 256, 'w'), ct.RangeDim(256, 1024, 256, 'h')])
# Ok, lets convert with fixed shape and add flexibility later
input_shape = ct.Shape(shape=[1, 3, 256, 256])
converted = ct.convert(
traced,
inputs=[ct.ImageType(name=INPUT_NODE, shape=input_shape)])
# doesn't matter how to get spec, result the same = crash
# converted.save('coreml.mlmodel')
# spec = ct.utils.load_spec('coreml.mlmodel')
spec = converted.get_spec()
OUTPUT_NODE = spec.description.output[0].name
# make input flexibility
size_range = flexible_shape_utils.NeuralNetworkImageSizeRange()
size_range.add_height_range((128, 512)) # tried with upper_bound=-1, doesn't work
size_range.add_width_range((128, 512))
flexible_shape_utils.update_image_size_range(spec, INPUT_NODE, size_range=size_range)
# The only working way to get flexible sizes is enumerate them all
# but it doesn't make much sense for super-resulution task in our case
# flexible_shape_utils.add_enumerated_image_sizes(spec, INPUT_NODE, sizes=[
# flexible_shape_utils.NeuralNetworkImageSize(x, x) for x in [128, 256, 512]
# ])
# make output as an image + flexibility
feature = flexible_shape_utils._get_feature(spec, OUTPUT_NODE)
feature.type.imageType.colorSpace = ft.ImageFeatureType.RGB
# can specify output size but it doesn't change anything
# feature.type.imageType.width = 512
# feature.type.imageType.height = 512
size_range = flexible_shape_utils.NeuralNetworkImageSizeRange()
size_range.add_height_range((256, 1024))
size_range.add_width_range((256, 1024))
flexible_shape_utils.update_image_size_range(spec, feature_name=OUTPUT_NODE, size_range=size_range)
# print our model
builder = NeuralNetworkBuilder(spec=spec)
print('inputs')
builder.inspect_input_features()
print('model')
builder.inspect_layers(verbose=True)
print('outputs')
builder.inspect_output_features()
updated = ct.models.MLModel(spec)
updated.save('coreml.mlmodel')
def test():
print('='*24, 'test', '='*24)
import coremltools as ct
import PIL
from PIL import Image
import numpy as np
print('input image size=', IMAGE_SIZE)
arr = np.zeros([IMAGE_SIZE[0], IMAGE_SIZE[1], 3], dtype=np.uint8)
img = Image.fromarray(arr)
model = ct.models.MLModel('coreml.mlmodel')
res = model.predict({INPUT_NODE: img})[OUTPUT_NODE]
print('result image size=', res.size)
env_info()
export()
convert()
test()
System environment (please complete the following information):
- coremltools version: 4.0
- OS: MacOS
- macOS: Big Sur 11.0 Beta (20A5395g)
- XCode version: 12.0.1 (12A7300)
- How you install python: anaconda
- python version: 3.8.3
- PyTorch version: 1.6.0
This is critical bug!
Because some models just do not make much sense with fixed (even several enumerated) size!
I will be happy to provide any additional info required. Please, any help/workaround is appreciated!
it neither works when I do not touch output node. (don't specify as image type, set flexible size)
@aseemw tested with MultiArray input with dynamic size – it work just perfect! (while keeping output as image)
size_l = 128
size_u = 2048
flexible_shape_utils.set_multiarray_ndshape_range(spec, INPUT_NODE, [1,3,size_l,size_l], [1,3,size_u,size_u])
here's updated script: https://gist.github.com/gordinmitya/96a1b041bea18add4ec2e31907d11100
Any ideas?
I've exactly the same problem and I see that coreml runtime basically consider only the imageType width and height values and ignore the imageSizeRange. I tried also to remove the imageType width and height values. Apparently the model is exported successfully this way but the runtime then fail to compile the model.
These are the input/output spec for my model:
(name: "image"
type {
imageType {
width: 1024
height: 1024
colorSpace: RGB
imageSizeRange {
widthRange {
lowerBound: 256
upperBound: 1024
}
heightRange {
lowerBound: 256
upperBound: 1024
}
}
}
},
name: "460"
type {
imageType {
width: 1024
height: 1024
colorSpace: RGB
imageSizeRange {
widthRange {
lowerBound: 256
upperBound: 1024
}
heightRange {
lowerBound: 256
upperBound: 1024
}
}
}
})
As said any image with a size different than 1024x1024 fails with "coreml Error binding image input buffer image: -7" message
I appear to be having the same issue #992
I was able to work around the bug by using MultiArray input and flexible shape image output (as you mention). I am converting my input UIImage to MLMultiArray using Accelerate vImageConvert_ARGB8888toPlanarF. The overhead of the conversion is quite low for my purposes.
Obviously not as ideal as flexible input images just working, but for me at least it is a working solution until these bugs are (hopefully) fixed and at least I can move on for now.
@3DTOPO Thank you for workaround with Accelerate! How do you think is it the way CoreML convert UIImage internally? And could you provide small snippet how to use vImageConvert_ARGB8888toPlanarF in Swift?
You're welcome! Possibly. Please see the end of this thread for the solution: https://github.com/hollance/CoreMLHelpers/issues/5#issuecomment-726021906
Hi, Apologies if I'm hijacking.
If you use enumerated shapes does it work with ANE @gordinmitya ? I got it working with enumerated shape on a trivial/minimal network and while CPU & GPU work fine, ANE still refuses.
@alexrkr haven't got so far. Need to run the model at least somehow. So in your case model works on ANE when converted with fixed size but use CPU/GPU when converted with enumerated size, right?
Exactly, so when exporting with enumerated sizes it works fine on CPU/GPU but not on ANE. Fixed size works on all devices fine. I was hoping to find a reference in Apple's docs or someone who has gotten ANE to work with flexible input network, I'm assuming that's possible.
Mitya, did you find solution? I also can't use flexible input in my app, it returns Error binding image input buffer image: -7
Many thanks!
I'm trying to wrap up development of an update that I've spent 2 years working on. Is this glaring bug ever going to be addressed?
Otherwise I am facing shipping a product with a horrendous work around for a feature that is supposed to be supported. I can't express how frustrating this issue is and one of the most critical toolchains for my app development.
@aseemw tested with MultiArray input with dynamic size – it work just perfect! (while keeping output as image)
size_l = 128 size_u = 2048 flexible_shape_utils.set_multiarray_ndshape_range(spec, INPUT_NODE, [1,3,size_l,size_l], [1,3,size_u,size_u])here's updated script: https://gist.github.com/gordinmitya/96a1b041bea18add4ec2e31907d11100
Any ideas?
HI! I can't get this to work either. Is there any other hack?