datumaro icon indicating copy to clipboard operation
datumaro copied to clipboard

ResizeTransform bug when dataset contains an RleMask annotation

Open jlwhelan28 opened this issue 1 year ago • 1 comments

Attempting to resize a dataset with RleMask annotation types fails.

Root cause seems to be attempting to initialize the "resized" RleMask using an image mask array here. isinstance(ann, Mask) catches RleMask due to type inheritance. isinstance(RleMask, Mask) == True https://github.com/openvinotoolkit/datumaro/blob/2494e15dfa4db24ad5ea8ec10247f341b70f36f0/src/datumaro/plugins/transforms.py#L1109-L1111

The test for this method excludes an RleMask in the sample dataset https://github.com/openvinotoolkit/datumaro/blob/2494e15dfa4db24ad5ea8ec10247f341b70f36f0/tests/unit/test_transforms.py#L904-L930

Reproduce with

        small_dataset = Dataset.from_iterable(
            [
                DatasetItem(
                    id=i,
                    media=Image.from_numpy(data=np.ones((4, 4)) * i),
                    annotations=[
                        Label(1),
                        Bbox(1, 1, 2, 2, label=2),
                        Polygon([1, 1, 1, 2, 2, 2, 2, 1], label=1),
                        PolyLine([1, 1, 1, 2, 2, 2, 2, 1], label=2),
                        Points([1, 1, 1, 2, 2, 2, 2, 1], label=2),
                        Mask(
                            np.array(
                                [
                                    [0, 0, 1, 1],
                                    [1, 0, 0, 1],
                                    [0, 1, 1, 0],
                                    [1, 1, 0, 0],
                                ]
                            )
                        ),
                        RleMask(pycocotools.mask.encode(
                            np.asfortranarray(np.array(
                                [
                                    [0, 0, 1, 1],
                                    [1, 0, 0, 1],
                                    [0, 1, 1, 0],
                                    [1, 1, 0, 0],
                                ]
                            ).astype(np.uint8))
                        )),
                    ],
                )
                for i in range(3)
            ],
            categories=["a", "b", "c"],
        )
        small_dataset.transform("resize", width=100, height=100)

Full traceback

dset.transform("resize", width=100, height=100)
Out[86]: ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File .../lib/python3.10/site-packages/IPython/core/formatters.py:708, in PlainTextFormatter.__call__(self, obj)
    701 stream = StringIO()
    702 printer = pretty.RepresentationPrinter(stream, self.verbose,
    703     self.max_width, self.newline,
    704     max_seq_length=self.max_seq_length,
    705     singleton_pprinters=self.singleton_printers,
    706     type_pprinters=self.type_printers,
    707     deferred_pprinters=self.deferred_printers)
--> 708 printer.pretty(obj)
    709 printer.flush()
    710 return stream.getvalue()

File .../lib/python3.10/site-packages/IPython/lib/pretty.py:410, in RepresentationPrinter.pretty(self, obj)
    407                         return meth(obj, self, cycle)
    408                 if cls is not object \
    409                         and callable(cls.__dict__.get('__repr__')):
--> 410                     return _repr_pprint(obj, self, cycle)
    412     return _default_pprint(obj, self, cycle)
    413 finally:

File .../lib/python3.10/site-packages/IPython/lib/pretty.py:778, in _repr_pprint(obj, p, cycle)
    776 """A pprint that just redirects to the normal repr function."""
    777 # Find newlines and replace them with p.break_()
--> 778 output = repr(obj)
    779 lines = output.splitlines()
    780 with p.group():

File .../lib/python3.10/site-packages/datumaro/components/dataset.py:261, in Dataset.__repr__(self)
    257 def __repr__(self) -> str:
    258     separator = "\t"
    259     return (
    260         f"Dataset\n"
--> 261         f"\tsize={len(self._data)}\n"
    262         f"\tsource_path={self._source_path}\n"
    263         f"\tmedia_type={self.media_type()}\n"
    264         f"\tannotated_items_count={self.get_annotated_items()}\n"
    265         f"\tannotations_count={self.get_annotations()}\n"
    266         f"subsets\n"
    267         f"\t{separator.join(self.get_subset_info())}"
    268         f"infos\n"
    269         f"\t{separator.join(self.get_infos())}"
    270         f"categories\n"
    271         f"\t{separator.join(self.get_categories_info())}"
    272     )

File .../lib/python3.10/site-packages/datumaro/components/dataset_storage.py:369, in DatasetStorage.__len__(self)
    367 def __len__(self) -> int:
    368     if self._length is None:
--> 369         self.init_cache()
    370     return self._length

File .../lib/python3.10/site-packages/datumaro/components/dataset_storage.py:178, in DatasetStorage.init_cache(self)
    176 def init_cache(self) -> None:
    177     if not self.is_cache_initialized():
--> 178         for _ in self._iter_init_cache():
    179             pass

File .../lib/python3.10/site-packages/datumaro/components/dataset_storage.py:185, in DatasetStorage._iter_init_cache(self)
    181 def _iter_init_cache(self) -> Iterable[DatasetItem]:
    182     try:
    183         # Can't just return from the method, because it won't add exception handling
    184         # It covers cases when we save the null error handler in the source
--> 185         for item in self._iter_init_cache_unchecked():
    186             yield item
    187     except _ImportFail as e:

File .../lib/python3.10/site-packages/datumaro/components/dataset_storage.py:268, in DatasetStorage._iter_init_cache_unchecked(self)
    266 if transform and transform.is_local:
    267     old_id = (item.id, item.subset)
--> 268     item = transform.transform_item(item)
    270 item_id = (item.id, item.subset) if item else None
    272 if item_id in cache:

File .../lib/python3.10/site-packages/datumaro/components/dataset_storage.py:103, in _StackedTransform.transform_item(self, item)
    101     if item is None:
    102         break
--> 103     item = t.transform_item(item)
    104 return item

File .../lib/python3.10/site-packages/datumaro/plugins/transforms.py:1111, in ResizeTransform.transform_item(self, item)
   1109 elif isinstance(ann, Mask):
   1110     rescaled_mask = self._lazy_resize_mask(ann, new_size)
-> 1111     resized_annotations.append(ann.wrap(image=rescaled_mask))
   1112 elif isinstance(ann, (Caption, Label)):
   1113     resized_annotations.append(ann)

File .../lib/python3.10/site-packages/datumaro/components/annotation.py:104, in Annotation.wrap(self, **kwargs)
    102 def wrap(self, **kwargs):
    103     "Returns a modified copy of the object"
--> 104     return attr.evolve(self, **kwargs)

File .../lib/python3.10/site-packages/attr/_funcs.py:419, in evolve(*args, **changes)
    416     if init_name not in changes:
    417         changes[init_name] = getattr(inst, attr_name)
--> 419 return cls(**changes)

TypeError: RleMask.__init__() got an unexpected keyword argument 'image'

jlwhelan28 avatar Mar 08 '24 14:03 jlwhelan28

@sooahleex, could you take a look?

wonjuleee avatar Mar 11 '24 00:03 wonjuleee

Hi @jlwhelan28, apologies for the delayed response. I've reviewed this and confirmed that the TypeError: RleMask.__init__() got an unexpected keyword argument 'image' occurs when attempting to resize RleMask annotations, as you described. This is indeed due to misinterpretation of type inheritance like RleMask as a Mask. Thank you for providing detailed analysis.

To prevent this issue, I've made modifications to enable separate resizing for RleMask. https://github.com/openvinotoolkit/datumaro/blob/b29b8c38dcce7a1ca96224580b542667129ac95a/src/datumaro/plugins/transforms.py#L1073-L1080 Additional test code related to this has been added, so please feel free to refer to it if you wish to confirm the issue. https://github.com/openvinotoolkit/datumaro/blob/b29b8c38dcce7a1ca96224580b542667129ac95a/tests/unit/test_transforms.py#L927-L940

This fix will be included in Datumaro 2.0, scheduled for release at the end of March. If you encounter any issues related to this in the future, please reopen the issue and let us know. Thank you for your interests.

sooahleex avatar Mar 20 '24 04:03 sooahleex