MONAI [Prototype] 4855 compose with lazy resampling

Fixes #4855

Description

The goal of this PR is to add a lazy_resample: bool option to Compose, so that consecutive spatial transforms could be applied with a single step image resampling. This will have several benefits:

reduce the memory footprint
speed up the compose transform
reduce interpolation errors

The implementation mainly consists of three types of non-breaking API modifications:

make sure all spatial transforms have an option to skip the array resampling step, and not skipping the metadata update step. this is achieved with a new base class (LazyTransform with class variable eager_mode: bool).
the MetaTensor must have a mechanism to track lazily applied changes, this is achieved by adding a MetaTensor.evaluated flag and spatial_shape properties to track the latest expected output spatial shape
the Compose transform's __call__ must be able to find out when to evaluate the fused transforms eagerly in a sequence of transforms.

Some examples of improving the composed transfom quality, based on this sequence:

xforms = [
    mt.LoadImageD(keys, ensure_channel_first=True),
    mt.Orientationd(keys, "RAS"),
    mt.SpacingD(keys, (1.5, 1.5, 1.5)),
    mt.CenterScaleCropD(keys, roi_scale=0.9),
    # mt.CropForegroundD(keys, source_key="seg", k_divisible=5),
    mt.RandRotateD(keys, prob=1.0, range_y=np.pi / 2, range_x=np.pi / 3),
    mt.RandSpatialCropD(keys, roi_size=(76, 87, 73)),
    mt.RandScaleCropD(keys, roi_scale=0.9),
    mt.Resized(keys, (30, 40, 60)),
    # mt.NormalizeIntensityd(keys),
    mt.ZoomD(keys, 1.3, keep_size=False),
    mt.FlipD(keys),
    mt.Rotate90D(keys),
    mt.RandAffined(keys),
    mt.ResizeWithPadOrCropd(keys, spatial_size=(32, 43, 54)),
    mt.DivisiblePadD(keys, k=3),
]
xform = mt.Compose(xforms, lazy_resample=True, mode='nearest')
xform.set_random_state(0)

before this PR (xform = mt.Compose(xforms)):

Screenshot 2022-08-15 at 11 33 06

Screenshot 2022-08-15 at 11 34 08

after this PR (xform = mt.Compose(xforms, lazy_resample=True, mode='nearest')):

Screenshot 2022-08-15 at 11 34 30

Screenshot 2022-08-15 at 11 34 42

after this PR (xform = mt.Compose(xforms, lazy_resample=True, mode=3, padding_mode="reflect")): Screenshot 2022-08-19 at 14 42 45

TODOs:

[ ] unit tests
[ ] skip resampling when the affine transform is almost ones diagonal
[ ] lazy resample in the inverse
[ ] discuss and improve the API design
[ ] compatibility with the cache-based datasets

Status

Hold

Types of changes

[x] Non-breaking change (fix or new feature that would not break existing functionality).
[ ] Breaking change (fix or new feature that would cause existing functionality to change).
[ ] New tests added to cover the changes.
[ ] Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
[ ] Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
[ ] In-line docstrings updated.
[ ] Documentation updated, tested make html command in the docs/ folder.

Aug 15 '22 10:08 wyli

Initial thoughts:

It certainly cuts out quite a lot of the reworking that I thought would be necessary to do.
The way you use the metatensor to do the heavy lifting for the feature certainly seems to work well
I think that my points on Friday about Compose and the interaction between various things that might want to change the effective pipeline that gets run still stand.
I think it is still worth looking at refactoring the transforms so that they give people a simple way to write new transforms with the same principle
Do you intend for this to be possible if metatensors aren't used, or do we require metatensor as a prerequisite?

Aug 15 '22 11:08 atbenmurray

Hi @atbenmurray ,

I think we all aligned that eventually, we will use MetaTensor as the unique data type, it can simplify the codebase and provide flexible info for the future features. CC @ericspod @wyli please feel free to comment if you have different thinking.

Thanks.

Aug 18 '22 01:08 Nic-Ma

Hi @atbenmurray ,

I think we all aligned that eventually, we will use MetaTensor as the unique data type, it can simplify the codebase and provide flexible info for the future features. CC @ericspod @wyli please feel free to comment if you have different thinking.

Thanks.

Hi Nic. We could indeed assume MetaTensor for everything, and that makes life a lot simpler internally. The main issue I see with that is that people who want to write transforms have to retrofit their code with MetaTensors. We always stated that we wanted Monai to be as close to vanilla pytorch as possible and that extensions were additive. I think we have mechanisms that allow us to handle pending transforms independently of meta tensors (they are WIP in #4922, in atmostonce/functional.py and atmostonce/array.py). Can we have that as a subtopic of the Friday meeting?

Aug 19 '22 14:08 atbenmurray

I see two major points to address, as discussed wednesday:

Pending transform representation As it stands, the metatensor has an evaluated flag. That flag indicates whether the transforms are pending or historical. If you have lazy transforms followed by a non-lazy transform followed by more lazy transforms:

La -> Lb -> Nc -> Ld -> Le

lazy transforms La and Lb will be applied twice. The application of La and Lb at the point we reach the non lazy transform Nc flips the flag to evaluated = True. Upon reaching Ld, the flag is flipped back to false. At the end of Le, a final apply must be done, but the transform history of the tensor is La, Lb, Lc, Ld still (we can argue about Nc but there are scenarios in which it does nothing to the metatensor as it is not aware of those capabilities). A pending list, such as added to MetaTensor in #4922, distinguishes between applied transforms and pending transforms. All pending transforms become applied transforms on the metatensor once applied. There are other ways, but this is a simple and (IMHO) clean approach.

Aug 19 '22 14:08 atbenmurray

Hi Nic. We could indeed assume MetaTensor for everything, and that makes life a lot simpler internally. The main issue I see with that is that people who want to write transforms have to retrofit their code with MetaTensors. We always stated that we wanted Monai to be as close to vanilla pytorch as possible and that extensions were additive. I think we have mechanisms that allow us to handle pending transforms independently of meta tensors (they are WIP in #4922, in atmostonce/functional.py and atmostonce/array.py). Can we have that as a subtopic of the Friday meeting?

@atbenmurray I agree with your comment completely . I also think we need to be close to vanilla pytorch - as close as possible.

MetaTensor already deviates from Pytorch, (by creating a custom subclass torch.Tensor), which introduces a high learning curve for any new user of monai (to understand it or modify it), and may lead to bugs such as https://github.com/Project-MONAI/MONAI/issues/5283

By putting more things in MetaTensor it will deviate from pytorch even further. We should consider alternative way to compose spatial tranforms, e.g. by having a class to Fuse the transforms (that will compose spatial grids, keep track of intermediate things, and will do interpolation only once at the end), e.g. see here https://github.com/Project-MONAI/MONAI/issues/4855#issuecomment-1292585995

Oct 26 '22 20:10 myron

Hi Nic. We could indeed assume MetaTensor for everything, and that makes life a lot simpler internally. The main issue I see with that is that people who want to write transforms have to retrofit their code with MetaTensors. We always stated that we wanted Monai to be as close to vanilla pytorch as possible and that extensions were additive. I think we have mechanisms that allow us to handle pending transforms independently of meta tensors (they are WIP in #4922, in atmostonce/functional.py and atmostonce/array.py). Can we have that as a subtopic of the Friday meeting?

@atbenmurray I agree with your comment completely . I also think we need to be close to vanilla pytorch - as close as possible.

MetaTensor already deviates from Pytorch, (by creating a custom subclass torch.Tensor), which introduces a high learning curve for any new user of monai (to understand it or modify it), and may lead to bugs such as #5283

By putting more things in MetaTensor it will deviate from pytorch even further. We should consider alternative way to compose spatial tranforms, e.g. by having a class to Fuse the transforms (that will compose spatial grids, keep track of intermediate things, and will do interpolation only once at the end), e.g. see here #4855 (comment)

At the moment, the implementation of #5420 containing the apply method that performs the accumulation of transforms and lazy resampling can work with metatensors or with standard tensors and separate transform descriptions. However, given that transforms always convert to metatensor, there would be other work that would have to happen to support use of tensors in preprocessing once more and that would be going back on the decision to exclusively use them.

Oct 28 '22 14:10 atbenmurray

https://github.com/Project-MONAI/MONAI/pull/5860 has a more recent implementation

Jan 17 '23 00:01 wyli