data preprocessing for train: resizing, cropping, 256, 286, 512?

Open lelikchern opened this issue 7 months ago • 0 comments

Could you kindly explain how do you preprocess the data for training? I understand that you normalize the data. But do you also create crops and train on crops? You say in the paper that you use 512x512 images for training on Mayo2016. As far as I understand, 512x512 is the native resolution for Mayo2016, so that means that you use the whole image as is?

For Mayo2020 you write about using 256x256 crops. Do I understand correctly that you use data_preprocessing.ipynb to create patches from the initial data (which is, as far as I remember, 512x512 pixels natively as well)? At least, you do that for testing - and you also make sure that the patches on which you test do not contain a lot of empty space, from what I see in the above .ipynb; but do you use this same strategy for creating training patches in 256x256 resolution as well?

Finally, there is a part in the code where you seem to load and resize all images to 286x286 pixels - e.g. the load_size parameter in base_options.py

And then you create a crop of size 256x256 within the 286x286 image. Is my understanding correct?

The reason I'm asking is that I tried to train ASCON using the default options given in the home page of your project, and I used .npys that have resolution 512x512 and I'm observing a very slow convergence, almost no convergence; so I wonder maybe I should have started my training on 256x256 patches first, to get the network to "exploit inherent semantic information"? Does it make sense?

Jul 02 '25 17:07 lelikchern