Regression checkerboard
Hello, I am using Clay to do pixel-wise regression on water images. I have fine-tuned the model, and when doing predictions, I am getting a checkerboard pattern in the prediction output. Any ideas about what is happening?
I have seen that the model outputs in half resolution, and interpolates the image to have it with same size as the input one. Is it possible to get a prediction from the model with same size as the input, without doing this interpolation?
Could this checkerboard pattern be related to the fact of using water images?
Thanks!!
@TaniaJG We conducted a pixel-wise regression for ABG, and the details are available in the CLAY documentation. We didn’t observe any checkerboard patterns during this process, but the results do appear somewhat pixelated because we upsampled the predictions from half the image size.
We can generate predictions at the original image resolution by adding another upsampling layer in the model's fusion step. I’m not certain if this issue is specific to water images, but it would be worth testing this hypothesis further.
Thank you Soumya for your fast response. I added another layer where you pointed, to have original resolution in the prediction, and this does not solve the problem. I will further investigate what is happening here.
Thanks!!
Hi again!! I have been researching about this issue, and I have not been able to solve it yet.
By the way, I tried fine-tunning with patch_size=16 in the SegmentEncoder() called by the Regressor(), and now I don't get the previous striped pattern, but the result is like pixelated (see figure). Is it OK to use different patch sizes other than 8?
Another thing that comes to my mind, is that the model makes mini patches of size 8x8, then calculates the embeddings for each one, and then it upsamples (with Conv) and downsamples (with MaxPool) them, to make the FPN. Would it make sense to do the mini-patches with different sizes, and calculate the embeddings for each size, in order to make a "more realistic" FPN? I mean, instead of upsampling/dowsampling the computed embeddings with patch size 8x8, we would directly have the embeddings at different sizes (8x8, 16x16, 32x32, ...). I would be grateful if you could clarify this, whether it is a reasonable approach or not, also taking into account memory consumption.
Thanks in advance for your great project!!
I have encountered the same checkerboard pattern when I fine-tuned a regression model to make canopy height predictions on 0.5m RGB and NIR image patches of 224x224 size.
Are there any recommendations on trying different feature maps or patches size for the Regressor()?
Any guidance would be appreciated.
Hi, I think I solved the problem. I corrected the standarization values given in metadata.yml, substracting the mean from the standard value. For example, for S2 red band, mean was 1552, and std was 1888. However, this std value seemed weird; indeed, it seemed mean+std value, instead of std itself. I did 1888-1552, to get std = 336, and now the stripped pattern has been removed. Now I get the typical checkerboard pattern, but that will be probably removed by replacing Conv2dTranspose with PixelShuffle.
Could you please clarify why std values from metadata.yml have been added with mean values? Thanks!
Thanks, @TaniaJG for identifying the source of the checkerboard pattern.
For my example, I had to calculate the mean and std for my imagery. However, my statistics were calculated with a NAN value of 0 included, which skewed the values. After correcting the mean and std normalization values, the checkerboard pattern went away.
It's now just to upsample the results to the full resolution.
Hiya folks! Interesting talk in this thread! I am also trying to solve such problems and tasks.
I was also able to get better predictions after applying @TaniaJG suggestion to correct the mean and std values. However, some of the pixel-wise predictions are getting misplaced, could this be because I did not updated the Sentinel 1 values?
Regards the Height I wam also trying to get estimations for the UK, for that I am using GEDI data at the RH98, same S1 and S2 as predictors, but here the estimations are very odd, any suggestions here @kjtheron?
Thanks in advance for the help and guidance! :)
I have not fully resolve this issue all together. But I can list some the decisions I made to improve my results.
- Correctly setting an NA value during preprocessing of imagery
- Calculating mean and SD for your imagery used for fine tuning.
- Data type and outlier values. I had uint16 imagery which had some extreme outlier pixels. I masked these values to ensure higher quality training patches. I also converted to 8bit, but results where not great.
- Changed the loss function. I tried different loss functions. RMSE, MAE, MSE and Huber.
I still need to upsample to the original resolution and perform another training run so assess the results. But with the suggestion above the results seemed good.
@kjtheron @TaniaJG thanks so much for sharing, this was really helpful! How did you handle updating the other bands (NIR, SWIR...) besides RGB? Using the method of subtracting the mean from the standard deviation of those bands gave a negative SD, so I'm wondering if there was something else you tried?
cc @MaceGrim @4242psherman4242