focus-stack Questions about 3D View Construction and Pixel Source Mapping

Hi @PetteriAimonen ,

First of all, thank you for your amazing work and for supporting open-source development! Your focus-stack project is really impressive, and I appreciate the effort you've put into it.

While using the code and reading the paper on complex wavelet transform, I have a few questions:

From my understanding, the final fused image does not take pixels directly from any single input image but is instead computed based on focus measures. Is this interpretation correct?
How is the 3D view constructed? What metric or calculation determines the relative height in the visualization?
If I want to determine which input image a particular pixel in the final image originates from, how can I achieve this? Which part of the code should I modify?

I would greatly appreciate any guidance you can provide! Thanks again for your contributions.

Feb 28 '25 20:02 fattypiggy

That's correct, the algorithm used for image merging uses wavelets as a sort of combined focus measure and pixel processing tool. This nicely handles blending between focus levels, but can result in halos which are a bit of a problem.

A few years back I was contracted to add a depthmap feature. The wavelet depthmap itself is poorly suitable for this because the wavelets at different levels do not directly correspond to any single pixel. Instead, the depthmap and 3D view features work using an entirely separate algorithm and share only the alignment part.

In case you haven't checked it yet, there is a description of the algorithms here: https://github.com/PetteriAimonen/focus-stack/blob/master/docs/Algorithms.md

Direct stacking of images pixel-by-pixel based on the 3d view depthmap data is not currently implemented, and it wouldn't be very pretty.

You can get the wavelet depthmap by saving Task_Merge::depthmap(), but that will be in the wavelet space and does not correspond to individual pixels.

Mar 01 '25 07:03 PetteriAimonen

Thank you so much for your response and the clarification. After studying your code and paper, I fully understand your point—using the wavelet-level depth directly for 3D reconstruction in a Z-stack scenario is indeed not optimal.

Given that each image in the Z-stack is taken at a fixed distance apart (e.g., 10 microns), I’m looking for an accurate way to create a 3D model of microscopic objects. My goal is to measure parameters like the perimeter with high precision, so a reliable 3D reconstruction method is very important for my project.

Do you have any suggestions or insights for generating an accurate 3D model from a Z-stack of microscopic images? Any advice or inspiration you can share would be greatly appreciated.

Thank you again for your help!

Mar 04 '25 16:03 fattypiggy

I'm not sure how good result is possible from just a single image stack. If possible, you could consider structured light or stereoscopic techniques.

But for single stack, this paper may be of use Model-Based 2.5-D Deconvolution for Extended Depth of Field in Brightfield Microscopy. It is from the same authors as the wavelet method I'm using. It's supposed to provide higher quality results, but requires knowing the lens characteristics and Z distance for each image. It's also very computationally expensive. There is a demonstration software available, but it takes long to run.

Mar 04 '25 18:03 PetteriAimonen

Thanks for your reply. The paper you suggested seems to fit my project, but I need some time to dive into it. I have a series of Z-Stack images, the distance between each image is known. I'm considering another approach that judging the pixels in-focus from each image, using these in-focus pixels as the structure, do you think it's a reasonable way? Can I use wavelet transform as a "pixel selector" to judge the pixel in focus?

By the way, thanks for your patience and kindness.

Mar 07 '25 17:03 fattypiggy

A single pixel has no information on whether it is "in focus". The algorithm used by depthmap output in task_focusmeasure.cc currently is based on sobel operator in 3x3 area. In my trials it worked somewhat better than using wavelet transform as the focus measure.

Main problem with focus estimation is that every image will have areas with practically no contrast (flat color), and areas with very sharp contrast. Halos from high contrast areas get overlapped with the low contrast areas, causing false results. The model based method aims to avoid this by modeling how the lens spreads the defocused light.

In the paper linked above there is figure 5, which compares variance method (somewhat similar but not exactly same as the sobel operator), CWT-EDF (complex wavelets) and their model based method.

Mar 07 '25 17:03 PetteriAimonen

Thanks for your prompt reply. After two weeks of exploring, I still haven't found a good enough solution for my project. So I will give up exploring new ones temporally. Right now, I will focus on 2D measurement, and still keep an eye on the 3D methods.

Mar 21 '25 20:03 fattypiggy