Which is the truth: datashader VS matplotlib

Open sunshx-bioinfo opened this issue 1 year ago • 1 comments

Hi,

I plot the cell boundaries with method='datashader' and method='matplotlib', they both display the cell distribution without overlapping (total 13899 cells).

But the datashader plot seems more sparse than matplotlib plot (bellow):

I guess matplotlib plot is the truth, is it right? matplotlib method seems already resolves the problem of cell overlapping.

Though datashader method is faster than matplotlib mehtod, we should choose matplotlib method for more accuracy, is that right?

Looking forward to your reply.

Dec 25 '24 07:12 sunshx-bioinfo

Hi, thanks for reaching out. You could try increasing the size of the datashader canvas, which can be influenced by increasing the figure size or the DPIs (https://github.com/scverse/spatialdata-plot/blob/12f490d0d16c9f1cf48e5373d1a39c815b5b98ed/src/spatialdata_plot/pl/utils.py#L2042-L2043).

In this case (not too many cells and not overlapping cells), I would prefer the plot on the right. You could get a more similar plot by using a bigger canvas for datashader. With a right size you may be able to get a plot with better resolved cell boundaries and still faster than matplotlib (keep in mind that the higher the canvas size the slower datashader will be).

Instead, in the case of plotting a large number of cells (so dense you can't see the boundary like here), or when plotting overlapping cells or overlapping points, the two plots would be different and datashader would be preferred (see examples such cases here).

Jan 05 '25 22:01 LucaMarconato