Worse performance using datashader?
I wrote some benchmarks available here https://github.com/scverse/spatialdata-plot/pull/295 (they can simply run as tests) and I have noticed that the datashader performance is worse than the matplotlib based one.
I think this maybe be due to the size of the canvas used by datashader since in the MERFISH example here https://github.com/scverse/spatialdata-plot/pull/243 the performance was (as expected) better.
Therefore using a smaller default canvas size may fixed the issue. @Sonja-Stockhaus could you please have a look into this?
Here are the results of a (single) run of the tests (the timing are consistent across multiple manual runs).
With the fix that I proposed to the performance bug here https://github.com/scverse/spatialdata-plot/issues/297 the performance gap is much bigger
@Sonja-Stockhaus my "didn't-look-at-the-code" theory is that datashader generates too large of an image which then bypasses the rasterisation-downsampling logic. Wdyt?
Yep, datashader generates an image that is exactly the size of the extent (large extent = large image = long runtime). I'll think of sth so that we can use a smaller canvas size and then maybe rasterize or so to bring it back to the original scale. Do we want a heuristic again to decide on the "smaller canvas size"?
I also noticed that for datashader, e.g. the radius of the points is relative to the axes which is not the case for matplotlib. So for a large extent you need extremely large point sizes to even make them visible at all with datashader. That should be consistent with matplotlib.
Thanks for the explanation. I would reuse the logic of _rasterize_if_necessary() or _multiscale_to_spatial_image() to take the dpi of the figure and the fig_size into consideration, since the extent could be extremely large, but in the end we are limited by the pixels available on screen/paper for plotting.
Btw, off-topic comment, when plotting Visium HD data as points/circles I noticed a Moire pattern due to the presence of a small rotation in the raw data. With datashader rasterization the Moire pattern disappears, which is great! So using datashader could have also this nice use case beyond improved performance.