fastplotlib Unexpectedly high cpu/gpu use

Was playing around with your subplots example the other day to get a feel for the library (am very excited about reducing my matplotlib render times), and I ran into an unexpected issue. While the figure was active, and without any mouse or keyboard input, one of my cpu cores was fully loaded, and gpu use was around ~30%. Did a little digging, and over a ~25 second profiling run, it seems like fastplotlib/layouts/_plot_area.py:363(render) is calling the pygfx viewport render function almost 40,000 times. Is this expected/desired behavior? Based on my limited understanding of the current codebase, there is a render loop, which I assume is constantly redrawing the figure. This cpu/gpu use seems problematic to me, especially if the figure isn't being interacted with most of the time. I do not observe this behavior if I use snapshot().

My code, run in jupyterlab (using version 0.2.0):

names = [['subplot']*8]*16
figure_grid = fpl.Figure(shape=(16,8), size=(1000,2000), names=names)
for subplot in figure_grid:
    data = np.random.rand(512, 512)
    subplot.add_image(data, name="rand-img")

pr = cProfile.Profile()
pr.enable()
figure_grid.show()
## wait ~25 seconds before running the next block
pr.disable()
pr.create_stats()
pr.print_stats(sort='tottime')

Top 10 calls by tottime

Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1029298    4.403    0.000    5.023    0.000 _helpers.py:304(proxy_func)
      409    2.081    0.005    2.081    0.005 {method 'poll' of 'select.epoll' objects}
 20191744    1.979    0.000    2.226    0.000 _weakrefset.py:63(__iter__)
   526029    1.260    0.000    2.323    0.000 _api.py:101(_new_struct_p)
    39168    1.100    0.000    3.220    0.000 environment.py:402(check_inactive)
    39168    0.745    0.000    1.567    0.000 matrix.py:452(mat_orthographic)
   203812    0.734    0.000    0.734    0.000 {built-in method builtins.dir}
   201508    0.634    0.000    1.054    0.000 structs.py:18(<listcomp>)
    78489    0.628    0.000    5.812    0.000 _api.py:2202(begin_render_pass)
    39168    0.521    0.000   23.825    0.001 renderer.py:392(render)

Jun 17 '24 18:06 n-garc

Hi, thanks for posting!

Is there a usecase for 128 subplots? That's a bit more than I anticipated use cases for. Anyways, it does seem like there's something slower about using subplots than putting all those graphics in one subplot, for example this is much more performant:

import fastplotlib as fpl
import numpy as np

fig = fpl.Figure()

for i in range(8):
    for j in range(16):
        fig[0, 0].add_image(np.random.rand(1000, 2000), offset=(i * (2000 + 500), j * (1000 + 500), 0))

While the figure was active, and without any mouse or keyboard input, one of my cpu cores was fully loaded, and gpu use was around ~30%.

I wonder if the cpu usage is related to this: https://github.com/pygfx/pygfx/issues/763, are you on pygfx@main? This isn't in the latest release yet.

Also, GPU usage can be misleading without the current wattage as well, if you're using an nvidia GPU this is quite easy to measure.

Did a little digging, and over a ~25 second profiling run, it seems like fastplotlib/layouts/_plot_area.py:363(render) is calling the pygfx viewport render function almost 40,000 times. Is this expected/desired behavior? Based on my limited understanding of the current codebase, there is a render loop, which I assume is constantly redrawing the figure. This cpu/gpu use seems problematic to me, especially if the figure isn't being interacted with most of the time.

I wonder if there's a way to not re-draw if nothing has changed, @almarklein any thoughts? Anyways the performance issue with 128 subplots (i.e. viewports) is probably something else, I would wait for https://github.com/pygfx/pygfx/issues/492 which is going to change how that works (I don't think this will change the fastplotlib API, but it will give an independent renderer to each subplot).

I do not observe this behavior if I use snapshot().

Yup because a snapshot is a static png

Jun 17 '24 23:06 kushalkolar

I'm coming from matplotlib land, and I have a couple of monster figures with that many subplots (definitely niche though). If fastplotlib doesn't require multiple subplots in order to add multiple images to the same figure, then I don't necessarily see a usecase. Just so I understand it better -- what is the point of subplots in fastplotlib? Is it to be able to have different controllers on different subplots?

I am on pygfx@main. https://github.com/pygfx/pygfx/issues/763 could very well be the culprit. I am on numpy 1.24, and my profiling from this morning points to the same methods called out in the thread over there. camera.update_projection_matrix() came up in the profiling, but was just outside the top 10 I put in my report. I believe it calls mat_orthographic though, which you can see in the top 10.

For completeness, I profiled your example as well. It also uses less cpu on my machine. Shown below are results from a ~37 second profile run. Calls to render are down to only 552 from 40,000. Corrected for the length of the run, it's about a 105x reduction. This is an odd number because it's very close to, but a little too far away from the number of subplots to draw any conclusions. I may be reading it wrong, but when I looked at the code this morning it seemed like each subplot calls render independently, in addition to the top-level figure. Not sure if it's helpful, but the jupyter busy/idle indicator goes nuts when a figure is showing (flips between busy and idle a couple dozen times a second).

Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     2936   23.875    0.008   23.875    0.008 {method 'poll' of 'select.epoll' objects}
   592364    2.570    0.000    3.248    0.000 _helpers.py:304(proxy_func)
991392/284832    0.767    0.000    1.069    0.000 _base.py:432(iter)
   141312    0.566    0.000    3.336    0.000 pipeline.py:631(draw)
   283176    0.501    0.000    1.818    0.000 _api.py:2040(set_bind_group)
      553    0.375    0.001    0.700    0.001 _jpg.py:54(_encode)
    10425    0.363    0.000    0.363    0.000 {method 'acquire' of '_thread.lock' objects}
      552    0.325    0.001    0.325    0.001 {method 'copy' of 'numpy.ndarray' objects}
   311658    0.321    0.000    0.542    0.000 api.py:242(new)
   592364    0.253    0.000    0.350    0.000 _helpers.py:247(release)
   592364    0.243    0.000    0.328    0.000 _helpers.py:243(capture)
1437857/1436699    0.238    0.000    0.249    0.000 {built-in method builtins.isinstance}
     1104    0.236    0.000    1.992    0.002 _base.py:420(traverse)
      553    0.232    0.000    8.413    0.015 renderer.py:392(render)

I can check the gpu wattage tomorrow.

Jun 18 '24 00:06 n-garc

I'm coming from matplotlib land, and I have a couple of monster figures with that many subplots (definitely niche though).

Yea fastplotlib works very differently from matplotlib (I maybe should've chosen a different name, too late for that :laughing: ). You can add multiple graphics of any type to a single subplot, use offsets if necessary.

If fastplotlib doesn't require multiple subplots in order to add multiple images to the same figure, then I don't necessarily see a usecase. Just so I understand it better -- what is the point of subplots in fastplotlib? Is it to be able to have different controllers on different subplots?

Yes each subplot has its own controller and camera, and you can also add different subplot-level animation functions to each subplot.

each subplot calls render independently, in addition to the top-level figure.

There's no rendering at the figure level. Figure.render() just calls subplot.render() for every subplot, so there will be n_subplot many render calls, and each subplot.render() essentially renders the viewport representing that subplot using the camera for that subplot. I'll see if Almar has any thoughts, but this will change once pygfx refactors the viewport and renderer.

Not sure if it's helpful, but the jupyter busy/idle indicator goes nuts when a figure is showing (flips between busy and idle a couple dozen times a second).

As far as I understand this indicates communication with jupyterlab, it's the remote frame buffer and it will always be active when receiving frames.

I can check the gpu wattage tomorrow.

Thanks! I also get about 30% usage but my GPU wattage under 50% of max so I don't think it's really pushing the GPU, it's something else.

Jun 18 '24 00:06 kushalkolar

This could be related to https://github.com/pygfx/pygfx/issues/763, but I would not be surprised if this is simply due to there being 128 subplots, i.e. 128 render calls 😅

I wonder if there's a way to not re-draw if nothing has changed

That would be a very obvious solution. Yes, most pygfx examples have code like this:

def animate():
    renderer.render(...)
    renderer.reques_draw()  # <-- this schedules a new draw

Some examples request a new draw on every draw, so the scene is rendered perpetually. This is advantageous in some use-cases, like simulations. (though we plan to provide different support for that). For a plotting library, it makes sense to not do this. Instead, call renderer.reques_draw() when you know that a redraw is needed (and when pygfx not already sdchedules a draw by itself).

Jun 18 '24 10:06 almarklein

call renderer.reques_draw() when you know that a redraw is needed (and when pygfx not already sdchedules a draw by itself).

I think we might be able to get away with this, started a new issue #586

@n-garc I also started #585 , if we do it it won't be anytime soon, maybe in a few months at the earliest, I would just set offsets like my previous example above

Aug 07 '24 07:08 kushalkolar