devito icon indicating copy to clipboard operation
devito copied to clipboard

compiler: Cache `Dependence` instances

Open enwask opened this issue 7 months ago • 4 comments

Adds a @_memoized_instances decorator for caching instances returned by __new__. This decorator is applied to Relation (and indirectly, Dependence) for a somewhat substantial speedup in build times.

Tested locally (with Python 3.10 on dewback), this yields a pretty consistent performance improvement on an old profiling script with some modifications (removed inline calls to cProfile and ran it against the script up to the first Operator construction). This gist highlights performance of a couple hotspots as well as the top consumers from cProfile; overall, the operator in that script builds roughly 15% faster in most of my runs with cached Dependence instances. I'm seeing a similar improvement in most of the examples as well.

Waiting on #2631 to be merged as there are 3.10 type hints in a couple places.

enwask avatar Jun 16 '25 14:06 enwask

Isn't this an unbounded cache?

@FabioLuporini technically unbounded yes, but entries are evicted by WeakValueDictionary when the referred object stops existing, so cache entries are only stored while that instance would remain in memory anyway (after rereading, I think this is what you were getting at with evicting entries that are None).

I considered using a bounded LRU cache of strong references but then we're hardcoding a cache limit for something that should scale with the problem size

enwask avatar Jun 17 '25 08:06 enwask

so cache entries are only stored while that instance would remain in memory anyway

ah, I see!

How much memory growth have you observed with a big Operator (ideally something high order)?

You also need to rebase the branch on latest main or we won't be able to spin CI

FabioLuporini avatar Jun 17 '25 08:06 FabioLuporini

How much memory growth have you observed with a big Operator (ideally something high order)?

On that demo script from you I've been using, memory usage was within the noise. Haven't tried on anything huge (don't really have any real-world operators to test with that take up a lot of memory in building) but at least theoretically it should be basically nothing. Will try some more contrived higher-order cases later.

You also need to rebase the branch on latest main or we won't be able to spin CI

Yeah will rebase shortly—also want to make a small change to make it obvious that memoized_constructor is private. But I'm also waiting on Python 3.9 to be dropped, as the 3.9 workflows will fail right away on some unsupported type hints.

enwask avatar Jun 17 '25 08:06 enwask

Codecov Report

Attention: Patch coverage is 98.40637% with 4 lines in your changes missing coverage. Please review.

Project coverage is 87.66%. Comparing base (bca0b10) to head (0d8a927).

Files with missing lines Patch % Lines
devito/tools/abc.py 93.75% 1 Missing and 2 partials :warning:
devito/tools/memoization.py 97.05% 0 Missing and 1 partial :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2634      +/-   ##
==========================================
- Coverage   91.30%   87.66%   -3.65%     
==========================================
  Files         245      245              
  Lines       48752    48994     +242     
  Branches     4298     4307       +9     
==========================================
- Hits        44515    42952    -1563     
- Misses       3529     5311    +1782     
- Partials      708      731      +23     
Flag Coverage Δ
pytest-gpu-aomp-amdgpuX ?

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Jun 20 '25 11:06 codecov[bot]

does it make sense to add a test that ensures we're not leaking memory and/or memory consumption is bounded ? this is mostly me being paranoid

@FabioLuporini the memory the cache takes up is bounded by the number of instances of the object that exist. There is a test which checks that cache entries are evicted when their values go out of scope, so this is guaranteed—I can try to add a test specifically for memory consumption if you'd like, but imo that's just going to be a more heuristic test (is the cache using "too much" memory? how much is too much?) of something that is already tested concretely (i.e. that the cache memory is bounded by the actual instance memory, which in turn ensures we're not leaking memory)

enwask avatar Jun 30 '25 10:06 enwask