marlin icon indicating copy to clipboard operation
marlin copied to clipboard

Where in the code uses "immediate eviction" and "fetched from L2 cache"??

Open ziyuhuang123 opened this issue 2 years ago • 2 comments

Hi! I find your repo very interesting and I gave it a star without hesitation! I am also learning L2 cache recently, so I wonder where it uses "immediate eviction" and "fetched from L2 cache"?? I guess it has relation with discard_memory or L2 persistent API?

Thank you!!

By the way, you mentioned you use ncu to perform and analyze it, also interested how that is done. Maybe you could publish a top conference paper!

ziyuhuang123 avatar Jan 24 '24 15:01 ziyuhuang123

Hi, the L2 cache is used implicitly whenever global memory is fetched; the immediate eviction cache policy for weight loads is defined here. The key is that we want to reuse activations (which we need to load many times) in L2 cache, but don't care about weights as they are only accessed exactly once.

We are considering a write-up of this work, however I am currently very busy, so this may take quite a while.

efrantar avatar Jan 25 '24 20:01 efrantar

I see! Is it possible to use L2 cache better? I know there is an API mentioned here. But I can not find out a way to use it well.... I mean, maybe some random access will squeeze out the useful data in L2? What do you think? Thanks!!!

ziyuhuang123 avatar Jan 26 '24 08:01 ziyuhuang123