kernel_tuner icon indicating copy to clipboard operation
kernel_tuner copied to clipboard

Add NCUObserver

Open csbnw opened this issue 1 year ago • 11 comments

Add NCUObserver to measure performance metrics, like you would normally do using the command-line tool ncu.

This observer is an instance of the new PrologueObserver. The PrologueObserver runs before any of the normal benchmarking. In case of NCUObserver, this is because profiling will in most cases affect performance of the kernel and therefore any other measurements by the other observers.

This new code has a dependency on nvmetrics, a small library that uses NVIDIA CUPTI and nvperf. Kerneltuner will work fine even when this library is not found, the NCUObserver will simply print a warning and measure nothing.

csbnw avatar Apr 19 '24 11:04 csbnw

I'm getting a 404 now on https://git.astron.nl/RD/recruit/nvmetrics but I was able to access the page last Friday, what happened?

benvanwerkhoven avatar Apr 22 '24 08:04 benvanwerkhoven

I'm getting a 404 now on https://git.astron.nl/RD/recruit/nvmetrics but I was able to access the page last Friday, what happened?

There is something going on at the ASTRON GitLab server, I hope that they will restore it soon. For now, the nvmetrics repository indeed seems to be gone. :scream:

csbnw avatar Apr 22 '24 09:04 csbnw

Luckily I already cloned it on Friday. I just managed to get around the permission issues related to reading performance counters on WSL, about to test the NCUObserver :-)

benvanwerkhoven avatar Apr 22 '24 09:04 benvanwerkhoven

I have pushed nvmetrics to a new repository: https://github.com/nlesc-recruit/nvmetrics

csbnw avatar Apr 22 '24 11:04 csbnw

Luckily I already cloned it on Friday. I just managed to get around the permission issues related to reading performance counters on WSL, about to test the NCUObserver :-)

You may want to update to the latest version from here. I am not sure what version you downloaded, but I introduced a bug causing no metrics were being observered at all. This is now fixed.

csbnw avatar Apr 22 '24 11:04 csbnw

I think it would be cleaner and more consistent with the current code to filter out prologue observers from the list of observers to use during the default benchmark loop, just like we do with continuous observers on this line: https://github.com/KernelTuner/kernel_tuner/blob/add-ncu-observer/kernel_tuner/core.py#L348

Could move that to the init though to do it only once, since it won't change during the run.

benvanwerkhoven avatar Apr 24 '24 08:04 benvanwerkhoven

I'll make some changes to see that this would look like

benvanwerkhoven avatar Apr 24 '24 08:04 benvanwerkhoven

We had a small issue with the CI, but that's fixed now.

benvanwerkhoven avatar Apr 25 '24 07:04 benvanwerkhoven

I'll make some changes to see that this would look like

These changes indeed make the code more consistent with the existing code, but it also makes it harder to see what new functionality was introduced, just by looking at the diff for this branch. But to be honest, what matters most to me is that the NCUObserver works. (It does, of course!) I am not sure whether you have more refactoring in mind or need anything from my side before this PR is ready to merge.

csbnw avatar Apr 26 '24 07:04 csbnw

I think I'm happy with how the code turned out now indeed. I guess the prologue observers should be mentioned in the documentation somewhere, but other than that this pull is quite ready to be merged in my opinion indeed.

benvanwerkhoven avatar Apr 26 '24 12:04 benvanwerkhoven