Add Tegra Observer to control clocks on Jetson devices
I've created an experimental TegraObserver that can monitor and control the GPU core clock on a Tegra device, similar to how NVML can do this for normal NVIDIA GPUs. I thought it may be a nice addition to kernel tuner, hence this PR. Comments and discussion welcome of course. While editing core.py I noticed two duplicate imports and removed those as well.
The TegraObserver does require elevated permission to be able to write to the files that control the clocks, similar to how the NVMLObserver requires that for use with nvidia-smi
I've tested it on a Jetson Nano and verified that the code to find the correct control files works on an Orin as well.
One issue I've noticed is that when kernel_tuner is interrupted through ctrl+C, the __del__ method of TegraObserver seems to be ignored, so the clocks are not set back to their initial values. I'm not sure how to fix that.
Edit: I wasn't sure how to add tests for this new observer. Perhaps just mocking the methods that read/write the clock frequencies?
I see that PR #242 changes how the clocks are set in NVML, when that one is merged I'll look at updating this PR to do it the same way for Tegra
The NVMLObserver is covered by these two files:
- https://github.com/KernelTuner/kernel_tuner/blob/4dbcb66fad3561f7ee24d270b60d597e68fc4cdf/test/test_observers.py#L4
- https://github.com/KernelTuner/kernel_tuner/blob/4dbcb66fad3561f7ee24d270b60d597e68fc4cdf/examples/cuda/vector_add_observers.py#L9
Regarding the former, you would need something similar to @skip_if_no_pynvml. Maybe it's enough to check the Linux kernel name/version (uname -r), as NVIDIA appends -tegra to the kernel name? The vector_add_observers could be renamed to _nvml and a similar example could be added for this new observer.
Another suggestion is to add a way to check whether the device is actually a Tegra device.
So taking the NVMLObserver as an example:
nvmlobserver = NVMLObserver(["nvml_energy", "temperature"])
metrics = OrderedDict()
metrics["GFLOPS/W"] = lambda p: (size/1e9) / p["nvml_energy"]
Could become:
nvmlobserver = NVMLObserver(["nvml_energy", "temperature"])
metrics = OrderedDict()
if nvmlobserver.is_available():
metrics["GFLOPS/W"] = lambda p: (size/1e9) / p["nvml_energy"]
This allows for more generic KernelTuner scripts.
Hi @csbnw! That's a very good point actually on writing generic tuning scripts. If someone is interested in tuning on a range of devices, it could be annoying to instantiate different observers based on the device at hand. It also goes a bit against the design philosophy of Kernel Tuner that attempts to take care of 'boring' things automatically for as much as possible. I'll have to think about that for a bit on how we can achieve that, but it's a good point!
Quality Gate passed
Issues
4 New issues
0 Accepted issues
Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code
I'm going to merge this into the 'tegra_observer' branch on the main repo. To make it easier for the student working on this to make contributions.