Huamin Chen
Huamin Chen
add OpenPerfEvent API so that tables can be configured to catch perf event counters in bpf programs
`open_perf_event` is well used in bcc Python libray, this PR is to get it here for golang
**Describe the bug** After running for extended hours, the curr_cpu_time becomes zero and never recovers ```json { "__name__": "node_energy_stat", "container": "kepler-exporter", "cpu_architecture": "Cascade Lake", "curr_cache_misses": "10019190", "curr_cpu_cycles": "4005839184", "curr_cpu_instructions": "5068047632",...
Unit and integration tests
Currently power meter is read from hwmon. More hwmon data is needed, including CPU temperature, cooling, other power draw, etc.
Currently the PID/CgroupID mapping are through /sys. This works for Cgroup V2, but not v1. On Cgroup V1 systems, accessing /proc is needed to convert PID/CgroupID to Pod name
There are multiple hardwired const. A startup configuration file is needed.
This task is to provide a developer guideline with: 1. Code of Conduct 2. Development environment 3. Architecture overview
There are two following ways to associate a process to its container ID: - by reading the process's /proc/pid/cgroup and find the container ID. This requires exposing host's /proc filesystem...
**Is your feature request related to a problem? Please describe.** Current GPU power estimate is based on memory footprint. This model doesn't account for factors such as GPU instructions, temperature,...
**Is your feature request related to a problem? Please describe.** Pod level energy reporting is useful on a per node basis. At the cluster level, high level aggregation is more...