Stephen Lien Harrell

Results 24 comments of Stephen Lien Harrell

really a data reduction problem only, sampling size

While the feature works, the way the website works makes xalt timeout the entire machine page, this will need to be retested once https://github.com/TACC/tacc_stats/issues/54 is complete.

Ok, do this for all cfg imports

for CPU need core-affinity matched to job id For Memory: Need to find all memory usage from primary job starter programmatically. Find job starter, then get all child process memory:...

regarding the approach above, need to make sure we can capture detached processes

check to see if we can get redfish going to get a more complete picture

Branch for this issue: https://github.com/TACC/tacc_stats/tree/dcgm_support

Using this document to see what metrics Cazes wants: https://github.com/NVIDIA/dcgm-exporter/blob/main/etc/dcp-metrics-included.csv (From Cazes:) From the PCI section, I’d like to keep track of bytes moved over the PCI bus.  We probably...

Create new pre-made graphs and have a page to zoom in.