rohansonecha
rohansonecha
Change-Id: I38dd936e4ad710e05e95d78f05b5e07585cb1660 ### What changes were proposed in this pull request? This pull request adds a new proto3 file which was based off of the existing hive_metastore.thrift file. The proto3...
This PR adds support for remote k8s cluster GPU metrics, building off of the recently merged in-cluster metrics. There is a new `/gpu-metrics` endpoint in the API server which a...
This PR installs and starts node and dcgm exporters on all nodes provisioned on major clouds (aws, gcp, and azure). This is a prerequisite for collecting node-level metrics via prometheus....
Tested (run the relevant ones): - [x] Code formatting: install pre-commit (auto-check on commit) or `bash format.sh` - [x] Any manual or new tests for this PR (please specify below)...