Tapasya Patki
Tapasya Patki
The `flux-resource` script currently doesn't check for missing arguments anywhere. For example, if you specify `flux resource cancel` instead of `flux resource cancel `, it doesn't return an error value...
Capturing a discussion with @dongahn, one of the missing pieces is to extend the locality aware scheduler to also handle socket/core-level packing of resources in addition to node packing.
@DavidPoliakoff @zfrye-llnl and I are thinking of writing a Variorum connector for Kokkos (https://github.com/LLNL/variorum). Is there any documentation on how to write one (or which functions: initialize/finalize, push/pop, begin/end_parallel_for are...
This pull request introduces a vendor-neutral power monitoring plugin that reports node, CPU (socket), memory and GPU power using LLNL's Variorum library (https://variorum.readthedocs.io/).
Fixes #1146. ToDo: - [x] test/docker: Add bookworm Dockerfile with PFA installation. - [ ] Add `libperfflow_runtime` to fluxion CMake/build setup - [ ] Annotate resource, qmanager, traverser functions and...
Use either Caliper or PerfFlowAspect to annotate the key Fluxion functions to gather scheduling overhead data.
Tested on Lassen with commit 50b21f on 5/2. Need to update version. ``` ./variorum-print-power-example -v 0.5.0 ```
Create a folder in variorum codebase with reference multi-node multi-GPU benchmarks. Add 3-4 benchmarks or workflows that have been annotated with Variorum for testing scalability and overhead, and also for...
In many places in Variorum, I see declarations made as `unsigned var`, without the datatype being explicitly specified. This defaults to `unsigned int` , and is functionally correct (`unsigned` is...