Garima Jain
Garima Jain
I recently joined the MLCommons group and have been looking into UNet3D's performance. It was mentioned during our discussions that UNet3D does not fully utilize bandwidth, potentially due to being...
When running `mlpstorage training run` with `--exec-type=docker` and `--num-accelerators=6`, only **one process** is executed, even though multiple accelerators are specified: ``` [OUTPUT] Running DLIO [Training & Checkpointing] with 1 process(es)...
I attempted to enable DFTracer profiling and hydra logging in the mlpstorage training workflow, but I was unable to locate the expected output trace/log files. Steps Taken: **Installed DFTracer:** `pip...