Emin Orhan
Emin Orhan
Please release the evaluation code as well. "Dirty code" is much better than no code. I'm finding it hard to parse the provided eval logs for ViT-L/16. Please just use...
I'm just curious what the `max_batch_size` argument does in `ModelArgs`: https://github.com/pytorch/torchtitan/blob/d2a4904f58accc683c17c66a360026cb3c8109af/torchtitan/models/llama/model.py#L32 A quick search suggests that it doesn't actually seem to be used anywhere else in the code base, so...
### 🐛 Describe the bug I'm trying to test this library on an HPC cluster with AMD MI250X GPUs, but I'm getting a weird seemingly Triton-related error specifically when I...
I recently upgraded to a nightly pytorch build (`2.7.0.dev20250221+rocm6.3`) and noticed that the dcp checkpoints saved with my prior build could no longer be loaded successfully. The issue seems to...