Binh Tang
Binh Tang
### Summary of Changes We add an option to convert weights into a new `dtype` while resharding FSDP checkpoints. This helps reduce checkpoint sizes and avoids issues under RAM constraints...
### Summary of Changes The existing script for resharding model parallel parts (i.e. `metaseq/scripts/reshard_model_parallel.py`) loads all checkpoint parts at once and might result in OOM issues under RAM constraints, especially...
## Summary Add a new script to collect latency results for OPT models during generation. While this script resembles the existing one [`metaseq/scripts/generation_benchmarks.py`](https://github.com/facebookresearch/metaseq/pull/240), it's a bit more general where besides...