Binh Tang

Results 3 issues of Binh Tang

### Summary of Changes We add an option to convert weights into a new `dtype` while resharding FSDP checkpoints. This helps reduce checkpoint sizes and avoids issues under RAM constraints...

cla signed

### Summary of Changes The existing script for resharding model parallel parts (i.e. `metaseq/scripts/reshard_model_parallel.py`) loads all checkpoint parts at once and might result in OOM issues under RAM constraints, especially...

cla signed

## Summary Add a new script to collect latency results for OPT models during generation. While this script resembles the existing one [`metaseq/scripts/generation_benchmarks.py`](https://github.com/facebookresearch/metaseq/pull/240), it's a bit more general where besides...

cla signed