torchtune
torchtune copied to clipboard
PyTorch native post-training library
With all the growing activity and focus on multimodal models is this library restricted to tune text only LLM? Do we plan to have Vision or more in general multimodal...
MPS support
#### Context - For testing purposes it can be useful to run directly on a local Mac computer. #### Changelog - Checks support for BF16 on MPS device. - Added...
#### Context - Create lora fine-tune for Gemma model. #### Changelog - ... #### Test plan - .... It can work with `apply_lora_to_mlp = True, apply_lora_to_output = False`, but not...
The idea is showing: - A progress bar with the actual total count - Having the same steps logged and reported on the progress bar - Count a training step...
#### Context As per title #### Changelog - Builder function + config #### Test plan - Trained for one epoch with the following loss - Training Speed  ...
#### Context This PR updates activation checkpointing (ac) to support selective layer and selective op activation checkpointing. It preserves the previous options enabled of full or None. This is controlled...
Get torchtune version during the build and add so that it appear in the resulting HTML dropdown
**This has not been extensively tested (only mistral 7b) and more of a proposal!** This change does the follow: - Create the model on the meta device - Load the...
The API enforces that the wrapping policy just be a set of modules, which is sufficient for a few use cases but the underlying API offers more generality in terms...
## Context On a single device, our current Llama7B full fine-tune recipe either OOMs with the ```AdamW``` optimizer, or takes > 55GB with ```SGD```. Given the importance of single device...