Rafi Ayub
Rafi Ayub
## Context The config system has gone through a couple of iterations, the earliest had a recipe params abstraction which was a convenient location to collect, document, and validate all...
We strongly encourage users to use publicly exposed imports, i.e., `from torchtune.datasets import alpaca_dataset` instead of `from torchtune.datasets._alpaca import alpaca_dataset`. However, for developers it is not clear which one to...
We should introduce a debug mode at the CLI level that will automatically run a config on CPU without distributed just for a small number of steps/epochs. This is really...
For handling directory paths, we have a mix of using `os` and `Pathlib` across the code base. `os` leads to some clunky code, an example is: ``` ROOT_DIR: str =...
After #406 is merged and removes params dataclasses, default values need to be relocated to library components. Comb through all library components and introduce default values where it makes sense,...
Once the story for config testing/validation becomes more clear (see #373), we should clearly explain this in the configs deep dive tutorial, only if there's anything a user needs to...
Using `/tmp` to store temporary outputs such as model checkpoints, tokenizers, logs, etc is not good practice because `/tmp` is often deleted and shared across users in a remote environment...
## Context Tutorial on how to set up chat data with llama3 and discussing how prompt templates work, especially with the new special tokens. #### Changelog - ... #### Test...
#### Context What is the purpose of this PR? Is it to - [x] add a new feature - [ ] fix a bug - [x] update tests and/or documentation...
### The Problem Packing multiple samples within a single context window means the model may accidentally attend to other samples it should not attend to. If there are sequences that...