Rafi Ayub
Rafi Ayub
Hi @walidbet18, you can load your local text file dataset by specifying `source="text"` and `data_files="fichier.txt"` in any of our dataset classes or builders. ``` # if you're using chat data...
Ah yes you need to specify it in the config, sorry I wasn't clear earlier. Make sure you setup your dataset component like so: ``` dataset: _component_: torchtune.datasets.instruct_dataset source: text...
If you're using unstructured text then you might need to use a different dataset class. I am planning to open a PR soon to add this to enable fine-tuning /...
Hey @BedirT, are you installing torchtune via git clone and have recently pulled from main? I was able to run the 70B_lora config with the following commands: ``` tune download...
Why not create a dedicated cache folder and set it as an environment variable on install?
do you also mind sharing what your peak memory allocated was for both the lora and qlora runs? and to be extra sure, do you mind making sure the checkpoints...
Confirmed that this works as expected with optimizer_in_bwd=True and gradient accumulation off, and optimizer_in_bwd=False and gradient accumulation on: `tune run full_finetune_single_device --config llama2/7B_full_low_memory optimizer_in_bwd=False gradient_accumulation_steps=5 max_steps_per_epoch=10` `tune run full_finetune_single_device --config...
closing this as it was addressed in #831
Hi, Thanks for the request. Multimodal support is in our medium term plan. In the short term up until release we are focused on supporting text-based LLM fine-tuning.
Thanks for raising this - no plans at the moment, we will reassess this after release and see if it's a strong community need