cod3r0k
cod3r0k
# ❗ Issue: Help Needed to Use `finetune_t3_preprocessed.py` Step by Step Hi, I'm trying to fine-tune a model using your codebase and would appreciate step-by-step guidance. Here's what I'm doing...
I prepare my data and run `preprocessed_data.py` and it take a long time, i think it get stuck! @havok2-htwo I have 120 cpu core and V100 ``` 2025-07-05 00:23:37,643 -...
I am seeking a step-by-step approach to do that. I don't see any comments about tokenization tricks, configurations, or detailed guidelines for that.
Great, @nshmyrev! I'm looking forward to your exciting updates about your newest model. Thanks!
Any update? does is officially well? @C00reNUT
Do we have any update? @ShaanveerS
Did you test it? > https://github.com/alisson-anjos/chatterbox-finetune
Dear @stlohrey, Could you please share some insights on how to prepare data and structure it appropriately? Additionally, we'd appreciate it if you could explain how to prepare a tokenizer...
What concerns do I have with Arabic UTF8-based language? and its data preparation? @stlohrey
Great, can you help me more? What is CT2? @MahmoudAshraf97