CALM
CALM copied to clipboard
dataset size effect on fine tuning
Hi, Thanks for sharing the implementations. I had a question about whether using my own credit related dataset size would affect the quality of fine-tuning the Llama model. Imagine that it is a table formatted dataset with lots of feature (about 2000 features) but total available records (with binary target laebles 0 as non-defaulter and 1 as defaulter) about 200,000 samples.
Are there any suggestions or limitations related to such data that I might consider when deciding whether fine-tuning might be a good option for my project?
Thanks