dataset size effect on fine tuning

Open shadow1999k opened this issue 1 year ago • 0 comments

Hi, Thanks for sharing the implementations. I had a question about whether using my own credit related dataset size would affect the quality of fine-tuning the Llama model. Imagine that it is a table formatted dataset with lots of feature (about 2000 features) but total available records (with binary target laebles 0 as non-defaulter and 1 as defaulter) about 200,000 samples.

Are there any suggestions or limitations related to such data that I might consider when deciding whether fine-tuning might be a good option for my project?

Thanks

Dec 23 '24 09:12 shadow1999k