FastChat
FastChat copied to clipboard
Add preprocess exception handler for trainer.
Add preprocess exception handling for 3 type of errors:
- size of source is 0
-
chatgptorbingis not in roles. - The order of
humanandassistantis incorrect.
Thanks for the contribution. Instead of ignoring these warnings, can we use functions like this to clean up the dataset before training? This ensures the quality of the training data.
https://github.com/lm-sys/FastChat/blob/73ea04dec7832de68783a68a424d374c85e3a29d/fastchat/data/split_long_conversation.py#L78-L94
@merrymercy OK, your guide is more useful than this patch. I'll close this issue.