How hard would it be to use human flagged audio files to train the D and G models?

Open marktellez opened this issue 1 year ago • 0 comments

I want to include this in my process, give the "good" files as the standard "in dir" but also have a "bad dir"

these are TTS generations humans have listened to and flagged as approved or rejected. It seems like this should boost both models.

any pointers are appreciated

Sep 08 '24 00:09 marktellez