FastSpeech2
FastSpeech2 copied to clipboard
pitch/energy corpus normalization
Hi, is the pitch/energy normalized within corpus instead of within speaker? Would it be better within speaker?