TransformerTTS
TransformerTTS copied to clipboard
Word timestamps
Is it possible to get the time stamp of each word in the output? Does the model adhere exactly to the target durations or is the final output close but not quite equal to the target durations?