Xuesong Yang
Xuesong Yang
I am using this backend to make a G2P-like conversation. But I did not know where to find a complete list of the phone set for each language. Could anyone...
Signed-off-by: Xuesong Yang # What does this PR do ? Add a one line overview of what this PR aims to accomplish. **Collection**: [Note which collection this PR will affect]...
# What does this PR do ? **Collection**: [Note which collection this PR will affect] # Changelog - Add specific line by line info of high level changes in this...
The link (https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tutorials/tts-python-basics-and-customization-with-ssml.html) shared in https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tutorials/tts-basics-customize-ssml.html#note is not valid anymore.
[zh] WARNING:NeMo-text-processing:Failed text: 免除GOOGLE在一桩诽谤官司中的法律责任。Key: integer_part Value: None
Received warning message when normalizing text. Could you pls provide what the message indicates? **Reproduciple code**: ```python from nemo_text_processing.text_normalization.normalize import Normalizer text_normalizer = Normalizer(lang="zh", input_case="cased", overwrite_cache=True, cache_dir=str("cache_dir")) text_normalizer_call_kwargs = {"punct_pre_process":...
Previously reported to Phonemizer repo: https://github.com/bootphon/phonemizer/issues/142, but suggested to re-direct here. There is `/1/` appearing in the phonetic transcription for the German language. Is it expected? ``` $ echo "aneinander"...
* tensorboard and wandb save hparam config files with slightly different YAML structure. This fix will address blockers when feeding in `*.ckpt` and `*/wandb/latest-run/files/config.yaml`. * simplified loading ASR models by...
- change the codes names in cuts into target_codes and context_codes. - remove `codec_model_name` in yaml config because we've added the codec name in `input_cfg.yaml` file. - fixed bugs that...
### Speed Up Codec Model Inference Made slightly changes for the dependency util func `mask_sequence_tensor`. We observed at least 3x speedup. This change is cherry-picked from the part of PR...
The Lhotse dataloader applies dynamic batching based on duration buckets, causing the number of consumed samples at each training step to vary. Tracking the number of consumed samples enables fairer...