Xuesong Yang

Results 10 issues of Xuesong Yang

I am using this backend to make a G2P-like conversation. But I did not know where to find a complete list of the phone set for each language. Could anyone...

Signed-off-by: Xuesong Yang # What does this PR do ? Add a one line overview of what this PR aims to accomplish. **Collection**: [Note which collection this PR will affect]...

fix

# What does this PR do ? **Collection**: [Note which collection this PR will affect] # Changelog - Add specific line by line info of high level changes in this...

fix

The link (https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tutorials/tts-python-basics-and-customization-with-ssml.html) shared in https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tutorials/tts-basics-customize-ssml.html#note is not valid anymore.

Received warning message when normalizing text. Could you pls provide what the message indicates? **Reproduciple code**: ```python from nemo_text_processing.text_normalization.normalize import Normalizer text_normalizer = Normalizer(lang="zh", input_case="cased", overwrite_cache=True, cache_dir=str("cache_dir")) text_normalizer_call_kwargs = {"punct_pre_process":...

bug
Stale

Previously reported to Phonemizer repo: https://github.com/bootphon/phonemizer/issues/142, but suggested to re-direct here. There is `/1/` appearing in the phonetic transcription for the German language. Is it expected? ``` $ echo "aneinander"...

* tensorboard and wandb save hparam config files with slightly different YAML structure. This fix will address blockers when feeding in `*.ckpt` and `*/wandb/latest-run/files/config.yaml`. * simplified loading ASR models by...

TTS
Run CICD

- change the codes names in cuts into target_codes and context_codes. - remove `codec_model_name` in yaml config because we've added the codec name in `input_cfg.yaml` file. - fixed bugs that...

TTS
common
Run CICD

### Speed Up Codec Model Inference Made slightly changes for the dependency util func `mask_sequence_tensor`. We observed at least 3x speedup. This change is cherry-picked from the part of PR...

TTS
common
Run CICD

The Lhotse dataloader applies dynamic batching based on duration buckets, causing the number of consumed samples at each training step to vary. Tracking the number of consumed samples enables fairer...

TTS