Xuesong Yang issues

Results 10 issues of


                                            Xuesong Yang

Where/How to get a full list of phone set used for each language?

I am using this backend to make a G2P-like conversation. But I did not know where to find a complete list of the phone set for each language. Could anyone...

[TTS] fix broken tutorial for MixerTTS.

Signed-off-by: Xuesong Yang # What does this PR do ? Add a one line overview of what this PR aims to accomplish. **Collection**: [Note which collection this PR will affect]...

fix

[TTS] fix broken tutorial for MixerTTS.

# What does this PR do ? **Collection**: [Note which collection this PR will affect] # Changelog - Add specific line by line info of high level changes in this...

fix

broken link for tts tutorials.

The link (https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tutorials/tts-python-basics-and-customization-with-ssml.html) shared in https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tutorials/tts-basics-customize-ssml.html#note is not valid anymore.

[zh] WARNING:NeMo-text-processing:Failed text: 免除GOOGLE在一桩诽谤官司中的法律责任。Key: integer_part Value: None

Received warning message when normalizing text. Could you pls provide what the message indicates? **Reproduciple code**: ```python from nemo_text_processing.text_normalization.normalize import Normalizer text_normalizer = Normalizer(lang="zh", input_case="cased", overwrite_cache=True, cache_dir=str("cache_dir")) text_normalizer_call_kwargs = {"punct_pre_process":...

bug

Stale

maybe wrong IPA symbol for a German word "aneinander"

Previously reported to Phonemizer repo: https://github.com/bootphon/phonemizer/issues/142, but suggested to re-direct here. There is `/1/` appearing in the phonetic transcription for the German language. Is it expected? ``` $ echo "aneinander"...

[magpietts][eval][bugfix] fixed infer and eval scripts and supported loading wandb hparam config.

* tensorboard and wandb save hparam config files with slightly different YAML structure. This fix will address blockers when feeding in `*.ckpt` and `*/wandb/latest-run/files/config.yaml`. * simplified loading ASR models by...

TTS

Run CICD

[magpietts][lhotse_v2] make model training recipe adapt to the latest v2 datasets.

- change the codes names in cuts into target_codes and context_codes. - remove `codec_model_name` in yaml config because we've added the codec name in `input_cfg.yaml` file. - fixed bugs that...

TTS

common

Run CICD

[magpietts][lhotse_v2] Adding scripts of converting NeMo manifests to Lhotse Shars, and speedup improvements for codec model inference.

### Speed Up Codec Model Inference Made slightly changes for the dependency util func `mask_sequence_tensor`. We observed at least 3x speedup. This change is cherry-picked from the part of PR...

TTS

common

Run CICD

[magpietts] added logging of consumed samples at each training step.

The Lhotse dataloader applies dynamic batching based on duration buckets, causing the number of consumed samples at each training step to vary. Tracking the number of consumed samples enables fairer...

TTS