Bohong Chen issues

Results 8 issues of


                                            Bohong Chen

Something about extract_mel_spectrogram_for_tts in datapipeline.py

If I use librosa.feature.melspectrogram to extract melspectrogram from the raw wav , the result is quite different from use extract_mel_spectrogram_for_tts in datapipeline.py . such as using same setting and wav...

A mean and std computation and use question

Hello, what a good work! While I have a question when I retrain rvqvae, it quite diffuse me a lot. Why in this place we need to recalculate std? https://github.com/EricGuo5513/momask-codes/blob/d4d34120ea01e3fda09944c3c030e8befe8c19e2/data/t2m_dataset.py#L41-L68

我爱小鹏老师，作业一完成

小彭老师，我完成了作业一

A little question about wavlm and seed mask

Hello, thanks for sharing code ! In the DiffuseStyleGesture , the model only use one audio feature , wavlm . But when extract wavlm feature from raw wav , the...

question

About code releasing

Hello, what's a nice job. Do you have a plan to release your code for our experiencing and referencing?

Stream and play

It's so appealing for me to use [streaming](https://github.com/huggingface/parler-tts/blob/8e465f1b5fcd223478e07175cb40494d19ffbe17/INFERENCE.md?plain=1#L158). ``` for (sampling_rate, audio_chunk) in generate(text, description, chunk_size_in_s): # You can do everything that you need with the chunk now # For...

Fix a bug in examples/mmlu.ipynb when using gpt-4o or gpt-4o-mini

# Fix a bug in examples/mmlu.ipynb ## Description: if we use `gpt-4o` or `gpt-4o-mini` to replace gpt-3.5-turbo for evaluation in https://github.com/openai/evals/blob/234bcde34b5951233681455faeb92baaaef97573/examples/mmlu.ipynb#L126 It will raise error as below: ``` [2024-08-25 16:11:24,231]...

Word level timestamp

I‘m trying to use it to get each word's timestamp, but my result is bad. I'm not sure what I did wrong. I use `whisper-large-v3` openai-whisper 20231117 whisper-timestamped 1.15.4 I...