Bohong Chen
Bohong Chen
If I use librosa.feature.melspectrogram to extract melspectrogram from the raw wav , the result is quite different from use extract_mel_spectrogram_for_tts in datapipeline.py . such as using same setting and wav...
Hello, what a good work! While I have a question when I retrain rvqvae, it quite diffuse me a lot. Why in this place we need to recalculate std? https://github.com/EricGuo5513/momask-codes/blob/d4d34120ea01e3fda09944c3c030e8befe8c19e2/data/t2m_dataset.py#L41-L68
小彭老师,我完成了作业一
Hello, thanks for sharing code ! In the DiffuseStyleGesture , the model only use one audio feature , wavlm . But when extract wavlm feature from raw wav , the...
Hello, what's a nice job. Do you have a plan to release your code for our experiencing and referencing?
It's so appealing for me to use [streaming](https://github.com/huggingface/parler-tts/blob/8e465f1b5fcd223478e07175cb40494d19ffbe17/INFERENCE.md?plain=1#L158). ``` for (sampling_rate, audio_chunk) in generate(text, description, chunk_size_in_s): # You can do everything that you need with the chunk now # For...
# Fix a bug in examples/mmlu.ipynb ## Description: if we use `gpt-4o` or `gpt-4o-mini` to replace gpt-3.5-turbo for evaluation in https://github.com/openai/evals/blob/234bcde34b5951233681455faeb92baaaef97573/examples/mmlu.ipynb#L126 It will raise error as below: ``` [2024-08-25 16:11:24,231]...
I‘m trying to use it to get each word's timestamp, but my result is bad. I'm not sure what I did wrong. I use `whisper-large-v3` openai-whisper 20231117 whisper-timestamped 1.15.4 I...