BancoLin comments

Results 11 comments of


                                            BancoLin

When using -pc output in the terminal, some Chinese characters cannot be displayed normally

``` const char * text = whisper_full_get_token_text(ctx, i, j); ... printf("%s%s%s%s", speaker.c_str(), k_colors[col].c_str(), text, "\033[0m"); ``` The issue stems from the possibility that the token `text` may not adhere to...

Chinese

in most cases the model works well on Chinese because both English and Chinese share many phonetic properties.

How do you resample to 16000?

Use ffmpeg to downsample audio files.

Training GPU requirements

with default setting you need a GPU with at least 24 GB RAM

RuntimeError

> RuntimeError: stft requires the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release. my solution is downgrade PyTorch to 1.10...

SSNR is 10.4

report my results: - use ffmpeg to downsample the source data from 48khz to 16khz. - use pretrained model in the best_ckpt/ folder - use CPU to do evaluation In...

How to convert the model to rnnoise_data_little.c

@AotYan please see my comment in #246

How to convert the model to rnnoise_data_little.c

> [@BancoLin](https://github.com/BancoLin) thank you for your advice, I have another question, in my train script the `sparsify_start` ,`sparsify_stop` , `sparsify_interval`, and `sparsify_exponent` remain unchanged, while the epochs are `200` and...

how to train the "little" model

@battlefor The rnnoise training script already covers 'little model' training steps (model sparsification), but your training data must large enough to trigger it. If I recall correctly, the little model...

how to train the "little" model

> > the little model training steps start at iteration 2500 and stop at iteration 8000 > > the `iteration`, do you mean in one epoch or the whole training...