nodejs-whisper icon indicating copy to clipboard operation
nodejs-whisper copied to clipboard

How can i get word from hindi or other language audio ?

Open dayamangukiya97 opened this issue 1 year ago • 2 comments

Is it possible to get word from hindi audio as well! if yes then how ?

dayamangukiya97 avatar Jul 30 '24 11:07 dayamangukiya97

Yes, you'll have to use one of the models not ending in .en.

For example autoDownloadModelName: 'medium' .

If you've already downloaded another model, first go into the cpp-folder under node_modules/nodejs-whisper/cpp/models and remove it. This will allow you to auto download the new model.

Whisper can auto detect the language that is spoken, or if you already know what language the audio is, you can provide it in options to make it more efficient.

SimonRosengren avatar Jul 31 '24 06:07 SimonRosengren

@ChetanXpro OR @SimonRosengren ok, thanks it works for me! but i have one more question, i used "medium" modal and also try other modals for hindi language audio but words not getting correct accuracy i mean to say some hindi word missing, some couple of lines missing, if audio is in female voice then word not getting accurate, last lines missing, and if audio duration blank or music then it throwing innecessory words from hindi audio

is there any way to set which type of exact clear audio or specific loudness detection or something else to solve this problem for get Accurate texts from hindi audio vocal file

Thanks in advance!

here is my code:

const filePath = path.resolve(__dirname, 'DilKaBhawar.mp3'); await nodewhisper(filePath, { modelName: 'medium', // Downloaded model's name autoDownloadModelName: 'medium', // (optional) Auto-download a model if the model is not present verbose: true, removeWavFileAfterTranscription: true, withCuda: true, // (optional) Use CUDA for faster processing whisperOptions: { outputInText: false, // Get output result in a text file outputInVtt: false, // Get output result in a VTT file outputInSrt: true, // Get output result in an SRT file outputInCsv: false, // Get output result in a CSV file translateToEnglish: false, // Translate from source language to English language: 'hi', // Source language timestamps_length: 500, // Amount of dialogue per timestamp pair wordTimestamps: true, // Word-level timestamps splitOnWord: false, // Split on word rather than on token }, });

dayamangukiya97 avatar Aug 03 '24 11:08 dayamangukiya97

@dayamangukiya97 This project is just a wrapper for the whisper cpp project, so for all the performance issues we need to raise issues in the whisper cpp https://github.com/ggerganov/whisper.cpp

ChetanXpro avatar Dec 13 '24 14:12 ChetanXpro