EfficientWord-Net hotword detection for a new language

Hi

Is it possible that I train this model for a language with a different alphabet than English, such as Persian?

Thanks,

Feb 26 '23 11:02 nfaraji2002

Yes, you can do that. Please go through the training file.

Feb 26 '23 20:02 aman-17

Thanks. I executed training.ipynb, but I faced with an error:

No file or directory found at /content/drive/MyDrive/Siamese/modelCheckpoints_old/model-8-01-0.96.h5

I think I need some pre-trained models, but I could not find it in your github. Is it possible that you upload them in the github space to be accessed by everyone?

My another question is that: I found that there are lots of English single-word audio files in the directory: "dataset_format_fixed". Do I require a new single-word audio dataset to train for a new language? or Can I use the model trained by your English dataset to customize on my hot words that are with completely different alphabets and letters such as in Arabic:
آ ب ث د ر ز م س ش ح ض Thanks in advance

Feb 27 '23 09:02 nfaraji2002

For your first question: Training again with Arabic words will give a better performance instead of going with the pre-trained model of English since the window frame of audio will be different(guessing this since Arabic words are longer than 1 sec).

Do I require a new single-word audio dataset to train for a new language? Yes if you want to get high accuracy. Our model gives the best accuracy on words that have less than 1.5 sec.

Mar 03 '23 21:03 aman-17

Like. @aman-17 pointed out it can be better to train the model from scratch as there is very little to no similarities in the pronunciations between arabian language and english

Secondly a more polished version of the code with pytorch and resnet is currently under the works. Will share the same soon , so stay stuned!

Mar 04 '23 01:03 TheSeriousProgrammer

The new model is out, can you test it with arabic languages and let us know? The newer model has only been trained for english words , but its perfomance is way better than the old one

Soon we will share the training code of the newer model as well

Apr 14 '23 11:04 TheSeriousProgrammer

am trying to do this as well, so i am trying to create a wakeword in an african language and while reading through your paper, i came across the use of siamense networks. so is it applicable that i can upload audio data for nay language as long as it is in the range of 1.5 seconds and the model will map new input as close to what the embeddings i created with the few audio samples.

Nov 10 '25 15:11 angelocodes

and also do you mind sharing link to the code for the new model

Nov 10 '25 17:11 angelocodes

The repo currently already has the new model as the default one.. ideally it should be able to work with any wakewords out there too .. like you mentioned, the model precisely does that

Nov 24 '25 04:11 TheSeriousProgrammer