[BUG] Unknown error for "No files were aligned, this likely indicates serious problems with the aligner."
I had tried with the same file on older version at mac, but somehow it just couldn't work on the lastest version on win10.
Debugging checklist
[Y] Have you updated to latest MFA version?
[Y] Have you tried rerunning the command with the --clean flag?
Describe the issue A clear and concise description of what the bug is.
For Reproducing your issue Please fill out the following:
-
Corpus structure
- What language is the corpus in? Cantonese
- How many files/speakers? 1
- Are you using lab files or TextGrid files for input? no
-
Dictionary
- Are you using a dictionary from MFA? If so, which one? no
- If it's a custom dictionary, what is the phoneset? cantonese_pronunciation.txt
-
Acoustic model
- If you're using an acoustic model, is it one download through MFA? If so, which one? no
- If it's a model you've trained, what data was it trained on? no
- It was found online from SPICE corpus https://dataverse.scholarsportal.info/dataset.xhtml?persistentId=doi:10.5683/SP2/MJOXP3
Log file
Please attach the log file for the run that encountered an error (by default these will be stored in ~/Documents/MFA).
validate_pretrained.log
Desktop (please complete the following information):
- OS: Windows
- Version Windows 10
- Any other details about the setup (Cloud, Docker, etc)
Additional context Add any other context about the problem here.
perhaps due to too much oov words you can check the oov_list
Now I found the reason for my alignment process with this error. It is because there is a clitic marker that is mistakenly included in the dictionary.txt. So the L.fst which is the input of compile-train-graphs-fsts.cc was not generated correctly. So maybe you need to check the dictionary. @mmcauliffe maybe you can add a checker program to make sure any specific makers in the config.yaml can not be a part of the dictionary.