hunalign
hunalign copied to clipboard
Sizes differing too much
Hi,
I'm trying to align an Arabic-English text corpus by using the following command: src/hunalign/hunalign -text -hand=hand_align_file dict.txt ar.txt en.txt. I don't have a dictionary, so dict.txt is an empty text file. When I run it I get this error:
Reading dictionary... 20 hungarian sentences read. 0 english sentences read. Sizes differing too much. Ignoring files to avoid a rare loop bug.
Am I missing something? Why aren't the english sentences read? Should I provide language codes when working with a different language than hungarian?