How to train the translation model?
Hi,
I use the following command for model training.
mkdir /tmp/nmt_model
python -m nmt.nmt
--src=mn --tgt=zh
--vocab_prefix=/tmp/nmt_data/vocab
--train_prefix=/tmp/nmt_data/train
--dev_prefix=/tmp/nmt_data/test
--test_prefix=/tmp/nmt_data/test
--out_dir=/tmp/nmt_model
--num_train_steps=12000
--steps_per_stats=100
--num_layers=2
--num_units=128
--dropout=0.2
--metrics=bleu
However, the output error message:
ValueError:vocab_file '/tmp/nmt_data/vocab.mn 'does not exist.
I am newbie of nmt.I do not know what is the command that generates the source language (target language) vocabulary?
Looking forward to your advice or answers.
Best regards,
yapingzhao
Use this script:
build_vocab.zip
with the following command:
python path_to_script/build_vocab.py --data=path_to_corpus/corpus_name --save_vocab=save_path/vocab_file_name --size=50000
you can change the vocab size to anything you want, it will create a vocab with the first n most common words in the corpus and add the
thank you very much!
Hi @yapingzhao .. were you able to solve this error? I am getting an Assertion Error if I use the build_vocab.py to generate the vocab