nmt icon indicating copy to clipboard operation
nmt copied to clipboard

How to train the translation model?

Open yapingzhao opened this issue 7 years ago • 3 comments

Hi, I use the following command for model training. mkdir /tmp/nmt_model python -m nmt.nmt
--src=mn --tgt=zh
--vocab_prefix=/tmp/nmt_data/vocab
--train_prefix=/tmp/nmt_data/train
--dev_prefix=/tmp/nmt_data/test
--test_prefix=/tmp/nmt_data/test
--out_dir=/tmp/nmt_model
--num_train_steps=12000
--steps_per_stats=100
--num_layers=2
--num_units=128
--dropout=0.2
--metrics=bleu However, the output error message: ValueError:vocab_file '/tmp/nmt_data/vocab.mn 'does not exist. I am newbie of nmt.I do not know what is the command that generates the source language (target language) vocabulary? Looking forward to your advice or answers. Best regards,

yapingzhao

yapingzhao avatar Apr 10 '18 07:04 yapingzhao

Use this script: build_vocab.zip with the following command: python path_to_script/build_vocab.py --data=path_to_corpus/corpus_name --save_vocab=save_path/vocab_file_name --size=50000 you can change the vocab size to anything you want, it will create a vocab with the first n most common words in the corpus and add the tags

ptamas88 avatar Apr 10 '18 08:04 ptamas88

thank you very much!

yapingzhao avatar Apr 10 '18 09:04 yapingzhao

Hi @yapingzhao .. were you able to solve this error? I am getting an Assertion Error if I use the build_vocab.py to generate the vocab

kadlaon avatar Dec 06 '18 12:12 kadlaon