Toan Nguyen

Results 21 comments of Toan Nguyen

@xbelonogov I think @TIXFeniks refers to the special token '▁' that merges subwords, not the underscore '_'.

@xbelonogov I think '▁' should not be a token on its own but should always be attached to other token to indicate that's a subword, no?

I'm not 100% clear about how BPE is implemented in YTTM but let's take [subword-nmt](https://github.com/rsennrich/subword-nmt) as an example. In subword-nmt, the word separator character (usually space " ") is not...

I tested with En-Fr, used the same data as in the paper, got 26.57 (the paper reported 26.75 after 5 days) after 5 days 8 hours using GPU (7-80k iterations/day)....

@ad26kt no, I used this blocks-examples/machine-translation code. I didn't make any change in the configuration, except using en-fr data as detailed in Bahdanau's paper.

@ad26kt I juse use that script as reference. You can take the data here http://www-lium.univ-lemans.fr/~schwenk/cslm_joint_paper/ (noted in Bahdanau's paper), preprocess the data as the authors mention in the paper (only...

A temporary fix for this could be: ``` python def load_model_from_bleu_file(self, model, bleu_file_path): assert os.path.exists(bleu_file_path) with closing(numpy.load(bleu_file_path)) as source: param_values = {} print source.items() for name, parameter in source.items(): if...

I believe @orhanf also made a patch. Basically just like your code.