Kelly Marchisio issues

Results 5 issues of


                                            Kelly Marchisio

no evaluate.py script

In the readme it says, "In addition, you can also evaluate the model in the same settings as in our paper using the evaluate.py script." -- note that there is...

How can I get deterministic results from map_embeddings.py?

I'm running the below command, however my results are not deterministic. Is there a way to do so? (I'm debugging the pipeline, so would like a fully reproducible run) ```...

Excessive memory usage during ensemble decoding

We're ensemble decoding with 4 models using this command: ``` $MARIAN/build/marian-decoder \ -c $configs \ -m $models -d $GPU \ --mini-batch 16 --maxi-batch 100 --maxi-batch-sort src -w 6500 \ --n-best...

build_dictionary nwords unexpected behavior when real # words < requested nwords

## 🐛 Bug When we specify nwords=N, in tasks.build_dictionary [here](https://github.com/facebookresearch/fairseq/blob/b4001184f49ed0e20d619b54bb3d43088fabf990/fairseq/tasks/fairseq_task.py#L97), if nwords is less than self.symbols, the resulting dictionary is size len(self.symbols) (code [snippet](https://github.com/facebookresearch/fairseq/blob/b4001184f49ed0e20d619b54bb3d43088fabf990/fairseq/data/dictionary.py#L174)). Perhaps this is intended behavior, but...

bug

needs triage

Unexpected behavior with sampling of repeated character sequence.

I added some sequences of repeated characters as user defined tokens to a Unigram model. Now when tokenizing with sampling, I get unexpected behavior as I increase the nbest size....

bug