Jayasimha T

Results 4 issues of Jayasimha T

https://github.com/IBM/pytorch-seq2seq/blob/f146087a9a271e9b50f46561e090324764b081fb/seq2seq/models/TopKDecoder.py#L83 . I think teacher_forcing should not be present in beam decoding, since ground truth tokens are not known during inference.

Hi can you please help me understand why 'track_running_stats' is set to False ? https://github.com/tristandeleu/pytorch-maml/blob/c7d994a3e9900d3d6790dbe921cd63abbc6589d4/maml/model.py#L12

Steps to reproduce: 1. Create new dataset using create_hf_dataset.py script 2. In the config, point to your finetuned model and new dataset. **We are using XLMR model.** Running torchrun --nproc_per_node=1...

If you are submitting a bug report, please fill in the following details and use the tag [bug]. **Describe the bug** The generations from huggingface model (LlamaForCausalLM) and HookedTransformer are...

bug
complexity-high