Shengyang Sun
Results
2
issues of
Shengyang Sun
**Describe the bug** In the `GPTSFTChatDataset`, if the first prompt length exceeds `max_seq_length`, all following turns are truncated out. Then the `loss_mask` becomes all `False` for the example. https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/nlp/data/language_modeling/megatron/gpt_sft_chat_dataset.py#L359 This...
bug
stale
# What does this PR do ? The original implementation of the prompt template in `gpt_sft_chat_dataset.py` uses a fix system token `System`. This PR enables to read the system token...
NLP
Run CICD