starcoder
starcoder copied to clipboard
chat/dialogues.py mask_user_labels is bug?
def mask_user_labels(tokenizer, dialogue_template, labels):
"""Masks the user turns of a dialogue from the loss"""
user_token_id = tokenizer.convert_tokens_to_ids(dialogue_template.user_token)
assistant_token_id = tokenizer.convert_tokens_to_ids(dialogue_template.assistant_token)
for idx, label_id in enumerate(labels):
The labels parameter is a two-dimensional list, which causes mask user to fail
Hi. mask_user_labels is designed to take labels as a list of token ids. So you should not pass a 2D list to this function. It is better if you apply it to your examples one at the time. You can modify it to accommodate your needs.