starcoder
starcoder copied to clipboard
some concern in "mask_user_labels"?
-
file chat/dialogues.py:239 should while labels[current_idx] != assistant_token_id and current_idx < len(labels): be while current_idx < len(labels) and labels[current_idx] != assistant_token_id: ?
-
chat/train.py:204 should mask_user_labels(tokenizer, dialogue_template, labels) be: for _ in labels: mask_user_labels(tokenizer, dialogue_template, _)
otherwise seems the mask_user_labels has bugs itself(1) and can not be used correctly(2)?