Cannot run HF example codes on all three codeLlama-Python-hf models
I understand this might be a huggingface-related problem but I cannot find the answer anywhere so I come to ask for help.
On huggingface there is a example code for codellama model:
from transformers import LlamaForCausalLM, CodeLlamaTokenizer
tokenizer = CodeLlamaTokenizer.from_pretrained("codellama/CodeLlama-7b-hf") model = LlamaForCausalLM.from_pretrained("codellama/CodeLlama-7b-hf") PROMPT = '''def remove_non_ascii(s: str) -> str: """ <FILL_ME> return result ''' input_ids = tokenizer(PROMPT, return_tensors="pt")["input_ids"] generated_ids = model.generate(input_ids, max_new_tokens=128)
filling = tokenizer.batch_decode(generated_ids[:, input_ids.shape[1]:], skip_special_tokens = True)[0] print(PROMPT.replace("<FILL_ME>", filling))
And the output is like:
def remove_non_ascii(s: str) -> str:
""" Remove non-ASCII characters from a string.
Args:
s: The string to remove non-ASCII characters from.
Returns:
The string with non-ASCII characters removed.
"""
result = ""
for c in s:
if ord(c) < 128:
result += c
return result
However, this works fine with all the original codellama model and codellama instruct models. But all three codellama-Python models will show tons of "Assertion srcIndex < srcSelectDimSize failed" errors and fail to complete the running.
The second strange thing is that, if I delete the ' <FILL_ME> ' part in the PROMPT when I am using codellama-Python model, then the error won't show , however there will still be no output.
So my questions are:
-
Why will these "Assertion
srcIndex < srcSelectDimSizefailed" errors happen on codellama-Python, as well as the no-output problems after I deleting the <FILL_ME> in the PROMPT? From my point of view, codeLlama-python is just modified on more Python tasks, and it should not be fundamentally different with original codellama and codellama-instruct. -
Why the readme of Huggingface page says Codellama-Python cannot do infilling? Why modification on Python tasks will make the model cannot do infilling? Is the problem in my question 1 related to this lack of infilling of codellama-Python?
Thank you so much for your precious time.
As far as I know, codellama-Python is not for infilling. Please refer to its documentation.
This model is not finetuned on infilling dataset. It is finetuned on only next token prediction dataset.