Cannot run HF example codes on all three codeLlama-Python-hf models

Open jessyford opened this issue 2 years ago • 1 comments

I understand this might be a huggingface-related problem but I cannot find the answer anywhere so I come to ask for help.

On huggingface there is a example code for codellama model:

from transformers import LlamaForCausalLM, CodeLlamaTokenizer

tokenizer = CodeLlamaTokenizer.from_pretrained("codellama/CodeLlama-7b-hf") model = LlamaForCausalLM.from_pretrained("codellama/CodeLlama-7b-hf") PROMPT = '''def remove_non_ascii(s: str) -> str: """ <FILL_ME> return result ''' input_ids = tokenizer(PROMPT, return_tensors="pt")["input_ids"] generated_ids = model.generate(input_ids, max_new_tokens=128)

filling = tokenizer.batch_decode(generated_ids[:, input_ids.shape[1]:], skip_special_tokens = True)[0] print(PROMPT.replace("<FILL_ME>", filling))

And the output is like:

def remove_non_ascii(s: str) -> str:
    """ Remove non-ASCII characters from a string.

    Args:
        s: The string to remove non-ASCII characters from.

    Returns:
        The string with non-ASCII characters removed.
    """
    result = ""
    for c in s:
        if ord(c) < 128:
            result += c
    return result

However, this works fine with all the original codellama model and codellama instruct models. But all three codellama-Python models will show tons of "Assertion srcIndex < srcSelectDimSize failed" errors and fail to complete the running. The second strange thing is that, if I delete the ' <FILL_ME> ' part in the PROMPT when I am using codellama-Python model, then the error won't show , however there will still be no output.

So my questions are:

Why will these "Assertion srcIndex < srcSelectDimSize failed" errors happen on codellama-Python, as well as the no-output problems after I deleting the <FILL_ME> in the PROMPT? From my point of view, codeLlama-python is just modified on more Python tasks, and it should not be fundamentally different with original codellama and codellama-instruct.
Why the readme of Huggingface page says Codellama-Python cannot do infilling? Why modification on Python tasks will make the model cannot do infilling? Is the problem in my question 1 related to this lack of infilling of codellama-Python?

Thank you so much for your precious time.

Nov 20 '23 10:11 jessyford

As far as I know, codellama-Python is not for infilling. Please refer to its documentation.

This model is not finetuned on infilling dataset. It is finetuned on only next token prediction dataset.

Dec 24 '23 08:12 humza-sami