Brandon Lockaby
Brandon Lockaby
is this correct, or in-progress in v4.41-release? ``` from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "NousResearch/Meta-Llama-3-8B-GGUF" filename = "Meta-Llama-3-8B-Q4_K_M.gguf" tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename) model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename) print(model) ``` ```...
@younesbelkada Same error. Created and loaded from this repo https://huggingface.co/brandonglockaby/Meta-Llama-3-8B-Q4_K_M-GGUF I should point out, previous attempts are ggufs that work correctly with current releases of llama.cpp and llama-cpp-python
@younesbelkada ``` pip install --upgrade --force-reinstall git+https://github.com/huggingface/transformers Successfully uninstalled transformers-4.41.2 ``` Same error related to tokenizer filename, produced with updated repo from gguf-my-repo as well as a gguf from my...
I tried to use kv_overrides to change the EOS token, here, as there doesn't seem to be any other way, but that didn't do the trick
```python from llama_cpp import Llama llm = Llama( model_path="/home/axyo/dev/LLM/models/Meta-Llama-3-8B-Instruct-GGUF-v2/Meta-Llama-3-8B-Instruct-v2.Q5_0.gguf", n_gpu_layers=-1, seed=8, n_ctx=4096, logits_all=True, ) prompt = """user What is a dog?assistant A dog, also known as Canis lupus familiaris, is...
topk=1 or temperature=0 or other variations of these parameters don't make a difference, EOS causes generation to come back blank with no logprobs
Exacerbatingly, trying to change EOS doesn't work: ```python llm = Llama( model_path="/home/axyo/dev/LLM/models/Meta-Llama-3-8B-Instruct-GGUF-v2/Meta-Llama-3-8B-Instruct-v2.Q5_0.gguf", n_gpu_layers=-1, seed=8, n_ctx=4096, logits_all=True, kv_overrides={"tokenizer.ggml.eos_token_id": 0}, ) ```
```python output = llm( prompt, echo=False, logprobs=100, max_tokens=1, repeat_penalty=1.0, # disable penalties temperature=0, logit_bias={128001: -1000} ) ``` Logit bias down the EOS id doesn't work either, idk 🤷
Affects: 0.3.15
``` from llama_cpp import Llama llm = Llama( model_path="/home/axyo/dev/LLM/models/gpt-oss-20b-F16.gguf", n_gpu_layers=16, seed=8, n_ctx=4096, logits_all=True, ) prompt = """systemYou are ChatGPT trained by OpenAI. Knowledge cutoff: 2024-06 Reasoning: low userReply with only...