Bruce D'Ambrosio comments

Results 20 comments of


                                            Bruce D'Ambrosio

Support for huggingface/text-generation-inference

this answers my question (issue #48) as well? Any ideas on timeline?

Support for huggingface/text-generation-inference

That would be GREAT, I haven't had much luck. I do have a llms compatible server with access between encode and generate, and streaming access between generate a decode, if...

download.sh returns 403 forbidden error

me too. sigh

Cannot load "meta-llama/Llama-2-70b-hf" and meta-llama/Llama-2-70b-chat-hf"

I'm getting RuntimeError: shape '[1, 34, 64, 128]' is invalid for input of size 34816 for 70B chat, 7 and 13 load fine.

Cannot load "meta-llama/Llama-2-70b-hf" and meta-llama/Llama-2-70b-chat-hf"

hmm I installed 4.31.0, I saw 4.31.0-dev in the config file, guess I'll try that.

Cannot load "meta-llama/Llama-2-70b-hf" and meta-llama/Llama-2-70b-chat-hf"

there is no 4.31.0.dev0 available: python -m pip install --upgrade transformers== 4.31.0.dev0 RROR: Could not find a version that satisfies the requirement transformers==4.31.0.dev0 (from versions: 0.1, 2.0.0, 2.1.0, 2.1.1, 2.2.0,...

Cannot load "meta-llama/Llama-2-70b-hf" and meta-llama/Llama-2-70b-chat-hf"

import torch import transformers from transformers import ( AutoTokenizer, BitsAndBytesConfig, AutoModelForCausalLM, ) from alphawave_pyexts import serverUtils as sv model_name = '/home/bruce/Downloads/llama/llama-2-70b-chat' print(f"Loading {model_name}") model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, load_in_4bit=True, device_map="auto", trust_remote_code=True)...

Cannot load "meta-llama/Llama-2-70b-hf" and meta-llama/Llama-2-70b-chat-hf"

I pass the pipeline to my utility server code that works with literally dozens of other models, including llama-2-7/13. I'll add some standalone test just to make sure that isn't...

Cannot load "meta-llama/Llama-2-70b-hf" and meta-llama/Llama-2-70b-chat-hf"

loading re-downloaded model now...

Cannot load "meta-llama/Llama-2-70b-hf" and meta-llama/Llama-2-70b-chat-hf"

same error. checked pytorch also, latest verion ![image](https://github.com/facebookresearch/llama/assets/2271133/d6ca9077-bd15-4271-8438-fabbb8613868) My conda env has lots of stuff, maybe I'll try a fresh one.. Ubuntu 22.04, up to date, btw. Python 3.11.3