Logan Lebanoff comments

Results 13 comments of


                                            Logan Lebanoff

LLaMA Implementation

Has anyone tested loading 65B with `accelerate` to load on multiple GPUs?

How to deploy web services for llama13B(or bigger model)

See here: https://github.com/facebookresearch/llama/issues/84#issuecomment-1456285764

How to load multiple GPU version without torchrun

I got it working following the instructions in this repo https://github.com/zsc/llama_infer. It uses huggingface's `transformers` and `accelerate` to load the model. Since it no longer needs torchrun, then you can...

cuBLAS API failed with status 15 - Error

I ran into this issue as well with torch==2.0. When I uninstalled it and re-installed as torch==1.13.1, then it seemed to fix the issue.

cuBLAS API failed with status 15 - Error

The error went away for me on GPU

cuBLAS API failed with status 15 - Error

CUDA 11.7. Also I'm used conda for install pytorch with cuda (`conda install pytorch=1.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia`) ``` $ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright...

Add plugin support

> > Is this feature in the Chat UI product roadmap? > > Yes! We're working on something similar right now 😄 Excited about the plugin support! Any update on...

"Invalid State: Controller is already closed" error when trying to use chat-ui with trained llama2 model on the HF platform

Here's what fixed it for me https://github.com/huggingface/chat-ui/issues/1169#issuecomment-2173309506

Can I hook it up to a retrieval system for a document chatbot?

Any progress on this? I'm also interested in hooking up retrieval to the UI

DeepInfra endpoint errors

Here's what fixed the `Controller is already closed` issue for me. Maybe it will for you too, though I was not using DeepInfra. https://github.com/huggingface/chat-ui/issues/1169#issuecomment-2173309506