Andrei comments

Results 177 comments of


                                            Andrei

[Feature] Dynamic Model Loading and Model Endpoint in FastAPI

@jmtatsch @MillionthOdin16 thank you! I still have a few questions on the best way to implement this, appreciate any input. The basic features would allow you to: - Specify a...

[Feature] Dynamic Model Loading and Model Endpoint in FastAPI

Implemented in #931

Getting `Illegal instruction (core dumped)` when running an openblas made install

Can you check out and test this llama.cpp commit with OPENBLAS (this is what v0.1.32 is based on) https://github.com/ggerganov/llama.cpp/tree/684da25926e5c505f725b4f10b5485b218fa1fc7 to confirm, also compare to the latest llama.cpp

Getting `Illegal instruction (core dumped)` when running an openblas made install

@Bloob-beep but without the chain ie `llm(prompt)` doesn't give this error? Very strange

Add Dockerfile + build workflow

@Niek do you mind moving this to the build release workflow?

Add Dockerfile + build workflow

@Niek if possible can we include @jmtatsch nvidia-docker container example as well in this PR? Ability to docker pull and run a GPU-accelerated container would be very helpful.

Add Dockerfile + build workflow

@Niek finally got a chance to merge this, great work! We now have a docker image. @jmtatsch if you're still interested it would be awesome to get that cuBLAS-based image,...

Add in-memory cache support for llama.cpp

@oobabooga thanks, I'll add the option like that. The biggest recent performance improvement has been the OpenBLAS / cuBLAS / clBlast support added to llama.cpp, those can be enabled by...

Add in-memory cache support for llama.cpp

@oobabooga merged in the changes for the gpu offloading and tested this out, works well for me. Made a small change to the cache capacity parsing to default to bytes...

Add Shared Library Build Target

**EDIT**: It works but I also needed to add the `-DCMAKE_CXX_FLAGS=-fPIC` and `-DCMAKE_C_FLAGS=-fPIC` to avoid the error below. @thomasantony thanks, I wasn't aware of those flags, however this doesn't seem...