popsiclexu
popsiclexu
/remove-lifecycle stale
@limbaniharsh For vLLM v0, you can replace the existing lmcache_connector.py in vLLM with the patch from LMCache located at: `docker/patch/lmcache_connector.py` In my environment, I used the following command to apply...
> Thank you for your PR—great work! We need to discuss further how we'll support RAG as an AI Provider. > > For example, what types of authentication will we...
@arbreezy The custom backend allows users to implement more complex and customizable functionality. Additionally, it enables integration with any vector database or embedding model within their own code, offering greater...
Thank you for your suggestion! I’ve added a description of the OpenAPI schema to the tutorial documentation. https://github.com/k8sgpt-ai/docs/pull/135 @AlexsJones
@abindg https://github.com/k8sgpt-ai/k8sgpt-operator/pull/666
> > Support attention backends based PagedAttention in vllm, eg. xformers, rocm_flash (if support rocm device) > > issue #680 > > Just a quick comment. I think the vllm...