Traian Rebedea
Traian Rebedea
You cannot load a 70B model an a T4 with 16 GB VRAM. Some guidance for VRAM size vs model (for Llama 3.1 but it is similar for other models)...
Hi @aqx95 , There might be a bug in how Langchain uses vllm backends and chat completions, e.g. https://github.com/langchain-ai/langchain/issues/29323 However, using an openai backend and vllm server as `openai_api_base` or...
`max_tokens` is a read-only property in `langchain-google-vertexai 2.0.13`, not sure why there is no setter. See discussion linked above. The solution should be to also add a setter in `langchain-google-vertexai`...
Hi @abhishekpandey3 , Can you share some details , e.g. a sample config and dialogue rails you are using? It is difficult to debug or test anything without some additional...
Do you have a `tmp` subdirectory in the of the directory where you have the dataset you are "indexing" / using as a reference for checking for duplication? If you...