Berat Çimen

Results 9 comments of Berat Çimen

Langchain integration is easier than it looks. You can add `vllm.LLM` as a custom LLM. [Doc link](https://python.langchain.com/docs/modules/model_io/models/llms/how_to/custom_llm)

GPT-J and GPT-NeoX have similar architectures. Maybe a hacky solution is possible as of now.

> Are you wanting `load_in_8bit` from HF or would you consider the AWQ GPTQ support sufficient? @hmellor cloud compute costs adds for quantizing models to AWQ and GPTQ so having...

@hmellor does models quantized by BnB and uploaded to hub work with vLLM?

Same issue on `OS Build 22631.3296`

is there any news about this issue?

Thank you both @charliermarsh and @zanieb! Setting the env variable `UV_CACHE_DIR = "E:\uv\cache"` worked! Here are the terminal outputs: ```bash [08:21:02] berat /e/Projects/python/uv_test>uv --help An extremely fast Python package manager....

Thanks @Krobys for the PR. Absolute legend!

hey @nbonamy i have a similar issue with OpenAI Agents SDK with OpenRouter. Can you give me a hand via telling how you implemented the fix?