graphrag icon indicating copy to clipboard operation
graphrag copied to clipboard

[Feature Request]: Local LLM and embedding

Open CoderJackZhu opened this issue 1 year ago • 6 comments

Is your feature request related to a problem? Please describe.

Currently, local models and local embeddings are not supported. When will they be supported?

Describe the solution you'd like

I hope to use LLM locally instead of use GPT-4. It is costly and can't afford. I'd appreciate if it is optional to LLM locally.

Additional context

No response

CoderJackZhu avatar Jul 12 '24 08:07 CoderJackZhu

read this https://github.com/microsoft/graphrag/issues/374 or read my 微信公众号实战微软新一代RAG:GraphRAG强大的全局理解能力,碾压朴素RAG?

KylinMountain avatar Jul 12 '24 08:07 KylinMountain

can use deepseek directly in CN?

JackyYangPassion avatar Jul 12 '24 09:07 JackyYangPassion

can use deepseek directly in CN?

if it is compatible with OPEN AI SDK, it should be ok. Like Qwen, moonshot and groq, they are compatible with OAI SDK.

KylinMountain avatar Jul 12 '24 09:07 KylinMountain

看这里 https://youtu.be/XiLEZzm7yCk

win4r avatar Jul 12 '24 12:07 win4r

Tested this repo and it works well.

https://github.com/TheAiSingularity/graphrag-local-ollama

*note - I'm doing research on large medical docs, but the default max token was too large, and I had to reduce the max token size to 4000 in embedding section of settings.yaml: embeddings:

parallelization: override the global parallelization settings for embeddings

async_mode: threaded # or asyncio llm: api_key: ${GRAPHRAG_API_KEY} type: openai_embedding # or azure_openai_embedding model: nomic_embed_text api_base: http://localhost:11434/api # api_version: 2024-02-15-preview # organization: <organization_id> # deployment_name: <azure_model_deployment_name> # tokens_per_minute: 150_000 # set a leaky bucket throttle # requests_per_minute: 10_000 # set a leaky bucket throttle max_retries: 1 # max_retry_wait: 10.0 # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times # concurrent_requests: 25 # the number of parallel inflight requests that may be made batch_size: 2 # the number of documents to send in a single request batch_max_tokens: 4000 # the maximum number of tokens to send in a single request # target: required # or optional

btcmonte avatar Jul 14 '24 15:07 btcmonte

The local search with embeddings from Ollama now works. You can read full guide here: https://medium.com/@karthik.codex/microsofts-graphrag-autogen-ollama-chainlit-fully-local-free-multi-agent-rag-superbot-61ad3759f06f Here is the link to the repo: https://github.com/karthik-codex/autogen_graphRAG

karthik-codex avatar Jul 18 '24 15:07 karthik-codex

https://github.com/severian42/GraphRAG-Ollama-UI,this repo solve problem

CoderJackZhu avatar Jul 19 '24 13:07 CoderJackZhu