Bruce

Results 2 issues of Bruce

I need to adjust the default token limit for my Large Language Model (LLM). Currently, I’m using Ollama with the Mistral model and have created two clients—one using the Ollama...

I find ollama already support bge-m3, because bge-m3 can generate sparse vector. Is there any way for generate sparse embeddings?

feature request