Error: VectorFromInput was called without vectorizer
Description
I'm encountering an error while using Verba. When I submit a query, I receive an error message indicating that VectorFromInput was called without vectorizer. Here are the details:
Is this a bug or a feature?
- [x] Bug
- [ ] Feature
Steps to Reproduce
Clone the repository from GitHub. Install dependencies using pip install -r requirements.txt. Start the application using verba start. Submit a query through the API (e.g., what's the most valuable item in Minecraft?).
Observed Behavior: The following error message is logged: INFO: 127.0.0.1:57421 - "POST /api/suggestions HTTP/1.1" 200 OK ✔ Received query: what's the most valuable item in Minecraft ? ✘ The query retriever result in the window retriever contains an error: ({'locations': [{'column': 6, 'line': 1}], 'message': 'get vector input from modules provider: VectorFromInput was called without vectorizer', 'path': ['Get', 'VERBA_Chunk_OLLAMA']}) ℹ No data found for VERBA_Chunk_OLLAMA. ℹ Retrieved Context of 0 tokens ✔ Successfully processed query: what's the most valuable item in Minecraft ? in 0.01s
Additional context
Environment:
OS: macOS (latest version) Python: 3.11 Verba: 1.0.4 Relevant dependencies: weaviate-client==3.23.1, openai==0.27.9, fastapi==0.102.0, etc.
Can confirm getting same error. "No Chunks Available" in the UI and the docker logs
2024-07-12 09:51:08 verba-1 | ✔ Received query: Hi 2024-07-12 09:51:08 verba-1 | ✘ The query retriever result in the window retriever contains an error: 2024-07-12 09:51:08 verba-1 | ({'locations': [{'column': 6, 'line': 1}], 'message': 'get vector input from 2024-07-12 09:51:08 verba-1 | modules provider: VectorFromInput was called without vectorizer', 'path': 2024-07-12 09:51:08 verba-1 | ['Get', 'VERBA_Chunk_OLLAMA']}) 2024-07-12 09:51:08 verba-1 | ℹ No data found for VERBA_Chunk_OLLAMA. 2024-07-12 09:51:08 verba-1 | ℹ Retrieved Context of 0 tokens 2024-07-12 09:51:08 verba-1 | ✔ Succesfully processed query: Hi in 0.06s
Running llama3 locally. Can access llama3 from the url "http://host.docker.internal:11434".
So ollama is not the issue.
Thanks for the issue! I see you're using Ollama. Which model are you using, did you make sure you've installed the model?
I have all models installed. I will go back and check, but I think I followed the instructions to the letter.
On Mon, 15 Jul 2024 05:15:59 -0700, Edward @.***> wrote:
Thanks for the issue! I see you're using Ollama. Which model are you using, did you make sure you've installed the model?
-- Reply to this email directly or view it on GitHub: https://github.com/weaviate/Verba/issues/243#issuecomment-2228363246 You are receiving this because you are subscribed to this thread.
Message ID: @.***>
Ronald P. Reck
http://www.rrecktek.com - http://www.ronaldreck.com
Interesting, let me try to reproduce this
Did you ingest data before querying?
I get the same error. Here is what I did:
- Install with docker.
- Created the .env from the example
- commented everything and uncommented ollama lines of code. (left as default, I have ollama installed with llama3 on the same port)
- changed the ports from 8080 to 8081 as 8080 was already in use.
- Everything installed correctly and healthy.
- Open verba, and set the RAG pipeline to be: Ollamaembedder -> Windows retriever -> Ollama generator
- Uploaded a 5 page pdf with only text.
- Click chat and ask who the author is
- Get error: No chunks available.
- Error in docker logs: ✔ Received query: Who is the author of the the report 2024-07-23 21:05:48 ✘ The query retriever result in the window retriever contains an error: 2024-07-23 21:05:48 ({'locations': [{'column': 6, 'line': 1}], 'message': 'get vector input from 2024-07-23 21:05:48 modules provider: VectorFromInput was called without vectorizer', 'path': 2024-07-23 21:05:48 ['Get', 'VERBA_Chunk_OLLAMA']}) 2024-07-23 21:05:48 ℹ No data found for VERBA_Chunk_OLLAMA. 2024-07-23 21:05:48 ℹ Retrieved Context of 0 tokens 2024-07-23 21:05:48 ✔ Succesfully processed query: Who is the author of the report
I am on Windows 11, Docker Engine v24.0.2, have python 3.10 on PC.
Same here. for Llama3.1 through ollama. In the UI, it shows "No Chunks Available", and in the logs as mentioned by others.
verba-1 | ✔ Received query: hello
verba-1 | ✘ The query retriever result in the window retriever contains an error:
verba-1 | ({'locations': [{'column': 6, 'line': 1}], 'message': 'get vector input from
verba-1 | modules provider: VectorFromInput was called without vectorizer', 'path':
verba-1 | ['Get', 'VERBA_Chunk_OLLAMA']})
verba-1 | ℹ No data found for VERBA_Chunk_OLLAMA.
verba-1 | ℹ Retrieved Context of 0 tokens
verba-1 | ✔ Succesfully processed query: hello in 0.04s
I was able to get this to work by using mxbai-embed-large for embedding. Should be as simple as 'ollama pull mxbai-embed-large'.
I was able to get this to work by using mxbai-embed-large for embedding. Should be as simple as 'ollama pull mxbai-embed-large'.
I made both models available "llama3.1:8b" and "mxbai-embed-large:latest". I am able to import small pdf file too (got an error in bigger files though)... But the chat is not working at all, with or without any documents.
Okay, I understood the issue, it probably requires exact name in env variable. In my case these are OLLAMA_MODEL=llama3.1:8b, OLLAMA_EMBED_MODEL=mxbai-embed-large:latest
One out of context question, I see it searches only from documents and not in general output from llama, like if I chat with "hello", if documents don't have it, I will not get any output in the chat. Is that expected behaviour?
Okay, I understood the issue, it probably requires exact name in env variable. In my case these are
OLLAMA_MODEL=llama3.1:8b,OLLAMA_EMBED_MODEL=mxbai-embed-large:latestOne out of context question, I see it searches only from documents and not in general output from llama, like if I chat with "hello", if documents don't have it, I will not get any output in the chat. Is that expected behaviour?
This works for me!Thank you!
I am getting this issue, too with ollama and verba in docker on windows. Same errors as others are getting, very frustrating to see the No Chunks Available message. Any other way to check the setup to trace where the error is originating?
✘ The query retriever result in the window retriever contains an error: 2024-08-11 14:55:20 verba-1 | ({'locations': [{'column': 6, 'line': 1}], 'message': 'get vector input from 2024-08-11 14:55:20 verba-1 | modules provider: VectorFromInput was called without vectorizer', 'path': 2024-08-11 14:55:20 verba-1 | ['Get', 'VERBA_Chunk_OLLAMA']}) 2024-08-11 14:55:20 verba-1 | ℹ No data found for VERBA_Chunk_OLLAMA. 2024-08-11 14:55:20 verba-1 | ℹ Retrieved Context of 0 tokens 2024-08-11 14:55:20 verba-1 | ✔ Succesfully processed query: hello in 0.02s
here is my .env file (sorry for the big text, it's doing it automatically):
URL-TO-YOUR-WEAVIATE-CLUSTER
WEAVIATE_URL_VERBA=localhost:8080
API-KEY-OF-YOUR-WEAVIATE-CLUSTER
WEAVIATE_API_KEY_VERBA=
YOUR-OPENAI-KEY
OPENAI_API_KEY=
YOUR-BASE-URL
OPENAI_BASE_URL=http://0.0.0.0:8000
YOUR-COHERE-KEY
COHERE_API_KEY=
YOUR-UNSTRUCTURED-KEY
UNSTRUCTURED_API_KEY=
UNSTRUCTURED_API_URL=
YOUR-GITHUB-TOKEN
GITHUB_TOKEN=
GITLAB_TOKEN=
OLLAMA_URL=http://localhost:3000 OLLAMA_MODEL=llama3.1:8b OLLAMA_EMBED_MODEL=mxbai-embed-large:latest
GOOGLE ENVIRONMENT VARIABLES
GOOGLE_APPLICATION_CREDENTIALS=
GOOGLE_CLOUD_PROJECT=
GOOGLE_API_KEY=
Hi @thomashacker
I am using OpenAI, my chunk got Retrieved but at the UI I am not getting any answer, its only says "generating response" INFO: Application startup complete. INFO: 127.0.0.1:56128 - "POST /api/suggestions HTTP/1.1" 200 OK ✔ Received query: What is the patient age? ℹ Retrieved Context of 1034 tokens ✔ Succesfully processed query: What is the patient age? in 1.94s INFO: 127.0.0.1:56128 - "POST /api/query HTTP/1.1" 200 OK ℹ Document ID received: 52d05839-c8f0-4cde-b4df-ddf9987e18ec ✔ Succesfully retrieved document: 52d05839-c8f0-4cde-b4df-ddf9987e18ec INFO: 127.0.0.1:56128 - "POST /api/get_document HTTP/1.1" 200 OK
hey all,
got the same problem... always showing "No Chunks Available"
did anyone find a fitting solution? your solutin below isn`t working for me :/
thanks for your help already...
Okay, I understood the issue, it probably requires exact name in env variable. In my case these are
OLLAMA_MODEL=llama3.1:8b,OLLAMA_EMBED_MODEL=mxbai-embed-large:latestOne out of context question, I see it searches only from documents and not in general output from llama, like if I chat with "hello", if documents don't have it, I will not get any output in the chat. Is that expected behaviour?
I experienced the same issue when following the same steps as OP.
Here's the config variables which worked for me (While following the steps of OP)
I added these directly in docker-compose.yml file
- WEAVIATE_URL_VERBA=http://weaviate:8080
- OLLAMA_URL=http://host.docker.internal:11434
- OLLAMA_MODEL=llama3.1:8b
- OLLAMA_EMBED_MODEL=mxbai-embed-large
hey all,
got the same problem... always showing "No Chunks Available"
did anyone find a fitting solution? your solutin below isn`t working for me :/
thanks for your help already...
Okay, I understood the issue, it probably requires exact name in env variable. In my case these are
OLLAMA_MODEL=llama3.1:8b,OLLAMA_EMBED_MODEL=mxbai-embed-large:latestOne out of context question, I see it searches only from documents and not in general output from llama, like if I chat with "hello", if documents don't have it, I will not get any output in the chat. Is that expected behaviour?
go to goldenverba/components/embedding/OllamaEmbedder.py and change this function (change json_data.get("embedding", []) to json_data.get("embeddings", []) ):
def vectorize_chunk(self, chunk) -> list[float]: try: embeddings = [] embedding_url = self.url + "/api/embeddings" data = {"model": self.model, "prompt": chunk} response = requests.post(embedding_url, json=data) json_data = json.loads(response.text) embeddings = json_data.get("embedding", []) return embeddings