[BUG] Cannot create 'Knowledge': Failed to create or get collection
Description
I am following the documentation here: https://docs.crewai.com/concepts/knowledge#what-is-knowledge
Steps to Reproduce
crew = Crew(
tasks=[task],
process=Process.sequential,
verbose=True,
knowledge_sources: [
PDFKnowledgeSource(
file_paths=[local_path],
chunk_size=1000, # Tamanho dos chunks para processamento
chunk_overlap=100, # Sobreposição entre chunks para preservar contexto
metadata={
'source': 'Informações sobre a clínica',
'description': 'Documento com processos e procedimentos da clínica'
}
)
],
embedder={
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
"dimensions": 256
}
}
)
When I create a Crew or Agent that has a parameter for knowledge_sources and then run the agent I get the error.
File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/agent.py", line 140, in post_init_setup
self._set_knowledge()
File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/agent.py", line 246, in _set_knowledge
self._knowledge = Knowledge(
^^^^^^^^^^
File "<path>crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/knowledge/knowledge.py", line 43, in __init__
self.storage.initialize_knowledge_storage()
File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 107, in initialize_knowledge_storage
raise Exception("Failed to create or get collection")
Exception: Failed to create or get collection
Expected behavior
That the knowledge_source was created and could be accessed by the agent or Crew.
Screenshots/Code snippets
See above.
Operating System
macOS Monterey
Python Version
3.11
crewAI Version
0.100.1
crewAI Tools Version
None
Virtual Environment
Venv
Evidence
File "
Possible Solution
I don't know.
Additional context
Same error: https://github.com/crewAIInc/crewAI/issues/1859#issuecomment-2601030610
I guess this has been fixed, can you try updating the crew version. Check out this PR #2055
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.
Hi @AlemaoEc, did this worked for you? Can you check once! Thanks
Hi @AlemaoEc, did this worked for you? Can you check once! Thanks
Hi! It didn't work. I migrated the solution to another library.
Same issue here. When I put the knowledge_sources inside the agent, it gives me the same error, but when I put the knowledge_sources inside the crew, I get no error.
Hi @GabrielBoninUnity, Can you confirm you are using the latest version of crewai?
Also possible please share your code, will try to recreate the bug, thanks.
Hey @Vidit-Ostwal , looks like i got it fix like this here
`import os from crewai import Agent, Crew, Process, Task, LLM from crewai.project import CrewBase, agent, crew, task from crewai.knowledge.source.csv_knowledge_source import CSVKnowledgeSource from dotenv import load_dotenv
Load environment variables
load_dotenv()
@CrewBase class newsletter: """newsletter crew"""
agents_config = "config/agents.yaml"
tasks_config = "config/tasks.yaml"
GOOGLE_CLOUD_PROJECT = os.getenv("GOOGLE_CLOUD_PROJECT")
GOOGLE_CLOUD_LOCATION = os.getenv("GOOGLE_CLOUD_LOCATION")
CLAUDE_MODEL = os.getenv("CLAUDE_MODEL")
# Set up LLM using CrewAI's built-in Vertex AI integration
llm = LLM(
model=f"vertex_ai/{CLAUDE_MODEL}",
vertex_ai_project=GOOGLE_CLOUD_PROJECT,
vertex_ai_location=GOOGLE_CLOUD_LOCATION,
temperature=0.7,
credentials=None # Uses default Google Cloud credentials
)
csv_source = CSVKnowledgeSource(
file_paths=["data.csv"],
)
@agent
def researcher(self) -> Agent:
return Agent(
config=self.agents_config["researcher"],
verbose=True,
llm=self.llm,
knowledge_sources = [self.csv_source],
)
@task
def research_task(self) -> Task:
return Task(
config=self.tasks_config["research_task"],
output_file="outputs/research_results.md"
)
@crew
def crew(self) -> Crew:
"""Creates the newsletter crew"""
return Crew(
agents=self.agents,
tasks=self.tasks,
process=Process.sequential,
verbose=True,
llm=self.llm
)
`
Hi @GabrielBoninUnity, can you explain what exactly you changed.
I added the llm code like this
llm = LLM( model=f"vertex_ai/{CLAUDE_MODEL}", vertex_ai_project=GOOGLE_CLOUD_PROJECT, vertex_ai_location=GOOGLE_CLOUD_LOCATION, temperature=0.7, credentials=None # Uses default Google Cloud credentials )
and also tagged the agent with it
like this @agent def researcher(self) -> Agent: return Agent( config=self.agents_config["researcher"], verbose=True, llm=self.llm, knowledge_sources = [self.csv_source],
I am having the same issue, @GabrielBoninUnity I don't understand what you did to make it work. Here is a snippet from my code: llm = LLM( model="gpt-4o-mini", temperature=0.8, #creativitz, 0 is precise max_tokens=150, top_p=0.9, frequency_penalty=0.1, presence_penalty=0.1, stop=["END"], seed=42 )
def __init__(self, role:str = None, name:str = None, interviewing_time:float = 0.0):
self.agents_config
self.tasks_config
self.role = role
self.name = name
self.interviewing_time = interviewing_time
self.company_docs
self.llm
@agent
def cv_fit_assessor(self) -> Agent:
return Agent(
config=self.agents_config["cv_fit_assessor"],
verbose=True,
llm=self.llm,
tools=[process_notifier_tool],
knowledge_sources = [self.company_docs],
)
If I remove the knowledge_sources = [self.company_docs] from the Agent it seems that the Agent is not retrieving the data from company docs. If I leave it there I have this error : An error occurred while running the crew: Failed to create or get collection. I upgrade already to the latest crew version
Hey @ramonabordea, could you provide the entire crew.py code ? Its a big hard for me to help you out with this code snippet
Hey! I have a simple crew that gives me this issue:
from crewai import Agent, Task, LLM, Process, Crew
from crewai.project import CrewBase, agent, task, crew
from crewai.knowledge.source.pdf_knowledge_source import PDFKnowledgeSource
from crewai import LLM
@CrewBase
class bug():
"""bug crew"""
agents_config = 'config/agents.yaml'
tasks_config = 'config/tasks.yaml'
some_source = PDFKnowledgeSource(
file_paths=["aRandomPdf.pdf"]
)
llm = LLM(
model="gpt-4o-mini",
temperature=0.8,
max_tokens=150,
top_p=0.9,
frequency_penalty=0.1,
presence_penalty=0.1,
stop=["END"],
seed=123
)
@agent
def some_random_agent(self) -> Agent:
return Agent(
config=self.agents_config["some_random_agent"],
knowledge_sources=[self.some_source]
)
@task
def some_random_task(self) -> Task:
return Task(
config=self.tasks_config["some_random_task"],
output_file="Joke.md"
)
@crew
def crew(self) -> Crew:
"""creates the bug crew"""
return Crew(
agents=self.agents,
tasks=self.tasks,
llm=self.llm,
process=Process.sequential
)
if __name__ == "__main__":
print("start test")
crew_inst = bug()
crew_inst.crew().kickoff()
print("end, no exception")
and the agents.yaml file where it looks like the role is the culprit:
some_random_agent:
alias: "jokester"
role: >
This agent's role is to break the knowledge sources in the crew AI bug report, this should do it
goal: >
You have to tell a funny joke.
backstory: >
You are an AI agent that must entertain developers.
process: >
Write a joke in the output file Joke.md
expected_output: >
A joke in file Joke.md
Could this issue be reopened? The agent's documentation doesn't give any hint that this should be a problem: https://docs.crewai.com/guides/agents/crafting-effective-agents The examples there pretty verbose text.
@lucasgomide can we re-open this please, will try to replicate this!
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Hey @Vidit-Ostwal ! Did you manage to replicate this issue?
@mariaS210, Can you confirm with the latest version of crewai version - 0.117.0, you are facing the same issue?
@mariaS210 I was not able to reproduce that using your crew. I created a random task.yaml file
some_random_task:
description: >
This is a random task
expected_output: >
A random output
agent: some_random_agent
Here is the output
$ python issue-2055.py
> CropBox missing from /Page, defaulting to MediaBox
> CropBox missing from /Page, defaulting to MediaBox
> start test
> end, no exception
Would you mind sharing the crewai version and also any logs or errors you’re encountering?
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.