crewAI icon indicating copy to clipboard operation
crewAI copied to clipboard

[BUG] Cannot create 'Knowledge': Failed to create or get collection

Open AlemaoEc opened this issue 11 months ago • 18 comments

Description

I am following the documentation here: https://docs.crewai.com/concepts/knowledge#what-is-knowledge

Steps to Reproduce

crew = Crew(
            tasks=[task],
            process=Process.sequential,
            verbose=True,
            knowledge_sources: [
              PDFKnowledgeSource(
                  file_paths=[local_path],
                  chunk_size=1000,  # Tamanho dos chunks para processamento
                  chunk_overlap=100,  # Sobreposição entre chunks para preservar contexto
                  metadata={
                    'source': 'Informações sobre a clínica',
                    'description': 'Documento com processos e procedimentos da clínica'
                  }
              )
            ],
            embedder={
              "provider": "openai",
              "config": {
                "model": "text-embedding-3-small",
                "dimensions": 256
              }
            }
        )

When I create a Crew or Agent that has a parameter for knowledge_sources and then run the agent I get the error.

  File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/agent.py", line 140, in post_init_setup
    self._set_knowledge()
  File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/agent.py", line 246, in _set_knowledge
    self._knowledge = Knowledge(
                      ^^^^^^^^^^
  File "<path>crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/knowledge/knowledge.py", line 43, in __init__
    self.storage.initialize_knowledge_storage()
  File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 107, in initialize_knowledge_storage
    raise Exception("Failed to create or get collection")
Exception: Failed to create or get collection

Expected behavior

That the knowledge_source was created and could be accessed by the agent or Crew.

Screenshots/Code snippets

See above.

Operating System

macOS Monterey

Python Version

3.11

crewAI Version

0.100.1

crewAI Tools Version

None

Virtual Environment

Venv

Evidence

File "/crewai_support_agent/.venv/lib/python3.12/site-packages/pydantic/main.py", line 214, in init validated_self = self.pydantic_validator.validate_python(data, self_instance=self) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/agent.py", line 140, in post_init_setup self._set_knowledge() File "/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/agent.py", line 246, in _set_knowledge self._knowledge = Knowledge( ^^^^^^^^^^ File "crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/knowledge/knowledge.py", line 43, in init self.storage.initialize_knowledge_storage() File "/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 107, in initialize_knowledge_storage raise Exception("Failed to create or get collection") Exception: Failed to create or get collection

Possible Solution

I don't know.

Additional context

Same error: https://github.com/crewAIInc/crewAI/issues/1859#issuecomment-2601030610

AlemaoEc avatar Feb 07 '25 13:02 AlemaoEc

I guess this has been fixed, can you try updating the crew version. Check out this PR #2055

Vidit-Ostwal avatar Feb 07 '25 18:02 Vidit-Ostwal

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Mar 10 '25 12:03 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] avatar Mar 16 '25 12:03 github-actions[bot]

Hi @AlemaoEc, did this worked for you? Can you check once! Thanks

Vidit-Ostwal avatar Mar 16 '25 12:03 Vidit-Ostwal

Hi @AlemaoEc, did this worked for you? Can you check once! Thanks

Hi! It didn't work. I migrated the solution to another library.

AlemaoEc avatar Mar 17 '25 19:03 AlemaoEc

Same issue here. When I put the knowledge_sources inside the agent, it gives me the same error, but when I put the knowledge_sources inside the crew, I get no error.

GabrielBoninUnity avatar Mar 20 '25 20:03 GabrielBoninUnity

Hi @GabrielBoninUnity, Can you confirm you are using the latest version of crewai?

Also possible please share your code, will try to recreate the bug, thanks.

Vidit-Ostwal avatar Mar 21 '25 02:03 Vidit-Ostwal

Hey @Vidit-Ostwal , looks like i got it fix like this here

`import os from crewai import Agent, Crew, Process, Task, LLM from crewai.project import CrewBase, agent, crew, task from crewai.knowledge.source.csv_knowledge_source import CSVKnowledgeSource from dotenv import load_dotenv

Load environment variables

load_dotenv()

@CrewBase class newsletter: """newsletter crew"""

agents_config = "config/agents.yaml"
tasks_config = "config/tasks.yaml"

GOOGLE_CLOUD_PROJECT = os.getenv("GOOGLE_CLOUD_PROJECT")
GOOGLE_CLOUD_LOCATION = os.getenv("GOOGLE_CLOUD_LOCATION")
CLAUDE_MODEL = os.getenv("CLAUDE_MODEL")

# Set up LLM using CrewAI's built-in Vertex AI integration
llm = LLM(
    model=f"vertex_ai/{CLAUDE_MODEL}",
    vertex_ai_project=GOOGLE_CLOUD_PROJECT,
    vertex_ai_location=GOOGLE_CLOUD_LOCATION,
    temperature=0.7,
    credentials=None  # Uses default Google Cloud credentials
)

csv_source = CSVKnowledgeSource(
    file_paths=["data.csv"],
)

@agent
def researcher(self) -> Agent:
    return Agent(
        config=self.agents_config["researcher"],
        verbose=True,
        llm=self.llm,
        knowledge_sources = [self.csv_source],
    )

@task
def research_task(self) -> Task:
    return Task(
        config=self.tasks_config["research_task"],
        output_file="outputs/research_results.md"
    )

@crew
def crew(self) -> Crew:
    """Creates the newsletter crew"""

    return Crew(
        agents=self.agents,
        tasks=self.tasks,
        process=Process.sequential,
        verbose=True,
        llm=self.llm
    )

`

GabrielBoninUnity avatar Mar 21 '25 14:03 GabrielBoninUnity

Hi @GabrielBoninUnity, can you explain what exactly you changed.

Vidit-Ostwal avatar Mar 21 '25 15:03 Vidit-Ostwal

I added the llm code like this

llm = LLM( model=f"vertex_ai/{CLAUDE_MODEL}", vertex_ai_project=GOOGLE_CLOUD_PROJECT, vertex_ai_location=GOOGLE_CLOUD_LOCATION, temperature=0.7, credentials=None # Uses default Google Cloud credentials )

and also tagged the agent with it

like this @agent def researcher(self) -> Agent: return Agent( config=self.agents_config["researcher"], verbose=True, llm=self.llm, knowledge_sources = [self.csv_source],

GabrielBoninUnity avatar Mar 21 '25 15:03 GabrielBoninUnity

I am having the same issue, @GabrielBoninUnity I don't understand what you did to make it work. Here is a snippet from my code: llm = LLM( model="gpt-4o-mini", temperature=0.8, #creativitz, 0 is precise max_tokens=150, top_p=0.9, frequency_penalty=0.1, presence_penalty=0.1, stop=["END"], seed=42 )

def __init__(self, role:str = None, name:str = None, interviewing_time:float = 0.0):
	self.agents_config 
	self.tasks_config   
	self.role = role
	self.name = name
	self.interviewing_time = interviewing_time
	self.company_docs 
	self.llm
	


@agent
def cv_fit_assessor(self) -> Agent:
	return Agent(
		config=self.agents_config["cv_fit_assessor"],
		verbose=True,
		llm=self.llm,
		tools=[process_notifier_tool],
		knowledge_sources = [self.company_docs],
	)

If I remove the knowledge_sources = [self.company_docs] from the Agent it seems that the Agent is not retrieving the data from company docs. If I leave it there I have this error : An error occurred while running the crew: Failed to create or get collection. I upgrade already to the latest crew version

ramonabordea avatar Mar 26 '25 13:03 ramonabordea

Hey @ramonabordea, could you provide the entire crew.py code ? Its a big hard for me to help you out with this code snippet

GabrielBoninUnity avatar Mar 26 '25 16:03 GabrielBoninUnity

Hey! I have a simple crew that gives me this issue:

from crewai import Agent, Task, LLM, Process, Crew
from crewai.project import CrewBase, agent, task, crew
from crewai.knowledge.source.pdf_knowledge_source import PDFKnowledgeSource

from crewai import LLM


@CrewBase
class bug():
    """bug crew"""
    agents_config = 'config/agents.yaml'
    tasks_config = 'config/tasks.yaml'
    some_source = PDFKnowledgeSource(
        file_paths=["aRandomPdf.pdf"]
    )

    llm = LLM(
        model="gpt-4o-mini",
        temperature=0.8,
        max_tokens=150,
        top_p=0.9,
        frequency_penalty=0.1,
        presence_penalty=0.1,
        stop=["END"],
        seed=123
    )

    @agent
    def some_random_agent(self) -> Agent:
        return Agent(
            config=self.agents_config["some_random_agent"],
            knowledge_sources=[self.some_source]
        )

    @task
    def some_random_task(self) -> Task:
        return Task(
            config=self.tasks_config["some_random_task"],
            output_file="Joke.md"
        )
    
    @crew
    def crew(self) -> Crew:
        """creates the bug crew"""
        return Crew(
            agents=self.agents,
            tasks=self.tasks,
            llm=self.llm,
            process=Process.sequential
        )

if __name__ == "__main__":
    print("start test")
    crew_inst = bug()
    crew_inst.crew().kickoff()
    print("end, no exception")

and the agents.yaml file where it looks like the role is the culprit:

some_random_agent:
  alias: "jokester"

  role: >
    This agent's role is to break the knowledge sources in the crew AI bug report, this should do it
  
  goal: >
    You have to tell a funny joke.
  backstory: >
    You are an AI agent that must entertain developers.
  process: >
    Write a joke in the output file Joke.md
  expected_output: >
    A joke in file Joke.md

Could this issue be reopened? The agent's documentation doesn't give any hint that this should be a problem: https://docs.crewai.com/guides/agents/crafting-effective-agents The examples there pretty verbose text.

mariaS210 avatar Mar 28 '25 20:03 mariaS210

@lucasgomide can we re-open this please, will try to replicate this!

Vidit-Ostwal avatar Mar 28 '25 20:03 Vidit-Ostwal

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Apr 30 '25 12:04 github-actions[bot]

Hey @Vidit-Ostwal ! Did you manage to replicate this issue?

mariaS210 avatar Apr 30 '25 12:04 mariaS210

@mariaS210, Can you confirm with the latest version of crewai version - 0.117.0, you are facing the same issue?

Vidit-Ostwal avatar Apr 30 '25 12:04 Vidit-Ostwal

@mariaS210 I was not able to reproduce that using your crew. I created a random task.yaml file

some_random_task:
  description: >
    This is a random task
  expected_output: >
    A random output
  agent: some_random_agent

Here is the output

$ python issue-2055.py
> CropBox missing from /Page, defaulting to MediaBox
> CropBox missing from /Page, defaulting to MediaBox
> start test
> end, no exception

Would you mind sharing the crewai version and also any logs or errors you’re encountering?

lucasgomide avatar Apr 30 '25 18:04 lucasgomide

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Jun 01 '25 12:06 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] avatar Jun 06 '25 12:06 github-actions[bot]