crewAI icon indicating copy to clipboard operation
crewAI copied to clipboard

I am facing an issue with PDFSearchTool using Azure OpenAI model

Open buntys2010 opened this issue 1 year ago • 13 comments

Hi,

When i am using below code

config=dict(
llm=dict(
provider="azure_openai", # or google, openai, anthropic, llama2, ...
config=dict(
    model ="gpt-35-turbo-16k",
deployment_name="vanilla-gpt-35-turbo-16k",
api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
),
),
embedder=dict(
provider="azure_openai", # or openai, ollama, ...
config=dict(
model="text-embedding-3-small",
deployment_name="",
api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
),
),
)
)

I am getting error

     69     "OPENAI_API_BASE"
     70 )
     71 values["openai_api_version"] = values["openai_api_version"] or os.getenv(
     72     "OPENAI_API_VERSION", default="2023-05-15"
     73 )
     74 values["openai_api_type"] = get_from_dict_or_env(
     75     values, "openai_api_type", "OPENAI_API_TYPE", default="azure"
     76 )

KeyError: 'openai_api_base'```

I have already set this env variable

os.environ['AZURE_OPENAI_ENDPOINT'] ='base url'

buntys2010 avatar Jul 23 '24 12:07 buntys2010

how are you using the dict with pdfsearchtool

theCyberTech avatar Jul 23 '24 13:07 theCyberTech

Sorry i didn't get your question, i got this reference from this link https://github.com/crewAIInc/crewAI/issues/541

Here is a complete code

config=dict(
llm=dict(
provider="azure_openai", # or google, openai, anthropic, llama2, ...
config=dict(
    model ="gpt-35-turbo-16k",
deployment_name="---",
api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
),
),
embedder=dict(
provider="azure_openai", # or openai, ollama, ...
config=dict(
model="text-embedding-3-small",
deployment_name="---",
api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
),
),
)
)```
How do i use PDFSearchTool with Azure?

buntys2010 avatar Jul 23 '24 13:07 buntys2010

config = dict( llm=dict( provider="azure_openai", config=dict( model="gpt-35-turbo-16k", deployment_name="---", api_key=os.environ.get("AZURE_OPENAI_API_KEY"), ), ), embedder=dict( provider="azure_openai", config=dict( model="text-embedding-3-small", deployment_name="---", api_key=os.environ.get("AZURE_OPENAI_API_KEY"), ), ), )

Is your deployment name intentionally 3 dashes?

theCyberTech avatar Jul 23 '24 14:07 theCyberTech

Also please confirm what version of crewai & crew-tools you are using

theCyberTech avatar Jul 23 '24 14:07 theCyberTech

No i have removed the correct deployment name, my versions are as below:

CrewAI - 0.36.1 CrewAI tools - 0.4.26

buntys2010 avatar Jul 23 '24 15:07 buntys2010

Hello @buntys2010 , I was able to get around this with the following change in embed chain > embedder > azure_openai.py

#from langchain_community.embeddings import AzureOpenAIEmbeddings
from langchain_openai import AzureOpenAIEmbeddings

supernitin avatar Jul 23 '24 17:07 supernitin

Can you upgrade to latest version

crewai == 0.41.1

theCyberTech avatar Jul 24 '24 00:07 theCyberTech

Hello @buntys2010 , I was able to get around this with the following change in embed chain > embedder > azure_openai.py

#from langchain_community.embeddings import AzureOpenAIEmbeddings
from langchain_openai import AzureOpenAIEmbeddings

This works for me! thanks.

buntys2010 avatar Jul 24 '24 05:07 buntys2010

Can you upgrade to latest version

crewai == 0.41.1

Upgrading to 0.41.1 still doesnt work

buntys2010 avatar Jul 24 '24 05:07 buntys2010

Forgive the detail on this but I've been battling this issue for days now...

When we install the latest versions of packages:

(with python 3.10.13)
crewai==0.41.1
crewai-tools==0.4.26
langchain-community==0.2.10

the below test code fails to run, using azure open ai service as the embedder:

import os
from crewai_tools import TXTSearchTool

os.environ["AZURE_OPENAI_DEPLOYMENT"] = "my-valid-deployment-name"
os.environ["AZURE_OPENAI_ENDPOINT"] = "https://my-valid-endpoint.openai.azure.com/"
os.environ["AZURE_OPENAI_KEY"] = os.environ.get("AZURE_OPENAI_KEY_VALUE")
os.environ["OPENAI_API_KEY"] = os.environ.get("AZURE_OPENAI_KEY_VALUE")

AZURE_EMBEDDING_MODEL_NAME = 'my-azure-embedding-deployment-model-name'
AZURE_EMBEDDING_DEPLOYMENT_NAME = 'my-azure-embedding-deployment-name'

read_tool = TXTSearchTool(
    txt="/workspaces/AI-Agents/crewai/mydetails.txt",
    config= {
        "llm": {
            "provider": "azure_openai",
            "config": {
                "api_key": os.environ.get("AZURE_OPENAI_KEY"),
                "model": os.environ.get("AZURE_OPENAI_DEPLOYMENT"),
                "deployment_name": os.environ.get("AZURE_OPENAI_DEPLOYMENT")
            }
        },
        "embedder": {
            "provider": "azure_openai",
            "config": {
                "model": AZURE_EMBEDDING_MODEL_NAME,
                "deployment_name": AZURE_EMBEDDING_DEPLOYMENT_NAME
            }
        }
    }
)

and it fails with the following error:

  File "/home/codespace/.python/current/lib/python3.10/site-packages/langchain_community/embeddings/azure_openai.py", line 68, in validate_environment
    values["openai_api_base"] = values["openai_api_base"] or os.getenv(
KeyError: 'openai_api_base'

It was last working with crewai==0.35 but the move to any higher version of crewai breaks Azure Embedders due to langchain_community downstream package as crewai pyproject.toml: was updated to go from langchain = "^0.1.10" to langchain = ">0.2,<=0.3".

So again, this is kinda not directly crewai code but langchain_community - if i update to the latest crewai version, but keep the working ancient langchain version, it works (i.e. crewai==0.41.1 crewai-tools==0.4.26 langchain-community==0.0.38). But if i update langchain to anything like 0.2.10, it breaks. I found 7 months ago, the langchain community seem to have put in some "validation method" which hard looks for values["openai_api_base"] then blows up as it isn't passed in the crew ai tools set up.

Even trying to add the "openai_api_base" parameter to the TXTSearchTool.config.embedder code in crewai errors with "As of openai>=1.0.0, Azure endpoints should be specified via the azure_endpoint param not openai_api_base (or alias base_url). (type=value_error)".

How we fix this as this is affecting anyone using crewai tools with custom AzureOpenAI embeddings?

@supernitin is right that if we manually hack the embed chain > embedder > azure_openai.py file changing from langchain_community.embeddings import AzureOpenAIEmbeddings to from langchain_openai import AzureOpenAIEmbeddings it works. But this is a not a sustainable solution as it will break on any future updates to the langchain_community package.

I think someone (@joaomdmoura / @lorenzejay ?) needs to go through all the crewai packages and find where langchain_community.embeddings import AzureOpenAIEmbeddings is referenced and update it to use the langchain_openai throughout the codebase?

(finally thanks for reading and crewAI is great!)

MrSimonC avatar Jul 26 '24 22:07 MrSimonC

Hi Crew AI Team,

I just wanted to follow up on the issue I raised earlier about the compatibility problem between Crew AI and langchain-community. After digging deeper, I found that the root cause of the issue lies in the embedchain package, which is used by crewai-tools.

To recap, the issue is related to the deprecated AzureOpenAIEmbeddings class in langchain-community, which is causing the error when using Azure Open AI service as the embedder. The dependency chain that leads to this issue is:

Crew 0.41.1 -> Crewaitools 0.4.26 -> depends on embedchain = "^0.1.114" -> langchain = ">0.2,<=0.3" and langchain-community = "^0.2.6"

I've raised an issue with the embedchain package team, and you can track the progress here: https://github.com/mem0ai/mem0/issues/1597 I've asked that they remove the decremented reference and update to the working replacement.

If they do do the work, and then merge, if they release within the same minor version e.g. 0.1.x, then, in theory, the crew AI tools, using "^0.1.114" as the dependency, should in theory automatically update and things should start working without any code-change.

MrSimonC avatar Jul 27 '24 21:07 MrSimonC

#1597 has been closed now and I can confirm that (at least) Azure Embeddings / Azure LLMs work again on:

  • crewai==0.41.1
  • crewai-tools==0.4.26
  • langchain-community==0.2.11

MrSimonC avatar Aug 11 '24 14:08 MrSimonC

@supernitin thank you for sharing that solution. I'm now running into a similar problem. Now the agent is looking for an OpenAI API Key. How did you instruct your agent to use Auzre OpenAI API? Thanks.

ValidationError: 1 validation error for ChatOpenAI root Did not find openai_api_key, please add an environment variable OPENAI_API_KEY which contains it, or pass openai_api_key as a named parameter. (type=value_error)

ecocarlisle avatar Aug 20 '24 00:08 ecocarlisle

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Dec 12 '24 06:12 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] avatar Dec 18 '24 12:12 github-actions[bot]

For those searching how to solve using PDFSearchTool with Azure OpenAI, you are probably having problems with conflicting environmental variables. The solution that worked for me was this:

# Check and remove conflicting environment variable
if "OPENAI_API_BASE" in os.environ:
    del os.environ["OPENAI_API_BASE"]

if "OPENAI_API_KEY" in os.environ:
    del os.environ["OPENAI_API_KEY"]

savesantos avatar Mar 24 '25 22:03 savesantos