Generating KB embeddings does not work if AZURE OPENAI API KEY is set

Open ThomasHoppe opened this issue 9 months ago • 0 comments

Generating the KB-embeddings with

python dataset_generation/generate_kb_embeddings.py --dataset_path datasets/enron.json --output_path datasets --model_name text-embedding-3-small

if the environment variable AZURE_OPENAI_API_KEY is set to the proper value, causes the following error:

File "/home/fokus/miniforge3/envs/kblam/lib/python3.13/site-packages/openai/_base_client.py", line 919, in request return self._request( ~~~~~~~~~~~~~^ cast_to=cast_to, ^^^^^^^^^^^^^^^^ ...<3 lines>... retries_taken=retries_taken, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ) ^ File "/home/fokus/miniforge3/envs/kblam/lib/python3.13/site-packages/openai/_base_client.py", line 1023, in _request raise self._make_status_error_from_response(err.response) from None openai.AuthenticationError: Error code: 401 - {'statusCode': 401, 'message': 'Unauthorized. Access token is missing, invalid, audience is incorrect (urn:ms.scopedToken or urn:ms.faceSessionToken), or have expired.'}

Since I am a newbie to azure, I thought this behavior was caused by wrong permissions in azure. But after two days of frustrated trial and error with azure I finally found out, that the problem is caused in file src/kblam/gpt_session.py line 45:

azure_ad_token_provider=token_provider,

which gets determined by

def _get_credential(self, lib_name: str = "azure_openai") -> DeviceCodeCredential:

This does not account for the environment variable. Just removing line 45 allows to use the rules implemented in the openai lib to determine the right credentials from environment variables. If one likes to cache the credentials, it would be better to modify _get_credential, so that first the openai rules are used. But if the credentials live in the environment variables there is no need to cache them.

Additionally, hard coding the api-version in line 23 src/kblam/gpt_session.py isn't a good idea, since it has changed already ...

Apr 24 '25 10:04 ThomasHoppe