BERTopic icon indicating copy to clipboard operation
BERTopic copied to clipboard

AttributeError: 'NoneType' object has no attribute 'strip' when using AzureOpenAI client

Open GiannisKav opened this issue 9 months ago • 5 comments

Have you searched existing issues? 🔎

  • [x] I have searched and found no existing issues

Desribe the bug

BERTopic Error with Content Filtering in AzureOpenAI

Description

When using BERTopic with AzureOpenAI, I'm encountering an error when content filtering is triggered. The error occurs in _openai.py where it tries to strip a None value, failing to properly handle the case where OpenAI returns a content-filtered response.

Error

AttributeError: 'NoneType' object has no attribute 'strip'

Traceback

<bertopic._bertopic.BERTopic object at 0x8cf54a68230>
File "/azureml-envs/azureml_5493e10031d137133cf6acf55ecbab07/lib/python3.12/site-packages/bertopic/representation/_openai.py", line 234, in extract_topics
label = response.choices[0].message.content.strip().replace("topic: ", "")
└ [Choice(finish_reason='content_filter', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, rol...
└ ChatCompletion(id='chatcmpl-BTxsrB8nuD2wpjvlGY4BEKxeTXf1K', choices=[Choice(finish_reason='content_filter', index=0, logprobs...
AttributeError: 'NoneType' object has no attribute 'strip'

Suggested Fix

Update the condition to also check if content is not None:

if response and hasattr(response.choices[0].message, "content") and response.choices[0].message.content is not None:
    label = response.choices[0].message.content.strip().replace("topic: ", "")
else:
    label = "No label returned"

Reproduction

BERTopic Version

0.17.0

GiannisKav avatar May 06 '25 08:05 GiannisKav

Thanks for sharing this. This should indeed be updated. Note that your solution wouldn't work because it assumes that .content exists so the check would raise an error if it doesn't. Instead, maybe something like this:

label = "No label returned"
if response and hasattr(response.choices[0].message, "content"):
    if response.choices[0].message.content is not None:
        label = response.choices[0].message.content.strip().replace("topic: ", "")

Not the prettiest though. It feels like this could be solved more elegantly...

MaartenGr avatar May 12 '25 10:05 MaartenGr

Actually, what I said is working due to Python's short-circuit evaluation of logical expressions.

Python evaluates the conditions from left to right and stops as soon as any condition is False. If hasattr(response.choices[0].message, "content") returns False, Python won't evaluate the third condition, so it won't try to access the non-existent .content attribute.

In terms of 'elegance', another solution could be to check the finish_reason directly.

GiannisKav avatar May 12 '25 13:05 GiannisKav

Actually, what I said is working due to Python's short-circuit evaluation of logical expressions.

Oh right, good catch. I must have looked it from the perspective of other languages, my bad!

In terms of 'elegance', another solution could be to check the finish_reason directly.

That might be interesting to explore. I remember there being PRs for that open that didn't have any updates. It feels like that filters with OpenAI are triggering more quickly considering I get more issues about that here.

MaartenGr avatar May 12 '25 13:05 MaartenGr

When is this going to be fixed?

beckmarc avatar Aug 08 '25 09:08 beckmarc

@beckmarc I'm currently swamped with other engagements so I don't have a timeline for this unfortunately. If someone has the time to open up a PR and do some testing, that would be greatly appreciated!

MaartenGr avatar Aug 08 '25 10:08 MaartenGr