spring-ai icon indicating copy to clipboard operation
spring-ai copied to clipboard

Nullpointer when using Azure Open AI and streaming

Open berjanjonker opened this issue 10 months ago • 6 comments

Bug description While using the AzureOpenAI chat client streaming won't work. It gets into this nullpointer. When I subscribe and print out the .content() Flux; I noticed that the last received token is null.

2025-04-10 17:38:22.611 [http-nio-8080-exec-7] ERROR o.a.c.c.C.[.[.[.[dispatcherServlet].log - Servlet.service() for servlet [dispatcherServlet] threw exception java.lang.NullPointerException: Cannot invoke "com.azure.ai.openai.models.ChatResponseMessage.getToolCalls()" because "responseMessage" is null at org.springframework.ai.azure.openai.AzureOpenAiChatModel.buildGeneration(AzureOpenAiChatModel.java:498) Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException: Assembly trace from producer [reactor.core.publisher.FluxMapFuseable] : reactor.core.publisher.Flux.map(Flux.java:6588) org.springframework.ai.azure.openai.AzureOpenAiChatModel.lambda$internalStream$13(AzureOpenAiChatModel.java:381)

Environment Spring-AI 1.0.0-M6 Chat Model: Azure OpenAI

Steps to reproduce chatClient.prompt().user("How are you?").stream().content().doOnEach(data -> System.out.println(data.get()));

//output How can I assist you today ? null

Expected behavior AzureOpenAiChatModel should be null safe or null values should be filtered out.

Minimal Complete Reproducible example See above. When I switch to another vendor like Anthrophic the result is as expected (without a null at the end of the stream)

berjanjonker avatar Apr 10 '25 16:04 berjanjonker

I got the same result with the OpenAI module.

dev-jonghoonpark avatar Apr 16 '25 09:04 dev-jonghoonpark

I have found that this issue is not related to Spring AI.

doOnEach handles multiple events. In the provided code, it calls the onComplete event as the final step after all tasks are completed. Since there is no data in the onComplete event, it results in null.

Using doOnNext instead of doOnEach will resolve the issue.

dev-jonghoonpark avatar Apr 17 '25 02:04 dev-jonghoonpark

I have found that this issue is not related to Spring AI.

doOnEach handles multiple events. In the provided code, it calls the onComplete event as the final step after all tasks are completed. Since there is no data in the onComplete event, it results in null.

Using doOnNext instead of doOnEach will resolve the issue.

I am a beginner, please forgive me if there are any mistakes in what I said. Here is my opinion: I think the null pointer caused by the Azure OpenAiChatModel not processing ChatResponse properly cannot be avoided when obtaining results, whether it is doOnNext or doOnEach

ReloadingPeace avatar Apr 17 '25 05:04 ReloadingPeace

I have found that this issue is not related to Spring AI. doOnEach handles multiple events. In the provided code, it calls the onComplete event as the final step after all tasks are completed. Since there is no data in the onComplete event, it results in null. Using doOnNext instead of doOnEach will resolve the issue.

I am a beginner, please forgive me if there are any mistakes in what I said. Here is my opinion: I think the null pointer caused by the Azure OpenAiChatModel not processing ChatResponse properly cannot be avoided when obtaining results, whether it is doOnNext or doOnEach

I agree. Created a PR to make the processing of chatReponses more robust

berjanjonker avatar Apr 17 '25 20:04 berjanjonker

thanks so much! will review.

markpollack avatar Apr 18 '25 15:04 markpollack

I am testing with M7. The snippit to reproduce this error

chatClient.prompt().user("How are you?").stream().content().doOnEach(data -> System.out.println(data.get())); 

Passes with M7.

Also the test in the PR - https://github.com/spring-projects/spring-ai/pull/2789 passes without the fix in the code.

I'm not sure what is going on here, though I suppose the extra checks in the PR don't hurt.

Thoughts?

markpollack avatar Apr 21 '25 19:04 markpollack

Thanks for testing @markpollack! I did some testing with different setups in Azure and I found the cause. It is because of the Content Filtering option: asynchronous-filter As you can see in the example below. WIth the asynchronous-filter enabled the last data message is null/empty and the content-filter result is send at a later moment.

data: {"id":"","object":"","created":0,"model":"","prompt_annotations":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"choices":[],"usage":null} 

data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"role":"assistant"}}],"usage":null} 

data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":"Color"}}],"usage":null} 

data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":" is"}}],"usage":null} 

data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":" a"}}],"usage":null} 

data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":"stop","delta":{}}],"usage":null} 

data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_reason":null,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"content_filter_offsets":{"check_offset":506,"start_offset":44,"end_offset":571}}],"usage":null} 

data: [DONE]

You can resproduce this if you set a custom content-filer on your model: Azure AI Foundry->Safety+Security->Create content filter->Output Filter->Streaming mode (Preview)

berjanjonker avatar Apr 22 '25 19:04 berjanjonker

Thanks so much for this, it is now merged for M8

markpollack avatar Apr 30 '25 13:04 markpollack