Python: azure ai service model invoke fails with no_model_name error code in sk python
Code snippet
import os
import asyncio
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.connectors.ai.prompt_execution_settings import PromptExecutionSettings
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatCompletion
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatPromptExecutionSettings
from azure.ai.inference import ChatCompletionsClient
from azure.core.credentials import AzureKeyCredential
from dotenv import load_dotenv
load_dotenv("../.env")
llm = AzureAIInferenceChatCompletion(
endpoint=os.getenv("AZUREAI_INFERENCE_ENDPOINT"), # "https://{myresource}.services.ai.azure.com/models"
api_key=os.getenv("AZUREAI_ENDPOINT_KEY"),
ai_model_id="phi-4",
service_id="phi-4",
)
# execution_settings = AzureAIInferenceChatPromptExecutionSettings(
# max_tokens=100,
# temperature=0.5,
# top_p=0.9,
# service_id="phi-4",
# model_id="phi-4",
# # extra_parameters={...}, # model-specific parameters
# )
execution_settings = PromptExecutionSettings(model="phi-4")
async def main():
chat_history = ChatHistory(messages=[{"role": "user", "content": "Hello"}])
response = await llm.get_chat_message_content(chat_history,
execution_settings,
headers={"x-ms-model-mesh-model-name": "phi-4"},
)
print(response)
if __name__ == "__main__":
asyncio.run(main())
Error details
Traceback (most recent call last):
File "/afh/projects/aiproj01-ea1be05d-2ca9-4751-94bd-64549ebf820f/shared/Users/pupanda/ai-foundation-models/ai-inference/phi/phi4-sk.py", line 41, in <module>
asyncio.run(main())
File "/anaconda/envs/azureml_py310_sdkv2/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/anaconda/envs/azureml_py310_sdkv2/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/afh/projects/aiproj01-ea1be05d-2ca9-4751-94bd-64549ebf820f/shared/Users/pupanda/ai-foundation-models/ai-inference/phi/phi4-sk.py", line 34, in main
response = await llm.get_chat_message_content(chat_history,
File "/anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/semantic_kernel/connectors/ai/chat_completion_client_base.py", line 197, in get_chat_message_content
results = await self.get_chat_message_contents(chat_history=chat_history, settings=settings, **kwargs)
File "/anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/semantic_kernel/connectors/ai/chat_completion_client_base.py", line 142, in get_chat_message_contents
return await self._inner_get_chat_message_contents(chat_history, settings)
File "/anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/semantic_kernel/connectors/ai/azure_ai_inference/services/azure_ai_inference_chat_completion.py", line 127, in _inner_get_chat_message_contents
response: ChatCompletions = await self.client.complete(
File "/anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/ai/inference/aio/_patch.py", line 670, in complete
raise HttpResponseError(response=response)
azure.core.exceptions.HttpResponseError: (no_model_name) No model specified in request. Please provide a model name in the request body or as a x-ms-model-mesh-model-name header.
Code: no_model_name
Message: No model specified in request. Please provide a model name in the request body or as a x-ms-model-mesh-model-name header.
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fb32ae6c850>
Unclosed connector
connections: ['deque([(<aiohttp.client_proto.ResponseHandler object at 0x7fb322838640>, 2558.922774265)])']
connector: <aiohttp.connector.TCPConnector object at 0x7fb32ae6c9a0>
This error of no_model_name comes from azure-ai-inference in SK. SK wrapper is not able to pass model vaue to ChatcompletionClient. So, when client.complete() tried, it just says model name info is missing.
It looks like a bug. Does anyone have any insights if thing can be addressed from client side?
@TaoChenOSU any thoughts on this?
Hi @PurnaChandraPanda,
The ai_model_id doesn't point the connector to the specified model. The ai_model_id is a parameter users can use for identifying what model the connector is created with in the case of AzureAIInference.
For Azure AI Inference, your endpoint should point directly to your model. It should look something like this: https://Phi-4-xxxxx.southcentralus.models.ai.azure.com if you're using serverless compute.
Hello @TaoChenOSU, @moonbox3
Thank you for the response.
I am using the "azure ai model service endpoint" in ai foundry - https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/how-to/inference?tabs=python#using-the-routing-capability-in-the-azure-ai-model-inference-endpoint. As per which, the base_url would appear as https://{myresource}.services.ai.azure.com/models.
Could you please try the same in your end? You would notice the same error as me.
On ai-model-id, i just passed the model name here, which is phi-4 in my case. Shall i pass a different value here otherwise?
Hello,
I'm experiencing the same issue as @PurnaChandraPanda, I am following the instructions mentioned in the url, in my case I am using DeepSeek-R1.
Has there been any update on this?
thanks!
Hi, also having this issue with the AI Inference SDK.
I a puliing the model endpoint directly from the Azue AI Foundry using the "Get Endpoint". I think ideally this should be the endpoint I need for the connector vs a different URL endpoint.
Tried an older version without of the Azure Inference SDK - azure-ai-inference==1.0.0b5 without any luck.
The inference endpoint structure that @TaoChenOSU suggested looks different now as @PurnaChandraPanda suggested.
When using AI Foundry - https://
I can't seem to find where the "https://model-name.region.models.ai.azure.com/" endpoint comes from. W
When I try to use that structure I get - Content: {"status": "Serverless endpoint not found"}.
Hi,
There are multiple model deployment methods on AI Foundry.
I was referring to the model-as-a-service deployment type (a.k.a serverless endpoint). You can read more about it here. This deployment type doesn't require a model id in the request as the endpoint itself only serves one model.
What @PurnaChandraPanda is using is a relatively new deployment type. In fact, it's still a preview feature on AI Foundry. It allows developers to link an Azure AI Service resource to an AI Foundry project. Once linked, future model deployments will be accessible from the AI service resource via an endpoint and a model id: https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/how-to/inference?tabs=python#using-the-routing-capability-in-the-azure-ai-model-inference-endpoint.
This PR addresses the issue: https://github.com/microsoft/semantic-kernel/pull/10427
When using a serverless endpoint, the ai_model_id is not used. When using an AI service endpoint, the ai_model_id will be used to route the request to the specified model.
@TaoChenOSU - this was supposed to be fixed with 1.21 release? any updates? this is still broken..
@TaoChenOSU - this was supposed to be fixed with 1.21 release? any updates? this is still broken..
Yes, it has been fixed with 1.21 release. Are you still seeing issues?
hey @TaoChenOSU,
I encounter the same issue. Let me share some insights from my side. I hope you can suggest a solution.
MODEL_NAME = "gpt-4o"
OAI_API_KEY = "<redacted>"
OAI_ENDPOINT = "<redacted>"
QUERY_FOR_SUMMARIZER = "Extract the main action items from the meeting minutes"
QUERY_FOR_CODE_OPTIMIZER = """
Review this recursive Fibonacci implementation and suggest an iterative alternative
"""
def build_manager() -> SKSimpleManagerAgent:
kernel = Kernel()
kernel.add_service(
service=AzureAIInferenceChatCompletion(
ai_model_id=MODEL_NAME,
api_key=OAI_API_KEY,
endpoint=OAI_ENDPOINT,
)
)
agents = [
ChatCompletionAgent(
description="Condensed summarizer",
id=uuid4().hex,
instructions="Return key facts only.",
kernel=kernel,
name="summarizer-agent",
),
ChatCompletionAgent(
description="Python optimizer",
id=uuid4().hex,
instructions="Optimize code for performance/readability.",
kernel=kernel,
name="code-optimizer-agent",
),
]
# This is a wrapper for creating a `GroupChatAgent` as follows;
# self._manager = AgentGroupChat(
# agents=self._subordinate_agents,
# selection_strategy=self._selection_strategy,
# termination_strategy=self._termination_strategy,
# )
return SKSimpleManagerAgent(
model_name=MODEL_NAME,
api_key=OAI_API_KEY,
endpoint=OAI_ENDPOINT,
subordinate_agents=agents,
enable_langfuse=False,
enable_telemetry=False,
)
async def main() -> NoReturn:
"""
async def select_agent(self, message: str) -> str:
# Query the question
await self._manager.add_chat_message(
message=ChatMessageContent(role=AuthorRole.USER, content=message)
)
# Use a selection strategy to pick the agent
selected = await self._selection_strategy.select_agent(
agents=self._subordinate_agents, history=self._manager.history
)
return selected.name
"""
mgr = build_manager()
for q in (QUERY_FOR_SUMMARIZER, QUERY_FOR_CODE_OPTIMIZER):
await mgr.select_agent(message=q)
Observed: INFO Telemetry inactive. Using standard AzureAIInferenceChatCompletion. INFO Initialized SKManagerAgentBase with 2 agents. DEBUG Selected Agent: summarizer-agent DEBUG Selected Agent: code-optimizer-agent
Unclosed client session client_session: <aiohttp.client.ClientSession object at 0x...> Unclosed connector connections: ['deque([(<aiohttp.client_proto.ResponseHandler object at 0x...>, ...)])'] connector: <aiohttp.connector.TCPConnector object at 0x...>
Environment:
- python 3.12
- azure-ai-inference 1.0.0b9
- semantic-kernel 1.23.0
I'm facing similar issue with a much simpler and bare bones implementation. Just wanted to hook up AzureOpenAi with my azure - deployment . The code is fairly simple.
load_dotenv()
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
azure_key = os.getenv("AZURE_OPENAI_API_KEY")
azure_deployment_name = os.getenv("AZURE_OPENAI_DEPLOYMENT")
azure_api_version = os.getenv("AZURE_OPENAI_API_VERSION")
print(azure_endpoint, azure_key, azure_deployment_name, azure_api_version)
if not all([azure_endpoint, azure_key, azure_deployment_name, azure_api_version]):
raise EnvironmentError("Env vars not set correctly.")
llm = AzureChatOpenAI(
temperature=0.2,
azure_deployment=azure_deployment_name,
api_version=azure_api_version,
)
Upon execution, I am getting the following error -
Error processing user prompt: Error code: 400 - {'error': {'code': 'no_model_name', 'message': 'No model specified in request. Please provide a model name in the request body or as a x-ms-model-mesh-model-name header.', 'details': 'No model specified in request. Please provide a model name in the request body or as a x-ms-model-mesh-model-name header.'}}
This is completely unexpected as even the LangChain docs don't mention the model_name parameter.
What's even more baffling is that when I pass model_name parameter, like this
llm = AzureChatOpenAI(
temperature=0.2,
azure_deployment=azure_deployment_name,
api_version=azure_api_version,
model_name="gpt-4o"
)
The python lsp/linter tells me this is not a recognised parameter, but apparently AzureChatOpenAi still recieves this and spits out the following error -
Error processing user prompt: Error code: 400 - {'error': {'code': 'unknown_model', 'message': 'Unknown model: gpt-4o', 'details': 'Unknown model: gpt-4o'}}
This is weird because in my azure deployment I can clearly see that the model is gpt-4o itself.
Any ideas how to move forward? I do not wish to use other providers like Groq, etc.
My system - Python 3.12 Langchain 0.3.25 langchain_openai 0.3.17
Any help or guidance would be appreciated.
Thanks
@mehulambastha
Use "model" key instead of "model_name". It will work.
llm_config = {
"deployment_name": self.settings.azure_openai_chat_deployment,
"api_key": self.settings.azure_openai_api_key,
"azure_endpoint": self.settings.azure_openai_endpoint,
"api_version": self.settings.azure_openai_api_version,
"temperature": 0.1,
"max_tokens": 4000,
"model": self.settings.azure_openai_chat_deployment
}
print(llm_config)
return AzureChatOpenAI(**llm_config)
Hi @mehulambastha , @yashness -- this is an issue in the Semantic Kernel repo. I am wondering why we've brought in issues related to LangChain? Can you please file the issue in the appropriate repo and keep this in the scope of Semantic Kernel? Thank you.
hey @TaoChenOSU,
I encounter the same issue. Let me share some insights from my side. I hope you can suggest a solution.
MODEL_NAME = "gpt-4o" OAI_API_KEY = "
" OAI_ENDPOINT = " " QUERY_FOR_SUMMARIZER = "Extract the main action items from the meeting minutes" QUERY_FOR_CODE_OPTIMIZER = """ Review this recursive Fibonacci implementation and suggest an iterative alternative """
def build_manager() -> SKSimpleManagerAgent: kernel = Kernel() kernel.add_service( service=AzureAIInferenceChatCompletion( ai_model_id=MODEL_NAME, api_key=OAI_API_KEY, endpoint=OAI_ENDPOINT, ) )
agents = [ ChatCompletionAgent( description="Condensed summarizer", id=uuid4().hex, instructions="Return key facts only.", kernel=kernel, name="summarizer-agent", ), ChatCompletionAgent( description="Python optimizer", id=uuid4().hex, instructions="Optimize code for performance/readability.", kernel=kernel, name="code-optimizer-agent", ), ] # This is a wrapper for creating a `GroupChatAgent` as follows; # self._manager = AgentGroupChat( # agents=self._subordinate_agents, # selection_strategy=self._selection_strategy, # termination_strategy=self._termination_strategy, # ) return SKSimpleManagerAgent( model_name=MODEL_NAME, api_key=OAI_API_KEY, endpoint=OAI_ENDPOINT, subordinate_agents=agents, enable_langfuse=False, enable_telemetry=False, )async def main() -> NoReturn: """ async def select_agent(self, message: str) -> str: # Query the question await self._manager.add_chat_message( message=ChatMessageContent(role=AuthorRole.USER, content=message) )
# Use a selection strategy to pick the agent selected = await self._selection_strategy.select_agent( agents=self._subordinate_agents, history=self._manager.history ) return selected.name """ mgr = build_manager() for q in (QUERY_FOR_SUMMARIZER, QUERY_FOR_CODE_OPTIMIZER): await mgr.select_agent(message=q)Observed: INFO Telemetry inactive. Using standard AzureAIInferenceChatCompletion. INFO Initialized SKManagerAgentBase with 2 agents. DEBUG Selected Agent: summarizer-agent DEBUG Selected Agent: code-optimizer-agent
Unclosed client session client_session: <aiohttp.client.ClientSession object at 0x...> Unclosed connector connections: ['deque([(<aiohttp.client_proto.ResponseHandler object at 0x...>, ...)])'] connector: <aiohttp.connector.TCPConnector object at 0x...>
Environment:
- python 3.12
- azure-ai-inference 1.0.0b9
- semantic-kernel 1.23.0
Hi @anu43!
Besides the unclosed session warning, are there other errors that you observed?
No, I haven't @TaoChenOSU. I only encounter this with the AzureAIInferenceChatCompletion.