semantic-kernel .Net: Python: Realtime API

Milestones

[x] ADR reviewed and decision agreed: #10355
[x] Preview implementation for Python: #10127
[x] Getting started samples
[x] Learn site documentation updated
[x] Blog post

Description

Integrate a real-time audio API within the Semantic Kernel to allow seamless interaction with OpenAI GPT, Gemini, and Anthropic models. This API will enable applications to send live audio streams for processing and receive responses in real time, facilitating enhanced conversational agents and multimedia applications. Especially targeting contact center scenarios

Scenarios

As a developer, I can integrate the real-time audio API into my SK application to interact with through voice commands including plugin and filter use.
As a developer, I can configure the audio input and output settings for optimal real-time performance.

Requirements:

API Development:

Design API endpoints capable of receiving live audio streams and connecting them to realtime audio endpoints
Enable the API to support multiple audio formats (e.g., PCM, WAV, MP3).
Allow customization of kernel parameters (e.g., temperature, response length).

Integration Support:

Provide seamless integration capabilities with the Semantic Kernel's existing features like plugins and filters.
Ensure compatibility with major programming languages and frameworks (e.g., Python, C#, Java).
Documentation and Samples:
Provide comprehensive API documentation, including usage guidelines, parameter descriptions, and example use cases.
Create sample projects demonstrating integration with OpenAI, Gemini, and Anthropic models.

Jan 06 '25 14:01 markwallace-microsoft

Would love to see the .NET Realtime API available for use!

Jul 01 '25 12:07 Chryogenic

@eavanvalkenburg and @markwallace-microsoft: Are you planning to support the new real-time API and model which have just been announced?

Links:

https://platform.openai.com/docs/guides/realtime-websocket
https://openai.com/index/introducing-gpt-realtime/

There have been some changes which must be adopted by SK:

New API endpoint
New websocket session events
New objects in body
New model name

Sep 02 '25 07:09 marvinbuss

@eavanvalkenburg and @markwallace-microsoft: Are you planning to support the new real-time API and model which have just been announced?

Links:

https://platform.openai.com/docs/guides/realtime-websocket

https://openai.com/index/introducing-gpt-realtime/

There have been some changes which must be adopted by SK:

New API endpoint

New websocket session events

New objects in body

New model name

I am waiting for this one as well!

Sep 02 '25 07:09 r4hulp

+1 to the new gpt-realtime model. Also compatibility/extensibility to integrate with the new Azure Voice Live API (which is similar in events interface to OpenAI realtime API) would be desirable as per https://github.com/microsoft/semantic-kernel/issues/12291

Sep 03 '25 11:09 jjgriff93