SwiftOpenAI icon indicating copy to clipboard operation
SwiftOpenAI copied to clipboard

feat: Implement TTS streaming and improve OpenRouter compatibility

Open cculbreath opened this issue 3 months ago • 0 comments

Summary

This pull request introduces streaming audio capabilities for the Text-to-Speech (TTS) API and enhances the Chat Completion parameters with reasoning and streamOptions. These revisions were made mostly through prompt engineering with Claude and OpenAI Codex, and I've done my best to verify that they are cleanly implemented, make sense, and have been tested in my application.

Key Changes

  • Streaming Audio for Text-to-Speech:

    • A new createStreamingSpeech method has been added to the OpenAIService protocol, allowing for real-time audio data streaming.
    • The AudioSpeechParameters now include a stream parameter to enable this functionality.
    • A new AudioSpeechChunkObject has been introduced to represent individual chunks of the audio stream.
    • The underlying networking components (AsyncHTTPClientAdapter and URLSessionHTTPClientAdapter) have been updated to handle byte streams for audio data, in addition to the existing line-based streams for text.
    • New tests have been added in AudioStreamingTests.swift to validate the streaming audio functionality.
  • Chat Completion Enhancements:

    • The ChatCompletionParameters now support reasoning and streamOptions, providing more control over the model's output and compatibility with OpenRouter's API
    • New tests in ChatCompletionParametersTests.swift ensure these parameters are correctly encoded.
  • Improved Error Handling:

    • Error messages for unsuccessful API responses now include the response body for improved OpenRouter compatibility
  • Documentation:

    • The README.md has been updated to reflect these new features and provide usage examples.

These changes enhance the library's capabilitiesfor applications requiring real-time audio generation.

cculbreath avatar Oct 11 '25 01:10 cculbreath