semantic-kernel .Net: Add support for image quality, style, and detail level

Context:

Text to image {Azure}OpenAI connectors: AzureOpenAI SDK v2 allows the specification of image style (Vivid & Natural) and image quality (High & Standard) when generating an image from text.:

internal async Task<string> GenerateImageAsync(
    string prompt,
    int width,
    int height,
    CancellationToken cancellationToken)
{
    ...

    var imageOptions = new ImageGenerationOptions()
    {
        Size = size,
        ResponseFormat = GeneratedImageFormat.Uri,
        Quality = GeneratedImageQuality.High, // It's not supported by SK yet.
        Style = GeneratedImageStyle.Vivid, // It's not supported by SK yet.
    };

    ...
}

Chat completion {Azure}OpenAI connectors: AzureOpenAI SDK v2 allows the specification of image detail level (Low, High, and Auto) when sending an image to the model:

private static ChatMessageContentPart GetImageContentItem(ImageContent imageContent)
{
   ...
   return ChatMessageContentPart.CreateImageMessageContentPart(
         BinaryData.FromBytes(data),
         imageContent.MimeType,
         ImageChatMessageContentPartDetail.Auto); // It's not supported by SK yet.

   ...
}

ToDo: Consider extending the SK public API surface to support the options mentioned above.

Notes Decide whether the support should be shipped as part of the Azure OpenAI v2 migration initiative or afterward.

Jul 03 '24 16:07 SergeyMenshykh

Yes, please prioritize this. This is something we need for our projects. Specifically for chat completion.

Jul 05 '24 07:07 AdriaanLarcai

Hello,

I was going to use quality and style for my hobby project. Then I thought maybe I could contribute.

I have a draft PR up into 'feature-connectors-openai' and it would be great to know if I am moving in the right direction: https://github.com/microsoft/semantic-kernel/pull/8064

Thank you for your time!

Aug 11 '24 20:08 aghimir3

@aghimir3 You approach is good but actually the ideal is to update our ITextToImage abstractions to accept a ExecutionSettings which can contain all those special details when requesting for a ImageGeneration.

I'm currently working on this PR to resolve this issue as part of it.

#8068

Aug 12 '24 11:08 rogerbarreto

Thank you for the response!

I noticed that the GetImageContentsAsync method in OpenAITextToImageService.cs hasn't been implemented yet. I'd be thrilled to contribute to it if you're open to that! Or, if you could guide me to any other .NET tasks related to OpenAI that need attention, I'd be more than happy to help out.

Aug 12 '24 17:08 aghimir3

Hi @RogerBarreto ,

I found major differences between #8068 and #7471 in the direction we are going.

These files are different:

https://github.com/RogerBarreto/semantic-kernel/blob/1f567f38f23288c108a791bed5be9070ca3c285d/dotnet/src/Connectors/Connectors.OpenAI/TextToImage/OpenAITextToImageService.cs
https://github.com/microsoft/semantic-kernel/blob/feature-connectors-openai/dotnet/src/Connectors/Connectors.OpenAI/Services/OpenAITextToImageService.cs

Could you please advise on what is the right way to continue with this implementation? Thank you for your time!

Aug 13 '24 02:08 aghimir3

This changes are not part of the OpenAI V2 feature they introduce a new Abstraction update for all connectors.

As the feature-connectors-openai is the newer version, this PR brings the abstraction implementation to the current main code base for this PR, consider using the V1 version

And as soon the feature branch merges to main, we update the code to the new pattern.

Aug 13 '24 08:08 rogerbarreto

Hi @RogerBarreto ,

Thanks for merging my changes into #8068 .

Since the OpenAI V2 version is still missing support for quality and style, can I branch off of feature-connectors-openai and work on it?

Thank you for your time!

Aug 14 '24 17:08 aghimir3