OpenAI icon indicating copy to clipboard operation
OpenAI copied to clipboard

OpenAI function produces unescaped double-quotes inside JSON String value, causing DecodingError

Open jcmourey opened this issue 2 years ago • 7 comments

Describe the bug When using a JSON function, a JSON Decoding error is sometimes triggered. The error happens because of unescaped double-quotes occurring inside a JSON string value.

To Reproduce Steps to reproduce the behavior: The JSON string that causes the error: """ { "question": "Which deep-sea creature is known for its ability to create its own light through bioluminescence?", "question_summary": "Deep-sea creature that produces its own light", "correct_answer": "Deep-sea anglerfish", "incorrect_answer_1": "Lantern shark", "incorrect_answer_2": "Glowing jellyfish", "incorrect_answer_3": "Bioluminescent crab", "humorous_answer": "Lightbulb shrimp", "humorous_failure_reaction": "Close, but not quite as bright!", "explanation": "The Deep-sea anglerfish is a fascinating creature that has a unique adaptation for finding food in the deep sea. It has a long glowing lure that dangles in front of its mouth to attract prey. This lure is bioluminescent and acts like a "fishing rod" to lure unsuspecting prey." } """ Error happens at "fishing rod" because double-quote is unescaped

Expected behavior Double quotes inside JSON string values should be escaped.

Screenshots dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "The given data was not valid JSON.", underlyingError: Optional(Error Domain=NSCocoaErrorDomain Code=3840 "Badly formed object around line 1, column 757." UserInfo={NSDebugDescription=Badly formed object around line 1, column 757., NSJSONSerializationErrorIndex=757}))): { "question": "Which deep-sea creature is known for its ability to create its own light through bioluminescence?", "question_summary": "Deep-sea creature that produces its own light", "correct_answer": "Deep-sea anglerfish", "incorrect_answer_1": "Lantern shark", "incorrect_answer_2": "Glowing jellyfish", "incorrect_answer_3": "Bioluminescent crab", "humorous_answer": "Lightbulb shrimp", "humorous_failure_reaction": "Close, but not quite as bright!", "explanation": "The Deep-sea anglerfish is a fascinating creature that has a unique adaptation for finding food in the deep sea. It has a long glowing lure that dangles in front of its mouth to attract prey. This lure is bioluminescent and acts like a "fishing rod" to lure unsuspecting prey." }

jcmourey avatar Aug 03 '23 10:08 jcmourey

I'm getting a similar error here, reproducable for me with this code/these prompts:

let query = ChatQuery(model: .gpt3_5Turbo, messages: [
    .init(role: .system, content: "You are Librarian-GPT. You know everything about the books, and you use answers with a bunch of random quotation marks everywhere."),
    .init(role: .user, content: "Who wrote Harry Potter? Use an answer with a bunch of quotation marks scattered around. Use double quotes which appear both singly and in pairs.")
])
        
for try await result in openAI.chatsStream(query: query) {
    print(result)
}

Resulting in this:

OpenAITests.swift:133: error: -[OpenAITests.OpenAITests testChatsStream] : failed: caught error: "dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "The given data was not valid JSON.", underlyingError: Optional(Error Domain=NSCocoaErrorDomain Code=3840 "Unterminated string around line 1, column 101." UserInfo={NSDebugDescription=Unterminated string around line 1, column 101., NSJSONSerializationErrorIndex=101})))"

brytonsf avatar Sep 13 '23 20:09 brytonsf

I'd guess we fix the problem here in StreamingSession w/ a custom JSON decoder that can tolerate any sort of formatted body text? GPT shouldn't necessarily be giving back valid JSON itself, so with no intermediate parsing to go from GPT response -> valid JSON string, it'd make sense that decoding fails.

brytonsf avatar Sep 13 '23 21:09 brytonsf

@brytonsf, @jcmourey,

Did you guys get function calling to work with streaming?

I attached a FunctionArguments to the Message object:

public struct Message: Codable, Hashable, Identifiable {
    public enum MessageType: String, Codable, Hashable {
        case function
        case message
    }
    
    public var id: String
    public var role: Chat.Role
    public var content: String
    public var createdAt: Date
    public var type: MessageType
    public var functionArguments: FunctionArguments?
}

public struct FunctionArguments: Codable, Hashable {
    public let location: String?
}

I am parsing the function calling arguments like this:

                        // Fetch function arguments and assign to message variable
                        var functionArguments: FunctionArguments?
                        if let data = functionCallArguments.data(using: .utf8) {
                            let decoder = JSONDecoder()
                            if let arguments = try? decoder.decode(FunctionArguments.self, from: data) {
                                print("arguments.location is " + (arguments.location?? ""))
                                functionArguments = arguments
                            }

This correctly prints the arguments:

arguments.location is San Francisco

But then "location" is not displayed on the ChatBubbleView( ), probably because the streaming is not complete when the view is displayed

Text(message.functionArguments?.location ?? "location placeholder")

Maybe I am doing something wrong? Is there another way to fetch the arguments? How did you guys get this working?

alelordelo avatar Sep 15 '23 08:09 alelordelo

I got functions to work for me with streaming. I use a "for try await" loop:

var results = ""

for try await result in client.chatsStream(query: query) {
  guard !Task.isCancelled else { return }
  guard let argument = result.choices.first?.delta.functionCall?.arguments else { continue }
  
  results.append(argument)

I concatenate the results then process them manually. When I have something that looks like a complete JSON object, e.g. a completed element from a JSON array, identified by surrounding matching { }, I use JSONDecoder() on it.

 let leftover = try await processPartialJson(results) // extract and decodes completed JSON objects and returns leftover, incomplete JSON object
 guard let leftover else { return }

 results = leftover
}
 

jcmourey avatar Sep 15 '23 15:09 jcmourey

awesome! @jcmourey

I am trying to adapt your code to the Chat Demo, but its hard without the processPartialJson and other parts of the code...

Are you using the chat demo also? If yes, mind sharing your view model?

alelordelo avatar Sep 15 '23 16:09 alelordelo

My processPartialJson is too specific for my use case, would be incomprehensible. You need to think through when you have something out of the stream string that could be JSON-decoded, extract it, decode it with JSONDecoder, remove it from the string and keep going.

jcmourey avatar Sep 15 '23 17:09 jcmourey

@jcmourey @brytonsf @Krivoblotsky , this is what I did that kinda works... The arguments are printed to the console, but the issue is that the view doesn't get updated so I am trying to figure out the best approach....

I added a FunctionArguments to the Message:

import Foundation
import OpenAI
import SwiftUI

public struct Message: Codable, Hashable, Identifiable {
    public enum MessageType: String, Codable, Hashable {
        case function
        case message
    }
    
    public var id: String
    public var role: Chat.Role // Assuming Chat.Role is Hashable
    public var content: String
    public var createdAt: Date
    public var type: MessageType
    public var functionArguments: FunctionArguments?
}

public struct FunctionArguments: Codable, Hashable {
    public let location: String?
    public let date: String?
}

2- Then I fetch the arguments like this:

     // Fetch function arguments and assign to message variable
                        var functionArguments: FunctionArguments?
                        if let data = functionCallArguments.data(using: .utf8) {
                            let decoder = JSONDecoder()
                            if let arguments = try? decoder.decode(FunctionArguments.self, from: data) {
                                print("arguments.name is " + (arguments.location ?? ""))
                                print("arguments.font is " + (arguments.date ?? ""))
                                functionArguments = arguments
                            }
                        }
                        

Full view model bellow:

          import Foundation
import Combine
import OpenAI

/// チャット画面のViewModel
public final class ChatViewModel: ObservableObject {
    public var openAIClient: OpenAIProtocol

    @Published var conversations: [Conversation] = [] // 会話一覧
    @Published var conversationErrors: [Conversation.ID: Error] = [:] // 会話エラー一覧
    @Published var selectedConversationID: Conversation.ID? // 選択中の会話ID


    var selectedConversation: Conversation? {selectedConversationID.flatMap { id in
        conversations.first { $0.id == id }
        }
    }

    // 選択中のチャット
    var selectedConversationPublisher: AnyPublisher<Conversation?, Never> {
        $selectedConversationID.receive(on: RunLoop.main).map { id in
            self.conversations.first(where: { $0.id == id })
        }
        .eraseToAnyPublisher()
    }

    public init(openAIClient: OpenAIProtocol) {
        self.openAIClient = openAIClient
    }

    // チャットを開始する
    func createConversation() {
        let conversation = Conversation(id: UUID().uuidString, messages: [])
        conversations.append(conversation)
    }
    
    // チャットを開始する
    func selectConversation(_ conversationId: Conversation.ID?) {
        selectedConversationID = conversationId
    }
    
    // チャットを開始する
    func deleteConversation(_ conversationId: Conversation.ID) {
        conversations.removeAll(where: { $0.id == conversationId })
    }
    
    // チャットを開始する
    @MainActor
    func sendMessage(
        _ message: Message,
        conversationId: Conversation.ID,
        model: Model
    ) async {
        guard let conversationIndex = conversations.firstIndex(where: { $0.id == conversationId }) else {
            return
        }
        conversations[conversationIndex].messages.append(message)

        await completeChat(
            conversationId: conversationId,
            model: model
        )
    }
    
    @MainActor
        func completeChat(
            conversationId: Conversation.ID,
            model: Model
        ) async {
            guard let conversation = conversations.first(where: { $0.id == conversationId }) else {
                return
            }
                    
            conversationErrors[conversationId] = nil

            do {
                guard let conversationIndex = conversations.firstIndex(where: { $0.id == conversationId }) else {
                    return
                }

                let weatherFunction = ChatFunctionDeclaration(
                    name: "getWeatherData",
                    description: "Get the current weather in a given location and date",
                    parameters: .init(
                      type: .object,
                      properties: [
                        "location": .init(type: .string, description: "The city and state, e.g. San Francisco, CA"),
                        "date": .init(type: .string, description: "The city and state, e.g. San Francisco, CA")

                      ],
                      required: ["location"]
                    )
                )
                
        

                let functions = [weatherFunction]
                
                let chatsStream: AsyncThrowingStream<ChatStreamResult, Error> = openAIClient.chatsStream(
                    query: ChatQuery(
                        model: model,
                        messages: conversation.messages.map { message in
                            Chat(role: message.role, content: message.content)
                        },
                        functions: functions
                    )
                )

                var functionCallName = ""
                var functionCallArguments = ""
                for try await partialChatResult in chatsStream {
                    for choice in partialChatResult.choices {
                        let existingMessages = conversations[conversationIndex].messages
                        var messageType: Message.MessageType = .message // Default to message
                        // Function calls are also streamed, so we need to accumulate.
                        if let functionCallDelta = choice.delta.functionCall {
                            if let nameDelta = functionCallDelta.name {
                              functionCallName += nameDelta
                            }
                            if let argumentsDelta = functionCallDelta.arguments {
                              functionCallArguments += argumentsDelta
                                print("Debug: argumentsDelta is : \(String(describing: argumentsDelta))")

                            }
                            print("Debug: functionCallDelta is : \(String(describing: functionCallDelta))")
                            

                            print("Debug: functionCallDelta.name is : \(String(describing: functionCallDelta.name))")
                            
                            print("Debug: ffunctionCallDelta.arguments is : \(String(describing: functionCallDelta.arguments))")

                        }
                        
                        //trying to get stream of 
                        var funcDelta = choice.delta.functionCall?.arguments ?? ""
                        print("Debug: funcDelta is : \(funcDelta)")


                        var messageText = choice.delta.content ?? ""
                        if let finishReason = choice.finishReason {
                            if finishReason == "function_call" {
                                messageType = .function
                                messageText += "Function call: name=\(functionCallName) arguments=\(functionCallArguments)"
                            }
                        }
                        print(functionCallArguments)
                        print("Debug: Entire choice object: \(choice)")
                        print("Debug: Entire delta object: \(choice.delta)")

                

                        // Fetch function arguments and assign to message variable
                        var functionArguments: FunctionArguments?
                        if let data = functionCallArguments.data(using: .utf8) {
                            let decoder = JSONDecoder()
                            if let arguments = try? decoder.decode(FunctionArguments.self, from: data) {
                                print("arguments.name is " + (arguments.location ?? ""))
                                print("arguments.font is " + (arguments.date ?? ""))
                                functionArguments = arguments
                            }
                        }
                        print("Debug: functionArguments object: \(String(describing: functionArguments))")
                        let message = Message(
                            id: partialChatResult.id,
                            role: choice.delta.role ?? .assistant,
                            content: messageText,
                            createdAt: Date(timeIntervalSince1970: TimeInterval(partialChatResult.created)),
                            type: messageType,
                            functionArguments: functionArguments
                        )

                        
                        // Debug: Print the entire message object
                        print("Debug: Created Message: \(message)")
                        
                        if let existingMessageIndex = existingMessages.firstIndex(where: { $0.id == partialChatResult.id }) {
                            // Meld into previous message
                            let previousMessage = existingMessages[existingMessageIndex]
                            let combinedMessage = Message(
                                id: message.id, // id stays the same for different deltas
                                role: message.role,
                                content: previousMessage.content + message.content,
                                createdAt: message.createdAt,
                                type: messageType
                            )
                            conversations[conversationIndex].messages[existingMessageIndex] = combinedMessage
                        } else {
                            conversations[conversationIndex].messages.append(message)
                        }
                    }
                }
                
            } catch {
                conversationErrors[conversationId] = error
            }
        }
}

alelordelo avatar Sep 22 '23 14:09 alelordelo