OpenAI function produces unescaped double-quotes inside JSON String value, causing DecodingError
Describe the bug When using a JSON function, a JSON Decoding error is sometimes triggered. The error happens because of unescaped double-quotes occurring inside a JSON string value.
To Reproduce Steps to reproduce the behavior: The JSON string that causes the error: """ { "question": "Which deep-sea creature is known for its ability to create its own light through bioluminescence?", "question_summary": "Deep-sea creature that produces its own light", "correct_answer": "Deep-sea anglerfish", "incorrect_answer_1": "Lantern shark", "incorrect_answer_2": "Glowing jellyfish", "incorrect_answer_3": "Bioluminescent crab", "humorous_answer": "Lightbulb shrimp", "humorous_failure_reaction": "Close, but not quite as bright!", "explanation": "The Deep-sea anglerfish is a fascinating creature that has a unique adaptation for finding food in the deep sea. It has a long glowing lure that dangles in front of its mouth to attract prey. This lure is bioluminescent and acts like a "fishing rod" to lure unsuspecting prey." } """ Error happens at "fishing rod" because double-quote is unescaped
Expected behavior Double quotes inside JSON string values should be escaped.
Screenshots dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "The given data was not valid JSON.", underlyingError: Optional(Error Domain=NSCocoaErrorDomain Code=3840 "Badly formed object around line 1, column 757." UserInfo={NSDebugDescription=Badly formed object around line 1, column 757., NSJSONSerializationErrorIndex=757}))): { "question": "Which deep-sea creature is known for its ability to create its own light through bioluminescence?", "question_summary": "Deep-sea creature that produces its own light", "correct_answer": "Deep-sea anglerfish", "incorrect_answer_1": "Lantern shark", "incorrect_answer_2": "Glowing jellyfish", "incorrect_answer_3": "Bioluminescent crab", "humorous_answer": "Lightbulb shrimp", "humorous_failure_reaction": "Close, but not quite as bright!", "explanation": "The Deep-sea anglerfish is a fascinating creature that has a unique adaptation for finding food in the deep sea. It has a long glowing lure that dangles in front of its mouth to attract prey. This lure is bioluminescent and acts like a "fishing rod" to lure unsuspecting prey." }
I'm getting a similar error here, reproducable for me with this code/these prompts:
let query = ChatQuery(model: .gpt3_5Turbo, messages: [
.init(role: .system, content: "You are Librarian-GPT. You know everything about the books, and you use answers with a bunch of random quotation marks everywhere."),
.init(role: .user, content: "Who wrote Harry Potter? Use an answer with a bunch of quotation marks scattered around. Use double quotes which appear both singly and in pairs.")
])
for try await result in openAI.chatsStream(query: query) {
print(result)
}
Resulting in this:
OpenAITests.swift:133: error: -[OpenAITests.OpenAITests testChatsStream] : failed: caught error: "dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "The given data was not valid JSON.", underlyingError: Optional(Error Domain=NSCocoaErrorDomain Code=3840 "Unterminated string around line 1, column 101." UserInfo={NSDebugDescription=Unterminated string around line 1, column 101., NSJSONSerializationErrorIndex=101})))"
I'd guess we fix the problem here in StreamingSession w/ a custom JSON decoder that can tolerate any sort of formatted body text? GPT shouldn't necessarily be giving back valid JSON itself, so with no intermediate parsing to go from GPT response -> valid JSON string, it'd make sense that decoding fails.
@brytonsf, @jcmourey,
Did you guys get function calling to work with streaming?
I attached a FunctionArguments to the Message object:
public struct Message: Codable, Hashable, Identifiable {
public enum MessageType: String, Codable, Hashable {
case function
case message
}
public var id: String
public var role: Chat.Role
public var content: String
public var createdAt: Date
public var type: MessageType
public var functionArguments: FunctionArguments?
}
public struct FunctionArguments: Codable, Hashable {
public let location: String?
}
I am parsing the function calling arguments like this:
// Fetch function arguments and assign to message variable
var functionArguments: FunctionArguments?
if let data = functionCallArguments.data(using: .utf8) {
let decoder = JSONDecoder()
if let arguments = try? decoder.decode(FunctionArguments.self, from: data) {
print("arguments.location is " + (arguments.location?? ""))
functionArguments = arguments
}
This correctly prints the arguments:
arguments.location is San Francisco
But then "location" is not displayed on the ChatBubbleView( ), probably because the streaming is not complete when the view is displayed
Text(message.functionArguments?.location ?? "location placeholder")
Maybe I am doing something wrong? Is there another way to fetch the arguments? How did you guys get this working?
I got functions to work for me with streaming. I use a "for try await" loop:
var results = ""
for try await result in client.chatsStream(query: query) {
guard !Task.isCancelled else { return }
guard let argument = result.choices.first?.delta.functionCall?.arguments else { continue }
results.append(argument)
I concatenate the results then process them manually. When I have something that looks like a complete JSON object, e.g. a completed element from a JSON array, identified by surrounding matching { }, I use JSONDecoder() on it.
let leftover = try await processPartialJson(results) // extract and decodes completed JSON objects and returns leftover, incomplete JSON object
guard let leftover else { return }
results = leftover
}
awesome! @jcmourey
I am trying to adapt your code to the Chat Demo, but its hard without the processPartialJson and other parts of the code...
Are you using the chat demo also? If yes, mind sharing your view model?
My processPartialJson is too specific for my use case, would be incomprehensible. You need to think through when you have something out of the stream string that could be JSON-decoded, extract it, decode it with JSONDecoder, remove it from the string and keep going.
@jcmourey @brytonsf @Krivoblotsky , this is what I did that kinda works... The arguments are printed to the console, but the issue is that the view doesn't get updated so I am trying to figure out the best approach....
I added a FunctionArguments to the Message:
import Foundation
import OpenAI
import SwiftUI
public struct Message: Codable, Hashable, Identifiable {
public enum MessageType: String, Codable, Hashable {
case function
case message
}
public var id: String
public var role: Chat.Role // Assuming Chat.Role is Hashable
public var content: String
public var createdAt: Date
public var type: MessageType
public var functionArguments: FunctionArguments?
}
public struct FunctionArguments: Codable, Hashable {
public let location: String?
public let date: String?
}
2- Then I fetch the arguments like this:
// Fetch function arguments and assign to message variable
var functionArguments: FunctionArguments?
if let data = functionCallArguments.data(using: .utf8) {
let decoder = JSONDecoder()
if let arguments = try? decoder.decode(FunctionArguments.self, from: data) {
print("arguments.name is " + (arguments.location ?? ""))
print("arguments.font is " + (arguments.date ?? ""))
functionArguments = arguments
}
}
Full view model bellow:
import Foundation
import Combine
import OpenAI
/// チャット画面のViewModel
public final class ChatViewModel: ObservableObject {
public var openAIClient: OpenAIProtocol
@Published var conversations: [Conversation] = [] // 会話一覧
@Published var conversationErrors: [Conversation.ID: Error] = [:] // 会話エラー一覧
@Published var selectedConversationID: Conversation.ID? // 選択中の会話ID
var selectedConversation: Conversation? {selectedConversationID.flatMap { id in
conversations.first { $0.id == id }
}
}
// 選択中のチャット
var selectedConversationPublisher: AnyPublisher<Conversation?, Never> {
$selectedConversationID.receive(on: RunLoop.main).map { id in
self.conversations.first(where: { $0.id == id })
}
.eraseToAnyPublisher()
}
public init(openAIClient: OpenAIProtocol) {
self.openAIClient = openAIClient
}
// チャットを開始する
func createConversation() {
let conversation = Conversation(id: UUID().uuidString, messages: [])
conversations.append(conversation)
}
// チャットを開始する
func selectConversation(_ conversationId: Conversation.ID?) {
selectedConversationID = conversationId
}
// チャットを開始する
func deleteConversation(_ conversationId: Conversation.ID) {
conversations.removeAll(where: { $0.id == conversationId })
}
// チャットを開始する
@MainActor
func sendMessage(
_ message: Message,
conversationId: Conversation.ID,
model: Model
) async {
guard let conversationIndex = conversations.firstIndex(where: { $0.id == conversationId }) else {
return
}
conversations[conversationIndex].messages.append(message)
await completeChat(
conversationId: conversationId,
model: model
)
}
@MainActor
func completeChat(
conversationId: Conversation.ID,
model: Model
) async {
guard let conversation = conversations.first(where: { $0.id == conversationId }) else {
return
}
conversationErrors[conversationId] = nil
do {
guard let conversationIndex = conversations.firstIndex(where: { $0.id == conversationId }) else {
return
}
let weatherFunction = ChatFunctionDeclaration(
name: "getWeatherData",
description: "Get the current weather in a given location and date",
parameters: .init(
type: .object,
properties: [
"location": .init(type: .string, description: "The city and state, e.g. San Francisco, CA"),
"date": .init(type: .string, description: "The city and state, e.g. San Francisco, CA")
],
required: ["location"]
)
)
let functions = [weatherFunction]
let chatsStream: AsyncThrowingStream<ChatStreamResult, Error> = openAIClient.chatsStream(
query: ChatQuery(
model: model,
messages: conversation.messages.map { message in
Chat(role: message.role, content: message.content)
},
functions: functions
)
)
var functionCallName = ""
var functionCallArguments = ""
for try await partialChatResult in chatsStream {
for choice in partialChatResult.choices {
let existingMessages = conversations[conversationIndex].messages
var messageType: Message.MessageType = .message // Default to message
// Function calls are also streamed, so we need to accumulate.
if let functionCallDelta = choice.delta.functionCall {
if let nameDelta = functionCallDelta.name {
functionCallName += nameDelta
}
if let argumentsDelta = functionCallDelta.arguments {
functionCallArguments += argumentsDelta
print("Debug: argumentsDelta is : \(String(describing: argumentsDelta))")
}
print("Debug: functionCallDelta is : \(String(describing: functionCallDelta))")
print("Debug: functionCallDelta.name is : \(String(describing: functionCallDelta.name))")
print("Debug: ffunctionCallDelta.arguments is : \(String(describing: functionCallDelta.arguments))")
}
//trying to get stream of
var funcDelta = choice.delta.functionCall?.arguments ?? ""
print("Debug: funcDelta is : \(funcDelta)")
var messageText = choice.delta.content ?? ""
if let finishReason = choice.finishReason {
if finishReason == "function_call" {
messageType = .function
messageText += "Function call: name=\(functionCallName) arguments=\(functionCallArguments)"
}
}
print(functionCallArguments)
print("Debug: Entire choice object: \(choice)")
print("Debug: Entire delta object: \(choice.delta)")
// Fetch function arguments and assign to message variable
var functionArguments: FunctionArguments?
if let data = functionCallArguments.data(using: .utf8) {
let decoder = JSONDecoder()
if let arguments = try? decoder.decode(FunctionArguments.self, from: data) {
print("arguments.name is " + (arguments.location ?? ""))
print("arguments.font is " + (arguments.date ?? ""))
functionArguments = arguments
}
}
print("Debug: functionArguments object: \(String(describing: functionArguments))")
let message = Message(
id: partialChatResult.id,
role: choice.delta.role ?? .assistant,
content: messageText,
createdAt: Date(timeIntervalSince1970: TimeInterval(partialChatResult.created)),
type: messageType,
functionArguments: functionArguments
)
// Debug: Print the entire message object
print("Debug: Created Message: \(message)")
if let existingMessageIndex = existingMessages.firstIndex(where: { $0.id == partialChatResult.id }) {
// Meld into previous message
let previousMessage = existingMessages[existingMessageIndex]
let combinedMessage = Message(
id: message.id, // id stays the same for different deltas
role: message.role,
content: previousMessage.content + message.content,
createdAt: message.createdAt,
type: messageType
)
conversations[conversationIndex].messages[existingMessageIndex] = combinedMessage
} else {
conversations[conversationIndex].messages.append(message)
}
}
}
} catch {
conversationErrors[conversationId] = error
}
}
}