open-interpreter Structure Output

Is your feature request related to a problem? Please describe.

Hello! - Is there a way to enforce Open Interpreter (or LiteLLM) to output a JSON object?

Describe the solution you'd like

Integrate JSON object to Open Interpreter

Describe alternatives you've considered

No response

Additional context

No response

Oct 02 '24 09:10 MorphSeur

Works great for me, I just asked it "can you analise the page at example.com and output a json object with the count of all html elements and an example text for each if applicable" and voila: JSON output. Model: GPT 4o

Oct 23 '24 11:10 bobemoe

Hello @bobemoe,

Thanks for you reply.

However, by replicating the query you have provided, I got the following:

Here is the JSON object containing the count and example text of each HTML element found on the YouTube homepage:

```json
{
    "html": {
        "count": 1,
        "example_text": "YouTubePr\u00e9sentationPresseDroit"
    },
    "head": {
        "count": 1,
        "example_text": "YouTube"
    },
    "script": {
        "count": 42,
        "example_text": "window.WIZ_global_data = {\"MUE"
    },
    "meta": {
        "count": 7,
        "example_text": ""
    },
    "link": {
        "count": 17,
        "example_text": ""
    },
    "title": {
        "count": 1,
        "example_text": "YouTube"
    },
    "style": {
        "count": 5,
        "example_text": "body{padding:0;margin:0;overfl"
    },
    "body": {
        "count": 1,
        "example_text": "Pr\u00e9sentationPresseDroits d'aut"
    },
    "iframe": {
        "count": 1,
        "example_text": ""
    },
    "ytd-app": {
        "count": 1,
        "example_text": "Pr\u00e9sentationPresseDroits d'aut"
    },
    "ytd-masthead": {
        "count": 1,
        "example_text": ""
    },
    "div": {
        "count": 291,
        "example_text": ""
    },
    "input": {
        "count": 1,
        "example_text": ""
    },
    "svg": {
        "count": 3,
        "example_text": ""
    },
    "g": {
        "count": 5,
        "example_text": ""
    },
    "path": {
        "count": 21,
        "example_text": ""
    },
    "a": {
        "count": 15,
        "example_text": ""
    },
    "defs": {
        "count": 2,
        "example_text": ""
    },
    "clippath": {
        "count": 2,
        "example_text": ""
    },
    "rect": {
        "count": 2,
        "example_text": ""
    },
    "span": {
        "count": 1,
        "example_text": ""
    }
}
```

This JSON object provides a count of each HTML element and an example text snippet (up to 30 characters) for each element type found on the YouTube homepage. If you need any further analysis or actions, please let me know!

Query used:

query = "can you analyse the page at youtube.com and output a json object with the count of all html elements and an example text for each if applicable"

As you can see, the output of Open Interpreter is in Markdown containing a summary and the asked JSON object bounded between the JSON markdown.

What I was asking for is: the output of Open Interpreter is a JSON.

Oct 23 '24 12:10 MorphSeur

Ok I see, other than changing the prompt slightly:

from now on can you output your entire answers in json format including the answer to this. your output must not contain text outside of the json response. use python to structure it so no syntax errors creep in.

   {                                                                                                                                  
       "message": "Sure, I will output all responses in JSON format.",                                                                
       "status": "success"                                                                                                            
   }

However I think if you used the python API you could construct the outputs JSON programmatically.

Can you give example of what you would like to prompt, and what output you expect. Do you want it as a JSON file? Are you using the interpreter CLI or the python API?

Oct 23 '24 12:10 bobemoe

It can be done with adding details to the prompt. The issue is sometimes it works fine, sometimes it requires REGEX formulas to extract the JSON.

To answer you questions, I need as output only a JSON object and I am using Python API.

Oct 23 '24 12:10 MorphSeur

I am not using python API. Can you show your code? An example query and how you would like to the output formatted? Can you not build the JSON object from the output? Think I'm all out of suggestions really other than you showing a detailed example of what you have now and what exactly you would like to achieve with examples.

Oct 23 '24 12:10 bobemoe

Here is the code:

from interpreter import interpreter
import litellm

def open_interpreter(model, llm_prompt):
    interpreter.llm.model = model
    interpreter.llm.temperature = 0.05
    interpreter.verbose = False
    interpreter.computer.verbose = False
    interpreter.auto_run = True
    litellm.drop_params = True
    response = interpreter.chat({"role": "user", 'type': 'message', "content": llm_prompt})
    interpreter.messages = []
    return response

model = "azure/gpt4o-default"  #GPT-4o
# model = "azure/gpt-4"
query = "can you analyse the page at youtube.com and output a json object with the count of all html elements and an example text for each if applicable"

response = open_interpreter(model, query)
print(response[-1]['content'])

Oct 23 '24 12:10 MorphSeur

I'm not sure that that demonstrates your use case very well as you've used my guess at a query so I'm still not quite sure exactly what kind of data you are asking it to generate, however keeping to the example I found the best thing to do was to ask it to use python to create a json file and then load that in:

from interpreter import interpreter
import litellm
import json

def open_interpreter(model, llm_prompt):
    interpreter.llm.model = model
    interpreter.llm.temperature = 0.05
    interpreter.verbose = False
    interpreter.computer.verbose = False
    interpreter.auto_run = True
    litellm.drop_params = True
    response = interpreter.chat({"role": "user", 'type': 'message', "content": llm_prompt})
    interpreter.messages = []
    return response

model = "gpt-4o"  #GPT-4o

query = "use python script to analyse the page at youtube.com and output a json object with the count of all html elements and an example text for each if applicable. save the json to a file called out.json. beautifulsoup, requests and json are already install no need to install anything"

response = open_interpreter(model, query)
#print(response[-1]['content'])

# Open the JSON file
with open('out.json', 'r') as file:
    data = json.load(file)

# Now 'data' contains the parsed JSON data as a Python dictionary
print("Results as json:");
print(data)

The issue is sometimes it works fine, sometimes it requires REGEX formulas to extract the JSON.

I think that is the nature of the beast, you are always going to get a slightly different result each time. My method above asking it to generate a json file with python should ensure the json is always readable, but the structure within may vary making it hard to parse for whatever purpose you wanted it. You can of course ask it to adhere to a specific format. It is very good at following examples so you could include a sample in your prompt. That does mean you'll need to give some specific examples and expected formats.

Oct 23 '24 17:10 bobemoe

Thanks for you reply! Fully understand! I even use a prompt that clearly explains which JSON object to return with its key name. Was only replicating the example you shared as you have mentioned that you got a JSON object as output.

The ultimate goal is to have only a JSON object as an output of Open Interpreter and avoid using REGEX formulas to parse the output.

LangChain and OpenAI for example allows to set an argument that allows to enforce the LLM to output only a JSON object.

Oct 23 '24 17:10 MorphSeur

Prompting is currently the best way to achieve this but we're exploring more possibilities for structured output!

Nov 04 '24 14:11 MikeBirdTech

Thanks @MikeBirdTech and @bobemoe ! Looking forward to this feature.

Nov 04 '24 14:11 MorphSeur

Structure Output - JSON object

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context