cursor icon indicating copy to clipboard operation
cursor copied to clipboard

Focused/broad multiple file understanding for chat and generation?

Open djpecot opened this issue 3 years ago • 1 comments

I'm curious how cursor is inputting data into the AI and if it is able to "see" connections across different files. For example, I'm working on a code base that has some structure like this:

- data
    - dataset_cities.csv
    - dataset_restaraunts.csv
    - dataset_random.csv
- src
    - thing.py
    - thing2.py
    - utils
        - thing3.py
- modules
    - openai
- main.py 

Where let’s say for example I want to chat about the current file in thing.py that references any imported modules/libraries at the head of the file (i.e. openai, utils.thing3, etc.), pulls that into the “context” of a code generation or chat task. Based on what I’m reading, it seems like the current version only accesses the “active” file that I have open.

Readme just says Chat: ChatGPT-style interface that understands your **current** file

Also looked over this issue but this seems more like a cross-file Ctrl + Shift + F.

Similar to this issue

If this isn’t part of the logic, is there a way to manually “force” other code context into a chat? This seems like a really important feature to give the AI a higher-level understanding of all the files in a given code base (or to focus on specific functions within a file or modules)

I peeked in the package.json but didn't detect any open-source AI modules, so not sure how the actual AI logic is processed ;)

Example. Using Ctrl + K to generate code, I gave it this prompt in main.py

# Load the rest of the files from the data folder as pandas DataFrames

To which it responded:

data_files = ['file1.csv', 'file2.csv', 'file3.csv']  # Replace with your actual file names
data_frames = [pd.read_csv(f'data/{file}') for file in data_files]

This is actually a pretty good output except.... that my underlying csv files were not named file1, file2, etc.

Now if I change the directory name from data to some_data_folder, it hallucinates a response:

import pandas as pd

# Load the rest of the files from the data folder as pandas DataFrames
import os

data_folder = "data"
data_files = [f for f in os.listdir(data_folder) if f.endswith('.csv')]

dataframes = {}
for file in data_files:
    file_path = os.path.join(data_folder, file)
    dataframes[file] = pd.read_csv(file_path)

It DID create the dynamic file names suprisingly this time, but missed on the actual directory name.

It seems like the underlying LLM is "unaware" of context outside of the current file?

djpecot avatar Apr 06 '23 20:04 djpecot

In older versions of Cursor, the AI could only see the currently focused file and knew if you had a selection in that file.

Since v0.2.0, Cursor has the ability to insert code snippets from any file into the chat to ask questions about it. To do this, hit CTRL+L (or your OS-specific command) to reach the chat window. Highlight some code you would like to give the AI as context, click into the chat window as if you are about to type a question, and a button will appear named "+ Insert Selection". By clicking this, that code is sent with your query and can be used by the AI to context your question.

This has limitations, as only a certain amount of code can reasonably be sent with the AI, but it should help with many requests. Regarding the AI's visibility of folder structure, I'd have to refer to a developer on how that works behind the scenes, but from my work with it, it doesn't seem to be able to see the folder structure loaded in the IDE.

Will list it as a potential enhancement pending review from the Cursor developers.

danperks avatar Apr 10 '23 10:04 danperks