continue icon indicating copy to clipboard operation
continue copied to clipboard

Autocomplete - Git diff is issued at each keystroke - no cache is implemented

Open ferenci84 opened this issue 11 months ago • 22 comments

Before submitting your bug report

Relevant environment info

- OS: MacOS
- Continue version: 0.9.264, 0.9.261
- IDE version: 1.97.1
- config.json:

  "tabAutocompleteModel": {
    "title": "Qwen2.5-Coder 1.5B",
    "provider": "ollama",
    "model": "qwen2.5-coder:1.5b-base",
    "completionOptions": {
      "maxTokens": 100
    }
  }

Description

Git diff command is issued at each keystroke. No cache is applied to this, potentially long, operation.

My IDE and my whole system became unresponsive, basically all actions that require any file operation was delayed. Under the hood, git also used the 'rg' command. The problem disappeared if I did any of this:

  • Reloaded the VS code window
  • Restarted Extensions

When I disabled the Continue extension, the problem didn't came back. When I reenabled, it came back again immediately, even after a fresh system restart.

Interesting detail is that if I issue 'git diff' or 'git diff --cached', on the same repo, it responds quickly. There may be a locking mechanism or race condition that causes problem when two instances of this git command is issued at the same time.

Few screenshots:

Image Image

I believe it's caused by this code:

// core/autocomplete/snippets/getAllSnippets.ts
const getDiffSnippets = async (
  ide: IDE,
): Promise<AutocompleteDiffSnippet[]> => {
  let diff: string[] = [];
  try {
    diff = await ide.getDiff(true);
  } catch (e) {
    console.error("Error getting diff for autocomplete", e);
  }

  return diff.map((item) => {
    return {
      content: item,
      type: AutocompleteSnippetType.Diff,
    };
  });
};

I checked that it's called from getAllSnippets that is called directly from provideInlineCompletionItems and no cache is preventing any of those calls to be made at each call of provideInlineCompletionItems.

To reproduce

Open git log to see that each keystroke causes git diff.

On my system possibly a large or uncommonly structured git repository caused long delays in this command, and the multiple command issued caused the file-system to be unresponsive, I'm not sure how this can be reproduced, however the core problem is apparent just by looking at the git log.

Additional note

Image

This is an other, less problematic project where git diff do not have such a huge delay and we can see what happens. The red rectangle is where I pressed save. Before and after there are commands that are triggered when autocomplete was triggered. I think the save action, where the rectangle is, where git diff should be issued (however IMHO, the whole 'git diff' stuff could be avoided in favour of last-visited-ranges and last-edited-ranges that is already implemented for autocomplete).

ferenci84 avatar Feb 13 '25 11:02 ferenci84

Can confirm. The behaviour started when upgrading to 0.0.88 of the IntelliJ-Plugin, constantly crashing the IDE after a few seconds (I type pretty fast)

Isfirs avatar Feb 14 '25 07:02 Isfirs

Provided a Fix for VS code in this PR: https://github.com/continuedev/continue/pull/4161

Please note that if there is a problem with IntelliJ, an equivalent solutions should be provided for that too, until then, the issue shouldn't be closed.

ferenci84 avatar Feb 14 '25 15:02 ferenci84

I later found that the core problem, that my system's resources got busy in that specific repository, was not solved after applying cache to diff. I traversed the problem and found that this part of the getAllSnippets cause problem:

Image

Further investigating I found that the too many concurrent call is done to the "goto provider":

Image

On below screenshots you can see the logs I put, and the log output I got:

Image Image

After I implemented sequential processing to avoid too many concurrent operation, I also found that the cache size is too small (see counter):

Image

You can also see that the special thing in this project I was working in, that it used typeorm with a long type definition file (tens of thousands of lines). This also means that the problem probably can be reproduced by including typeorm and importing in the current file.

After I changed to sequential processing and increased the cache size from 50 to 500 (this stores only the file name and line numbers), the problem was mostly solved. Yet it takes more than 2 minutes processing at the first completion, but it doesn't block the system (CPU shows the same level of busyness, so the concurrent processing that I removed appeared to be unnecessary, the difference is that other necessary operations are not blocked while it's running), and on subsequent completions it runs in fraction of a second.

I have also disabled this whole processing, since after I successfully debugged everything, I found that the whole ideSnippets is not used in the code that collects and consolidates all the snippets.

ferenci84 avatar Feb 15 '25 12:02 ferenci84

@ferenci84 thank you for the excellent write up and PR—I just merged and this will be in the next pre-release. Until it's there and verified to be solving the problem I'll leave open this issue

sestinj avatar Feb 17 '25 05:02 sestinj

I downloaded and installed 1.0.0 and the problem is still in effect. While typing, I can watch a lot of git processes spawning in my task manager.

Isfirs avatar Feb 26 '25 09:02 Isfirs

Image

Can confirm that this problem happens: git child processes spawn as soon as I start typing.

Windows 10, Continue 1.0.1, PhpStorm 2024.3.2.1

  "tabAutocompleteModel": {
    "title": "Qwen2.5-Coder 1.5B",
    "model": "qwen2.5-coder:1.5b",
    "provider": "ollama"
  },

5tr1k3r avatar Feb 27 '25 07:02 5tr1k3r

Can confirm bug is still alive in continue plugin 1.02 (Feb.28, 2025) Since this still crashes IntelliJ IDE after aprox. 50 Git processes spawned, the plugin is now unusable since 2 weeks. Is there any chance to adapt the given fix from ferenci84 above for the Jetbrains IDE so it makes it into the next version 1.03? Thanks in advance...

Image

hdtvnase2k avatar Feb 28 '25 12:02 hdtvnase2k

@hdtvnase2k Right, the solution need to be adapted. The cache is implemented only for vs code this is the fallback for IDEs where it's not present: const currentTimestamp = ide.getLastFileSaveTimestamp ? ide.getLastFileSaveTimestamp() : Math.floor(Date.now() / 10000) * 10000; // Defaults to update once in every 10 seconds

ferenci84 avatar Feb 28 '25 18:02 ferenci84

I can confirm same Issue in Intellij Idea Ultimate (2024.3.4).

Goland (2024.3.4) seems unaffected by this issue, or at least does not crash.

EDIT: Seems like it (im not 100% sure) happens depending on the repository configuration, authentication issues, lack of configuration e.g. user.email username, or non-existent remote repository.

lyoneel avatar Mar 03 '25 02:03 lyoneel

Can confirm bug is still alive in continue plugin 1.02 (Feb.28, 2025) Since this still crashes IntelliJ IDE after aprox. 50 Git processes spawned, the plugin is now unusable since 2 weeks.

For me PyCharm will lock up until I terminate the git processes. The processes don't seem to terminate on their own so the lock up is inevitable. I don't even have a git repo for the current files I'm working on. I'm just experimenting in an single Python script so I haven't added git.

Update: I can confirm the git sub-processes are staying around because no git repo was initialized. Something about the plugin is causing git to linger in it's absence. The command line for these processes are all git diff.

Windows 10 22H2 (OS Build 19045.5608) PyCharm 2024.3.4 (Community Edition) Continue Dev Plugin v1.0.2 git version 2.48.1.windows.1 installed via scoop.sh

infomaniac50 avatar Mar 20 '25 22:03 infomaniac50

I have made a PR for intelliJ that adds caching for git diff: https://github.com/continuedev/continue/pull/4753

I hope it resolves the issue.

ferenci84 avatar Mar 21 '25 14:03 ferenci84

While at it, i noticed an other, possibly related performance issue. Even if there is no user action, getOpenFiles and getWorkspaceDirs are called every few seconds: Image

The pattern is apparent from the log: getOpenFiles and getWorkspaceDirs is called every 2 seconds In addition, getWorkspaceDirs is called every 5 seconds

It's highly possible that the same happens in VS Code.

ferenci84 avatar Mar 21 '25 14:03 ferenci84

Proposed Solution for This Issue:

  1. Initialize a cache using git diff (working dir vs. staged).
  2. Attach a DocumentListener to the active editor. For modified files: Run git diff <file> and refresh the cache.
  3. Expose a function (e.g., getLatestDiffWithCache) for autocomplete features to consume the cached diffs.

Key Optimization: Instead of repeatedly scanning the entire working directory(especially critical for mid/large-scale projects), we focus only on the file being actively edited.To efficiently track code changes with minimal overhead.

However, I don't feel compelled to implement this. What puzzles me most is why we'd use git diff for autocomplete when it's usually filtered out. I have to admit that in most scenarios, this data is useless. For example, when the context request has a 1024 token limit, excessive git diff data would just get filtered out anyway. I don't understand why we'd repeatedly execute such a redundant operation in a high-performance scenario. Is this git diff data actually beneficial for autocomplete? Does anyone have better insights on this?

lkk214 avatar Mar 28 '25 19:03 lkk214

@lkk214 I agree with you. I made the simplest possible implementation, but I hope the whole git diff stuff will be taken out from the autocomplete in the future (or it will be optional). I experimented with different autocomplete settings with and without git diff, and it works better without.

ferenci84 avatar Mar 28 '25 22:03 ferenci84

@sestinj Would you mind to share your opinion about this?

ferenci84 avatar Mar 28 '25 22:03 ferenci84

a) thanks for the PR and I'll review that first thing tomorrow b) yes, caching for IntelliJ would be an important addition c) I'm open to adjusting our use of git diff in autocomplete. If you have some examples or other evidence of different performance I'd be curious to see what you're seeing. We'd probably run an experiment quickly and then make a decision d) the openFiles request is done to refresh the @ files dropdown, rather than subscribing to file open/close events. That would be another valid approach, though the current approach shouldn't be a performance concern since listing open files is cheap

sestinj avatar Mar 30 '25 07:03 sestinj

@hdtvnase2k Right, the solution need to be adapted. The cache is implemented only for vs code this is the fallback for IDEs where it's not present: const currentTimestamp = ide.getLastFileSaveTimestamp ? ide.getLastFileSaveTimestamp() : Math.floor(Date.now() / 10000) * 10000; // Defaults to update once in every 10 seconds

Hello again and thank you for your suggestion @ferenci84. since even in version 1.0.10 (updated on 3rd of April 20205) the bug of unlimited spawning git processes still persists, is it possible that you maybe assist in implementing a fix for the intelliJ-IDEA-verion of this plugin so it can be contributed to one of the future versions? This would be great. Thank you in advance

hdtvnase2k avatar Apr 17 '25 11:04 hdtvnase2k

This bug still exists, and is affecting us on IJ 2024.3.4.1 on a large Java project. As soon as there are any number of git diffs, continue becomes unusable until disabled. #5335 created but I suspect its a duplicate of this.

@sestinj , is there any fix on the horizon for this?

bdavj avatar Apr 25 '25 09:04 bdavj

Key Optimization: Instead of repeatedly scanning the entire working directory(especially critical for mid/large-scale projects), we focus only on the file being actively edited.To efficiently track code changes with minimal overhead.

Hey, as mentioned, medium and large projects will definitely be affected, and large diffs will not provide any help for autocomplete

lkk214 avatar Apr 25 '25 09:04 lkk214

@sestinj This issue seems to cause serious problems especially for people with large java projects. Generating diffs in those projects can get really slow and can make the extension unusable.

Honestly, do you really think including diff have substantial advantage over just tracking recently visited ranges and recently edited ranges? Would anyone miss it? If not, it could be taken out, or at least make it configurable.

On my side, I keep it off, and I get really good completions.

ferenci84 avatar Apr 25 '25 09:04 ferenci84

@ferenci84 - how do you disable this? We've not got the diff context provider and seeing the huge slowdowns too, even with the latest continue version.

bdavj avatar Apr 25 '25 10:04 bdavj

@bdavj Not sure if it's working with your version but I'm using this config:

"tabAutocompleteOptions": {
    "maxPromptTokens": 1024,
    "debounceDelay": 500,
    "maxSuffixPercentage": 0.2,
    "prefixPercentage": 0.3,
    "experimental_includeClipboard": false,
    "experimental_includeRecentlyVisitedRanges": true,
    "experimental_includeRecentlyEditedRanges": false,
    "experimental_includeDiff": false,
    "useImports": true
  }

I think they exist only in json configuration.

ferenci84 avatar Apr 26 '25 05:04 ferenci84

They do only exist in the JSON config it seems. I've just done a tactical build with the diff context provider replaced with [], which makes it a lot faster.

bdavj avatar Apr 28 '25 12:04 bdavj

Thank you for your suggestion @ferenci84

@bdavj Not sure if it's working with your version but I'm using this config:

"tabAutocompleteOptions": { "maxPromptTokens": 1024, "debounceDelay": 500, "maxSuffixPercentage": 0.2, "prefixPercentage": 0.3, "experimental_includeClipboard": false, "experimental_includeRecentlyVisitedRanges": true, "experimental_includeRecentlyEditedRanges": false, "experimental_includeDiff": false, "useImports": true }

I think they exist only in json configuration.

Unfortunately this does not work for our project when using v1.0.14 JetBrains (24.04.2025) or v1.0.15 (28.04.2025). Also the suggested fix from @bdavj

They do only exist in the JSON config it seems. I've just done a tactical build with the diff context provider replaced with [], which >makes it a lot faster.

does not work for us too.

Our IDEs are freezing after a work of 5-10 minutes when using the plugin.

Here again the friendly request that we could possibly implement a fix for the roblem with the friendly help of @ferenci84 and that this could then be included in the next version. Could you please help us with that, @ferenci84 ?

Thank you in advance.

hdtvnase2k avatar Apr 29 '25 05:04 hdtvnase2k

Same problem here.

Continue: 1.0.15 PyCharm: 2025.1 OS: Windows 10

Rybo-W avatar May 08 '25 00:05 Rybo-W

Thanks in advance to everyone who has already suggested solutions for this bug. In particular, thanks to @lkk214 who had already outlined the solution:

  • Initialize a cache using git diff (working dir vs. staged).
  • Attach a DocumentListener to the active editor. For modified files: Run git diff <file> and refresh the cache.
  • Expose a function (e.g., getLatestDiffWithCache) for autocomplete features to consume the cached diffs.

And also special thanks to @ferenci84 for the concrete proposal:

I agree with you. I made the simplest possible implementation, but I hope the whole git diff stuff will be taken out from the autocomplete in the future (or it will be optional). I experimented with different autocomplete settings with and without git diff, and it works better without.

Thanks also to @sestinij who has already given feedback on @lkk214 s implementation proposal:

a) thanks for the PR and I'll review that first thing tomorrow b) yes, caching for IntelliJ would be an important addition c) I'm open to adjusting our use of git diff in autocomplete.

We really would like to use the plugin in our project, but even in the current version 1.0.16 our IDE freezes after about 5 - 10 minutes and the IDE process has to be terminated manually in the task manager.

We would like to implement a fix for this bug, but we would definitly need help from one of you @lkk214 @ferenci84 @sestinj

Since we can't contact you directly via github, we hereby leave our contact e-mail: [email protected] We would be very happy about any support.

Thank you in advance an keep up the good work.

hdtvnase2k avatar May 13 '25 11:05 hdtvnase2k

This issue hasn't been updated in 90 days and will be closed after an additional 10 days without activity. If it's still important, please leave a comment and share any new information that would help us address the issue.

github-actions[bot] avatar Aug 12 '25 02:08 github-actions[bot]

This issue was closed because it wasn't updated for 10 days after being marked stale. If it's still important, please reopen + comment and we'll gladly take another look!

github-actions[bot] avatar Aug 23 '25 02:08 github-actions[bot]