Implement server-side completion sorting and filtering
At the moment, we rely entirely on client-side sorting of completions. That's bad, and we should implement server side sorting. The big steps here are:
- Design the API for composable completion item ordering
- Implement the base version of this API
- Hack LSP into accepting serve-provided completion order
- Polish specific completion routines until the sort order is acceptable
Design
Each completion item should have a CompletionScore object. Scores are partially comparable. Score should record, precisely why we think this variant may be better than average and should return facts like "do the types align exactly?", "are we looking for a non-static function here", etc. I think it should be a bunch of bools. It currently is
pub enum CompletionScore { TypeMatch, TypeAndNameMatch }
I don't think that's the right repr.
After scores, we should implement fuzzy matching. We should look at the identifier being completed, and comput, for each CompletionItem, how good the match is, throwing away the completions which are not relevant.
Then, we should sort the final list, taking into account both the fuzzy score and CompletionScore (does it need a better name? MatchBits?)
Implementation
Fuzzy scoring needs to be implemented from scratch I think. Scores are partially there
LSP Hacking
LSP actually insists that the client does sorting&filtering, so we need to hack around that, see https://github.com/microsoft/language-server-protocol/issues/898 for details. I believe we need to:
- set
isIncompletetotrue - set sort text to
format!("{04}", i)whereiis our final order
Polishing
We produce a ton of garbage completions at the moment, so we should take care to actually design and use the appropriate scores.
cc @SomeoneToIgnore
I was thinking on making a few benchmarks and maybe a few more path-handling improvements before even starting to think about this myself.
Should we also close https://github.com/rust-analyzer/rust-analyzer/issues/4544 ?
#7945 refactors some code, renaming CompletionScore to Relevance
Thanks for writing this up! I'll give this a shot :+1:
It seems #7945 doesn't change that we don't sort type+name matches above just type matches. In the short term, is there a downside to taking the approach I proposed in #7904, which is to set the sortText for all items to some value [1, 3]?
I think long term we plan to be setting the value for sortText for all items anyway. So I think my question is partly, can this work be rolled out to users in stages, where stage 1 is the sorting and stage 2 is the filtering.
In stage 1 we could be building out lots of Relevance criteria, and it would be nice if during stage 1 we were setting sortText in such a way that users could see the improved sorting right away.
Then stage 2 would be filtering, where we'd do fuzzy matching and have to deal with the LSP Hacking you've described above (I'm still reading through the github issue linked there). Basically I can see the fuzzy matching and lsp hacking bringing on a whole other set of challenges, and it would be nice if we could avoid blocking the benefits of stage 1 behind the release of stage 2.
Does this seem feasible?
Yeah, I think that might make sense. I haven’t looked closely into that PR, but, if it works better in vs code than what we have today, I think we should merge.
On Tuesday, 9 March 2021, Josh Mcguigan [email protected] wrote:
Thanks for writing this up! I'll give this a shot 👍
It seems #7945 https://github.com/rust-analyzer/rust-analyzer/pull/7945 doesn't change that we don't sort type+name matches above just type matches. In the short term, is there a downside to taking the approach I proposed in #7904 https://github.com/rust-analyzer/rust-analyzer/pull/7904, which is to set the sortText for all items to some value [1, 3]?
I think long term we plan to be setting the value for sortText for all items anyway. So I think my question is partly, can this work be rolled out to users in stages, where stage 1 is the sorting and stage 2 is the filtering.
In stage 1 we could be building out lots of Relevance criteria, and it would be nice if during stage 1 we were setting sortText in such a way that users could see the improved sorting right away.
Then stage 2 would be filtering, where we'd do fuzzy matching and have to deal with the LSP Hacking you've described above (I'm still reading through the github issue linked there). Basically I can see the fuzzy matching and lsp hacking bringing on a whole other set of challenges, and it would be nice if we could avoid blocking the benefits of stage 1 behind the release of stage 2.
Does this seem feasible?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rust-analyzer/rust-analyzer/issues/7935#issuecomment-794429704, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANB3M43I3FAJYOPHFSXEKDTCZ3AXANCNFSM4Y3RWYNQ .
I've updated #7904 (rebased since #7945 merged), which will fix the current sorting so type+name matches are above just type matches. I believe it will also serve as a reasonable base for continued work on building out improved relevance scoring.
See #7686 for an example of less than ideal sorting.
Each completion item should have a CompletionScore object
clangd has this: https://clangd.llvm.org/extensions.html#code-completion-scores
Triage: We now have a CompletionRelevance struct, compute a score from that, and set the LSP sortText accordingly. Is there anything left to do or can this be closed?
I think the crucial think we still don't do is the fuzzy matching of what the user has typed with the identifier we want to complete. That is, I belive current scorring completley ignores edit distance (double check me). Makes sense to open a separate issue for that maybe?
https://github.com/rust-lang/rust-analyzer/pull/13124 shows an interesting use-case for this: sometimes (bindgen mostly) the amount of symbols we want to complete is humongous, which actually creates performance problems. So, ideally, we include a pre-filtering step: not only we want to sort & cull completions after we build the whole list, but, even while we are producing completions, we should keep only the 1000s best according to fuzzy matching (computing just the label for fuzzy matching is much cheaper than rendering the whole completion item).
Closing this in favor of https://github.com/rust-lang/rust-analyzer/issues/14158 and https://github.com/rust-lang/rust-analyzer/issues/14159