Code action to organize imports
Organize Imports Diagnostic and Code Action
This PR adds a diagnostic and a code action to organize imports in the current Zig file. The code action is available in the command palette as organize @import.
It is designed to be fairly customizable, so that we can arrive at a acceptable sorting style, like different grouping or case sensitivity. All decisions, especially those around diagnostics are very much open to critique as I do not have experience with the LSP.
Closes #1637, continues work from #881.
Design
An import is a top level declaration of a variable that uses the @import builtin.
A top level declaration that uses import variables (like const print = std.debug.print;) is not considered an import and stays in place.
The sorting splits imports into three separated groups: builtin packages, packages and project files.
The builtin packages are sorted in the order: std, builtin, build_options.
The sorting inside other groups is done by comparing the declaration names (the x in const x = @import("file.zig");).
The feature is tested both in unit tests and in neovim. The unit tests required a new helper function.
Diagnostic
The diagnostic is enabled under the warn_style zls config option, which is disabled by default.
The diagnostic's range is the entire line of the import which would be moved by organizing imports.
That means one diagnostic per wrongly placed import line.
The diagnostic does not consider the correctness of separators between imports.
Code Action
The code action is available based on the diagnostic.
This means that users with the warn_style option disabled will not be offered the code action.
The code action always sorts. The previous PR had optimization using isSorted to avoid sorting when it was not necessary.
This, however, did not work well in some corner cases.
The documentation comments (///) are moved with the import they are attached to.
It also respects Top-Level Doc Comments (//!), which must appear before any expression. This is handled using the commments token location.
This feature required running generateCodeAction for each diagnostic provided in request.
Previously, it ran the function only on diagnostics from getAstCheckDiagnostics.
Considerations and notes
- Top-level declarations (that includes imports) are order-independent (Source and tested)
-
zig fmtandtextDocument/formattingdoes not interfere with the sorting. - Only root declarations are considered for sorting (no import inside functions).
- Will not work with
@import(variable);(we don't do comptime) - Resolving one diagnostic will resolve all diagnostics in the file (since all imports are moved).
- I removed the part dealing with intersection with action's range (
builder.rangefield which no longer exists). If it was relevant, I can study it further. - The code action works without
warn_styleas seen in tests. It is just that user cannot call it without the diagnostic. -
std.sort.sortdoes not exist anymore. I pickedstd.sort.heapas a replacement. My understanding is that we do not need a stable sort here, but feel free to correct me and suggest a different sort from thestdlibrary.
Unresolved questions
- [ ] Allow for
const print = std.debug.print;to be sorted as well in some way? - [ ] How to treat
@import("root")? (Currently a package, which does not seem right) - [ ] The
getSortSlicefunction ported from previous PR seems weird. I would just doreturn self.nameand not look in the import path. Any reasoning behind this? - [ ] What about grouping by directory for the file imports? Perhaps if there are > 1 file imports from the same directory, they could be in a group of their own.
- [ ] I think
DiagnosticKindcould be parsed from the diagnostic code, which would be cleaner and would easily allow more versions of the diagnostic message. Or use thedatafield of the diagnostic message. - [ ] What about
@embedFile? - [ ] How to implement ignoring the diagnostic?
Ast.zig, for example, imports extra stuff near end of the file for tests.
I'm not sure about this being behind warn_style, the style guide makes no mention of the order or location of imports.
A problem for having a "blessed" location in the file for imports, is a few projects (including zig itself) are actually moving to have all imports at the bottom of files. Due to feeling like imports at the top are usually just noise to be scrolled past to get to the meat of the file.
Also I can't get the code action to apply in vscode, neither as a quick fix nor in the command palette.
Interesting stuff! I agree, I just wanted the diagnostic to not be on by default (people have preferences and it could be annoying). So two solutions that come to mind are 1) creating an entirely new config flag or 2) scrapping the diagnostic and having the code action just available whenever.
I will take a look at the problem you mention. It is supposed to appear when having the cursor on the diagnostic.
Thank you for taking the time to revive #881.The detailed an thorough description of this PR is very helpful. I will try to answer/address some of the points that you have mentioned.
- How to treat
@import("root")? (Currently a package, which does not seem right)
There are three "builtin packages":
-
std -
builtin -
root
build_options is not one of them since it added through the build.zig. I could understand adding it to the other "builtin packages" since its a very common convention.
std.sort.sortdoes not exist anymore.
This function has been moved to std.mem.sort.
- Will not work with
@import(variable);(we don't do comptime)
Zig requires that the first parameter of @import is a string literal so this is not an issue.
- The
getSortSlicefunction ported from previous PR seems weird. I would just doreturn self.nameand not look in the import path. Any reasoning behind this?
IIRC, this function was added to make it easier to control which string/name is used to do sorting. #881 sorted imported files like const foo = @import("bar/baz.zig") by the filename (baz.zig) instead of foo. It could become useful if we decode to change the logic later on.
- I removed the part dealing with intersection with action's range (
builder.rangefield which no longer exists). If it was relevant, I can study it further.
This was added as an optimization. The given range tells us for which part of the document the code actions should be computed. If the imports are not visible then there is no need to compute the code action. It not necessary to bring this back to get this PR merged.
The documentation comments (
///) are moved with the import they are attached to.
What about non doc-comments (//)? One TODO task I had in #881 was "preserve comments". This was especially tricky for non doc-comments. The currently implementation moves them all after the imports. I could foresee a situation where the users want to keep them between the imports for some reason. I am not sure what the best approach would be. We can leave this as a followup task.
Also I can't get the code action to apply in vscode, neither as a quick fix nor in the command palette.
I have the same problem. Using the "Source Action..." command in VS Code is the only way I was able to run the code actions.
I would also prefer to not have this feature rely on warn_style. I plan on removing this option at some point to replace it with a linter. It is possible to have code actions without reporting a diagnostic. With #1093, editors that support it can run source.organizeImports code actions automatically on save.
- Allow for
const print = std.debug.print;to be sorted as well in some way?
There are two possibles I could imagine:
Move alias to the end of each section
const std = @import("std");
const Ast = std.zig.Ast;
const Server = std.zig.Server;
const lsp = @import("lsp");
const AnyTransport = lsp.AnyTransport;
const types = lsp.types;
const DocumentScope = @import("../DocumentScope.zig");
const Declaration = DocumentScope.Declaration;
const Scope = DocumentScope.Scope;
Move alias after all imports
const std = @import("std");
const lsp = @import("lsp");
const DocumentScope = @import("../DocumentScope.zig");
// Not sure if we should sort by variable name or "path". I think path is more reasonable
const Ast = std.zig.Ast;
const Server = std.zig.Server;
const AnyTransport = lsp.AnyTransport;
const types = lsp.types;
const Declaration = DocumentScope.Declaration;
const Scope = DocumentScope.Scope;
I would personally prefer the first option but I am open to suggestions.
- What about
@embedFile?
Maybe move them into their own section after all @import("file.zig") imports.
I am happy to say that I made progress on the VS Code situation.
First of all, the import organizing does work with editor.codeActionsOnSave setting:
https://github.com/user-attachments/assets/a11379e6-0e01-42d6-84c4-90cd90cd40c7
But as you pointed out, the Organize Imports does not show up in the Command pallet and the shortcut does not work either. So I investigated with other LSPs. Rust-analyzer did not work either. But Typescript LSP did.
After looking around the code, I found the difference in registering codeActionProvider.
This is the change:
--- a/src/Server.zig
+++ b/src/Server.zig
@@ -615,7 +615,9 @@ fn initializeHandler(server: *Server, arena: std.mem.Allocator, request: types.I
},
.documentHighlightProvider = .{ .bool = true },
.hoverProvider = .{ .bool = true },
- .codeActionProvider = .{ .bool = true },
+ .codeActionProvider = .{ .CodeActionOptions = .{ .codeActionKinds = &.{
+ .@"source.organizeImports",
+ } } },
.declarationProvider = .{ .bool = true },
.definitionProvider = .{ .bool = true },
.typeDefinitionProvider = .{ .bool = true },
And it's there.
I suppose it has to be explicitly listed as supported. So how do we implement this? Simply create an array of all the code actions we support?
I suppose it has to be explicitly listed as supported. So how do we implement this? Simply create an array of all the code actions we support?
I don't think there is any alternative to just hard-coding the supported code action kinds. I also noticed that Sublime Text LSP requires source.fixAll to be explicitly listed so that their lsp_code_actions_on_save setting works.
Moving aliases to the right group has been implemented. I went for grouping based on the parent import:
const abc = @import("abc.zig");
const abc_related = abc.related;
const abc_related2 = abc.related2;
const xyz = @import("xyz.zig");
const xyz_related = xyz.related;
I am interested in what you think could be improved
Thanks for collaborating and taking the time to teach me more about Zig!
The followup tasks would need some benchmarking (never optimize blindly). Do you have a recommended microbenchmark setup for Zig?
The followup tasks would need some benchmarking (never optimize blindly). Do you have a recommended microbenchmark setup for Zig?
This post on ziggit.dev may be useful to you. My usually workflow involves perf, hotspot and poop. I created a separate branch at techatrix/code-action-organize-imports-bench which features a benchmark and some optimizations. I chose to create a standalone executable instead of opting for some microbenchmarking library. It creates a synthetic zig file with various configuration option. I chose to test on files with thousands of imports to stress test the implementation and to make any startup time of the benchmark negligible in comparison.
If you wish to use this as an opportunity to learn more about optimizing code then you can just use my benchmarking program and try to optimize it yourself. I would still suggest to use f27d726035d1084b9cc8ec32f659ed534dd45ed1 because it just performs a more efficient version of calling offsets.locToRange in a loop on the same document. The changes to getImportsDecls are featured in the next commit.
Here is a showcase of the optimized version performed poop:
Show Results
poop ./zig-out/bin/bench2-safe-old ./zig-out/bin/bench2-safe-new
Benchmark 1 (3 runs): ./zig-out/bin/bench2-safe-old
measurement mean ± σ min … max outliers delta
wall_time 17.1s ± 130ms 17.0s … 17.3s 0 ( 0%) 0%
peak_rss 50.1MB ± 56.8KB 50.1MB … 50.2MB 0 ( 0%) 0%
cpu_cycles 65.4G ± 241M 65.3G … 65.7G 0 ( 0%) 0%
instructions 109G ± 42.5K 109G … 109G 0 ( 0%) 0%
cache_references 70.9M ± 2.14M 69.1M … 73.3M 0 ( 0%) 0%
cache_misses 389K ± 116K 257K … 477K 0 ( 0%) 0%
branch_misses 129M ± 38.3K 129M … 129M 0 ( 0%) 0%
Benchmark 2 (4 runs): ./zig-out/bin/bench2-safe-new
measurement mean ± σ min … max outliers delta
wall_time 1.31s ± 2.75ms 1.31s … 1.32s 0 ( 0%) ⚡- 92.3% ± 0.9%
peak_rss 108MB ± 74.6KB 108MB … 108MB 0 ( 0%) 💩+115.2% ± 0.3%
cpu_cycles 4.83G ± 8.43M 4.82G … 4.84G 0 ( 0%) ⚡- 92.6% ± 0.5%
instructions 16.4G ± 105K 16.4G … 16.4G 0 ( 0%) ⚡- 84.9% ± 0.0%
cache_references 16.4M ± 3.69M 13.9M … 21.8M 0 ( 0%) ⚡- 76.9% ± 8.7%
cache_misses 143K ± 47.9K 110K … 213K 0 ( 0%) ⚡- 63.2% ± 41.5%
branch_misses 4.45M ± 14.7K 4.43M … 4.47M 0 ( 0%) ⚡- 96.6% ± 0.0%
The speedup is less extreme on smaller (more realistic) files.
Great success!
Bench - reproduced results
❯ poop zig-out/bin/bench-orig zig-out/bin/bench-techatrix
Benchmark 1 (3 runs): zig-out/bin/bench-orig
measurement mean ± σ min … max outliers delta
wall_time 95.0s ± 2.43s 93.4s … 97.8s 0 ( 0%) 0%
peak_rss 53.1MB ± 135KB 53.0MB … 53.2MB 0 ( 0%) 0%
cpu_cycles 425G ± 1.60G 424G … 427G 0 ( 0%) 0%
instructions 1.01T ± 246K 1.01T … 1.01T 0 ( 0%) 0%
cache_references 1.58G ± 35.9M 1.55G … 1.62G 0 ( 0%) 0%
cache_misses 43.3M ± 5.08M 38.1M … 48.2M 0 ( 0%) 0%
branch_misses 141M ± 42.0K 141M … 141M 0 ( 0%) 0%
Benchmark 2 (3 runs): zig-out/bin/bench-techatrix
measurement mean ± σ min … max outliers delta
wall_time 43.2s ± 189ms 43.0s … 43.4s 0 ( 0%) ⚡- 54.5% ± 4.1%
peak_rss 111MB ± 154KB 111MB … 111MB 0 ( 0%) 💩+108.8% ± 0.6%
cpu_cycles 192G ± 1.01G 191G … 193G 0 ( 0%) ⚡- 54.7% ± 0.7%
instructions 478G ± 228K 478G … 478G 0 ( 0%) ⚡- 52.7% ± 0.0%
cache_references 540M ± 36.6M 514M … 582M 0 ( 0%) ⚡- 65.8% ± 5.2%
cache_misses 15.7M ± 1.01M 14.7M … 16.7M 0 ( 0%) ⚡- 63.8% ± 19.1%
branch_misses 11.5M ± 68.8K 11.5M … 11.6M 0 ( 0%) ⚡- 91.8% ± 0.1%
I noticed from the profiler that the lookupSymbolGlobal was taking >90% of total time.
I tried to cache known decls by string name, it got faster by about 30%, and +10% memory consumption.
Then I circumvented the lookupSymbolGlobal entirely by getting the root scope manually.
This should be okay, since we only care about decls in root scope.
This yielded another 94% improvement for a total improvement of 97%!
Bench - direct root scope lookup (improvement over the first optimization)
Benchmark 1 (3 runs): bench-techatrix
measurement mean ± σ min … max outliers delta
wall_time 42.7s ± 604ms 42.1s … 43.3s 0 ( 0%) 0%
peak_rss 111MB ± 18.8KB 111MB … 111MB 0 ( 0%) 0%
cpu_cycles 190G ± 1.55G 189G … 191G 0 ( 0%) 0%
instructions 478G ± 855K 478G … 478G 0 ( 0%) 0%
cache_references 552M ± 29.6M 530M … 585M 0 ( 0%) 0%
cache_misses 19.6M ± 947K 18.5M … 20.4M 0 ( 0%) 0%
branch_misses 11.6M ± 48.9K 11.5M … 11.6M 0 ( 0%) 0%
Benchmark 2 (3 runs): bench-root-scope-no-strings
measurement mean ± σ min … max outliers delta
wall_time 2.56s ± 23.3ms 2.54s … 2.59s 0 ( 0%) ⚡- 94.0% ± 2.3%
peak_rss 111MB ± 75.9KB 111MB … 111MB 0 ( 0%) - 0.1% ± 0.1%
cpu_cycles 9.39G ± 26.2M 9.37G … 9.42G 0 ( 0%) ⚡- 95.0% ± 1.3%
instructions 27.0G ± 413K 27.0G … 27.0G 0 ( 0%) ⚡- 94.4% ± 0.0%
cache_references 107M ± 571K 106M … 107M 0 ( 0%) ⚡- 80.6% ± 8.6%
cache_misses 5.33M ± 64.1K 5.28M … 5.40M 0 ( 0%) ⚡- 72.7% ± 7.8%
branch_misses 10.6M ± 33.6K 10.5M … 10.6M 0 ( 0%) ⚡- 8.8% ± 0.8%
Bench - total difference
Benchmark 1 (3 runs): bench-before
measurement mean ± σ min … max outliers delta
wall_time 94.9s ± 292ms 94.6s … 95.2s 0 ( 0%) 0%
peak_rss 53.2MB ± 119KB 53.1MB … 53.3MB 0 ( 0%) 0%
cpu_cycles 427G ± 1.03G 426G … 428G 0 ( 0%) 0%
instructions 1.01T ± 639K 1.01T … 1.01T 0 ( 0%) 0%
cache_references 1.57G ± 28.1M 1.54G … 1.59G 0 ( 0%) 0%
cache_misses 46.0M ± 4.86M 42.0M … 51.4M 0 ( 0%) 0%
branch_misses 141M ± 47.9K 141M … 141M 0 ( 0%) 0%
Benchmark 2 (3 runs): bench-final
measurement mean ± σ min … max outliers delta
wall_time 2.58s ± 2.60ms 2.57s … 2.58s 0 ( 0%) ⚡- 97.3% ± 0.5%
peak_rss 111MB ± 108KB 110MB … 111MB 0 ( 0%) 💩+107.9% ± 0.5%
cpu_cycles 9.41G ± 9.30M 9.40G … 9.42G 0 ( 0%) ⚡- 97.8% ± 0.4%
instructions 27.0G ± 162K 27.0G … 27.0G 0 ( 0%) ⚡- 97.3% ± 0.0%
cache_references 107M ± 1.19M 107M … 109M 0 ( 0%) ⚡- 93.2% ± 2.9%
cache_misses 5.21M ± 91.3K 5.12M … 5.30M 0 ( 0%) ⚡- 88.7% ± 16.9%
branch_misses 10.5M ± 11.1K 10.5M … 10.5M 0 ( 0%) ⚡- 92.5% ± 0.1%
At this point we are more or less IO bound.
I should have mentioned that the configuration options in the benchmarking program are meant to be used with a Release Build (zig build bench -Doptimize=ReleaseSafe). Waiting a few minutes for benchmark results can be quite annoying.
Anyway, the removal of lookupSymbolGlobal made for a nice improvement!