Panics on non consecutive blocks and vregs ids
Hi, currently it is possible for two panics to happen when a function's SSA is being validated:
- If the block's numbering does not start at
0or is non consecutive. - If the vreg's numbering does not start at
0or is non consecutive.
I ran into it after trying to use regalloc2's allocators in my toy/educational compiler project... I ended up with a hacked up clif file, something like:
test run
test compile
target x86_64
function %branch_diverge(i64, i64) -> i64 system_v {
block1(v0: i64, v1: i64):
brif v0, block2, block2
block2:
v6 = iconst.i64 1
v11 = iadd v6, v1
return v11
}
; run: %branch_diverge(0, 0) == 1
Which worked fine in my register allocator, as it uses hashmaps, but when trying to run regalloc2's allocators I got panics. Running clif-util test from cranelift passes on this file, that probably relabels all ids before passing them to the register allocator?.
I'm not sure if non conseuctive ids violate an unwritten rule about SSA notation, but I didn't find anything in the documentation about this and if it is not allowed, ideally the ssa validation should catch it and report an error, such that users can easily understand what the problem is.
The first one happens in the second argument of the validate_ssa call. Specifically here:
https://github.com/bytecodealliance/regalloc2/blob/925df1b4674435a9322e21912926a68749517861/src/postorder.rs#L38
in the case the index of a block is outside of num_blocks.
The other one here: https://github.com/bytecodealliance/regalloc2/blob/925df1b4674435a9322e21912926a68749517861/src/ssa.rs#L19 In case the vregs ids are not consecutive.
A standalone minimal example that demonstrates the two problems can be found here.
Backtraces from that example
thread 'main' panicked at /home/ivor/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/regalloc2-0.11.1/src/postorder.rs:38:17:
index out of bounds: the len is 2 but the index is 2
stack backtrace:
0: rust_begin_unwind
at /rustc/4d91de4e48198da2e33413efdcd9cd2cc0c46688/library/std/src/panicking.rs:692:5
1: core::panicking::panic_fmt
at /rustc/4d91de4e48198da2e33413efdcd9cd2cc0c46688/library/core/src/panicking.rs:75:14
2: core::panicking::panic_bounds_check
at /rustc/4d91de4e48198da2e33413efdcd9cd2cc0c46688/library/core/src/panicking.rs:273:5
3: regalloc2::postorder::calculate
at /home/ivor/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/regalloc2-0.11.1/src/postorder.rs:38:17
4: regalloc2::cfg::CFGInfo::init
at /home/ivor/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/regalloc2-0.11.1/src/cfg.rs:53:9
5: regalloc2::cfg::CFGInfo::new
at /home/ivor/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/regalloc2-0.11.1/src/cfg.rs:46:9
6: regalloc2::fastalloc::run
at /home/ivor/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/regalloc2-0.11.1/src/fastalloc/mod.rs:1290:29
7: regalloc2::run
at /home/ivor/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/regalloc2-0.11.1/src/lib.rs:1600:13
8: regalloc_panic::main
at ./grus_codegen/examples/regalloc_panic.rs:173:21
9: core::ops::function::FnOnce::call_once
at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5
thread 'main' panicked at /home/ivor/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/regalloc2-0.11.1/src/ssa.rs:19:26:
index out of bounds: the len is 4 but the index is 6
stack backtrace:
0: rust_begin_unwind
at /rustc/4d91de4e48198da2e33413efdcd9cd2cc0c46688/library/std/src/panicking.rs:692:5
1: core::panicking::panic_fmt
at /rustc/4d91de4e48198da2e33413efdcd9cd2cc0c46688/library/core/src/panicking.rs:75:14
2: core::panicking::panic_bounds_check
at /rustc/4d91de4e48198da2e33413efdcd9cd2cc0c46688/library/core/src/panicking.rs:273:5
3: <usize as core::slice::index::SliceIndex<[T]>>::index
at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/index.rs:274:10
4: core::slice::index::<impl core::ops::index::Index<I> for [T]>::index
at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/index.rs:16:9
5: <alloc::vec::Vec<T,A> as core::ops::index::Index<I>>::index
at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/vec/mod.rs:3361:9
6: regalloc2::ssa::validate_ssa::{{closure}}
at /home/ivor/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/regalloc2-0.11.1/src/ssa.rs:19:26
7: regalloc2::ssa::validate_ssa
at /home/ivor/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/regalloc2-0.11.1/src/ssa.rs:33:21
8: regalloc2::fastalloc::run
at /home/ivor/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/regalloc2-0.11.1/src/fastalloc/mod.rs:1290:9
9: regalloc2::run
at /home/ivor/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/regalloc2-0.11.1/src/lib.rs:1600:13
10: regalloc_panic::main
at ./grus_codegen/examples/regalloc_panic.rs:192:22
11: core::ops::function::FnOnce::call_once
at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5
Commits are split out:
- c536d39210aced8de6d5bdf757f64191452fe3bb adds checks that mitigate the panics.
- a3f088231e7a0d4d859709d80c3588f3e7fbb45a adds documentation to the
RegAllocErrorenum, and a note inFunctionthat states the requirement on the indices. - cf441a96feb03dd5ccd6b65b7df972669f085774 adds some drive-by fixes to the documentation, they gave warnings about
[alloc_s]not being an existing token, andOperand::fixednot existing, I think that should beOperandConstraint::FixedRegnow?, lastlytarget/in the.gitignoredoesn't match a symlink (I always use one to a ramdisk).
With these changes, the mentioned minimal non working example runs cleanly en reports Err as a return from the run method for both allocators.
Hi @iwanders -- thanks for the report.
Actually I think the issue is a bit different, and much simpler, than this PR implies: regalloc2 requires accurate information about the range of block IDs, instruction IDs, and VReg IDs, because it pre-allocates structures indexed by these IDs. If you provide inaccurate information, the code will panic.
Non-contiguous ranges, on the other hand, are perfectly fine; the allocator does not assume that the index spaces are fully filled.
The num_vregs, num_blocks and num_insts methods are perhaps ambiguously named: in the case of a sparse index space, these are not really the number of used indices, but the total size of the index space. (In other words, max-index plus one.) What I had meant by the names was exactly that -- "number of indices" rather than "number of things attached to these indices".
So I don't think this PR is the right solution, but we should properly document this, and we should validate that all IDs are in range so we get a clearer error. Would you be willing to do that?
because it pre-allocates structures indexed by these IDs. If you provide inaccurate information, the code will panic. The num_vregs, num_blocks and num_insts methods are perhaps ambiguously named: in the case of a sparse index space, these are not really the number of used indices, but the total size of the index space.
Yeah, I think this is the main thing... the docs for those functions currently states they should return the number of blocks, not the value of the index of the highest block (plus one). So to make the code work, the only logical conclusion for me was to require the indices to be sequential and start at zero, otherwise they don't all end up with an index below num_blocks. Anyways, I get what you mean :+1:
So I don't think this PR is the right solution, but we should properly document this, and we should validate that all IDs are in range so we get a clearer error. Would you be willing to do that?
Yeah, no problem, I'll update it tomorrow. The fix (c536d39210aced8de6d5bdf757f64191452fe3bb) only needs a rewording of the trace on line 20, but the actual checks only check whether it's in bounds of the allocated container. As for the docs of the num_blocks function; we can just have the documentation explain that it's the number of elements in pre-allocated containers internally, so the user should pass the highest block id plus one, not the actual number of blocks used in the function. That explanation makes it clear why it is named like this.
Okay, I pushed a commit that rewords the documentation and makes the checks check the length.
Non-contiguous ranges, on the other hand, are perfectly fine; the allocator does not assume that the index spaces are fully filled.
About this though, there's a loop over the possible block indices here:
https://github.com/bytecodealliance/regalloc2/blob/925df1b4674435a9322e21912926a68749517861/src/cfg.rs#L79-L83
And the signature for block_insn does not return an optional:
https://github.com/bytecodealliance/regalloc2/blob/925df1b4674435a9322e21912926a68749517861/src/lib.rs#L1138
The naive solution in that case is to return an empty range, from instruction 0 to instruction 0, but that results in a debug assert: https://github.com/bytecodealliance/regalloc2/blob/925df1b4674435a9322e21912926a68749517861/src/index.rs#L149
So, to handle non-consecutive blocks, one would have to do something like this:
fn block_insns(&self, block: regalloc2::Block) -> InstRange {
println!("block: {block:?}");
if let Some(r) = self.blocks.get(&block) {
self.blocks[&block]
} else {
println!("Don't have this block, returning a dummy range");
InstRange::new(Inst::new(0), Inst::new(1))
}
}
which is less than intuitive as a workaround? Edit; Actually, I think this makes the data in the instruction -> block lookup in the CFGInfo incorrect, because it will assign the last block to that instruction range?
I wonder if, instead of 0..f.num_blocks(), we should do a traversal of the blocks through the entry block and block successors?
Doesn't that assert in InstRange::new() permit a range of 0..0 as it's a <= comparison?
I suppose you're finding a few places where a contiguous-indices assumption has leaked in, which makes me think that maybe the most prudent solution is just to check that property. It's certainly what Cranelift provides in its input, so that will be the path of least resistance for other users as well.
Doesn't that assert in InstRange::new() permit a range of 0..0 as it's a <= comparison?
Sorry, yes, we hit the assert one function below that assert on 155;
https://github.com/bytecodealliance/regalloc2/blob/925df1b4674435a9322e21912926a68749517861/src/index.rs#L155
Which is called from the CFGInfo creation here.
Backtrace
``` thread 'main' panicked at /home/ivor/Documents/Code/rust/bytecodealliance/regalloc2/src/index.rs:155:9: assertion failed: self.len() > 0 stack backtrace: 0: rust_begin_unwind at /rustc/4d91de4e48198da2e33413efdcd9cd2cc0c46688/library/std/src/panicking.rs:692:5 1: core::panicking::panic_fmt at /rustc/4d91de4e48198da2e33413efdcd9cd2cc0c46688/library/core/src/panicking.rs:75:14 2: core::panicking::panic at /rustc/4d91de4e48198da2e33413efdcd9cd2cc0c46688/library/core/src/panicking.rs:145:5 3: regalloc2::index::InstRange::first at /home/ivor/Documents/Code/rust/bytecodealliance/regalloc2/src/index.rs:155:9 4: regalloc2::cfg::CFGInfo::init at /home/ivor/Documents/Code/rust/bytecodealliance/regalloc2/src/cfg.rs:84:60 5: regalloc2::cfg::CFGInfo::new at /home/ivor/Documents/Code/rust/bytecodealliance/regalloc2/src/cfg.rs:46:9 6: regalloc2::fastalloc::run at /home/ivor/Documents/Code/rust/bytecodealliance/regalloc2/src/fastalloc/mod.rs:1290:29 7: regalloc2::run at /home/ivor/Documents/Code/rust/bytecodealliance/regalloc2/src/lib.rs:1614:13 ```It's certainly what Cranelift provides in its input, so that will be the path of least resistance for other users as well.
Hmm, it is sure what it feels like. I only really ran into this because I manually hacked up a clif file and ended up with non-consecutive numbers, not sure if anyone else ever ran into this situation? I think it's fine to require them to be consecutive, as long as there's a clear indication that is required and validated for?
Sorry, I'm a bit confused now. You wrote in the initial report
Running clif-util test from cranelift passes on this file, that probably relabels all ids before passing them to the register allocator?.
But now you're saying
I only really ran into this because I manually hacked up a clif file and ended up with non-consecutive numbers,
I've confirmed that your CLIF compiles fine with Cranelift, and indeed it does so because Cranelift allocates into new ID spaces for machine-level blocks, instructions and vregs (it's a different IR layer).
I want to make sure I have the precise details of your report right: are you somehow reaching this with CLIF ("because I manually hacked up a clif file")?
I want to make sure I have the precise details of your report right: are you somehow reaching this with CLIF ("because I manually hacked up a clif file")?
Sorry for the confusion! Running clif-util does not result in any problems, the problems merely stem from the id's I pass to regalloc2 through my implementation of the regalloc2::Function trait. I do not have these problems when using the clif-util binary from the cranelift repository.
When I mentioned 'clif file' I mean anything that ends in .clif. In my project I was keeping the ids as they are returned by the clif_reader crate and effectively passing those ids directly to regalloc2. I started with a file from filetests/isa/x64, and modified that in a way that made the indices non consecutive. Then I read that with the reader, and ended up passing the non consecutive ids to regalloc2. Hope this clarifies what I meant, and how I got here.
@cfallin , friendly ping on this, I'm also happy to change this PR to just documentation changes if you'd prefer that.
I forgot to state in my previous message, but the original PR description did include a minimal non-working example that shows the original problem.
Sorry, this fell off my radar -- yes, given that a few consecutive-ID assumptions have slipped in, let's document it formally and check it. Happy to review a PR for that (which I guess is close to what you had originally?).
And another small thing, compiling the fuzzer gave a warning in the pipeline.
There's now an unused result here, I think we can just ignore it with let _ =, given that the indices will always be correctly generated and we can't really propagate the Err anyway?
Edit; I ignored the result in 06850ff5b7d3ee40da961155749ee049eb74445f happy to change it into an expect or something if you'd prefer that.
Looks good, thanks!