Missing optimization in Wasm for `x < 0 || x > constant`
The attached .wasm file has this code at offset 95044:
95044| block $label3
95046| block $label2
95048| local.get $var6
95050| i64.const 0
95052| i64.lt_s ;; $var6 < 0 (signed)
95053| br_if $label2 ;; Jump to error branch
95055| local.get $var6
95057| i64.const 255
95060| i64.gt_s ;; $var6 > 255 (signed)
95061| i32.eqz ;; Negate the condition
95062| br_if $label3 ;; Jump to OK branch
95064| end $label2 ;; Error branch
95065| call $FormatException
95068| call $Error._throwWithCurrentStackTrace
95071| unreachable
95072| end $label3 ;; OK branch
This is a direct translation of Dart code:
if (value < 0 || value > 255) throw ...
When optimized with wasm-opt 765c61445 (current main branch) using flags
--all-features
--closed-world
--traps-never-happen
--type-unfinalizing
-Os
--type-ssa
--gufa
-Os
--type-merging
-Os
--type-finalizing
wasm-opt generates this:
local.tee $var0
i64.const 0
i64.ge_s ;; var0 >= 0 (signed)
local.get $var0
i64.const 255
i64.le_s ;; var0 =< 255 (signed)
i32.and
if ;; OK branch
...
else ;; Error branch
...
end
Here var0 >= 0 && var0 <= 255 part could be optimized to a single unsigned comparison
local.get $var0
i64.const 255
i64.le_u
if
... ;; OK branch
else
... ;; Error branch
end
Interestingly on a different program with the exact same Dart expression, wasm-opt generates different instructions, but with the same missing optimization:
local.get $var9
i64.const 0
i64.lt_s ;; $var9 < 0 (signed)
br_if $label4
local.get $var9
i64.const 255
i64.gt_s ;; $var9 > 255 (signed)
i32.eqz
br_if $label5
For this second program, an older wasm-opt (d844d2e77) generates code with a more obvious missing optimization:
local.tee $var5
i64.const 0
i64.ge_u ;; var5 >= 0 (unsigned)
local.get $var5
i64.const 255
i64.le_u ;; var5 <= 255 (unsigned)
i32.and
i32.eqz
Here ge_u against 0 will always evaluate to 1, so the code can be simplified to the same optimized code as above.
Makes sense, those two-part comparisons look like they should be optimized. However in general we know we are missing a great deal of such patterns, and our plan has been to call out to LLVM for this in the long term. If you think these particular patterns are common and urgent to get in then I can look into it, but if this is more of a general issue then maybe instead it would make sense for us to re-prioritize the LLVM work. What do you think?
OTOH the specific pattern of x >= 0 (unsigned) looks like it is already implemented. I don't see (i64.ge_u (X) (i64.const 0)) in the output of wasm-opt test.unopt.wasm -O3 -all -tnh --closed-world, and I also verified with
(module
(func $test (export "test") (param $x i32) (result i32)
(i32.ge_u
(local.get $x)
(i32.const 0)
)
)
)
$ bin/wasm-opt a.wat -O3 --print
(module
(type $0 (func (param i32) (result i32)))
(export "test" (func $test))
(func $test (param $0 i32) (result i32)
(i32.const 1)
)
)
Is it possible you need to run another cycle of optimizations, if you still see that somehow?
If you think these particular patterns are common and urgent to get in then I can look into it, but if this is more of a general issue then maybe instead it would make sense for us to re-prioritize the LLVM work. What do you think?
Would the idea be binaryen calling out to LLVM? What would that do to compile-times of binaryen?
(Somewhat offtopic) where we could also benefit:
Dart has only one integer type, namely int. It's a 64-bit int wrap-around type. So the code we emit is all 64-bit integer arithmetic & bitwise operations. Though very often such 64-bit int code could be done in 32-bit arithmetic.
The conversions between 32-bit and 64-bit come up especially when e.g. indexing arrays, getting length of arrays, ... So for example a loop over a wasm array will be using a i64 loop variable that is incrementing until end of array (which we get via array.len; i64.extend_i32_u) and the body of the loop will use i32.wrap_i64; array.get to index - where the loop induction variable could just be i32 the entire time (avoiding all i64.extend_i32_u, i32.wrap_i64).
@mkustermann
Would the idea be binaryen calling out to LLVM? What would that do to compile-times of binaryen?
Yes, that's the idea. We'd be using LLVM as a library so I don't think our compile times would be affected much (but users that want this optional feature would need to have a build of LLVM, either download it or build it).
the code we emit is all 64-bit integer arithmetic & bitwise operations. Though very often such 64-bit int code could be done in 32-bit arithmetic.
Interesting, that does sound like an opportunity to optimize. Perhaps open a new issue specifically for that? If you can provide some complete code samples that show the situation that would be helpful too.