Quantity::new syntax is not zero-cost on integer base units
Given the following test code, I expect all of these functions to generate the same ASM. However, when using base units i32 and i16 (the two I have tested), the third function generates thousands more ASM instructions over the first two (which are identical).
pub fn test_i32(a: i32, b: i32) -> i32 {
let power_1 = a;
let power_2 = b;
power_1 - power_2
}
pub fn test_struct_i32(a: i32, b: i32) -> i32 {
let power_1 = uom::si::i32::Power {
dimension: core::marker::PhantomData,
units: core::marker::PhantomData,
value: a
};
let power_2 = uom::si::i32::Power {
dimension: core::marker::PhantomData,
units: core::marker::PhantomData,
value: b
};
(power_1 - power_2).value
}
pub fn test_new_i32(a: i32, b: i32) -> i32 {
let power_1 = uom::si::i32::Power::new::<uom::si::power::watt>(a);
let power_2 = uom::si::i32::Power::new::<uom::si::power::watt>(b);
(power_1 - power_2).value
}
These three functions produce the same output asm when using the f32 base type.
Cargo.toml
[dependencies.uom]
version = "0.33.0"
default-features = false
features = [
"i16", "f32", "i32",
"si"
]
$ rustc --version
rustc 1.62.1 (e092d0b6b 2022-07-16)
My guess is that the to_base call in new() isn't being optimized out even when the given unit matches the base unit. Non-floating point underlying storage types haven't gotten the same attention that floating point types have. See #261 where I would like to add tests to ensure zero-cost code gets generated. At the time that issue was created code generation for f32 worked but I never tested other types.
https://github.com/iliekturtles/uom/blob/10d9679193775aae04987338a31014524bd5674d/src/quantity.rs#L208-L223
I think the same thing might also explain other behaviour that I saw (which I didn't log as an issue because I wasn't entirely confident it wasn't expected in that case):
When the autoconvert feature is enabled, Eq implementations on u16 units were also resulting in significant cost. In my application I didn't actually need autoconvert so didn't dwell on it, but I suspect to_base is the culprit there too, from a brief peek into the source.
Hmm, I have just got the same thing. I use uom in embedded calculations on ESP32. I am using i32 as a base type, and Quantity<...>::new execution was just very slow, comparing to using raw i32. I was using opt-level = 3, but it seems, that to_base was not optimized out by the compiler.