rust icon indicating copy to clipboard operation
rust copied to clipboard

f64::round doesn't work properly on arm-unknown-linux-gnueabi

Open vklachkov opened this issue 1 year ago • 5 comments

I'm cross-compiling for target arm-unknown-linux-gnueabi, and I encountered that f64::round() does not work correctly. I tried this code:

fn main() {
    let value: f64 = 15.44;
    dbg!(value);
    dbg!(value.round());
}

I expected to see:

[bug/src/main.rs:3:5] value = 15.44
[bug/src/main.rs:4:5] value.round() = 15.0

But instead, in the debug build I see:

[bug/src/main.rs:3:5] value = 15.44
[bug/src/main.rs:4:5] value.round() = 0.0

In the release build, as a rule, the output is correct, but sometimes f64::round() returning 0 or 1 instead of the correct rounding. I can't reliably reproduce this bug in release build.

I build this code in docker container. See more details under the spoiler.

Build details

I build code with this command:

docker build -f Dockerfile -t build_image .

docker run \
  -v $HOME/.cargo/registry:/root/.cargo/registry \
  -v $(pwd):/src \
  -t build_image

Dockerfile:

FROM debian:11-slim

# Install Rust
RUN apt-get update && apt-get install -y curl
RUN curl https://sh.rustup.rs -sSf | bash -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"

# Install tools
RUN apt-get update && apt-get install -y \
    crossbuild-essential-armel

# Install target for ARMv6
RUN rustup target add arm-unknown-linux-gnueabi

# Build
RUN mkdir -p /src
WORKDIR /src

# Remove --release for debug build
CMD cargo build --release --target=arm-unknown-linux-gnueabi

Cargo.toml:

[package]
name = "bug"
version = "0.1.0"
edition = "2021"

[dependencies]

[profile.release]
opt-level = 3
strip = "debuginfo"
panic = "abort"

.cargo/config.toml:

[target.arm-unknown-linux-gnueabi]
linker = "arm-linux-gnueabi-gcc"
rustflags = ["-L", "/usr/lib/arm-linux-gnueabi"]

binaries.tar.gz source.tar.gz

Meta

rustc --version --verbose:

rustc 1.76.0 (07dca489a 2024-02-04)                                                                                                                                                           
binary: rustc                                                                                                                                                                                 
commit-hash: 07dca489ac2d933c78d3c5158e3f43beefeb02ce                                                                                                                                         
commit-date: 2024-02-04                                                                                                                                                                       
host: x86_64-unknown-linux-gnu
release: 1.76.0
LLVM version: 17.0.6

vklachkov avatar Mar 10 '24 14:03 vklachkov

With -Copt-level=0, round looks like this:

        push    {r11, lr}
        mov     r11, sp
        sub     sp, sp, #8
        bl      round
        str     r1, [sp, #4]
        str     r0, [sp]
        ldr     r0, [sp]
        ldr     r1, [sp, #4]
        mov     sp, r11
        pop     {r11, pc}

With optimizations enabled:

        push    {r11, lr}
        bl      round
        pop     {r11, pc}

I'm guessing that this is an LLVM backend-specific issue, but I don't have any 32-bit ARM on hand to test these.

saethlin avatar Mar 11 '24 23:03 saethlin

Realizing that the problem is more likely related to compiling the float formatting machinery. We've for sure found miscompiles in there before.

saethlin avatar Mar 11 '24 23:03 saethlin

@saethlin

With -Copt-level=0, round looks like this

I also looked into assembler, but I still didn't understand where the round was located and what the root of the issue was.

I don't have any 32-bit ARM on hand to test these

I'm reproducing this issue on Raspberry Pi Zero, but I think this bug can be reproduced in Qemu.

Realizing that the problem is more likely related to compiling the float formatting machinery. We've for sure found miscompiles in there before.

Literally on the same day, while dealing with f64::round, I encountered another bug: https://github.com/rust-lang/rust/issues/46950. Another broken floating point operation that can also be workarounded with a naive implementation.

Can I help with this bug?

vklachkov avatar Mar 12 '24 07:03 vklachkov

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-high

apiraino avatar Mar 12 '24 10:03 apiraino

Can I help with this bug?

Maybe. Usually the best thing to do here is to minimize the reproducer. The fact that your reproducer uses the standard library formatting is a bit vexing, because if that's where the code pattern is that's being miscompiled, you have a lot of code to start from.

There are other formatting crates, so I'd see if this reproduces with one of them. It's pretty easy to find a handful of them by searching crates.io. Then I'd download the source and start minimizing it as much as you can while retaining the "different behavior in debug and release on 32-bit ARM" property.

saethlin avatar Mar 12 '24 21:03 saethlin

@saethlin

I decided not to try to make a minimal repro. I was afraid that any “my” code could introduce new bugs. And I went looking for a very simple solution, where it would be clearly visible that the problem was in f64::round().

I abandoned std format and used ryu and libc::write. I know for sure that ryu and libc::write work flawlessly: they pass all tests, errors in these libraries would have been noticed long ago.

As a result, I wrote this repro:

Code
fn main() {
  // Debug fail, release ok:
  // print_float(15.44f64);
  // print_float(15.44f64.round());

  // Debug fail, release fail:
  // print_float(std::hint::black_box(get_float()));
  // print_float(std::hint::black_box(get_float()).round());
}

#[inline(never)]
fn get_float() -> f64 {
  15.44
}

#[inline(never)]
fn print_float(value: f64) {
  let mut buffer = ryu::Buffer::new();
  let printed = buffer.format(value);

  print_str(printed);
  print_str("\n");
}

#[inline(always)]
fn print_str(s: &str) {
  unsafe { libc::write(1, s.as_ptr() as _, s.len()) };
}

The difference between the first two lines and the second two lines is that in the first case Rust optimizes f64::round, but in the second it does not. Without optimizations, I see that “bl round” returns invalid value.

To be even more confident, I abandoned rye altogether and simply decided to print the bytes of the rounded float:

No float format code
fn main() {
  print_float(std::hint::black_box(get_float()));
  print_float(std::hint::black_box(get_float()).round());
}

#[inline(never)]
fn get_float() -> f64 {
  15.44
}

#[inline(never)]
fn print_float(value: f64) {
  let printed = format!("{:X?}", value.to_ne_bytes());
  print_str(&printed);
  print_str("\n");
}

#[inline(always)]
fn print_str(s: &str) {
  unsafe { libc::write(1, s.as_ptr() as _, s.len()) };
}

And got zeroes:

[E1, 7A, 14, AE, 47, E1, 2E, 40]
[0, 0, 0, 0, 0, 0, 0, 0]

If necessary, I will post this code on Github and attach binaries.

I don't know what the problem is with f64::round, but I see that std format has nothing to do with it.

vklachkov avatar Mar 14 '24 20:03 vklachkov