Add cfg(no_128_bit) to core: removes u128/i128 formatting
This is the u128/i128 equivalent of #86048, which removed floating point formatting from the core library on cfg(no_fp_fmt_parse).
The purpose of this option is to permit linking with a custom compiler-builtins which does not support 128-bit integers. (Or no compiler-builtins at all, if some other source of memcpy etc. is available, although this patch alone is not necessarily sufficient for that). This is a situation that the Linux kernel is in; see https://github.com/torvalds/linux/blob/2df76606db9de579bc96725981db4e8daa281993/rust/compiler_builtins.rs#L51-L63 which is a custom compiler-builtins which just panics on everything. (NB that I am not involved in the Linux kernel efforts myself.)
<i128 as Debug>::fmt is defined to output i128, rather than panic as #86048 does on floats. No other formatting is supported.
Test for this is by generating a static library linked to core but not to compiler-builtins and verifying that it does not have any dependencies on 128-bit compiler-rt functions with objdump. Using 128-bit division (etc.) is not forbidden by this patch so user code may still cause calls to these functions; it is up to users to decide what they want to do in this case, as Rust can't know exactly which operations on which architectures will require compiler-builtins. Users will see a linker error if they try operations which do, so I don't believe that to be a soundness issue.
Linked issues: https://github.com/Rust-for-Linux/linux/issues/514.
r? @scottmcm
(rust-highfive has picked a reviewer for you, use r? to override)
I believe u128 is also used in some of the Duration methods before casting to floats. Maybe these methods should be removed too?
I believe u128 is also used in some of the Duration methods before casting to floats. Maybe these methods should be removed too?
I considered trying to rip it out more thoroughly and produce a core library which cannot use these functions, but that would be quite hard to maintain and test, not least because the operations which compile to these functions are LLVM’s choice and platform-specific. E.g. there doesn’t appear to be one for addition. My understanding is that usage in an inline or generic method is OK because that doesn’t end up in the finished binaries unless used, so the formatting and parsing code (which isn’t inlined) is what matters.
The job x86_64-gnu-llvm-13 failed! Check out the build log: (web) (plain)
Click to see the possible cause of the failure (guessed by this bot)
Some tests failed in compiletest suite=run-make-fulldeps mode=run-make host=x86_64-unknown-linux-gnu target=x86_64-unknown-linux-gnu
i...................................i...ii...............F
failures:
---- [run-make] src/test/run-make-fulldeps/core-no-128-bit stdout ----
error: make failed
status: exit status: 2
command: "make"
--- stdout -------------------------------
--- stdout -------------------------------
LD_LIBRARY_PATH="/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit:/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-bootstrap-tools/x86_64-unknown-linux-gnu/release/deps:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/lib" '/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc' --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit -L /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit --edition=2021 --crate-type=rlib --crate-name=core ../../../../library/core/src/lib.rs --cfg no_128_bit
LD_LIBRARY_PATH="/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit:/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-bootstrap-tools/x86_64-unknown-linux-gnu/release/deps:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/lib" '/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc' --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit -L /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit --edition=2021 --crate-type=staticlib --crate-name=demo --sysroot=. -C panic=abort lib.rs
# Expect that objdump succeeds and grep fails. The grep pattern is for names like __multi3.
# There is no pipefail on dash so echo a string that will fail the test if objump fails.
(objdump -t /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit/libdemo.a || echo __objdumpfailedti) | (! grep -w '__[a-z]\+ti[0-9]\?')
0000000000000000 *UND* 0000000000000000 __muloti4
0000000000000000 *UND* 0000000000000000 __muloti4
--- stderr -------------------------------
make: *** [Makefile:8: all] Error 1
------------------------------------------
This seems like a policy decision to me, so I'm going to flip it over to r? @joshtriplett to have a team discussion.
I've tried measuring the impact of this PR onto the generated libdemo.a file in the added test. For that I've manually applied this PR to my local rustc checkout, and ran the command:
./x.py test --stage 1 src/test/run-make-fulldeps/core-no-128-bit
To get the state before the PR, I did:
git checkout HEAD~2 library
To turn on optimizations, I added -O to the makefile. This gives the following table (units in kb):
| What | Size Before | Size After | Savings |
|---|---|---|---|
| du | 4316 | 4236 | 80 |
| strip + du | 3052 | 2996 | 56 |
| -O + du | 2124 | 2084 | 40 |
| -O + strip + du | 1696 | 1664 | 32 |
IIRC kernels usually get deployed without debuginfo, and are usually compiled with optimizations, so the last line is probably the most relevant. 32 kilobytes of savings correspond to about 1.9% relative to the 1696 kb.
I've also tried out the already existing floating point number cfg, this time with this patch applied, but not with the flag to turn off 128 bit numbers enabled.
| What | Size Before | Size After | Savings |
|---|---|---|---|
| du | 4316 | 3352 | 964 |
| strip + du | 3052 | 2360 | 692 |
| -O + du | 2120 | 1708 | 412 |
| -O + strip + du | 1696 | 1356 | 340 |
So float parsing/formatting has about 340 kb of savings, or 20% of the 1696 binary.
It makes total sense that floats take up more space because their formatting code is way more complicated.
As the second difference between floats and 128 bit numbers, floats require the FPU to be around so are an absolute no-no in kernel space. For 128 bit numbers, they shouldn't use any special hardware so should in theory be able to run in the kernel... the reason to avoid them is just the impact on the binary size I think?
Not a kernel person, but it seems that the kernel already uses GCC's __uint128_t compiler extension. Its adoption has also been attributed with speedups. Although due to ABI issues (#54341), doing similar things as that commit in Rust might not be the best idea at the current point in time.
I think this PR makes sense, even if the impact is rather small compared to turning off float formatting, and 128 bit numbers are "just" integers, so don't require any special hardware outside of CPU, registers and memory.
floats require the FPU to be around so are an absolute no-no in kernel space
Nit: floats can be used where available, but it is still good to be able to remove them when not used (see below).
For 128 bit numbers, they shouldn't use any special hardware so should in theory be able to run in the kernel... the reason to avoid them is just the impact on the binary size I think?
In general, everything that can reasonably be removed should be so (conditionally depending on the kernel configuration, in many cases), but this does not mean the features are never used by any configuration.
In addition, even if the kernel does not use something in any config today, that itself may change over time, e.g. see The road to Zettalinux by willy.
When I made the original list long ago, I picked a few "big" things that I could think of, but really, the more that we can configure in/out, the better (just like the kernel itself).
I imagined other use cases (i.e. outside the Linux kernel) would also appreciate these options too. For instance, in embedded, constrained devices, every saved text page is welcome, and could open Rust for more use cases. Security-wise, it is also a good idea to avoid dead code. And perhaps in some safety-critical systems it may facilitate the process.
For some of these use cases it may be possible to strip things out after compilation, but probably not for all of them, at least as-is/easily (e.g. in the kernel we support external modules, so we export core, but we would need a way to mark which parts may be stripped or not, both from objects and rmetas). And even if it is possible, you lose second order benefits like compilation speed etc.
So all effort towards what I called "modularization of core" elsewhere would be welcome by several projects, I think.
Tests have moved from src/test to tests, please rebase on top of new master, if this PR is going any further. Post @rustbot ready once the rebase is complete.
Thanks!
@rustbot author
The job x86_64-gnu-llvm-13 failed! Check out the build log: (web) (plain)
Click to see the possible cause of the failure (guessed by this bot)
Some tests failed in compiletest suite=run-make-fulldeps mode=run-make host=x86_64-unknown-linux-gnu target=x86_64-unknown-linux-gnu
ii...................................i...ii...............F
failures:
---- [run-make] tests/run-make-fulldeps/core-no-128-bit stdout ----
error: make failed
status: exit status: 2
status: exit status: 2
command: cd "/checkout/tests/run-make-fulldeps/core-no-128-bit" && AR="ar" CC="cc -ffunction-sections -fdata-sections -fPIC -m64" CXX="c++ -ffunction-sections -fdata-sections -fPIC -m64" HOST_RPATH_DIR="/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" LD_LIB_PATH_ENVVAR="LD_LIBRARY_PATH" LLVM_BIN_DIR="/usr/lib/llvm-13/bin" LLVM_COMPONENTS="aarch64 aarch64asmparser aarch64codegen aarch64desc aarch64disassembler aarch64info aarch64utils aggressiveinstcombine all all-targets amdgpu amdgpuasmparser amdgpucodegen amdgpudesc amdgpudisassembler amdgpuinfo amdgpuutils analysis arm armasmparser armcodegen armdesc armdisassembler arminfo armutils asmparser asmprinter avr avrasmparser avrcodegen avrdesc avrdisassembler avrinfo binaryformat bitreader bitstreamreader bitwriter bpf bpfasmparser bpfcodegen bpfdesc bpfdisassembler bpfinfo cfguard codegen core coroutines coverage debuginfocodeview debuginfodwarf debuginfogsym debuginfomsf debuginfopdb demangle dlltooldriver dwarflinker dwp engine executionengine extensions filecheck frontendopenacc frontendopenmp fuzzmutate globalisel hexagon hexagonasmparser hexagoncodegen hexagondesc hexagondisassembler hexagoninfo instcombine instrumentation interfacestub interpreter ipo irreader jitlink lanai lanaiasmparser lanaicodegen lanaidesc lanaidisassembler lanaiinfo libdriver lineeditor linker lto m68k m68kasmparser m68kcodegen m68kdesc m68kdisassembler m68kinfo mc mca mcdisassembler mcjit mcparser mips mipsasmparser mipscodegen mipsdesc mipsdisassembler mipsinfo mirparser msp430 msp430asmparser msp430codegen msp430desc msp430disassembler msp430info native nativecodegen nvptx nvptxcodegen nvptxdesc nvptxinfo objcarcopts object objectyaml option orcjit orcshared orctargetprocess passes perfjitevents powerpc powerpcasmparser powerpccodegen powerpcdesc powerpcdisassembler powerpcinfo profiledata remarks riscv riscvasmparser riscvcodegen riscvdesc riscvdisassembler riscvinfo runtimedyld scalaropts selectiondag sparc sparcasmparser sparccodegen sparcdesc sparcdisassembler sparcinfo support symbolize systemz systemzasmparser systemzcodegen systemzdesc systemzdisassembler systemzinfo tablegen target textapi transformutils vectorize webassembly webassemblyasmparser webassemblycodegen webassemblydesc webassemblydisassembler webassemblyinfo webassemblyutils windowsmanifest x86 x86asmparser x86codegen x86desc x86disassembler x86info xcore xcorecodegen xcoredesc xcoredisassembler xcoreinfo xray" LLVM_FILECHECK="/usr/lib/llvm-13/bin/FileCheck" NODE="/usr/bin/node" PYTHON="/usr/bin/python3" RUSTC="/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" RUSTDOC="/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustdoc" RUST_BUILD_STAGE="stage2-x86_64-unknown-linux-gnu" RUST_DEMANGLER="/checkout/obj/build/x86_64-unknown-linux-gnu/stage2-tools-bin/rust-demangler" S="/checkout" TARGET="x86_64-unknown-linux-gnu" TARGET_RPATH_DIR="/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" TMPDIR="/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit" "make"
--- stdout -------------------------------
LD_LIBRARY_PATH="/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit:/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-bootstrap-tools/x86_64-unknown-linux-gnu/release/deps:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/lib" '/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc' --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit -L /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit --edition=2021 --crate-type=rlib --crate-name=core ../../../library/core/src/lib.rs --cfg no_128_bit
LD_LIBRARY_PATH="/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit:/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-bootstrap-tools/x86_64-unknown-linux-gnu/release/deps:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/lib" '/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc' --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit -L /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit --edition=2021 --crate-type=staticlib --crate-name=demo --sysroot=. -C panic=abort lib.rs
# Expect that objdump succeeds and grep fails. The grep pattern is for names like __multi3.
# There is no pipefail on dash so echo a string that will fail the test if objump fails.
(objdump -t /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make-fulldeps/core-no-128-bit/core-no-128-bit/libdemo.a || echo __objdumpfailedti) | (! grep -w '__[a-z]\+ti[0-9]\?')
0000000000000000 *UND* 0000000000000000 __muloti4
0000000000000000 *UND* 0000000000000000 __muloti4
--- stderr -------------------------------
make: *** [Makefile:8: all] Error 1
------------------------------------------
@rustbot ready
Hey! It looks like you've submitted a new PR for the library teams!
If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.
Examples of T-libs-api changes:
- Stabilizing library features
- Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
- Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
- Changing public documentation in ways that create new stability guarantees
- Changing observable runtime behavior of library APIs
:umbrella: The latest upstream changes (presumably #108286) made this pull request unmergeable. Please resolve the merge conflicts.
This seems like a policy decision to me so I'm going to flip it over to... have a team discussion.
Ah, so this is waiting on a team decision? I'll add the label.
We discussed this in the libs meeting today and we are not inclined the add more opt-out configuration flags. The current set is already questionable and doesn't have a path to specialization.
While I can appreciate the concerns about binary size for the Linux kernel, this is fundamentally a problem caused by the way Linux does dynamic linking of modules, which prevents the linker from eliminating unused symbols. As such, this flag is not useful for the majority of embedded projects which use static linking.
The existing no_fp_fmt_parse flag also has no path to stabilization, but we're willing to keep it for now considering the large impact this has on Linux (~300KB of code). This no_128_bit only saves ~30KB, and if this is still significant for a particular kernel build then it might be worth considering disabling module support so that the linker can eliminate unused symbols.
@rfcbot fcp close
Team member @Amanieu has proposed to close this. The next step is review by the rest of the tagged team members:
- [x] @Amanieu
- [x] @cuviper
- [ ] @joshtriplett
- [x] @m-ou-se
- [x] @the8472
No concerns currently listed.
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
:bell: This is now entering its final comment period, as per the review above. :bell:
The final comment period, with a disposition to close, as per the review above, is now complete.
As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.