[AVR] Optimize 'call' to 'rcall' for short programs
As a possible optimization, we could look into using rcall instead of call instructions when the target is close enough. For example, here: https://godbolt.org/z/rEz9j71dq (apparently avr-gcc doesn't do this optimization).
int foo(int a, int b) {
return a + b;
}
int bar(int a, int b) {
return foo(a, b) + 3;
}
If -ffunction-sections is not used, rcall is both shorter in code size and faster in execution speed.
We need to investigate this can be done by linker relaxation.
What's more, we can check if other relax optimization can be done in lld for AVR.
apparently avr-gcc doesn't do this optimization
Link with -mrelax, which performs other optimizations, too. Notice there are cases / sections that must not be optimized like .vectors or .jumptables.
Also if you are relaxing, the assembler must not relax by itself, and all relaxations must be postponed until link.
apparently avr-gcc doesn't do this optimization
Link with
-mrelax, which performs other optimizations, too. Notice there are cases / sections that must not be optimized like.vectorsor.jumptables.Also if you are relaxing, the assembler must not relax by itself, and all relaxations must be postponed until link.
Sure. Thanks.
It seems impossible to do this call -> rcall transform with clang + gnu-avr-ld.
What's the problem? All that has to be done is to pass -mrelax to ld.
What's the problem? All that has to be done is to pass
-mrelaxtold.
Sure. It is my mistake. clang's AVR driver does not handle -mrelax/-mno-relax properly.
fixed by: https://reviews.llvm.org/D144617 https://reviews.llvm.org/D144620
@llvm/issue-subscribers-clang-driver