Undefined weak symbol's address
Section 5.6.1 (https://github.com/ARM-software/abi-aa/blob/844a79fd4c77252a11342709e3b27b2c9f590cf1/aaelf64/aaelf64.rst#weak-references) says the following.
During linking, the symbol value of an undefined weak reference is:
- Zero if the relocation type is absolute
- The address of the place if the relocation type is pc-relative.
It looks like the second bullet point doesn't match the linker's actual behavior. Calling an unresolved weak symbol on AArch64 is expected to be equivalent to nop and just increments the program counter to the next instruction. Here is an excerpt from the real object file as an example.
$ aarch64-linux-gnu-objdump -dr /usr/aarch64-linux-gnu/lib/crti.o
0000000000000000 <call_weak_fn>:
0: 90000000 adrp x0, 0 <__gmon_start__>
0: R_AARCH64_ADR_GOT_PAGE __gmon_start__
4: f9400000 ldr x0, [x0]
4: R_AARCH64_LD64_GOT_LO12_NC __gmon_start__
8: b4000040 cbz x0, 10 <call_weak_fn+0x10>
c: 14000000 b 0 <__gmon_start__>
c: R_AARCH64_JUMP26 __gmon_start__
10: d65f03c0 ret
If __gmon_start__ is undefined in the above code, b would fall into an infinite loop instead of behaving as nop if the address of __gmon_start__ is evaluated to the address of the relocated place. It needs to be evaluated to PC+4 instead.
It looks like it's a spec bug.
I think this is a case of when we decide when a relocation type is PC relative. The ADRP is a PC-relative instruction, but as R_AARCH64_ADR_GOT_PAGE calculates the offset to the GOT entry for __gmon_start__ and not __gmon_start__ itself. The test for PC-relative is applied to the dynamic relocation R_AARCH64_GLOB_DAT to __gmon_start__, which will return 0 if __gmon_start__ is an undefined weak.
Hope that makes sense. I think it is possible the wording could be tidied up to make it clearer, although I can't immediately think of any right now.
well a reasonable interpretation of "the relocation type is pc-relative" is that the relocation operation has "P" in it.
the operation of R_AARCH64_ADR_GOT_PAGE is "Page(G(GDAT(S+A)))-Page(P)" so it is pc-relative and thus in this expression the symbol value "S" is specified to be S=P for undef weak symbols, which is not what we want.
i think this can be resolved by changing the wording of the condition to "has P but does not have G", or by changing GDAT(S+A) to GDAT(sym+A) (i.e. the symbol address S is not evaluated in an expression with GDAT)
@nsz-arm Thanks, that's exactly why the control reached the b instruction. After linking, the above code is transformed to the following piece of code:
0000000000237a44 <call_weak_fn>:
237a44: 90000000 adrp x0, 237000 <errstring.1+0xf7a0>
237a48: 91291000 add x0, x0, #0xa44
237a4c: b4000040 cbz x0, 237a54 <call_weak_fn+0x10>
237a50: 14000000 b 237a50 <call_weak_fn+0xc>
237a54: d65f03c0 ret
The linker relaxed the GOT-loading adrp+ldr to the load immediate adrp+adr instruction pair because the weak symbol's address is PC-relative link-time constant in my test case (my binary was position-dependent.) Apparently, it handled call_weak_fn's address as S, not 0, because R_AARCH64_ADR_GOT_PAGE is defined as PC-relative (i.e. "Page(G(GDAT(S+A)))-Page(P)").
relaxing adrp+ldr to adrp+add is not valid for undef weak symbols, but i think you could relax to mov+nop (or mov+mov if the registers are different).
(btw this is an issue in ld.so and static pie early start code where it is assumed that hidden symbols can be accessed without dynamic relocation, but that's not true for hidden weakref symbols, which is the common case when dealing with linker generated symbols that not all linkers emit.. i wish the spec required adrp to relax to mov x,0 for undef weak instead of doing adrp x,0, then using adrp+add for weakref would be valid)
Relaxing adrp+ldr to adrp+add is valid for undef weak symbols if we are creating a position-dependent executable, no? The offset from a (non-relocatable) location to address 0 is a link-time constant in that case.
ah yes, adrp can work in pde (assuming the adrp location is below 4G which is the case usually and in your example)
Thanks for the suggestion on the changes we could make to the docs.
Several areas where we could improve:
- What happens when dynamic linking, i.e. the weak reference may be unresolved at static link time but it could be provided by the dynamic linker. This may be better off in the sysvabi64 document as the answer might be platform dependent.
- We have https://github.com/ARM-software/abi-aa/issues/217 which shows up a problem with how we do non 0 addends for GOT relative relocations. The spec while self consistent is difficult implement in at least lld and ld.bfd.
- How to describe weak references in the context of GOT relative relocations. I'm tending towards nsz-arm's suggestion of altering the description of GDAT, but it looks like some thought would need to go into the text, and ideally we could combine it with fixing 217.
Regarding the second point, I strongly feel we should update this psABI doc to match the reality instead of the other way around. As far as I know, only Alpha and MIPS psABIs allow addends inside a GOT entry. Alpha is already irrelevant, and MIPS seems to be going the same route. There's no reason to diverge from other psABIs at this point.
In addition to that, allowing addends inside GOT would result in a GOT size bloat because we need to create a separate GOT entry for each addend for the same symbol. It is better to just read a symbol value from GOT and add an addend afterward, and that's precisely what the current code does.