[Bug]: Android NDK r27 rc2 miscompiles code with indirect gotos
Description
Android NDK r27 rc 2 produces invalid code for any target architecture when compiling with any nonzero optimization level.
Bellow is attached minimized/striped sample that shows the problem (when targeting x86_64 with -O1)
extern int printf(const char *fmt, ...);
int main() {
void* bytecode[2];
bytecode[0] = &&VM__OP_1;
bytecode[1] = &&VM__TERMINATE;
int state = 0;
int index = 0;
while (1) {
switch (state) {
case 0:
goto *bytecode[index];
case 1:
// NOTE: THIS IS ONLY REACHABLE VIA INDIRECT GOTOS
VM__OP_1:
state = 2;
break;
case 2:
printf("OP_1:(instruction=%d)\n", index);
index++;
goto *bytecode[index];
}
}
VM__TERMINATE:
printf("TERMINATE:(instruction=%d)\n", index);
return 0;
}
Link to github project: https://github.com/SanjaLV/ndk-bug-reports/tree/main/r27_rc2
Prerequisites:
- Linux/macOS machine
-
ANDROID_HOMEenv variable that will point to Android SDK root. -
ndk;26.3.11579264/ndk;27.0.11902837installed with SDK manager.
How to reproduce (invalid code):
- Run
make localand observe correct behavior with system compiler - Run
make r26and observe correct behavior when compiling with NDK r26d - Run
make r27and observe incorrect program behavior. - Run
optnoneand observe correct behavior withO0optimization level.
Correct execution should yield the following output:
OP_1:(instruction=0)
TERMINATE:(instruction=1)
Incorrect NDK r27 execution results in the following output:
TERMINATE:(instruction=0)
Context:
Originally discovered that upgrading NDK from r26d to r27 r1/rc2 broke state-machine like bytecode interpreter. After some investigation, we found out that bug appears if and only if we enable INDIRECT GOTO optimizations.
Feel free to ask for more information.
Many thanks, Aleksandrs
Upstream bug
No response
Commit to cherry-pick
No response
Affected versions
r27
Canary version
No response
Host OS
Linux
Host OS version
Ubuntu 22.04
Affected ABIs
armeabi-v7a, arm64-v8a, x86, x86_64
miscompilation is obvious if you inspect generated llvm bitcode:
27.0.11902837/toolchains/llvm/prebuilt/linux-x86_64/bin/clang -O1 test.c -S -emit-llvm & cat test.ll
...
; Function Attrs: nofree nounwind uwtable
define dso_local i32 @main() local_unnamed_addr #0 {
%1 = tail call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str.1, i32 noundef 0)
ret i32 0
}
...
Can't repro on clang upstream https://godbolt.org/z/d9bGnhhv6 so the fix is likely to be in upstream.
The bug is in SimplifyCFGPass
Before
define dso_local noundef i32 @main() #0 {
%1 = alloca [2 x ptr], align 16
store ptr blockaddress(@main, %7), ptr %1, align 16, !tbaa !5
%2 = getelementptr inbounds [2 x ptr], ptr %1, i64 0, i64 1
store ptr blockaddress(@main, %15), ptr %2, align 8, !tbaa !5
br label %3
3: ; preds = %0, %12
%4 = phi i32 [ 0, %0 ], [ %13, %12 ]
%5 = phi i32 [ 0, %0 ], [ %14, %12 ]
switch i32 %4, label %12 [
i32 0, label %6
i32 1, label %7
i32 2, label %9
]
6: ; preds = %3
br label %17
7: ; preds = %3, %17
%8 = phi i32 [ %18, %17 ], [ %5, %3 ]
%8 = phi i32 [ %18, %17 ], [ %5, %3 ]
br label %12
9: ; preds = %3
%10 = call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str, i32 noundef %5)
%11 = add nsw i32 %5, 1
br label %17
12: ; preds = %3, %7
%13 = phi i32 [ %4, %3 ], [ 2, %7 ]
%14 = phi i32 [ %5, %3 ], [ %8, %7 ]
br label %3, !llvm.loop !9
15: ; preds = %17
%16 = call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str.1, i32 noundef %
ret i32 0
17: ; preds = %9, %6
%18 = phi i32 [ %11, %9 ], [ %5, %6 ]
%19 = sext i32 %18 to i64
%20 = getelementptr inbounds [2 x ptr], ptr %1, i64 0, i64 %19
%21 = load ptr, ptr %20, align 8, !tbaa !5
indirectbr ptr %21, [label %7, label %15]
}
After
; *** IR Dump After SimplifyCFGPass on main ***
; Function Attrs: mustprogress norecurse uwtable
define dso_local noundef i32 @main() #0 {
%1 = alloca [2 x ptr], align 16
store ptr blockaddress(@main, %6), ptr %1, align 16, !tbaa !5
%2 = getelementptr inbounds [2 x ptr], ptr %1, i64 0, i64 1
store ptr blockaddress(@main, %12), ptr %2, align 8, !tbaa !5
br label %3
3: ; preds = %0, %10
%4 = phi i32 [ 0, %0 ], [ %11, %10 ]
%5 = phi i32 [ 0, %0 ], [ %5, %10 ]
switch i32 %4, label %10 [
i32 0, label %12
i32 1, label %6
i32 2, label %7
]
6: ; preds = %3
br label %10
7: ; preds = %3
%8 = call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str, i32 noundef %5)
%9 = add nsw i32 %5, 1
br label %12
10: ; preds = %3, %6
%11 = phi i32 [ %4, %3 ], [ 2, %6 ]
br label %3, !llvm.loop !9
12: ; preds = %7, %3
%13 = phi i32 [ %9, %7 ], [ %5, %3 ]
%14 = call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str.1, i32 noundef %
ret i32 0
}
https://r.android.com/3211575
Do we need a separate fix for r28? That's currently on clang-r530567, and I think the plan was to keep on that one.
The fix is already included in r28 (revision number for the fix is r523414).
Great, thanks for confirming.
by the way, the patch i have uploaded 'hides' the issue. the real issue is still in llvm trunk. It could take a while to land a fix but I'm preparing a patch for review.
I see. Hidden well enough that it's not something a user would encounter any more, or just way less common?
So the change in https://r.android.com/3211575 'updates the CFG' in a way that the later (buggy) pass can't see the problematic pattern anymore. Most likely the issue can't be reproduced in the new compiler but the current situation isn't ideal.
Right. So, fixed as far as any NDK developer would be able to tell, but there's room for improvement upstream. I'll mark it fixed here once we merge the fix into the NDK then.
I believe https://github.com/llvm/llvm-project/commit/fc6bdb8549842613da51b9d570b29e27cc709f69 is the patch causing the bug. I'm preparing a fix but reverting this should also work.
https://r.android.com/3218272 reverts the above mentioned patch.
Hey, just wanted to update you that r27b indeed fixes minimal repro that was reported here, but sadly it still miscompiles the non-reduced code.
This is not a blocker for us, since we have disabled indirect goto optimizations for any r27 minor ndks builds, and any future NDK major release will have the proper fix.
I don't think it is worth re-opening the issue, because nobody else reported the same problem.
Aditya has https://github.com/llvm/llvm-project/pull/103688 to fix this more generally in upstream.
I don't think it is worth re-opening the issue, because nobody else reported the same problem.
We'll reopen in case we're able to opportunistically pick up the fix, but ack, we won't make a big deal out of it :)
Landed in upstream.
@appujee Can you backport to llvm-toolchain, llvm-r530567 and llvm-r522817 branches?
Hey, The original (non-reduced) problem still occurs with the r28 beta 1 (28.0.12433566).
Just wanted to make sure this is a known thing. And is it not the case that r28 beta 1 was suppose to ship with cherry-picked fix and there are other problems in clang compiler.
No, that's not expected. Thanks for the heads up. @pirama-arumuga-nainar FYI
I think the fix hadn't landed when we picked the toolchain for r28 beta 1, so it doesn't inlcude the cherry-pick r.android.com/3267834. The final release will include the fix.