Android: 64-bit ARM apk crashes during startup.
Describe the bug: Android builds for 64-bit ARM crash on startup with the following error message:
04-24 05:40:48.296 9815 9840 D simpleservo: thread 'Script(1,1)' panicked at /home/mukilan/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/8603cbf/mozjs-sys/src/jsval.rs:217:5:
04-24 05:40:48.296 9815 9840 D simpleservo: assertion `left == right` failed
04-24 05:40:48.296 9815 9840 D simpleservo: left: 12970366926827028480
04-24 05:40:48.296 9815 9840 D simpleservo: right: 0
To Reproduce:
-
./mach build --target=aarch64-linux-android - With arm64 based device connected to host machine and developer mode enabled:
./mach install --target=aarch64-linux-android -
./mach run --android
More info I was able to generate a backtrace by setting RUST_BACKTRACE=full in the jni port's init method.
04-27 13:13:27.969 22275 22314 D simpleservo: thread 'Script(1,1)' panicked at /home/mukilan/.cargo/git/checkouts/mozjs-fa11ffc7d4f1cc2d/8603cbf/mozjs-sys/src/jsval.rs:217:5:
04-27 13:13:27.969 22275 22314 D simpleservo: assertion `left == right` failed
04-27 13:13:27.969 22275 22314 D simpleservo: left: 12970366926827028480
04-27 13:13:27.969 22275 22314 D simpleservo: right: 0
04-27 13:13:27.969 22275 22314 D simpleservo: stack backtrace:
04-27 13:13:27.969 22275 22305 D simpleservo: simpleservo::simpleservo: done perform_updates
04-27 13:13:27.969 22275 22332 D simpleservo: simpleservo: wakeup
04-27 13:13:27.970 22275 22305 D simpleservo: simpleservo: performUpdates
04-27 13:13:27.970 22275 22305 D simpleservo: simpleservo::simpleservo: perform_updates
04-27 13:13:27.971 22275 22305 D simpleservo: simpleservo::simpleservo: done perform_updates
04-27 13:13:27.974 22275 22314 D simpleservo: 0: 0x72e68eef4c - <unknown>
04-27 13:13:27.974 22275 22314 D simpleservo: 1: 0x72e691ad08 - <unknown>
04-27 13:13:27.974 22275 22314 D simpleservo: 2: 0x72e68ea1b4 - <unknown>
04-27 13:13:27.974 22275 22314 D simpleservo: 3: 0x72e68eed78 - <unknown>
04-27 13:13:27.974 22275 22314 D simpleservo: 4: 0x72e68f0438 - <unknown>
04-27 13:13:27.975 22275 22314 D simpleservo: 5: 0x72e68f0030 - <unknown>
04-27 13:13:27.975 22275 22314 D simpleservo: 6: 0x72e68f0c10 - <unknown>
04-27 13:13:27.975 22275 22314 D simpleservo: 7: 0x72e68f09c8 - <unknown>
04-27 13:13:27.975 22275 22314 D simpleservo: 8: 0x72e68ef414 - <unknown>
04-27 13:13:27.975 22275 22314 D simpleservo: 9: 0x72e68f0720 - <unknown>
04-27 13:13:27.975 22275 22314 D simpleservo: 10: 0x72e691807c - <unknown>
04-27 13:13:27.975 22275 22314 D simpleservo: 11: 0x72e6918408 - <unknown>
04-27 13:13:27.976 22275 22314 D simpleservo: 12: 0x72e62ad140 - <unknown>
04-27 13:13:27.976 22275 22314 D simpleservo: 13: 0x72e1893e1c - <unknown>
04-27 13:13:27.976 22275 22314 D simpleservo: 14: 0x72e219fff4 - <unknown>
04-27 13:13:27.976 22275 22314 D simpleservo: 15: 0x72e20ea27c - <unknown>
04-27 13:13:27.976 22275 22314 D simpleservo: 16: 0x72e1c5c6ec - <unknown>
04-27 13:13:27.976 22275 22314 D simpleservo: 17: 0x72e13eb5b8 - <unknown>
04-27 13:13:27.977 22275 22314 D simpleservo: 18: 0x72e233f3b4 - <unknown>
04-27 13:13:27.977 22275 22314 D simpleservo: 19: 0x72e1c579e8 - <unknown>
04-27 13:13:27.977 22275 22314 D simpleservo: 20: 0x72e13e5740 - <unknown>
04-27 13:13:27.977 22275 22314 D simpleservo: 21: 0x72e1031c24 - <unknown>
04-27 13:13:27.977 22275 22314 D simpleservo: 22: 0x72e100dff8 - <unknown>
04-27 13:13:27.977 22275 22314 D simpleservo: 23: 0x72e1c46190 - <unknown>
04-27 13:13:27.977 22275 22314 D simpleservo: 24: 0x72e1c3cf54 - <unknown>
04-27 13:13:27.978 22275 22314 D simpleservo: 25: 0x72e1c61be0 - <unknown>
04-27 13:13:27.978 22275 22314 D simpleservo: 26: 0x72e1c4cff8 - <unknown>
04-27 13:13:27.978 22275 22314 D simpleservo: 27: 0x72e13e83b8 - <unknown>
04-27 13:13:27.978 22275 22314 D simpleservo: 28: 0x72e13e8568 - <unknown>
04-27 13:13:27.978 22275 22314 D simpleservo: 29: 0x72e1c4b4dc - <unknown>
04-27 13:13:27.978 22275 22314 D simpleservo: 30: 0x72e1c48ae4 - <unknown>
04-27 13:13:27.978 22275 22314 D simpleservo: 31: 0x72e13dffb0 - <unknown>
04-27 13:13:27.979 22275 22314 D simpleservo: 32: 0x72e113108c - <unknown>
04-27 13:13:27.979 22275 22314 D simpleservo: 33: 0x72e13dfde0 - <unknown>
04-27 13:13:27.979 22275 22314 D simpleservo: 34: 0x72e24baab4 - <unknown>
04-27 13:13:27.979 22275 22314 D simpleservo: 35: 0x72e120ae74 - <unknown>
04-27 13:13:27.979 22275 22314 D simpleservo: 36: 0x72e1a70d10 - <unknown>
04-27 13:13:27.979 22275 22314 D simpleservo: 37: 0x72e156f19c - <unknown>
04-27 13:13:27.979 22275 22314 D simpleservo: 38: 0x72e1578628 - <unknown>
04-27 13:13:27.980 22275 22314 D simpleservo: 39: 0x72e156eb00 - <unknown>
04-27 13:13:27.980 22275 22314 D simpleservo: 40: 0x72e1209e10 - <unknown>
04-27 13:13:27.980 22275 22314 D simpleservo: 41: 0x72e0e2df90 - <unknown>
04-27 13:13:27.980 22275 22314 D simpleservo: 42: 0x72e68f4c14 - <unknown>
04-27 13:13:27.982 22275 22314 D simpleservo: 43: 0x75e8b4388c - <unknown>
04-27 13:13:27.982 22275 22314 D simpleservo: 44: 0x75e8ae3d0c - <unknown>
I decoded the stack trace using addr2line and nm and I think this is due to the Tagged Pointers feature enabled on 64-bit ARM systems for apps that target Android SDK 30 and above.
The Tagged Pointer feature sets the upper most byte of a 64-bit pointer to a non-zero tag value for identifying memory safety issues. However, the SpiderMonkey assumes all user-mode pointers are only 48-bits wide on 64-bit systems when encoding the user-mode pointer into a JS::Value. A similar assertion is present in mozjs as well.
Servo in particular stores the pointers to the Rust allocated DOM structs as Private JS::Value in the reserved slot of the reflected JSObject. Since the addresses of these rust allocated DOM structs are tagged i.e upper byte is non-zero, the assertions in both SM and mozjs are violated.
I'm not sure how Gecko deals with this case. Either they have MTE disabled (which is possible using the AndroidManifest.xml, but this option will go away) or Gecko doesn't rely on using private user mode pointers the way Servo does, or uses a custom allocator for non-JS objects.
Looks like Gecko doesn't run into this issue only because it uses jemalloc instead of the system allocator by default.
Disabling jemalloc (ac_add_options --disable-jemalloc) does make Gecko crash on 64-bit ARM with a similar assertion failure (though in wasm code path):
05-10 11:13:11.389 4655 4655 F DEBUG : Revision: '0'
05-10 11:13:11.389 4655 4655 F DEBUG : ABI: 'arm64'
05-10 11:13:11.390 4655 4655 F DEBUG : Timestamp: 2024-05-10 11:13:11+0530
05-10 11:13:11.390 4655 4655 F DEBUG : pid: 4617, tid: 4644, name: Gecko >>> org.mozilla.geckoview_example <<<
05-10 11:13:11.390 4655 4655 F DEBUG : uid: 10526
05-10 11:13:11.390 4655 4655 F DEBUG : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
05-10 11:13:11.390 4655 4655 F DEBUG : Cause: null pointer dereference
05-10 11:13:11.390 4655 4655 F DEBUG : Abort message: '[4617] Assertion failure: (w >> TypeDefBits) == 0, at /home/mukilan/dev/mozilla-unified/js/src/wasm/WasmValType.h:98
Patching Servo to use jemalloc on Android also fixes the crash in Servo, so I guess that might be the permanent solution here.