sof icon indicating copy to clipboard operation
sof copied to clipboard

rimage: segfault when parsing the wrong key

Open teburd opened this issue 4 years ago • 7 comments

I mistakenly put public rather than private key as the key argument and got a confusing segfault in place of a nice error message. @marc-hb mentioned this should probably be ticketed, so here it is!

teburd avatar Jan 11 '22 20:01 teburd

Indeed, instant reproduction with rimage commit 32052add9968:

PRIVATE_KEY_OPTION='-DRIMAGE_PRIVATE_KEY=/home/me/SOF/sof/keys/otc_public_key_3k.pem' ./scripts/xtensa-build-all.sh tgl

Firmware completing manifest v2.5
 meta: completing ADSP manifest
 meta: limit is 0x63ac0
 auth: completing authentication manifest
 cse: ri_css_v2_5_hdr_create completing CSS manifest
 css: set build date to 1970:00:01
 cse: completing CSE V2.5 manifest
 cse: cse checksum 5739b738
Firmware file size 0x68000 page count 98
 rimage_read_key: read key '/home/mherber2/SOF/sof/keys/otc_public_key_3k.pem'
Segmentation fault (core dumped)
make[2]: *** [src/arch/xtensa/CMakeFiles/run_rimage.dir/build.make:70: src/arch/xtensa/CMakeFiles/run_rimage] Error 139

marc-hb avatar Jan 11 '22 22:01 marc-hb

make -C build_tgl_gcc VERBOSE=1

cd build_tgl_gcc/src/arch/xtensa &&
 ../../../rimage_ep/build/rimage -o sof-tgl.ri -c /home/me/SOF/sof/rimage/config/tgl.toml \
-k /home/me/SOF/sof/keys/otc_public_key_3k.pem -i 3 -f 1.2 -b 1 -e bootloader-tgl sof-tgl

With valgrind:

Firmware file size 0x68000 page count 98
 rimage_read_key: read key '/home/me/SOF/sof/keys/otc_public_key_3k.pem'
==79892== Invalid read of size 8
==79892==    at 0x4A0B4A2: RSA_check_key_ex (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==79892==    by 0x10CC34: rimage_check_key (pkcs1_5.c:103)
==79892==    by 0x10D6F9: pkcs_v1_5_sign_man_v2_5 (pkcs1_5.c:599)
==79892==    by 0x10DAA9: ri_manifest_sign_v2_5 (pkcs1_5.c:681)
==79892==    by 0x11115E: man_write_fw_v2_5 (manifest.c:1312)
==79892==    by 0x114773: main (rimage.c:209)
==79892==  Address 0x30 is not stack'd, malloc'd or (recently) free'd
==79892== 
==79892== 
==79892== Process terminating with default action of signal 11 (SIGSEGV)
==79892==  Access not within mapped region at address 0x30
==79892==    at 0x4A0B4A2: RSA_check_key_ex (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==79892==    by 0x10CC34: rimage_check_key (pkcs1_5.c:103)
==79892==    by 0x10D6F9: pkcs_v1_5_sign_man_v2_5 (pkcs1_5.c:599)
==79892==    by 0x10DAA9: ri_manifest_sign_v2_5 (pkcs1_5.c:681)
==79892==    by 0x11115E: man_write_fw_v2_5 (manifest.c:1312)
==79892==    by 0x114773: main (rimage.c:209)
==79892==  If you believe this happened as a result of a stack
==79892==  overflow in your program's main thread (unlikely but
==79892==  possible), you can try to increase the size of the
==79892==  main thread stack using the --main-stacksize= flag.
==79892==  The main thread stack size used in this run was 8388608.

@lrgirdwo , the crash with TGL can be reproduced as early as when thesofproject/rimage#33 added CAVS2.5 for the first time (with a cherry-pick of size config 9e50a02f1b0d otherwise there's not enough space - and no crash) Warning: a number of commits in thesofproject/rimage#33 don't compile.

marc-hb avatar Jan 11 '22 23:01 marc-hb

So let's avoid TGL because it requires brand new rimage code. Same reproduction with APL and much older rimage code, going as far back as September 2020's a94e12d7d653 when .toml files were added for the first time (I was too lazy to try to backtrack SOF too)

PRIVATE_KEY_OPTION='-DRIMAGE_PRIVATE_KEY=/home/me/SOF/sof/keys/otc_public_key_3k.pem' ./scripts/xtensa-build-all.sh apl

...
Firmware completing manifest v1.8
 meta: completing ADSP manifest
 meta: limit is 0x38b80
 auth: completing authentication manifest
 cse: completing CSS manifest
 css: set build date to 2022:00:11
 cse: completing CSE V1.8 manifest
Firmware file size 0x3d000 page count 55
 pkcs: signing with key /home/me/SOF/sof/keys/otc_public_key_3k.pem
==166931== Invalid read of size 8
==166931==    at 0x4A0B4A2: RSA_check_key_ex (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==166931==    by 0x10C8BC: pkcs_v1_5_sign_man_v1_8 (pkcs1_5.c:202)
==166931==    by 0x10CC02: ri_manifest_sign_v1_8 (pkcs1_5.c:257)
==166931==    by 0x10ED52: man_write_fw_v1_8 (manifest.c:924)
==166931==    by 0x1121EB: main (rimage.c:226)
==166931==  Address 0x30 is not stack'd, malloc'd or (recently) free'd
==166931== 
==166931== 
==166931== Process terminating with default action of signal 11 (SIGSEGV)
==166931==  Access not within mapped region at address 0x30
==166931==    at 0x4A0B4A2: RSA_check_key_ex (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==166931==    by 0x10C8BC: pkcs_v1_5_sign_man_v1_8 (pkcs1_5.c:202)
==166931==    by 0x10CC02: ri_manifest_sign_v1_8 (pkcs1_5.c:257)
==166931==    by 0x10ED52: man_write_fw_v1_8 (manifest.c:924)
==166931==    by 0x1121EB: main (rimage.c:226)
==166931==  If you believe this happened as a result of a stack
==166931==  overflow in your program's main thread (unlikely but
==166931==  possible), you can try to increase the size of the
==166931==  main thread stack using the --main-stacksize= flag.
==166931==  The main thread stack size used in this run was 8388608.


marc-hb avatar Jan 11 '22 23:01 marc-hb

Here's another fun one discovered by @softwarecki:

./scripts/xtensa-build-all.sh tgl apl

[100%] Built target run_smex
 pkcs: RSA private key is valid.
 pkcs: digest for manifest is b89ecf28afef0bb9ecee207ef8ab6ad7c20b10bf5fe23d3a361928cb92e5a8d8
*** stack smashing detected ***: terminated
[100%] Built target copy_dictionaries
Aborted (core dumped)
make[3]: *** [src/arch/xtensa/CMakeFiles/run_rimage.dir/build.make:70: src/arch/xtensa/CMakeFiles/run_rimage] Error 134

That's because the TGL key "leaks" to later platforms. ./scripts/xtensa-build-all.sh apl tgl works fine.

Reproduced with sof commit 443b21de4b3f

marc-hb avatar Feb 16 '22 05:02 marc-hb

@marc-hb please assign me this issue i would like to work on this.

aryansharma6827 avatar Feb 29 '24 08:02 aryansharma6827

@aryansharma6827 assigned - thanks !

lgirdwood avatar Feb 29 '24 13:02 lgirdwood

Thanks @aryansharma6827 ! Note you don't need to wait to be "officially" assigned to start working. Adding a comment is enough. fter 2 years without activity you can safely assume no one else is working on this :-)

Also note A LOT of rimage has changed in the meantime. As the first thing please compare today's version with the versions in the report. Make sure they still reproduce the same. Either way please comment and share what the new status is.

The main rimage branch is now in the main sof repo, it's not in the separate rimage repo any more.

marc-hb avatar Feb 29 '24 15:02 marc-hb