OpenSK icon indicating copy to clipboard operation
OpenSK copied to clipboard

Misuse of const in sha256 module

Open jwnrt opened this issue 1 year ago • 7 comments

Expected Behavior

https://github.com/google/OpenSK/blob/develop/libraries/crypto/src/sha256.rs#L28

BUSY is used for locking the sha256 module between Sha256::new() an Sha256::finalize() calls.

Actual Behavior

const in Rust is similar to #define in C in that its value is copied into its use-sites. Each place BUSY is used in this module is an independent Cell and not the same one.

I think static BUSY: AtomicBool would work here.

jwnrt avatar Jun 22 '24 12:06 jwnrt

This bug can be test by added these lines anywhere:

let s1 = Sha::<E>::new();
let s2 = Sha::<E>::new();
let mut out1 = [0; 32];
let mut out2 = [0; 32];
s1.finalize(&mut out1);
s2.finalize(&mut out2);
if out1 != out2 {
    return vec![Ctap2StatusCode::CTAP2_ERR_VENDOR_INTERNAL_ERROR as u8];
}

The above should not be allowed.

Do I understand that the proposed solution is this?

diff --git a/libraries/crypto/src/sha256.rs b/libraries/crypto/src/sha256.rs
index 09e32e0..75a5418 100644
--- a/libraries/crypto/src/sha256.rs
+++ b/libraries/crypto/src/sha256.rs
@@ -15,8 +15,8 @@
 use super::{Hash256, HashBlockSize64Bytes};
 use arrayref::{array_mut_ref, array_ref};
 use byteorder::{BigEndian, ByteOrder};
-use core::cell::Cell;
 use core::num::Wrapping;
+use core::sync::atomic::{AtomicBool, Ordering};
 use zeroize::Zeroize;
 
 const BLOCK_SIZE: usize = 64;
@@ -25,7 +25,7 @@ const BLOCK_SIZE: usize = 64;
 // sha256 in parallel. (Note that almost all usage of Sha256 is through Hash256::hash which is
 // statically correct. There's only 2 low-level usages in the `hmac::hmac_256` and those are
 // sequential.) This variable tracks whether `new` was called but `finalize` wasn't called yet.
-const BUSY: Cell<bool> = Cell::new(false);
+static BUSY: AtomicBool = AtomicBool::new(false);
 
 pub struct Sha256 {
     state: [Wrapping<u32>; 8],
@@ -46,7 +46,7 @@ impl Drop for Sha256 {
 
 impl Hash256 for Sha256 {
     fn new() -> Self {
-        assert!(!BUSY.replace(true));
+        assert!(!BUSY.swap(true, Ordering::SeqCst));
         Sha256 {
             state: Sha256::H,
             block: [0; BLOCK_SIZE],
@@ -112,7 +112,7 @@ impl Hash256 for Sha256 {
         for i in 0..8 {
             BigEndian::write_u32(array_mut_ref![output, 4 * i, 4], self.state[i].0);
         }
-        BUSY.set(false);
+        BUSY.store(false, Ordering::SeqCst);
     }
 }

It has a problem though: Tests start failing. Not because any individual test doesn't work, i.e., this passes: cargo test --features std test_hkdf_256_matches However, running cargo test --features std can see the above test failing.

kaczmarczyck avatar Jun 26 '24 14:06 kaczmarczyck

If the problem is that tests run in parallel, we could either use cargo test --features=std -- --test-threads=1 but this is going to slow everything down. Or we could try to isolate the tests that need to run sequentially by matching on their names:

cargo test --features=std -- --skip=uses_sha_256
cargo test --features=std -- --test-threads=1 uses_sha_256

Also regarding the atomic ordering, we can probably use Relaxed because we are not protecting any memory in the "critical section" as far as I'm aware.

ia0 avatar Jun 26 '24 14:06 ia0

Do I understand that the proposed solution is this?

Yes, that’s what I was thinking.

Is OpenSK aiming for any targets without atomics? OpenTitan (for example) doesn’t have atomics but I couldn’t think of an alternative solution.

jwnrt avatar Jun 26 '24 15:06 jwnrt

Is OpenSK aiming for any targets without atomics? OpenTitan (for example) doesn’t have atomics but I couldn’t think of an alternative solution.

It is sometimes possible to use portable-atomic for such targets. Not sure about OpenTitan, but if it's single-core then it's possible to disable interrupts to simulate atomic operations (loads and stores can sometimes avoid critical sections depending on the architecture).

ia0 avatar Jun 26 '24 15:06 ia0

Our pragmatic solution for this problem in one case: Use a static mut instead.

In this case, BUSY is added to prevent implementation errors. I'd assume you'd trigger the assert eventually even without Atomic guarantees if you compute two SHAs in parallel.

kaczmarczyck avatar Jun 28 '24 09:06 kaczmarczyck

I don't think it's easy to list all tests that use SHA256. Also increases maintenance. What we could do is disable the BUSY assert with a compiler flag, i.e., #[cfg(test)]. It would mean that you'd only notice your mistake after testing on hardware. But maybe we want to be more specific anyway: Some crypto implementation could be fine with parallel SHA256 computation. It's just that we want to make sure our library doesn't do that, to support hardware that can't provide that.

kaczmarczyck avatar Jun 28 '24 09:06 kaczmarczyck

If the intent of this check is to make sure the library doesn't do parallel hashes, then we shouldn't disable it in tests. We sadly need to use --test-threads=1 or (less coverage but faster) use a thread local BUSY variable.

ia0 avatar Jun 28 '24 10:06 ia0

We agreed to not use an Atomic, but use a Mutex based implementation that allows to still run tests in parallel. Outside of tests, we will remove this BUSY check.

Reasoning is: We want to make sure OpenSK doesn't call SHA256 in parallel, so all hardware crypto implementations can use OpenSK. We don't want to prevent people from using parallel SHA256 in their own code if they know they can support it on their hardware. Therefore, the check will move inside libraries/opensk and apply in tests only to assert this property there.

PR coming soon.

kaczmarczyck avatar Jul 03 '24 12:07 kaczmarczyck