Misuse of const in sha256 module
Expected Behavior
https://github.com/google/OpenSK/blob/develop/libraries/crypto/src/sha256.rs#L28
BUSY is used for locking the sha256 module between Sha256::new() an Sha256::finalize() calls.
Actual Behavior
const in Rust is similar to #define in C in that its value is copied into its use-sites.
Each place BUSY is used in this module is an independent Cell and not the same one.
I think static BUSY: AtomicBool would work here.
This bug can be test by added these lines anywhere:
let s1 = Sha::<E>::new();
let s2 = Sha::<E>::new();
let mut out1 = [0; 32];
let mut out2 = [0; 32];
s1.finalize(&mut out1);
s2.finalize(&mut out2);
if out1 != out2 {
return vec![Ctap2StatusCode::CTAP2_ERR_VENDOR_INTERNAL_ERROR as u8];
}
The above should not be allowed.
Do I understand that the proposed solution is this?
diff --git a/libraries/crypto/src/sha256.rs b/libraries/crypto/src/sha256.rs
index 09e32e0..75a5418 100644
--- a/libraries/crypto/src/sha256.rs
+++ b/libraries/crypto/src/sha256.rs
@@ -15,8 +15,8 @@
use super::{Hash256, HashBlockSize64Bytes};
use arrayref::{array_mut_ref, array_ref};
use byteorder::{BigEndian, ByteOrder};
-use core::cell::Cell;
use core::num::Wrapping;
+use core::sync::atomic::{AtomicBool, Ordering};
use zeroize::Zeroize;
const BLOCK_SIZE: usize = 64;
@@ -25,7 +25,7 @@ const BLOCK_SIZE: usize = 64;
// sha256 in parallel. (Note that almost all usage of Sha256 is through Hash256::hash which is
// statically correct. There's only 2 low-level usages in the `hmac::hmac_256` and those are
// sequential.) This variable tracks whether `new` was called but `finalize` wasn't called yet.
-const BUSY: Cell<bool> = Cell::new(false);
+static BUSY: AtomicBool = AtomicBool::new(false);
pub struct Sha256 {
state: [Wrapping<u32>; 8],
@@ -46,7 +46,7 @@ impl Drop for Sha256 {
impl Hash256 for Sha256 {
fn new() -> Self {
- assert!(!BUSY.replace(true));
+ assert!(!BUSY.swap(true, Ordering::SeqCst));
Sha256 {
state: Sha256::H,
block: [0; BLOCK_SIZE],
@@ -112,7 +112,7 @@ impl Hash256 for Sha256 {
for i in 0..8 {
BigEndian::write_u32(array_mut_ref![output, 4 * i, 4], self.state[i].0);
}
- BUSY.set(false);
+ BUSY.store(false, Ordering::SeqCst);
}
}
It has a problem though: Tests start failing. Not because any individual test doesn't work, i.e., this passes:
cargo test --features std test_hkdf_256_matches
However, running cargo test --features std can see the above test failing.
If the problem is that tests run in parallel, we could either use cargo test --features=std -- --test-threads=1 but this is going to slow everything down. Or we could try to isolate the tests that need to run sequentially by matching on their names:
cargo test --features=std -- --skip=uses_sha_256
cargo test --features=std -- --test-threads=1 uses_sha_256
Also regarding the atomic ordering, we can probably use Relaxed because we are not protecting any memory in the "critical section" as far as I'm aware.
Do I understand that the proposed solution is this?
Yes, that’s what I was thinking.
Is OpenSK aiming for any targets without atomics? OpenTitan (for example) doesn’t have atomics but I couldn’t think of an alternative solution.
Is OpenSK aiming for any targets without atomics? OpenTitan (for example) doesn’t have atomics but I couldn’t think of an alternative solution.
It is sometimes possible to use portable-atomic for such targets. Not sure about OpenTitan, but if it's single-core then it's possible to disable interrupts to simulate atomic operations (loads and stores can sometimes avoid critical sections depending on the architecture).
Our pragmatic solution for this problem in one case: Use a static mut instead.
In this case, BUSY is added to prevent implementation errors. I'd assume you'd trigger the assert eventually even without Atomic guarantees if you compute two SHAs in parallel.
I don't think it's easy to list all tests that use SHA256. Also increases maintenance. What we could do is disable the BUSY assert with a compiler flag, i.e., #[cfg(test)]. It would mean that you'd only notice your mistake after testing on hardware. But maybe we want to be more specific anyway: Some crypto implementation could be fine with parallel SHA256 computation. It's just that we want to make sure our library doesn't do that, to support hardware that can't provide that.
If the intent of this check is to make sure the library doesn't do parallel hashes, then we shouldn't disable it in tests. We sadly need to use --test-threads=1 or (less coverage but faster) use a thread local BUSY variable.
We agreed to not use an Atomic, but use a Mutex based implementation that allows to still run tests in parallel. Outside of tests, we will remove this BUSY check.
Reasoning is: We want to make sure OpenSK doesn't call SHA256 in parallel, so all hardware crypto implementations can use OpenSK. We don't want to prevent people from using parallel SHA256 in their own code if they know they can support it on their hardware. Therefore, the check will move inside libraries/opensk and apply in tests only to assert this property there.
PR coming soon.