BouncyCastle vulnerable to a timing variant of Bleichenbacher called Marvin Attack
(As I've sent to the security mailing list three emails, last one over 2 weeks ago, and haven't received anything but automated reply that the email was received, I'm making the issue public here. What follows after the question is a verbatim copy of the last email.)
Question: Should I ask for CVE assignment for it?
I've tested both PKCS#1 v1.5 and OAEP API and verified that both provide a clear side-channel signal useful in Bleichenbacher-like attacks.
Since my previous emails have received nothing but an automated reply for over a week, I'll be making this issue in BouncyCastle public in two weeks (on 6th of November) if that doesn't change.
I've tested bcprov-jdk18on-17.jar on OpenJDK 1.8.0_372 from ArchLinux on an AMD Ryzen 5 5600X CPU. I did not perform any special tuning to make the noise smaller.
I've used the test harnesses in attachment to run the test data generated by the step2-alt.sh and step2-oaep-alt.sh from https://github.com/tomato42/marvin-toolkit/ (though I did test just 2048 bit keys, not the 1024 and 4096 generated by them too). Both of them accept 3 parameters: a path to a RSA PKCS#8 DER key, path to a file with concatenation of RSA ciphertexts (they both assume those are 256 bytes long), and an output csv file where timing of the processing will be written. See the examples/pyca-cryptography in the above repo on how to process that csv file further.
For both padding modes the Friedman test p-value is smaller than what a double precision floating point number can represent (i.e. it's reported as 0). That's in spite of using just 10k and 100k sample size for pkcs1 and oaep test respectively.
Which means the timing signal is extremely strong and definitely exploitable over the network.
Looking more closely at the results, it's clear that the PKCS#1 interface leaks timing data through raising of an exception, and the OAEP interface leaks timing data through the bit size of the decrypted value (EM in PKCS#1 nomenclature). Though those are very imprecise results, additional side-channels may be present.
See the Marvin paper for suggestions how to fix this: https://people.redhat.com/~hkario/marvin/ A bit more information is also present in the draft specifying the RSA implicit rejection mechanism I'm working to standardise: https://github.com/tomato42/marvin-ietf
See step2.py for explanation of the meaning of the probes, the legend for pkcs1.png is as follows:
ID,Name
0,header_only
1,no_header_with_payload_48
2,no_padding_48
3,no_structure
4,signature_padding_0
5,valid_0
6,valid_48
7,valid_192
8,valid_246
9,valid_repeated_byte_payload_246_1
10,valid_repeated_byte_payload_246_255
11,zero_byte_in_padding_48_4
for oaep.png is as follows:
ID,Name
0,no_padding_48
1,no_structure
2,signature_padding_0
3,too_short_payload_0_1
4,too_short_payload_0_3
5,too_short_payload_0_7
6,too_short_payload_0_15
7,valid_0
8,valid_repeated_byte_payload_245_0
pairwise sign test and Wilcoxon signed rank test results are in the csv files: oaep.csv pkcs1.csv
Apologies about the delay in responding, we only check the address every couple of weeks and it gets more than its fair share of spam. We've found your original emails though with the attachment. Taking a look.
We've done an initial pass.
If you're up for it, we're interested in following this up by looking at what our TLS API is doing.
While the figures around the examples are interesting, the issues related to exception throwing on javax.crypto.Cipher are well known, and it's a function of the JCE API rather than the BC provider, which in this case is simply following the documented API.
The BCJSSE provider does attempt to work around things already, but it's quite likely with better analysis we could find a way of doing that better as well. We will try and put something together for collecting timing data using your tools and be back in touch.
It's possible to fix it by not raising an exception, see the implicit rejection (a.k.a. Marvin workaround) documented in the Marvin paper, implemented by OpenSSL, NSS, and proposed in https://datatracker.ietf.org/doc/draft-kario-rsa-guidance/ (I'll be presenting it for CFRG adoption on IETF 118 this Thursday)
For testing TLS you can use either the regular script or the one that expects implicit rejection
Yes, we arrived at that conclusion a while ago - I think with the JSSE we're probably going to need to provide a custom "something" to do it though. Our JSSE provider is used with other JCE providers as well so we'll still need to make sure we can still accommodate them as well. First, collect some numbers though, can't flatten a curve we haven't measured.
Good luck with the presentation.
I think with the JSSE we're probably going to need to provide a custom "something" to do it though
I'm not super familiar with how provider infrastructure in Java works, but isn't it the case that you can either provide a way to do raw RSA operations (i.e. the RSADP() primitive from PKCS#1), or full operations (i.e the complete PKCS#1 v1.5 decryption)?
(I'm genuinely asking. If there are interfaces where implicit rejection won't work, I'd really like to know about them.)
So we think, with a "little bending of the rules" we can do this. We can provide raw RSA, but we need to stay mindful of the fact that we're also trying to make it possible for people to certify things, which really rules out raw RSA as a solution, with FIPS and Common Criteria providing developer access to raw RSA is frowned upon, the expectation is that you provide access to full operations only, so as attractive as it is, raw RSA isn't really going to cut it.
We're pretty sure we can still take advantage of implicit rejection though - for example the Java cipher class has an unwrap() method on it which takes a string for the algorithm of the key you want back. @peterdettman pointed out yesterday we could have some special algorithm names which could be used to flag that implicit rejection is acceptable and that exception throwing is not expected, so something like
Cipher rsaCipher = Cipher.getInstance("RSA/NONE/PKCS1Encoding", "BC");
rsaCipher.init(Cipher.UNWRAP_MODE, rsaPrivateKey);
Key secret = rsaCipher.unwrap(encapsulation, "TlsPremasterSecret", Cipher.SECRET_KEY);
Could be used to flag to the provider that "I don't want to see an exception", alternately there's also:
Cipher rsaCipher = Cipher.getInstance("RSA/ImplicitRejection/PKCS1Encoding", "BC");
which could be used to flag the "I don't want to see an exception" mode.
We'll probably need to try a few things and see what works best. Sometimes what looks great in theory doesn't always work well in practice... but there's options.
For FIPS, the PKCS#1 v1.5 will not be allowed at all after end of this year, which is like 6 weeks. So I'm not entirely sure how it's relevant...
For implicit rejection to actually help users of the library, it needs to be the default behaviour. The legacy depadding may be available for debugging, but it should be documented as inherently insecure.
Also, it's not just about TLS, it affects everything that can use encryption with PKCS#1 v1.5 padding: CMS, JWT, XMLencryption, etc.
Yes, although keep in mind it is PKCS#1.5 encryption they're disallowing and as it's been turned into a soft transition for FIPS 140-2 modules, I've a feeling there may be a bit of a delay on even that being realized fully.
Fully aware it's not just applicable to TLS, have to start somewhere though.
When will you assign a CVE to this issue?
CVE's are not free. We will, and do, raise them when appropriate however the issues with the Cipher API in Java are well documented, and while the numbers are interesting, we haven't come up with anything solid. Our TLS API already incorporates some counter measures and the numbers above are not for our TLS API. We have not found anything which has required the allocation of a CVE.
That's not to say we're not interested in moving to more of the code base making use of what's documented in the RFC that's currently in draft, we big fans of dealing with possible issues early, however that's not even a standard at the moment, and while worthy of investigation is clearly not something we can deploy.
And no, not ever throwing an exception is not an option, we could add a new implementation to do this, but people have being using javax.crypto.Cipher for over 20 years now and there are millions of lines of code relying on this behavior. For all those people the suggested fix is likely to cause other security issues as at that point it will appear the Cipher API has worked but it will be doing so in an application which is not designed to cope with the change in behavior.
CVEs are also the best tool we have to inform users that they are using vulnerable software. And while I do empathise with having to deal with bad APIs, attackers don't really care why a particular leakage is there, just if it is there.
On a slightly different note, I'll be at the first half of IETF 119 Brisbane, 2024. Is the RFC draft going to be presented again there as a lead up to a vote?
I don't think so. There was a general consensus for the draft adoption by the CFRG at IETF 118. I don't think it will be ready for WGLC by IETF 119, so I'm not sure which vote you mean... If I'll attend IETF 119, it will be remotely.
the numbers above are not for our TLS API.
The offer to run the test against pure BouncyCastle TLS stack still stands. Provide instructions similar to what I have for other implementations, e.g. go, and I'll run the test against it.
Sure, we'll put something together. You'll just need to give us a few days, we're flat out dealing with certification work at the moment.
ping?
@tomato2 Attached is a copy of the test program I used for the non-JSSE TLS stack. It's based on your original test script, except using the decryption code from https://github.com/bcgit/bc-java/blob/a4e57f2019867c7ee86508fc5b136d73fd4922d9/tls/src/main/java/org/bouncycastle/tls/crypto/impl/bc/BcDefaultTlsCredentialedDecryptor.java#L108 .
(For reference, the JCE version used in BCJSSE is here: https://github.com/bcgit/bc-java/blob/a4e57f2019867c7ee86508fc5b136d73fd4922d9/tls/src/main/java/org/bouncycastle/tls/crypto/impl/jcajce/JceDefaultTlsCredentialedDecryptor.java#L101).
Yes, that API is vulnerable too, though the side-channel it has is smaller.
After collecting 1M measurements per probe on the same AMD Ryzen 5 5600X I got the following result:
Sign test mean p-value: 0.3052, median p-value: 0.1888, min p-value: 8.685e-54
Friedman test (chisquare approximation) for all samples
p-value: 3.6703623211321894e-88
Worst pair: 1(no_header_with_payload_48), 2(no_padding_48)
Mean of differences: -9.25545e-08s, 95% CI: -1.43365e-07s, -4.362431e-08s (±4.987e-08s)
Median of differences: -1.31000e-07s, 95% CI: -1.50000e-07s, -1.200000e-07s (±1.500e-08s)
Trimmed mean (5%) of differences: -1.25431e-07s, 95% CI: -1.51653e-07s, -1.005266e-07s (±2.556e-08s)
Trimmed mean (25%) of differences: -1.33397e-07s, 95% CI: -1.52282e-07s, -1.161970e-07s (±1.804e-08s)
Trimmed mean (45%) of differences: -1.34690e-07s, 95% CI: -1.51574e-07s, -1.185917e-07s (±1.649e-08s)
Trimean of differences: -1.31000e-07s, 95% CI: -1.55000e-07s, -1.155000e-07s (±1.975e-08s)
Layperson explanation: Definite side-channel detected, implementation is VULNERABLE
legend:
0,header_only
1,no_header_with_payload_48
2,no_padding_48
3,no_structure
4,signature_padding_8
5,valid_0
6,valid_48
7,valid_192
8,valid_246
9,valid_repeated_byte_payload_246_1
10,valid_repeated_byte_payload_246_255
11,zero_byte_in_padding_48_4
pairwise test results: report.csv.zip
This result suggest that the issue is in the numerical library, just like OAEP result above.
Thanks, @tomato42 . Could you confirm which version of the library that last testing was done with? We released version 1.77 in November where we did some quick improvements on this issue, so it would be great to have comparative results between 1.76 and 1.77.
ran it with bcprov-jdk18on-177.jar and bctls-jdk18on-177.jar
side note: Oracle has released new updates, CVE-2024-20952 is the same vulnerability in Java.
That's interesting, I'd even go as far as to say good news. I'm not sure if this change will affect the timing results you got or not - I think we'll need to try it on an updated JVM.
I haven't looked at the fix in detail, just got a notification that they fixed it
any progress?
Hi @tomato42,
First a word about SunJSSE and their recent change. While SunJSSE does use Cipher "RSA/ECB/PKCS1Padding" for decrypting the TLS Pre-Master-Secret, it has special code for doing so in that context. The change mentioned above appears targeted at TLS usage only (specifically avoiding an exception throw on an internal codepath), so in case there is any confusion it doesn't seem to be a general fix for timing side-channels in Cipher "RSA/ECB/PKCS1Padding". Presumably it does improve the situation for TLS, which is all good.
The special code I referred to above is that SunJSSE has JDK-specific codepaths. Firstly it requires the passing of a TlsRsaPremasterSecretParameterSpec to the cipher init call (which is also used in UNWRAP_MODE instead of DECRYPT_MODE). This allows it to use a specialized internal implementation tailored to the TLS use case (i.e. not throwing exceptions, and incorporating the RSA version check and the 48 byte random value fallback). As is increasingly common however, TlsRsaPremasterSecretParameterSpec is only allowed for "internal JDK use". Secondly, there is explicit code in the case of any non-Oracle provider to fall back to using Cipher "RSA/ECB/PKCS1Padding" in decrypt mode, with the usual result that it has to catch exceptions for failure. This means that if SunJSSE is used with any other cryptographic provider (than an Oracle one) for the RSA decryption, it is in the same boat as the BCJSSE implementation.
The approach of having a TLS-specific parameterization is IMO good, but it would be great if we (BC) would be able to interoperate with it via some actual publicly accessible variant/version of TlsRsaPremasterSecretParameterSpec.
So, onto the situation in BCJSSE. We have basically two problems to solve.
The first is the Cipher API one (timing signal of thrown exceptions) and we think probably we should replicate the parameter spec trick described above, unfortunately not interoperably as explained, but at least we would have safety when BCJSSE is used with BC, in similar fashion to SunJSSE being safe with JDK.
The second is that we still have an internal timing channel coming from the conversion of a final BigInteger result to a variable-sized byte array. This happens at the end of the blinded RSA decryption. BigInteger has some native code backing for modPow that makes it difficult to give up the performance, but as I recall @tomato42 describing somewhere it's possible to tweak just the final modular multiplication of the de-blinding so that we convert to a fixed-length array without timing dependent on the plaintext. Unfortunately, other parts of this low-level code expect variable-length arrays for RSA values (and this code is heavily locked in by backward compatibility issues). So to address this we basically need to make a self-contained implementation of TLS RSA Pre-Master-Secret decryption, duplicating the bits and pieces currently spread across several RSA and PKCS1 implementation classes, and ironing out the remaining timing channel principally by being able to deal with fixed-length arrays.
Now that I've written all that out, I think I have time to do that second part next week, and we can then use it directly in the non-JSSE TLS stack and get some measurements. The parameter spec solution for BCJSSE would follow by just forwarding to the same implementation in case a specific parameter spec is provided.
The problem with TLS-specific parametrisation is that it is TLS specific. It doesn't help if you use the same API to implement CMS, XMLenc, JWE, or anything similar.
I have some time now so I'll take a closer look at the OpenJDK fix, but on first glance it doesn't look too good...
Unfortunately, other parts of this low-level code expect variable-length arrays for RSA values (and this code is heavily locked in by backward compatibility issues). So to address this we basically need to make a self-contained implementation of TLS RSA Pre-Master-Secret decryption, duplicating the bits and pieces currently spread across several RSA and PKCS1 implementation classes, and ironing out the remaining timing channel principally by being able to deal with fixed-length arrays.
That sounds to me like you need to change the code that does RSA-PKCS#1v1.5, RSA-OAEP, and RSASVE to use a private API for RSA private key operation that returns a byte string instead of an arbitrary length integer.
Signing can still use the old API as the result isn't secret, so leakage that depends on it is fine.
@tomato42 We've added a custom implementation of TLS RSA PreMasterSecret decryption (org.bouncycastle.crypto.tls.TlsRsaKeyExchange#decryptPreMasterSecret, now used by the non-JSSE TLS implementation.
It's available for testing in a new beta build: https://downloads.bouncycastle.org/betas/ (in the bcprov jar).
For timing tests, we can now just collect data for:
byte[] M = TlsRsaKeyExchange.decryptPreMasterSecret(ciphertext, rsaPrivateKey, 0x0303, secureRandom);
Note that this is not a generic RSA//PKCS1 decryption; it validates the expected plaintext length of 48, and also checks the passed version bytes (0x0303 here) match. It may therefore be ideal to test against a ciphertext corpus tailored to the TLS case.
Edit: I guess I should say that this new code no longer has any timing dependency on the decrypted RSA value (including its length). All variable-time BigInteger -> (fixed-size) byte[] conversions are done on blinded or public values.
All variable-time BigInteger -> (fixed-size) byte[] conversions are done on blinded (...) values.
How do you convert the result of the RSADP() operation to byte[] without deblinding it first?
The blinded RSA operation is here: https://github.com/bcgit/bc-java/blob/4ff20f7b2a8aa0341a3e8a82026394b688fec4c2/core/src/main/java/org/bouncycastle/crypto/tls/TlsRsaKeyExchange.java#L176-L197 .
The blinded result is "almost unblinded" using:
BigInteger offsetResult = unblind.add(ONE).multiply(blindedResult).mod(modulus);
where the add(ONE) means that offsetResult will still need to have blindedResult subtracted from it (a final deblinding step) after conversion to byte arrays.