codeql icon indicating copy to clipboard operation
codeql copied to clipboard

Java: Promote HashWithoutSalt query

Open joefarebrother opened this issue 3 years ago • 9 comments

joefarebrother avatar Mar 23 '22 17:03 joefarebrother

Is there anything else blocking the merge of this query, other than https://github.com/github/codeql/pull/8541#discussion_r834217496?

Needs a docs review maybe?

atorralba avatar May 26 '22 11:05 atorralba

There's a performance issue on jdk. The unusual dataflow logic and workarounds probably deserve a look at by somebody like @aschackmull too.

joefarebrother avatar May 26 '22 12:05 joefarebrother

@aschackmull could you take a look at the dataflow logic in this query, and see if something can be improved?

atorralba avatar Aug 08 '22 07:08 atorralba

@aschackmull could you take a look at the dataflow logic in this query, and see if something can be improved?

There is certainly some very unusual flow going on here with a lot of ad-hoc moving parts, so I'm not too surprised that this has potential for poor performance. My gut feeling is that we'll need to aim for something less ad-hoc to get performance under control, but I'd need to dig a bit deeper to say more. Also, there are lots of parts in here that could have poor performance: have you identified the pain points?

aschackmull avatar Aug 09 '22 11:08 aschackmull

have you identified the pain points?

@joefarebrother do you remember where the issue was in JDK regarding performance?

atorralba avatar Aug 09 '22 13:08 atorralba

Seems to be some memory pressure (lots of instances of "pausing/unpausing evaluation") within some of the dataflow stages.

Currently trying removing various aspects of the query to see if any impact performance.

joefarebrother avatar Aug 10 '22 09:08 joefarebrother

Seems to be some memory pressure (lots of instances of "pausing/unpausing evaluation") within some of the dataflow stages.

That typically just means that the flow you're trying to calculate is huge - possibly due to a too ad-hoc flow config. The query involves 3 different configurations, I guess step 1 is to identify which of these are problematic.

aschackmull avatar Aug 10 '22 09:08 aschackmull

Looks like the main culprit is in adding all field reads as flow steps; restricting it to just field reads containing a MessageDigest improves the performance.

joefarebrother avatar Aug 10 '22 10:08 joefarebrother

Now it no longer times out and takes multiple hours, but DCA still indicates a 20% slowdown on JDK (and takes ~320 seconds locally)

joefarebrother avatar Aug 11 '22 12:08 joefarebrother

Apologies for butting in; I came across this PR by chance. While storing unsalted password hashes is poor security, using typical message digests for password hashing is in itself fundamentally insecure. Message digests are designed for integrity and optimized for speedy evaluation, while dedicated password hashing functions are optimized for the opposite to prevent brute-force attacks. Please consider scrapping this query and replacing it with a (much simpler) query testing for use of message digests for password hashing. That would annoy people who have to interoperate with legacy systems (Excel, etc.), but overall I think it's the right thing to do. Thanks.

vlkl-sap avatar Aug 28 '22 14:08 vlkl-sap

QHelp previews:

java/ql/src/Security/CWE/CWE-759/HashWithoutSalt.qhelp

Use of a hash function without a salt

In cryptography, a salt is a randomly generated value to be appended to a password before it is hashed and stored in a database.

Without a salt, if an attacker were to steal the password database, they would be able to see which users share passwords, as well as more easily perform a dictionary attack using pre-computed hashes.

Recommendation

Use a long random salt of at least 32 bytes, then use the combination of password and salt to hash a password or password phrase.

Additionally, it is recommended to use an algorithm better suited to password hashing, such as PBKDF2, rather than use message digest algorithms such as SHA-256, in order to further reduce the possibility of dictionary attacks.

Example

The following example shows three ways of hashing. In the 'BAD' case, no salt is provided. In the `BETTER` case, a salt is provided, and SHA-256 is used. In the 'GOOD' case, a salt is provided, and PBKDF2 is used.

public class HashWithoutSalt {
	// BAD - Hash without a salt.
	public byte[] getSHA256Hash(String password) throws NoSuchAlgorithmException {
		MessageDigest md = MessageDigest.getInstance("SHA-256");
		return md.digest(password.getBytes());
	}

	// BETTER - Hash with a salt using SHA256.
	public byte[] getSHA256Hash(String password, byte[] salt) throws NoSuchAlgorithmException {
		MessageDigest md = MessageDigest.getInstance("SHA-256");
		md.update(salt);
		return md.digest(password.getBytes());
	}

	// GOOD - Hash with a salt using PBKDF2.
	public byte[] getPBKDF2Hash(String password, byte[] salt) throws NoSuchAlgorithmException {
		KeySpec spec = new PBEKeySpec(password.toCharArray(), salt, 65536, 128);
		SecretKeyFactory f = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA1");
		return f.generateSecret(spec).getEncoded();
	}
}

References

github-actions[bot] avatar Oct 11 '22 16:10 github-actions[bot]

Hi @vlkl-sap, thank you for your feedback and apologies for the delayed response.

We have decided to keep the current query in place, as it covers a known CWE and the work is already done. We plan to improve the coverage as well as the query help to recommend the use of more secure hashing algorithms.

Seperately, we will also look at improving our coverage of hashing algorithm issues. We will have to investigate the best way of implemeting this to minimize the noisiness of alerts.

joefarebrother avatar Oct 18 '22 12:10 joefarebrother