feat: Implement BLAKE2X XOF (Blake2Xb and Blake2Xs)
This PR introduces an implementation of Blake2X – inspired from, but greatly extends on #677. This contributes to #1.
| Category | Lines |
|---|---|
| Test Vector Files | 6148 |
| Unit Tests | 762 |
| Dependency Management | 124 |
| Implementation | 631 |
This is a complete implementation of the Blake2X extensible-output function (XOF) for both its 64-bit (Blake2Xb) and 32-bit (Blake2Xs) variants. This new functionality is gated behind the blake2x cargo feature flag.
1. Implementation and Design
The implementation follows the design principles of the existing crate, utilizing a macro-driven approach to provide generic logic for both Blake2Xb and Blake2Xs, which minimizes code duplication.
-
Core Logic (
macros.rs): A new macro,blake2x_impl!, has been introduced. -
Algorithm (
blake2x.rs): The implementation follows the two-phase process specified by the Blake2X algorithm:-
Root Hash Generation: It first computes a root hash (
H₀) using the underlyingBlake2bVarCoreorBlake2sVarCore. Crucially, it incorporates the total desired output length (xof_len) into the parameter block during this phase. This ensures that outputs of different lengths are cryptographically distinct, a key security feature of Blake2X. -
Output Expansion: The
finalize_xof()method returns aReaderstruct. This reader generates the final hash output incrementally. For each block of output requested, it computes a new hash by feeding the root hashH₀into the base BLAKE2 function, but with a unique parameter block for each "expansion node" (differentiated by an incrementingnode_offset). This logic is encapsulated in theexpand_nodehelper function.
-
Root Hash Generation: It first computes a root hash (
-
Public API: New public-facing structs
Blake2xb,Blake2xs, and their correspondingReadertypes are exposed.
2. Testing and Validation
The implementation is supported by a comprehensive test suite in tests/blake2x.rs that validates correctness through two primary resources: official test vectors and a reference implementation.
-
Test Vectors:
- The
tests/data/directory now includesblake2xb-kat.jsonandblake2xs-kat.json. These are the test vectors sourced directly from the BLAKE2 RFC repository. - The tests parse these JSON files and validate the implementation against a wide range of input and output sizes for both keyed and unkeyed hashing.
- The
-
Reference Implementation (
b2rs):- To further ensure correctness, the test suite now uses the
b2rscrate as a reference implementation. This crate is maintained on GitHub by Jean-Philippe Aumasson (@veorq), one of the original authors of the BLAKE2 algorithm. - The tests use
b2rsin two ways:-
Fuzz-like comparison: Random inputs and output lengths are generated, and the output of our implementation is compared against the output of
b2rs. - ~~Internal State Validation: The tests go a step further by using
b2rsto verify intermediate values of the Blake2X computation, such as the value of the root hash (H₀) and the output of the first expansion node. This confirms that our internal parameter block construction and hashing logic are correct, not just the final result.~~ Edit: removed in 4d3debf264a45da9e33d52645eb6ee9963336f66
-
Fuzz-like comparison: Random inputs and output lengths are generated, and the output of our implementation is compared against the output of
- To further ensure correctness, the test suite now uses the
All tests, including functional checks for progressive reads and constructor behavior, are passing.
Edit: the last 2 commits of the PR respectively fix a typo-detection CI false positive, and unrelated clippy warnings.
@newpavlov Happy to add a CHANGELOG line as part of this PR, or remove the b2rs unit tests, or anything else you might think needs adapting.
Hey @newpavlov, are there any blockers for this PR to get reviewed and merged? Can we help in any way?
@LesnyRumcajs needs a rebase
@tarcieri I've had to remove the testing of now-private structs in 4d3debf as part of the rebase, which favors respecting the visibility modifiers that are in main. I could reinstate the test and change those modifiers, lmk.
Will be great to get this merged! Thanks for the effort here.
Thanks for rebasing.
This is a very large PR which makes it difficult to review.
Another issue is we'd like to replace the entire BLAKE2 implementation: #228
@tarcieri The splits present in the top of the PR description should make clear most of the PR is test vector files, which are extracted directly from the RFC repo.
Specifically the test vector files are the result of running jq '[.[] | select(.hash == "blake2xb")] blake2-kat.json on the file https://github.com/BLAKE2/BLAKE2/blob/master/testvectors/blake2-kat.json (for blake2xb-kat.json) and running jq '[.[] | select(.hash == "blake2xs")]' on that same file (for blake2/tests/data/blake2xs/blake2xs-kat.json).
Please let me know if you would like me to programmatically download those test vector files from that repo upon testing, thereby removing them here (it could look like that +172/-6149) — I haven't because considering the contents of blake2/tests/data, committing those test vectors seems to be the convention.
Re: implementation replacement, the blake2x implementation in the current form of the PR now uses only the public API of blake2 (this is the point I address in this comment). Consequently, it should survive a switch to some new implementation of blake2 without requiring code changes.