hashes icon indicating copy to clipboard operation
hashes copied to clipboard

feat: Implement BLAKE2X XOF (Blake2Xb and Blake2Xs)

Open huitseeker opened this issue 7 months ago • 7 comments

This PR introduces an implementation of Blake2X – inspired from, but greatly extends on #677. This contributes to #1.

Category Lines
Test Vector Files 6148
Unit Tests 762
Dependency Management 124
Implementation 631

This is a complete implementation of the Blake2X extensible-output function (XOF) for both its 64-bit (Blake2Xb) and 32-bit (Blake2Xs) variants. This new functionality is gated behind the blake2x cargo feature flag.

1. Implementation and Design

The implementation follows the design principles of the existing crate, utilizing a macro-driven approach to provide generic logic for both Blake2Xb and Blake2Xs, which minimizes code duplication.

  • Core Logic (macros.rs): A new macro, blake2x_impl!, has been introduced.

  • Algorithm (blake2x.rs): The implementation follows the two-phase process specified by the Blake2X algorithm:

    1. Root Hash Generation: It first computes a root hash (H₀) using the underlying Blake2bVarCore or Blake2sVarCore. Crucially, it incorporates the total desired output length (xof_len) into the parameter block during this phase. This ensures that outputs of different lengths are cryptographically distinct, a key security feature of Blake2X.
    2. Output Expansion: The finalize_xof() method returns a Reader struct. This reader generates the final hash output incrementally. For each block of output requested, it computes a new hash by feeding the root hash H₀ into the base BLAKE2 function, but with a unique parameter block for each "expansion node" (differentiated by an incrementing node_offset). This logic is encapsulated in the expand_node helper function.
  • Public API: New public-facing structs Blake2xb, Blake2xs, and their corresponding Reader types are exposed.

2. Testing and Validation

The implementation is supported by a comprehensive test suite in tests/blake2x.rs that validates correctness through two primary resources: official test vectors and a reference implementation.

  • Test Vectors:

    • The tests/data/ directory now includes blake2xb-kat.json and blake2xs-kat.json. These are the test vectors sourced directly from the BLAKE2 RFC repository.
    • The tests parse these JSON files and validate the implementation against a wide range of input and output sizes for both keyed and unkeyed hashing.
  • Reference Implementation (b2rs):

    • To further ensure correctness, the test suite now uses the b2rs crate as a reference implementation. This crate is maintained on GitHub by Jean-Philippe Aumasson (@veorq), one of the original authors of the BLAKE2 algorithm.
    • The tests use b2rs in two ways:
      1. Fuzz-like comparison: Random inputs and output lengths are generated, and the output of our implementation is compared against the output of b2rs.
      2. ~~Internal State Validation: The tests go a step further by using b2rs to verify intermediate values of the Blake2X computation, such as the value of the root hash (H₀) and the output of the first expansion node. This confirms that our internal parameter block construction and hashing logic are correct, not just the final result.~~ Edit: removed in 4d3debf264a45da9e33d52645eb6ee9963336f66

All tests, including functional checks for progressive reads and constructor behavior, are passing.


Edit: the last 2 commits of the PR respectively fix a typo-detection CI false positive, and unrelated clippy warnings.

huitseeker avatar Jul 15 '25 19:07 huitseeker

@newpavlov Happy to add a CHANGELOG line as part of this PR, or remove the b2rs unit tests, or anything else you might think needs adapting.

huitseeker avatar Jul 17 '25 12:07 huitseeker

Hey @newpavlov, are there any blockers for this PR to get reviewed and merged? Can we help in any way?

LesnyRumcajs avatar Sep 15 '25 08:09 LesnyRumcajs

@LesnyRumcajs needs a rebase

tarcieri avatar Sep 19 '25 01:09 tarcieri

@tarcieri I've had to remove the testing of now-private structs in 4d3debf as part of the rebase, which favors respecting the visibility modifiers that are in main. I could reinstate the test and change those modifiers, lmk.

huitseeker avatar Sep 19 '25 02:09 huitseeker

Will be great to get this merged! Thanks for the effort here.

phutchins avatar Sep 19 '25 12:09 phutchins

Thanks for rebasing.

This is a very large PR which makes it difficult to review.

Another issue is we'd like to replace the entire BLAKE2 implementation: #228

tarcieri avatar Sep 23 '25 22:09 tarcieri

@tarcieri The splits present in the top of the PR description should make clear most of the PR is test vector files, which are extracted directly from the RFC repo.

Specifically the test vector files are the result of running jq '[.[] | select(.hash == "blake2xb")] blake2-kat.json on the file https://github.com/BLAKE2/BLAKE2/blob/master/testvectors/blake2-kat.json (for blake2xb-kat.json) and running jq '[.[] | select(.hash == "blake2xs")]' on that same file (for blake2/tests/data/blake2xs/blake2xs-kat.json).

Please let me know if you would like me to programmatically download those test vector files from that repo upon testing, thereby removing them here (it could look like that +172/-6149) — I haven't because considering the contents of blake2/tests/data, committing those test vectors seems to be the convention.

Re: implementation replacement, the blake2x implementation in the current form of the PR now uses only the public API of blake2 (this is the point I address in this comment). Consequently, it should survive a switch to some new implementation of blake2 without requiring code changes.

huitseeker avatar Sep 24 '25 00:09 huitseeker