hashes icon indicating copy to clipboard operation
hashes copied to clipboard

Kupyna: implemented hashing function

Open AnarchistHoneybun opened this issue 1 year ago • 9 comments

function spec

Scrapped previous PR since it ran into a lot of problems. I'm aware that #601 exists, creating this one to check against the repo's test suite as I iterate on my implementation, and potentially make it the main pr for kupyna if the other one becomes inactive.

Currently the implementation does everything the paper describes (testing leaves a bit to be desired) so step one is to clean up any linter complaints, and then proceed to making it no-std compatible.

AnarchistHoneybun avatar Oct 11 '24 16:10 AnarchistHoneybun

@newpavlov could you give me some idea on why the tests are failing right now? I don't think I've touched the Cargo.toml let alone the version of digest; checked and it's the same version as that present in some other toml's. Any idea why this could be happening?

AnarchistHoneybun avatar Oct 11 '24 16:10 AnarchistHoneybun

Might be a bit unrelated but when I try cargo run on my local it gives me this error:

error: failed to select a version for `digest`.
    ... required by package `kupyna v0.1.0 (/home/vrin/kupyna_hashes_contrib/kupyna)`
versions that meet the requirements `=0.11.0-pre.9` are: 0.11.0-pre.9

all possible versions conflict with previously selected packages.

  previously selected package `digest v0.11.0-pre.8`
    ... which satisfies dependency `digest = "=0.11.0-pre.8"` of package `ascon-hash v0.3.0-pre (/home/vrin/kupyna_hashes_contrib/ascon-hash)`

failed to select a version for `digest` which could resolve this conflict

now this worked fine before updating the toml. Post toml update the CI is fine, but the main function is failing. Is this expected behaviour?

AnarchistHoneybun avatar Oct 11 '24 18:10 AnarchistHoneybun

Try to rebase your branch to master.

newpavlov avatar Oct 11 '24 20:10 newpavlov

I don't have a lot (read any) experience with rebasing, sadly. Could you explain a bit more?

AnarchistHoneybun avatar Oct 12 '24 04:10 AnarchistHoneybun

You should be able to rebase to this repos master using something like this (it accounts for the fact that you've forked from a fork):

git remote add upstream [email protected]:RustCrypto/hashes.git
git pull upstream
git checkout main_repo_contrib
git rebase upstream/master
git push --force

Note that I do not guarantee correctness of these commands since I wrote them from memory.

newpavlov avatar Oct 12 '24 04:10 newpavlov

rebased and made the hex formatting changes. I was wondering if I should also format the mds and s-box matrices the same way, but i think it'd make me put hex-literal as a proper dependency instead of a dev dependency. Any particular way you want me to approach this?

AnarchistHoneybun avatar Oct 13 '24 20:10 AnarchistHoneybun

In the previous PR I was told to remove all heap allocations from the code. This is the code snippet that was flagged:

fn matrix_to_block(matrix: Matrix) -> Vec<u8> {
    let mut block = vec![0u8; ROWS * COLS];

This was when ROWS and COLS were constants, and the hash code length was fixed on 512. This has been made dynamic since, and looks something like this now:

fn matrix_to_block(matrix: Matrix) -> Vec<u8> {
    let rows = matrix.len();
    let cols = matrix[0].len();

    let mut block = vec![0u8; rows * cols];

where the matrix type doesn't have a predefined size. Is there a way to go around the heap allocations in this?

AnarchistHoneybun avatar Oct 14 '24 10:10 AnarchistHoneybun

It's probably worth to make the output size a type parameter (i.e. Kupyna<OutputSize: ArraySize>) and calculate ROWS and COLUMNS from it. Alternatively, you could allocate the biggest block size on stack and then slice it to a smaller block, it certainly will be more efficient than using heap allocations.

newpavlov avatar Oct 14 '24 11:10 newpavlov

Could you explain the first approach a bit?

AnarchistHoneybun avatar Oct 15 '24 21:10 AnarchistHoneybun

I took a look at the spec and I think that the best option will be to follow the groestl crate here, i.e. to define two structs KupynaShortVarCore and KupynaLongVarCore implementing the VariableOutputCore trait. You then can define type aliases for Kupyna256, Kupyna384, and Kupyna512 by using the appropriate wrappers. The compress function then can be implemented separately for l equal to 512 and 1024.

newpavlov avatar Oct 23 '24 03:10 newpavlov

The CI failure is fixed in #637, so you need to rebase to master in addition to the requested changes.

newpavlov avatar Jan 03 '25 05:01 newpavlov

yeah I'll do that asap. bit of an issue with the code where it's failing hashing tests for the smaller state size tests even though it passes all individual tests (padding, blocking, t_xor and t_plus transforms) and it has troubled me to no end. going through the paper again to see what I have got wrong.

AnarchistHoneybun avatar Jan 03 '25 05:01 AnarchistHoneybun

hello @newpavlov could I get some guidance on how to emulate the padding mechanism of groestl? Like I have all other details ironed out more or less, figured out that aforementioned issue as well, but can't seem to get this

AnarchistHoneybun avatar Jan 11 '25 18:01 AnarchistHoneybun

What do you mean by "emulate the padding mechanism of groestl"? IIUC Kupyna uses a slightly different padding scheme (see Section 5 of the paper), i.e. instead of a 64 bit counter it uses 96 bits. You can take a look at how len64_padding_le is implemented in the block-buffer crate and implement the 96 bit padding similarly.

newpavlov avatar Jan 13 '25 11:01 newpavlov

hey @newpavlov , I figured out how it's doing the padding and think i got it down enough to work for kupyna. There's a slight issue where kupyna works for non standard lengths (see test vectors appendix where one vector is of a length that doesn't fill up a full byte). Are there any guidelines as to how to let the user pass a custom length for a given message? earlier I thought I'd just assume the message was up till the last high bit, but it's possible the user intended some zeroes to be part of the message too?

here's a temp repo with some groestl code that I've been modifying https://github.com/AnarchistHoneybun/upl/blob/main/src/lib.rs to hack on the problem

AnarchistHoneybun avatar Feb 19 '25 06:02 AnarchistHoneybun

Right now our policy is that we do not support bit-level updates, so you can ignore those vectors. See https://github.com/RustCrypto/traits/issues/917 for the previous discussion.

newpavlov avatar Feb 19 '25 19:02 newpavlov

hey @newpavlov , I think it's almost all groestl-ish now. I still have to move it to no_std and fix a small bug but like, any other large changes you'd wish to see?

AnarchistHoneybun avatar Feb 26 '25 17:02 AnarchistHoneybun

I will try to take a look at the code in the following weeks. Meanwhile, could you bump edition to 2024 and MSRV to 1.85 similarly to the other crates in this repo? It's also worth to rebase to master.

newpavlov avatar Feb 26 '25 20:02 newpavlov

Thanks @newpavlov , I've opened https://github.com/AnarchistHoneybun/kupyna_hashes_contrib/pull/5 against the PR'd branch separately to address most of your comments. There are still a few failed cases on a cargo test to address, but for the most part it's looking very groestley and hopefully ready to join the RustCrypto hashes family soon.

side note: I built a GitHub codespace to run this in which I have working nicely. Is it worth including a devcontainer.json in the main branch so anybody can fork then run it in a codespace, or would we rather avoid needing to manage a config file and leave it .gitignored?

jkoudys avatar Mar 01 '25 08:03 jkoudys

The 256-bit version's passing all its tests. The 512 has a few failures on larger message sizes, so I'm guessing there's still some error with the new padding code that's been implemented. I believe this was working correctly when it was a standalone lib. It's only struggling on larger data size.

running 6 tests
test kup512_n0 ... ok
test kup512_n1024 ... FAILED
test kup512_n1536 ... FAILED
test kup512_n2048 ... FAILED
test kup512_n512 ... ok
test kup512_n8 ... ok

jkoudys avatar Mar 01 '25 10:03 jkoudys

@newpavlov I'm pushing these in on https://github.com/AnarchistHoneybun/kupyna_hashes_contrib/pull/7

Do you have any examples of non-byte-aligned bit streams for hashing? You can take a 1-bit, 510-bit, etc. message hash in kupyna, and the C reference app does it. We have some tests that I'm trying to convert from the C version to do one of the bitwise hashes but I can't figure out how the shorter length gets passed down into the hasher.

Is it totally fine to just make this version support only byte-bound messages? I'm sure it's useful even without being bitwise. I assume that could go in on a future PR.

jkoudys avatar Mar 15 '25 14:03 jkoudys

IIRC most hash functions theoretically support processing messages with non-byte lengths. But it's a problematic capability to properly support on modern general purpose computers and it has an extremely low practical value, so cryptographic libraries generally just ignore it.

Is it totally fine to just make this version support only byte-bound messages?

Yes, see the traits issue I linked above.

newpavlov avatar Mar 15 '25 18:03 newpavlov

Hey @newpavlov could you also do a review and lmk if there's anything left out? I wanted to finalize this and then discuss the possibility of working on something larger like this (I remember yescrypt coming up in discussion) for GSOC if possible. Thank you!

AnarchistHoneybun avatar Mar 15 '25 18:03 AnarchistHoneybun

@AnarchistHoneybun I have the style updates, and the new unit tests, waiting on https://github.com/AnarchistHoneybun/kupyna_hashes_contrib/pull/7 . That PR's going into the same branch AnarchistHoneybun:main_repo_contrib for this PR to RustCrypto, so you'll just need to accept that PR.

jkoudys avatar Mar 16 '25 03:03 jkoudys

@AnarchistHoneybun If you don't plan to work on the comments above in the near future, I guess we could merge this PR in its current form and leave the improvements for later PRs.

newpavlov avatar Mar 23 '25 00:03 newpavlov

@AnarchistHoneybun If you don't plan to work on the comments above in the near future, I guess we could merge this PR in its current form and leave the improvements for later PRs.

I'm sorry I've just been stuck with some college assignments, planned to finish these improvements latest by this weekend. Merging it now seems fine as we could atleast have the function in the repo and then work on the mentioned details, but I'm fine with waiting too.

AnarchistHoneybun avatar Mar 23 '25 02:03 AnarchistHoneybun

How about merge at this level, then I can branch directly and finish the remaining?

jkoudys avatar Mar 23 '25 03:03 jkoudys