java-crc Add support for ByteBuffer next to byte[]

Added support for ByteBuffer next to byte[].

The unit tests do not seem to run when doing a mvn verify. I'm unsure why. I did not create a unit test because of this.

Aug 08 '24 12:08 Jurrie

I see your point and support the idea in general. But I'm not exactly fond of introducing a function object for access abstraction. I appreciate the abstraction, but it also introduces a performance penalty for the most common use scenario. I'll need to profile it a bit and weigh things.

I also see the issue with tests. Yes, that is weird. But I usually ran them from IDE and they still work from IDE, but somehow don't run from mvn. I'll need to investigate it.

I'm afraid it will take a bit of time for me to take care of it all.

Aug 11 '24 09:08 snksoft

Thanks for your response. I see your point. I'll wait and see what profiling will tell.

If there's something I can do, let me know.

Aug 11 '24 09:08 Jurrie

Can I get more details on your use case?

Just a progress update. I did some profiling and results are theoretically interesting, but not very good.

I can't measure a single call unless data is very large which is not representative.

For small number of repetitions (e.g. 1000), your implementation is slower by 25-30% which is too high to be accepted. In this case, you are much better off just copying bytes to a pre-allocated buffer (reused between calculations) and then using old functions to calculate the crc. And that is true for both static method and table based implementation.

For large number of repetitions, JVM optimizations do improve things. Interestingly enough, JVM seems optimise ByteBuffer version better, which even comes within 1% penalty from the original byte[] version for very short data (like 100 bytes) and a very large number of repetitions (like millions). But penalty remains at 5-10% for larger chunks and byte[] version remains about 10% slower even for a very large number of repetitions. But the main issue is high number of repetitions is not very representative for this call anyway as table based version will be much faster in that case (typically, you are better off with table based implementation if you calculate crc over more than 0.5K over the run time of your program). Table based implementation looks better. And your implementation for ByteBuffer eventually starts beating pre-allocated buffer approach. But by that time all variants are within 1% behind of the old byte[] based call which remains fastest.

So, sorry, I will take a bit more time to think about it. I understand the benefits of supporting ByteBuffer in the API. And I like your functional abstraction. But penalty is too high and benefits are contrived and negligibly small. So, it is no go as it is. What I am thinking though, is keeping my implementation for byte[] based calculations and adding your abstracted methods next to it (for ByteBuffer and anyone else's generic use). This will mean code duplication. But on the other hand, it will add new API (and flexibility) with no performance degradation for the old code. So, I'm currently weighing theoretical bad vs semi-theoretical good.

Sep 08 '24 03:09 snksoft

Thanks for your time looking into profiling this. Very thorough! Is there a blog post on your method and the results somewhere? It would make an interesting read!

Can I get more details on your use case?

Well, I'm mocking a USB device. For this, I use a ByteBuffer because I need support for big and little endianness, and for writing short/int/long. After writing the data, I need to compute and add the CRC. As I already have my stuff in a ByteBuffer, the easiest for me would be to pass that on to your framework.

...you are much better off just copying bytes to a pre-allocated buffer...

That's what I'm currently doing. It's good enough for my use case - I don't need the speed per se.

My PR was just an attempt to see if you would consider this approach, because it would be a better fit for my use case. But like I said, I can do without ByteBuffer support and use byte[]. So again, thanks for looking into this.

Sep 08 '24 06:09 Jurrie

No, I didn't try to write a blog. I copied your functions appending 2 to their names and also made own ByteBuffer implementation by simply duplicating my original code and replacing byte[] with ByteBuffer in my original code (I've added 3 to that name). Then I just made a simple benchmarking code and played with parameters. Not very scientific and you would need a few runs to confirm what you see. But in my tests trends I described were very repeatable/stable. And then it all boils down to how long your app/device lives and how many messages it processes in a single run (or from power on to power off in case of a device).

I've attached both files in case you want to try it yourselves (e.g. adjust it to resemble your use case better).

CRC.bench.zip

Sep 08 '24 07:09 snksoft

Cool - thanks. Alright, let's close this one and stick to byte[] then, right?

Sep 08 '24 07:09 Jurrie

I actually leaning towards adding your methods next to my one. They have they merits. E.g. some people may have all sorts of buffers, streams or other data types and might be perfectly ok with a performance hit if that means they don't need to manage pre-allocated byte arrays and do data copying. So, I'm actually thinking whether I want to just add your methods next to my ones, a wrapper class to better support pre-allocated buffer or both. I just need to think twice before adding anything to a public API of a library. So, I don't like rushing such decisions.

Sep 08 '24 07:09 snksoft

Thanks @snksoft !

Sep 15 '24 18:09 Jurrie