FastDoubleParser icon indicating copy to clipboard operation
FastDoubleParser copied to clipboard

More performant digit check and its extraction

Open xtonik opened this issue 2 years ago • 4 comments

Nowadays there are used universal methods for digits conversion reused many times in the project. They uses trick (char) (c - '0') < 10, but it is slower than classic character range test '0' <= c && c <= '9', just checkout 8294dd3 and run e.g. benchmark JmhJavaFloatFromByteArray.java for value "3.141592". The difference should be about 5% in comparison with benchmark run e.g. from previous commit 0a5ca6a.

Above that, the trick evaluates expression c - '0', the same which is evaluated afterwards when accumulating digits into significand. If this expression is extracted outside and result of expression reused, then parsing is up to 10% more performant for main branch - run the same benchmark after checkout 584feeb.

Note that, there is more than 80 use of isDigit(), so it is widely spread across whole project.

The change brings many code duplicates, but as the project is inherently about to provide early the same functionality for various input data types, the code duplicity is unavoidable.

I can take care of the change If there are no free human resources to implement it.

xtonik avatar Jun 17 '23 09:06 xtonik

On which hardware did you see a performance difference? I tried the proposed changes on two different Intel CPUs and did not measure any difference.

Here are my results for JmhJavaFloatFromByteArray:


 * FastDoubleSwar.isDigits() with return '0' <= c && c <= '9';
 * Benchmark                              (str)  Mode  Cnt    Score   Error  Units
 * JmhJavaFloatFromByteArray.m                0  avgt    4    4.613 ± 0.399  ns/op
 * JmhJavaFloatFromByteArray.m              365  avgt    4    8.795 ± 0.330  ns/op
 * JmhJavaFloatFromByteArray.m             10.1  avgt    4   12.456 ± 0.290  ns/op
 * JmhJavaFloatFromByteArray.m        3.1415927  avgt    4   16.972 ± 2.539  ns/op
 * JmhJavaFloatFromByteArray.m    1.6162552E-35  avgt    4   22.124 ± 0.188  ns/op
 * JmhJavaFloatFromByteArray.m  0x1.57bd4ep-116  avgt    4  365.855 ± 6.610  ns/op
 *
 * FastDoubleSwar.isDigits() with return (char) (c - '0') < 10;
 * Benchmark                              (str)  Mode  Cnt    Score   Error  Units
 * JmhJavaFloatFromByteArray.m                0  avgt    4    4.684 ± 0.568  ns/op
 * JmhJavaFloatFromByteArray.m              365  avgt    4    9.394 ± 0.108  ns/op
 * JmhJavaFloatFromByteArray.m             10.1  avgt    4   12.212 ± 0.090  ns/op
 * JmhJavaFloatFromByteArray.m        3.1415927  avgt    4   16.951 ± 2.436  ns/op
 * JmhJavaFloatFromByteArray.m    1.6162552E-35  avgt    4   23.245 ± 0.235  ns/op
 * JmhJavaFloatFromByteArray.m  0x1.57bd4ep-116  avgt    4  363.210 ± 4.485  ns/op
 

I have also tried implementing the proposed inlining of method isDigit() in commit add318a. The changes made no difference in the measured performance.

Here is an example with file mesh.txt, with the proposed inlinining of method isDigit():

CPU: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
OS: Linux, 5.15.0-1039-azure, 2 processors available
VM: Java 20, OpenJDK 64-Bit Server VM, Azul Systems, Inc., 20.0.1+9
-XX:CompileCommand=inline,java/lang/String.charAt
Parsing numbers in file /home/runner/work/FastDoubleParser/FastDoubleParser/fastdoubleparserdemo/data/mesh.txt
...

Measurement results:
JavaDoubleParser byte[]     :   354.51 MB/s (+/- 1.8 %)    48.29 Mfloat/s      20.71 ns/f     2.63 speedup

And here is an example with file mesh.txt, without the proposed inlinining of method isDigit(): JavaDoubleParser byte[] : 360.09 MB/s (+/- 2.2 %) 49.05 Mfloat/s 20.39 ns/f 2.74 speedup

wrandelshofer avatar Jun 17 '23 15:06 wrandelshofer

I use "Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz" running on Debian 11.

One thing is(char) (c - '0') < 10 vs. '0' <= c && c <= '9', as we see it is probably platform dependent and hard to say what is better.

But the second issue is to reuse results of expression x - '0' outside method digit() if possible. My suggestion is not to use the method for that cases at all.

xtonik avatar Jun 21 '23 12:06 xtonik

Yes, you have right - within parser is the change unnoticeable. I have created separate benchmark, which gives me clear results in favour of conditional variant. But this should not mean, that it is more performant, there can be some pitfall as often is.

I have also added another improvements related to the issue, even not directly, in commits ec92d2d, f17b7a0 and ce2139c. What I see now, they should be rather within separate merge requests.``

xtonik avatar Jun 21 '23 12:06 xtonik

I have found the reason - there was included "the example of how not to use isDigit() at all", it is rolled back by last commit caee9b2 and little difference in results of JmhJavaFloatFromByteArray appears. So my suggestion is:

  1. Replace (char) (c - '0') < 10 with '0' <= c && c <= '9' in method isDigit() for both char and byte variants.
  2. Everywhere, where then is expression x - '0' evaluated again, avoid use of this method completely in the same manner as the changes rolled back in caee9b2.

Please, don't forget to have a look to optimized version of methods isEightZeroes(), tryToParseUpTo7Digits() and isEightDigits().

xtonik avatar Jun 21 '23 13:06 xtonik

I gave it another try. Inlining of the isDigit() method has no performance effect on Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz but it does improve the performance on a Apple M2 Max. So, I decided to include your proposed fix. Thank you very much! 😀

https://github.com/wrandelshofer/FastDoubleParser/commit/dbdce023c6a349b6a721b8eeb2c8d7b3b26146f4

wrandelshofer avatar May 25 '24 14:05 wrandelshofer