simplecpp icon indicating copy to clipboard operation
simplecpp copied to clipboard

added `TokenList::Stream` class to wrap `std::istream` usage and implemented alternative C I/O version

Open firewave opened this issue 3 years ago • 2 comments

Reading the file via std::istream includes a considerable overhead caused by std::istream::sentry and others. Using C I/O instead reduces the "total Ir" usage by about 10%.

Testing with -q -Ilib/ -D__GNUC__ lib/valueflow.cpp:

Clang 13 271,013,439 -> 269,398,283 -> 244,353,123 GCC 11 -> -> ``

The intermediate value shows the improvement if we only change the actual input file. The last value if we also do it for all the includes which does not require you to use the new interface function. So we even if you do not change the application which uses this it would still result in a sizable improvement.

firewave avatar Mar 07 '22 12:03 firewave

Still needs the unit tests to be adjusted to test both implementations as well as making it properly selectable in the standalone binary.

This can possibly be introduced further by adding buffering to FileStream.

We could also add a MemStream version which works on a memory buffer allowing for the use case of just a passing a std::istringstream (i.e. string) and deprecating/removing the old API with std::istream.

firewave avatar Mar 07 '22 12:03 firewave

~Somehow using fgetc() seems to already interpret the characters causing 0x00 to be omitted when read even though the file is open in "binary" mode. Using fread() (which is supposed to be faster) fixes this but it is much slower.~

Turns out it's just a bug in my code with unget() and multi-byte characters.

firewave avatar Apr 19 '22 07:04 firewave

I backed out the test changes and will merge them into #261 so they don't get lost.

firewave avatar Oct 27 '22 11:10 firewave

Testing with -q -Ilib/ -D__GNUC__ lib/valueflow.cpp:

current -> std::istream input -> FILE* input Clang 14 263,400,261 -> 255,772,481 -> 241,915,419 GCC 12 266,098,949 -> 261,699,461 -> 247,682,262

Testing with -q -Ilib/ -D__GNUC__ lib/tokenize.cpp:

current -> std::istream input -> FILE* input Clang 14 300,502,647 -> 294,632,802 -> 275,871,698 GCC 12 304,202,739 -> 301,878,425 -> 282,870,756

firewave avatar Oct 27 '22 11:10 firewave