parquet-java icon indicating copy to clipboard operation
parquet-java copied to clipboard

Performance optimization: Move all LittleEndianDataInputStream functionality into ByteBufferInputStream

Open theosib-amazon opened this issue 3 years ago • 7 comments

I broke up https://github.com/apache/parquet-mr/pull/953 into more digestible pieces. This new PR is the lowest level set of changes. By themselves, these additions to ByteBufferInputStream don't yield much improvement, so future PRs will include modifications to other source files that take advantage of this new functionality.

The complete set of changes (including subsequent PRs) is for performance optimization. In benchmarking with Trino, we find query performance to improve from 5% to 15%, depending on the query, and that includes all the I/O time from S3.

All of LittleEndianDataInputStream functionality is moved into ByteBufferInputStream, without changing any pre-existing interfaces or functionality. These changes yield the following benefits:

  • Elimination of extra layers of abstraction and method call overhead
  • Enable the use of intrinsics for readInt, readLong, etc.
  • Availability of faster access methods like readFully and skipFully, without the need for helper functions

This PR also marks LittleEndianDataInputStream as deprecated.

Context: I've been working on improving Parquet reading performance in Trino, mostly by profiling while running performance benchmarks and TPCDS queries. This PR is a subset of the changes I made that have more than doubled the performance of a lot of TPCDS queries (wall clock time, including the S3 access time). If you are kind enough to accept these changes, I look forward to offering further performance improvements.

theosib-amazon avatar Apr 25 '22 21:04 theosib-amazon

I added some tests for this new code.

theosib-amazon avatar Apr 26 '22 21:04 theosib-amazon

Is the '5% to 15%' gain from this change or along with other changes? If it is later, can you share the point to other changes? Like to see the overall changes before committing.

shangxinli avatar May 16 '22 00:05 shangxinli

That improvement comes from a larget set of changes. I have a design doc that goes over all those changes plus some more that make it possible to get even more performance improvements.

https://docs.google.com/document/d/1fBGpF_LgtfaeHnPD5CFEIpA2Ga_lTITmFdFIcO9Af-g/edit?usp=sharing

theosib-amazon avatar May 16 '22 15:05 theosib-amazon

@sunchao Can you have a review?

shangxinli avatar Jul 24 '22 21:07 shangxinli

Is this mostly a refactoring PR? I also don't see LittleEndianDataInputStream being marked as deprecated.

sunchao avatar Jul 25 '22 22:07 sunchao

Is this mostly a refactoring PR? I also don't see LittleEndianDataInputStream being marked as deprecated.

I initially marked LittleEndianDataInputStream as deprecated. But I seem to recall that I was advised by Ryan Blue to do that in a later PR.

theosib-amazon avatar Jul 26 '22 14:07 theosib-amazon

@theosib-amazon Thanks again for your contribution! I see the comments are generally around duplicating code, refactoring, and making code maintainable. If you have a measurement of improvement on this change alone, it would help the reviewers

shangxinli avatar Dec 03 '22 19:12 shangxinli