[#1163] improvement(netty): return the direct byte buffer when getting localfile data by using netty
What changes were proposed in this pull request?
Currently, the FileSegmentManagedBuffer will return the heap bytebuffer when reading data from localfile. It will be better to using the direct byte buffer
Why are the changes needed?
Fix: # (issue)
Does this PR introduce any user-facing change?
No.
try to resolve https://github.com/apache/incubator-uniffle/issues/1163
Could you help describe more in description ? @pegasas
Codecov Report
Merging #1171 (3964b6a) into master (523cc4f) will increase coverage by
1.12%. Report is 7 commits behind head on master. The diff coverage is0.00%.
@@ Coverage Diff @@
## master #1171 +/- ##
============================================
+ Coverage 53.61% 54.74% +1.12%
- Complexity 2575 2582 +7
============================================
Files 391 371 -20
Lines 22366 20047 -2319
Branches 1876 1875 -1
============================================
- Hits 11992 10975 -1017
+ Misses 9664 8436 -1228
+ Partials 710 636 -74
| Files Changed | Coverage Δ | |
|---|---|---|
| ...niffle/server/netty/ShuffleServerNettyHandler.java | 1.47% <0.00%> (+<0.01%) |
:arrow_up: |
... and 29 files with indirect coverage changes
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
We know the managedBuffer is FileSegmentManagedBuffer . To keep performance, we should ensure the FileChannel is used, which is invoked by convertToNetty method. Right?
Sure. I will fix this and add some description.
@pegasas In fact, this line of code in getShuffleData#transferTo method channel.write(buffer.convertToNetty()) will use zero-copy api, we should keep using FileSegmentManagedBuffer instead of NettyManagedBuffer.
@pegasas In fact, this line of code in getShuffleData#transferTo method
channel.write(buffer.convertToNetty())will use zero-copy api, we should keep usingFileSegmentManagedBufferinstead ofNettyManagedBuffer.
It looks like we indeed use zero copy when in MessageEncoder#write,
Perhaps we may replace https://github.com/apache/incubator-uniffle/blob/master/common/src/main/java/org/apache/uniffle/common/netty/buffer/FileSegmentManagedBuffer.java#L64 into direct byte buffer?
It seems it use heap byte buffer in https://github.com/apache/incubator-uniffle/blob/master/internal-client/src/main/java/org/apache/uniffle/client/impl/grpc/ShuffleServerGrpcNettyClient.java#L305-L306
Is there any thinking when we first design here? Will there be any risk?
@zuston @leixm
@pegasas In fact, this line of code in getShuffleData#transferTo method
channel.write(buffer.convertToNetty())will use zero-copy api, we should keep usingFileSegmentManagedBufferinstead ofNettyManagedBuffer.It looks like we indeed use zero copy when in
MessageEncoder#write,Perhaps we may replace https://github.com/apache/incubator-uniffle/blob/master/common/src/main/java/org/apache/uniffle/common/netty/buffer/FileSegmentManagedBuffer.java#L64 into direct byte buffer?
It seems it use heap byte buffer in https://github.com/apache/incubator-uniffle/blob/master/internal-client/src/main/java/org/apache/uniffle/client/impl/grpc/ShuffleServerGrpcNettyClient.java#L305-L306
Is there any thinking when we first design here? Will there be any risk?
@zuston @leixm
You are right, we can refer to https://github.com/apache/spark/blob/master/common/network-common/src/main/java/org/apache/spark/network/protocol/MessageEncoder. Java, implement Encoder zero-copy, write an AbstractFileRegion.
The FileSegmentManagedBuffer#nioByteBuffer method is for the Grpc server, not for the nettty server. I am not sure whether to use off-heap memory as much as possible in Grpc mode, what do you think? cc @zuston .
Regarding the second question, the client will use GetLocalShuffleDataResponse as the decode result. In fact, the buffer here comes from netty's ByteBuf. As long as the netty client uses off-heap memory, off-heap memory is also used here. You can refer to TransportConf#preferDirectBufs.
@pegasas Are you willing to raise a PR to implement the zero-copy logic of Encoder? also there is another problem. At present, only sendShuffleData uses netty. The three interfaces getLocalShuffleData, getMemoryShuffleData, and getLocalShuffleIndex all support netty, but they have not been enabled yet. Can you raise a PR to enable it?
@pegasas Are you willing to raise a PR to implement the zero-copy logic of Encoder? also there is another problem. At present, only sendShuffleData uses netty. The three interfaces getLocalShuffleData, getMemoryShuffleData, and getLocalShuffleIndex all support netty, but they have not been enabled yet. Can you raise a PR to enable it?
Maybe we're not familiar with the Netty. Could you raise a pr to enable the Netty?
@pegasas Are you willing to raise a PR to implement the zero-copy logic of Encoder? also there is another problem. At present, only sendShuffleData uses netty. The three interfaces getLocalShuffleData, getMemoryShuffleData, and getLocalShuffleIndex all support netty, but they have not been enabled yet. Can you raise a PR to enable it?
Maybe we're not familiar with the Netty. Could you raise a pr to enable the Netty?
Sure.
@pegasas Are you willing to raise a PR to implement the zero-copy logic of Encoder? also there is another problem. At present, only sendShuffleData uses netty. The three interfaces getLocalShuffleData, getMemoryShuffleData, and getLocalShuffleIndex all support netty, but they have not been enabled yet. Can you raise a PR to enable it?
Sure. Will continue following this thread. Thanks @leixm !