incubator-uniffle icon indicating copy to clipboard operation
incubator-uniffle copied to clipboard

[#1163] improvement(netty): return the direct byte buffer when getting localfile data by using netty

Open pegasas opened this issue 2 years ago • 11 comments

What changes were proposed in this pull request?

Currently, the FileSegmentManagedBuffer will return the heap bytebuffer when reading data from localfile. It will be better to using the direct byte buffer

Why are the changes needed?

Fix: # (issue)

Does this PR introduce any user-facing change?

No.

pegasas avatar Aug 24 '23 17:08 pegasas

try to resolve https://github.com/apache/incubator-uniffle/issues/1163

pegasas avatar Aug 24 '23 17:08 pegasas

Could you help describe more in description ? @pegasas

zuston avatar Aug 30 '23 03:08 zuston

Codecov Report

Merging #1171 (3964b6a) into master (523cc4f) will increase coverage by 1.12%. Report is 7 commits behind head on master. The diff coverage is 0.00%.

@@             Coverage Diff              @@
##             master    #1171      +/-   ##
============================================
+ Coverage     53.61%   54.74%   +1.12%     
- Complexity     2575     2582       +7     
============================================
  Files           391      371      -20     
  Lines         22366    20047    -2319     
  Branches       1876     1875       -1     
============================================
- Hits          11992    10975    -1017     
+ Misses         9664     8436    -1228     
+ Partials        710      636      -74     
Files Changed Coverage Δ
...niffle/server/netty/ShuffleServerNettyHandler.java 1.47% <0.00%> (+<0.01%) :arrow_up:

... and 29 files with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov-commenter avatar Aug 30 '23 03:08 codecov-commenter

We know the managedBuffer is FileSegmentManagedBuffer . To keep performance, we should ensure the FileChannel is used, which is invoked by convertToNetty method. Right?

Sure. I will fix this and add some description.

pegasas avatar Aug 30 '23 05:08 pegasas

@pegasas In fact, this line of code in getShuffleData#transferTo method channel.write(buffer.convertToNetty()) will use zero-copy api, we should keep using FileSegmentManagedBuffer instead of NettyManagedBuffer.

leixm avatar Sep 01 '23 05:09 leixm

@pegasas In fact, this line of code in getShuffleData#transferTo method channel.write(buffer.convertToNetty()) will use zero-copy api, we should keep using FileSegmentManagedBuffer instead of NettyManagedBuffer.

It looks like we indeed use zero copy when in MessageEncoder#write,

Perhaps we may replace https://github.com/apache/incubator-uniffle/blob/master/common/src/main/java/org/apache/uniffle/common/netty/buffer/FileSegmentManagedBuffer.java#L64 into direct byte buffer?

It seems it use heap byte buffer in https://github.com/apache/incubator-uniffle/blob/master/internal-client/src/main/java/org/apache/uniffle/client/impl/grpc/ShuffleServerGrpcNettyClient.java#L305-L306

Is there any thinking when we first design here? Will there be any risk?

@zuston @leixm

pegasas avatar Sep 04 '23 02:09 pegasas

@pegasas In fact, this line of code in getShuffleData#transferTo method channel.write(buffer.convertToNetty()) will use zero-copy api, we should keep using FileSegmentManagedBuffer instead of NettyManagedBuffer.

It looks like we indeed use zero copy when in MessageEncoder#write,

Perhaps we may replace https://github.com/apache/incubator-uniffle/blob/master/common/src/main/java/org/apache/uniffle/common/netty/buffer/FileSegmentManagedBuffer.java#L64 into direct byte buffer?

It seems it use heap byte buffer in https://github.com/apache/incubator-uniffle/blob/master/internal-client/src/main/java/org/apache/uniffle/client/impl/grpc/ShuffleServerGrpcNettyClient.java#L305-L306

Is there any thinking when we first design here? Will there be any risk?

@zuston @leixm

You are right, we can refer to https://github.com/apache/spark/blob/master/common/network-common/src/main/java/org/apache/spark/network/protocol/MessageEncoder. Java, implement Encoder zero-copy, write an AbstractFileRegion.

The FileSegmentManagedBuffer#nioByteBuffer method is for the Grpc server, not for the nettty server. I am not sure whether to use off-heap memory as much as possible in Grpc mode, what do you think? cc @zuston .

Regarding the second question, the client will use GetLocalShuffleDataResponse as the decode result. In fact, the buffer here comes from netty's ByteBuf. As long as the netty client uses off-heap memory, off-heap memory is also used here. You can refer to TransportConf#preferDirectBufs.

leixm avatar Sep 04 '23 03:09 leixm

@pegasas Are you willing to raise a PR to implement the zero-copy logic of Encoder? also there is another problem. At present, only sendShuffleData uses netty. The three interfaces getLocalShuffleData, getMemoryShuffleData, and getLocalShuffleIndex all support netty, but they have not been enabled yet. Can you raise a PR to enable it?

leixm avatar Sep 04 '23 03:09 leixm

@pegasas Are you willing to raise a PR to implement the zero-copy logic of Encoder? also there is another problem. At present, only sendShuffleData uses netty. The three interfaces getLocalShuffleData, getMemoryShuffleData, and getLocalShuffleIndex all support netty, but they have not been enabled yet. Can you raise a PR to enable it?

Maybe we're not familiar with the Netty. Could you raise a pr to enable the Netty?

jerqi avatar Sep 04 '23 06:09 jerqi

@pegasas Are you willing to raise a PR to implement the zero-copy logic of Encoder? also there is another problem. At present, only sendShuffleData uses netty. The three interfaces getLocalShuffleData, getMemoryShuffleData, and getLocalShuffleIndex all support netty, but they have not been enabled yet. Can you raise a PR to enable it?

Maybe we're not familiar with the Netty. Could you raise a pr to enable the Netty?

Sure.

leixm avatar Sep 04 '23 06:09 leixm

@pegasas Are you willing to raise a PR to implement the zero-copy logic of Encoder? also there is another problem. At present, only sendShuffleData uses netty. The three interfaces getLocalShuffleData, getMemoryShuffleData, and getLocalShuffleIndex all support netty, but they have not been enabled yet. Can you raise a PR to enable it?

Sure. Will continue following this thread. Thanks @leixm !

pegasas avatar Sep 04 '23 07:09 pegasas