rocketmq icon indicating copy to clipboard operation
rocketmq copied to clipboard

[ISSUE #4265] Fix query message get wrong result when tls enabled

Open zhanxuefeng opened this issue 3 years ago • 10 comments

Make sure set the target branch to develop

What is the purpose of the change

Fix query message get the wrong result when tls enabled

Brief changelog

Correct the return value of WritableByteChannel's write method in FileRegionEncoder

zhanxuefeng avatar May 09 '22 06:05 zhanxuefeng

Can you create an issue to describe this?

hzh0425 avatar May 09 '22 10:05 hzh0425

Can you create an issue to describe this?

ISSUE #4265 describe the bug.

zhanxuefeng avatar May 09 '22 12:05 zhanxuefeng

@zhanxuefeng oh, I understand, can you add a test to prove your pr?

hzh0425 avatar May 11 '22 14:05 hzh0425

Codecov Report

Merging #4266 (2ffa0a1) into develop (859bc15) will decrease coverage by 0.01%. The diff coverage is 20.00%.

@@              Coverage Diff              @@
##             develop    #4266      +/-   ##
=============================================
- Coverage      45.14%   45.12%   -0.02%     
+ Complexity      7634     7626       -8     
=============================================
  Files            976      976              
  Lines          67814    67816       +2     
  Branches        8964     8963       -1     
=============================================
- Hits           30612    30601      -11     
- Misses         33437    33445       +8     
- Partials        3765     3770       +5     
Impacted Files Coverage Δ
...va/org/apache/rocketmq/namesrv/NamesrvStartup.java 16.66% <0.00%> (-0.15%) :arrow_down:
...che/rocketmq/remoting/netty/FileRegionEncoder.java 92.85% <100.00%> (+0.54%) :arrow_up:
...rg/apache/rocketmq/common/stats/StatsSnapshot.java 84.61% <0.00%> (-15.39%) :arrow_down:
...apache/rocketmq/broker/longpolling/PopRequest.java 31.03% <0.00%> (-13.80%) :arrow_down:
...ache/rocketmq/common/stats/MomentStatsItemSet.java 39.13% <0.00%> (-13.05%) :arrow_down:
.../apache/rocketmq/common/stats/MomentStatsItem.java 38.09% <0.00%> (-9.53%) :arrow_down:
...va/org/apache/rocketmq/common/stats/StatsItem.java 50.00% <0.00%> (-6.67%) :arrow_down:
...etmq/namesrv/routeinfo/BatchUnRegisterService.java 94.73% <0.00%> (-5.27%) :arrow_down:
...org/apache/rocketmq/common/stats/StatsItemSet.java 50.74% <0.00%> (-4.48%) :arrow_down:
...controller/impl/DefaultBrokerHeartbeatManager.java 80.48% <0.00%> (-3.66%) :arrow_down:
... and 23 more

:mega: Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

codecov-commenter avatar May 12 '22 00:05 codecov-commenter

@hzh0425 The same logs mentioned in ISSUE #4265

FileRegionEncoder

image

I search the same msg mentioned in ISSUE #4265 image Header is 193 bytes and transferred 193 bytes, the 1st msg is 378 bytes and transferred 378 bytes, the 2nd msg is 450 bytes and transferred 450 bytes, total 1021 bytes transferred. image NettyDecoder get 1021 bytes and show the sub trace and pub trace.

zhanxuefeng avatar May 12 '22 01:05 zhanxuefeng

Good catch for this issue.

ByteBuffer's capacity is always greater than it's readable bytes

However, this bug involves QueryMessageProcessor, which is core function for RocketMQ. I wonder why this bug has never been found. Does it only occur when TLS enabled ?

Your code change in FileRegionEncoder is very underlying. Could you assess the impact on other upper-layer functions?

MatrixHB avatar May 12 '22 09:05 MatrixHB

@MatrixHB Yes, the bug only occur when tls enabled, FIleRegionEncoder will be add to NettyRemotingServer's ChannelPipline only when tlsModel is enforcing.

HandshakeHandler

image

ManyMessageTransfer, OneMessageTransfer and QueryMessageTransfer are extend from FileRegion in RocketMQ

OneMessageTransfer may not occur because it has a header and only one ByteBuffer.

But ManyMessageTransfer has the problem too, which involves PullMessageProcessor, I'll show the test later.

zhanxuefeng avatar May 13 '22 02:05 zhanxuefeng

Test ManyMessageTransfer

RocketMQ version: 4.9.3 tls.server.mode: enforcing transferMsgByHeap=false(default true) this is very important

The topic I created has 1 writeQueueNums and 1 readQueueNums, and I send 2 msgs to the topic.

I add some logs.

PullMessageProcessor

image

ManyMessageProcessor

image

A simple pull consumer

image

Console log

image

Consume log

image

Similar to ISSUE #4265 610 bytes and 2 msgs was found, but only transferred 413(216 header + 197 1st msg) bytes, because 768(256 capacity + 512 capacity) > 610, so the 2nd msg will not be transferred. The consume get 413 bytes but 610(4 length + 606 data) needed, no msg decoded.

zhanxuefeng avatar May 13 '22 03:05 zhanxuefeng

Prove ManyMessageProcessor

With my code changed in FileRegionEncoder image

Console log

image

Consumer log

image

One header and 2 msgs were transferred, and consumer received 610 bytes, decode to 2 msgs.

zhanxuefeng avatar May 13 '22 03:05 zhanxuefeng

Coverage Status

Coverage increased (+0.1%) to 48.848% when pulling 28a1bc5f856b2904bff2527af464834273d1a5ea on zhanxuefeng:develop into 8504abfafcf763be6fc593183edb1a2f8f061e51 on apache:develop.

coveralls avatar May 24 '22 10:05 coveralls