fluentd icon indicating copy to clipboard operation
fluentd copied to clipboard

Fix line skipping issue in receive_lines method

Open yugeeklab opened this issue 1 year ago • 7 comments

Which issue(s) this PR fixes: Fixes #4494

What this PR does / why we need it: Before this patch, long lines could cause breakdowns in fluentd, potentially posing a vulnerability. With this patch, max_line_size will be integrated into the FIFO, enabling the system to skip lines exceeding the maximum size before executing receive_lines.

Docs Changes:

Release Note:

yugeeklab avatar May 10 '24 11:05 yugeeklab

@yugeeklab Thanks for this fix! CI is currently unstable because of #4487. We will fix it. Sorry for the trouble.

I see the intent of this fix as follows.

  • In the current implementation, large lines that would eventually be discarded in receive_lines are temporarily held in IOHandler's @lines.
  • This is a waste of memory.
  • This PR resolves the waste.

Surely, such a fix would allow us to limit memory consumption by the max_line_size setting to some extent!

This PR would be effective to some extent, however I believe the problem of memory consumption will remain. It would be possible that FIFO's @buffer becomes unlimitedly large if the @eol does not appear in the data.

Are these my understandings correct?

daipom avatar May 13 '24 03:05 daipom

Hi, @daipom

I've just published an issue #4491 for more information.

This PR would be effective to some extent, however I believe the problem of memory consumption will remain. It would be possible that FIFO's @buffer becomes unlimitedly large if the @eol does not appear in the data.

When max_line_size isn't set, FIFO's @buffer can grow indefinitely. Or if max_line_size has large value, FIFO's buffer will be limited, but there's still a possibility of fluentd experiencing slowdowns.

Summary:

as-is: max_line_size helps you avoid buffer overflow configuring via buffer section. to-be: max_line_size helps prevent buffer overflow by configuring the buffer section and also ensures FIFO's buffer size remains limited.

If you have any suggestions, such as the fifo_buffer_size parameter or any other ideas, please feel free to discuss them with me.

Thank you for your review!

yugeeklab avatar May 13 '24 09:05 yugeeklab

@yugeeklab

I've just published an issue #4491 for more information.

Thanks so much!

as-is: max_line_size helps you avoid buffer overflow configuring via buffer section. to-be: max_line_size helps prevent buffer overflow by configuring the buffer section and also ensures FIFO's buffer size remains limited.

Now I understand! The following understanding was not correct.

This PR would be effective to some extent, however I believe the problem of memory consumption will remain. It would be possible that FIFO's @buffer becomes unlimitedly large if the @eol does not appear in the data.

This fix clears FIFO's @buffer when read_lines. So, this fix ensures FIFO's buffer size remains limited.

If you have any suggestions, such as the fifo_buffer_size parameter or any other ideas, please feel free to discuss them with me.

Thanks! Basically, it seems to be a very good idea to limit the FIFO's buffer.

daipom avatar May 13 '24 10:05 daipom

Basically, it seems to be a very good idea to limit the FIFO's buffer.

Thank you for your comment!! @daipom

Please let me know if there's any feedback on my code or idea. I'll review and accept your feedback as soon as possible.

Thank you.

yugeeklab avatar May 13 '24 11:05 yugeeklab

About CI failures, although #4493 has been resolved, we still need to resolve #4487.

daipom avatar May 17 '24 04:05 daipom

@yugeeklab The CI issue has been resolved. Sorry for the trouble. Could you please rebase this branch on the latest master?

daipom avatar May 17 '24 08:05 daipom

Hi, @daipom

Rebase is done!!

Thank you for your review!!

yugeeklab avatar May 19 '24 08:05 yugeeklab

Sorry for waiting. I will review this soon.

daipom avatar May 27 '24 08:05 daipom

@yugeeklab Thanks for updating. I'm checking the CI failures.

daipom avatar Jun 04 '24 05:06 daipom

Current CI failures have nothing to do with this PR. Sorry for the trouble again.

daipom avatar Jun 04 '24 05:06 daipom

The CI issue has been resolved. So, could you please rebase this to the latest master? Sorry for the trouble again.

daipom avatar Jun 07 '24 08:06 daipom

Hi, @daipom

Rebase is done. I also added a commit to resolve the following issue. Please review it if you don't mind.

Thank you.

yugeeklab avatar Jun 09 '24 08:06 yugeeklab

We should not update pos like this commit (bf73efe). ... If updating pos like this commit, some data may be lost when BufferOverflowError occurs or when Fluentd is forced to stop.

Oh, sorry, it was wrong. The @lines.empty? condition will prevent it (probably...).

https://github.com/fluent/fluentd/blob/bf73efeea722376f30bccac0f10095fb6601939c/lib/fluent/plugin/in_tail.rb#L1230-L1232

So, we need to consider only the following points.

For this feature, we need to take care of @was_long_line in particular. We need to make sure that the restart of Fluentd does not cause a subsequent incomplete log to be sent.

daipom avatar Jun 12 '24 05:06 daipom

So, we need to consider only the following points.

For this feature, we need to take care of @was_long_line in particular. We need to make sure that the restart of Fluentd does not cause a subsequent incomplete log to be sent.

I think we should change FIFO#bytesize.

https://github.com/fluent/fluentd/blob/bf73efeea722376f30bccac0f10095fb6601939c/lib/fluent/plugin/in_tail.rb#L1087-L1089

It is used in the following pos logic:

https://github.com/fluent/fluentd/blob/bf73efeea722376f30bccac0f10095fb6601939c/lib/fluent/plugin/in_tail.rb#L1230-L1241

https://github.com/fluent/fluentd/blob/bf73efeea722376f30bccac0f10095fb6601939c/lib/fluent/plugin/in_tail.rb#L1246-L1248

The bytesize should be the uncommitted byte size that FIFO is still handling. It does not equal the size of the buffer of FIFO anymore because FIFO can clear the buffer to skip the long line.

In the following case (max_line_size 12), very long line not finished yet will be cleared from the buffer soon.

short line\n # To be committed to the pos
very long line not finished yet # Not to be committed to the pos until the `@eol` occurs.

However, that data size should be considered for pos handling. Since the line is not finished yet, the pos update should be done up to the end of short line\n. (When Fluentd restarts, Fluentd should continue the process from the end of short line\n.) Also, the reopening pos should be from the end of very long line not finished yet (especially for the case open_on_every_update).

For this, FIFO#bytesize should be the uncommitted byte size that FIFO is still handling, not the real buffer size of FIFO.

daipom avatar Jun 12 '24 07:06 daipom

@yugeeklab I have fixed the remaining points and pushed them to my tmp branch (the following 3 commits). Could you please check them? If there is no problem, I will push these commits to this PR. If you have any concerns or ideas, please let me know.

The main point is to resolve the issue that is tested on the 'discards a subsequent data in a long line even if restarting occurs between' test in fix to commit the correct pos to continue processing correctly. This test would fail in the current branch.

daipom avatar Jun 12 '24 09:06 daipom

Hi @daipom

So, Here is the summary of your code

AS-IS: When a restart occurs while reading a long line, because it doesn't record the position properly, a long line can be recognized as a short line.

T0-BE: When a restart occurs while reading a long line and the position is recorded properly, the long line is recognized as a long line.

It looks good to me!!

Thank you!!!

yugeeklab avatar Jun 14 '24 09:06 yugeeklab

@yugeeklab Yes! Thanks for checking it! I will push them.

daipom avatar Jun 14 '24 10:06 daipom

@yugeeklab Sorry, I failed to push. Something wrong happens... I'm fixing it. Please wait...

daipom avatar Jun 14 '24 10:06 daipom

@yugeeklab Sorry for the trouble. Could you please run the following command to recover your origin/master branch? (I wrongly pushed the current master to your origin/master. Sorry for the trouble.)

git push -f

daipom avatar Jun 14 '24 10:06 daipom

Hi @daipom

I recover my origin/master!!

Should I also reopen Pull Request?

Thank you.

yugeeklab avatar Jun 17 '24 00:06 yugeeklab

Should I also reopen Pull Request?

Of course! Thanks for reopening it! Sorry for the trouble.

daipom avatar Jun 17 '24 00:06 daipom