fluent-logger-ruby icon indicating copy to clipboard operation
fluent-logger-ruby copied to clipboard

Logging messages can block

Open txase opened this issue 10 years ago • 10 comments

This gem can block execution, effectively allowing a website to be DOS'd, if too many messages are sent at the same time. The TCPSocket is set to be synchronous, so the only buffering of messages is done in the kernel. But writes to the socket use write instead of write_nonblock, so if the kernel buffer fills up the write will block and the ruby thread being executed will just need to wait until data is cleared from the kernel buffer.

Compounding the issue, there is only one TCPSocket connection for all threads of a process. If a web app uses puma or some other kind of threaded model, then as soon as the kernel buffer fills up all threads will need to wait until buffer room is freed up.

To partially solve this issue, I think the write needs to be turned into a write_nonblock. The gem could then detect overflows and discard messages when that happens. It could also buffer some messages if they overflow, but that would just be a nicety.

To solve the issue further, there probably should be a separate connection per thread. That would allow for better scalability of socket buffering when comparing a single threaded process and a process with lots of threads trying to write messages. But this is also just a nicety on top.

txase avatar Jul 15 '15 16:07 txase

To solve the issue further, there probably should be a separate connection per thread.

It means socket should be created on each send or stored socket on TLS? Could you show me an implementation example?

repeatedly avatar Jul 21 '15 00:07 repeatedly

@repeatedly: stored socket on TLS (thread-local-storage, at first I thought you meant TLSv1.0 like SSL :).

Something like this:

def connect!
  con = TCPSocket.new(@host, @port)
  # More setup here
  con
end

def send_data(data)
  Thread.current[:fluent_logger_connection] ||= connect!
  con = Thread.current[:fluent_logger_connection]
  con.write data
end

But again, this is extra niceness. The real issue is the blocking writes. That should be addressed first.

txase avatar Jul 31 '15 22:07 txase

any update? this could be major issue if you're sending data from web server and when there is some problem with td-agent so that this logger totally block whole web server and it can easily crashed (which is the issue we already having now)

breath103 avatar Dec 10 '15 02:12 breath103

I will investigate this issue in next week.

okkez avatar Dec 10 '15 05:12 okkez

I want to reproduce this issue, but I cannot reproduce this for now. I wrote very small script for this.

require "fluent-logger"

Fluent::Logger::FluentLogger.open(nil, :host=>'localhost', :port=>24224)

a = []

start = Time.now

100.times do |i|
  a << Thread.start(i) do
    puts "start#{i}"
    1000.times do
      Fluent::Logger.post("myapp.access", ENV.to_hash)
    end
    puts "end#{i}"
  end
end

a.each(&:join)

p :done
p Time.now - start
<source>
  @type forward
  port 24224
</source>

<match *.**>
  @type stdout
</match>

okkez avatar Dec 15 '15 09:12 okkez

@txase @breath103 Please try #39.

okkez avatar Dec 24 '15 04:12 okkez

@okkez thanks! sorry for late reply. i'll let you know after checking out this

breath103 avatar Mar 21 '16 05:03 breath103

Try setting up a logger to send to "localhohoho.com", and wait, wait, wait... at the logger creation time, and every time the connection is retried again.

aotenko avatar May 10 '16 12:05 aotenko

any updates on this? it's been an year since last update.

bogdanRada avatar Jul 04 '17 13:07 bogdanRada

Here is the patch: https://github.com/fluent/fluent-logger-ruby/pull/39 We need the feedback on several environment...

repeatedly avatar Jul 04 '17 15:07 repeatedly