python-logstash-async Memory leak when logstash is not available

Hello. My team recently faced a problem with python-logstash-async. When logstash server went into maintenance, the memory usage of one of our services started to rapidly grow.

Turns out this is because of python-logstash-async's feature to store failed events in the sqlite database. Even though we write everything to the filesystem, memory usage is not even comparable to database size (memory usage is much bigger).

To demonstrate that I created this little Django project: https://github.com/xunto-examples/async_logstash_memory_leak

I run start_logstash.sh and start_no_logstash.sh to start service. And tests.sh to start sending requests.

Mar 29 '21 18:03 xunto

Test results when logstash is not available:

While database size was 71.7MB at the end of testing.

Mar 29 '21 18:03 xunto

Test result for the same project but without python-logstash-async enabled: no_logstash

Mar 29 '21 18:03 xunto

I can also confirm that leak doesn't happen anymore if event_ttl parameter is set.

Mar 29 '21 18:03 xunto

Thanks for the detailed report! I'm currently quite limited on time, so it might took some time to look into it.

Mar 29 '21 18:03 eht16

It's ok, we are going to stick with setting event_ttl for now anyway.

Mar 29 '21 18:03 xunto

I had a look at it but unfortunately, I'm not able to reproduce it, not even with your great test setup. pla_py37_memory_issue63 As you can see, I had the test setup running for hours and continously firing requests on it but the memory consumption kept at the same level.

What versions of python-logstash-async, django and python are you using? I used the following versions:

(venv) enrico@localhost: ~% python --version
Python 3.7.3
(venv) enrico@localhost: ~% pip list
Package               Version
--------------------- ---------
Django                3.2
memory-profiler       0.58.0
python-logstash-async 2.2.0
...

Apr 30 '21 08:04 eht16

Hi, sorry for slow response, we used

Django==3.1.7
python-logstash-async==2.2.0

and

Python 3.9.3

I also found out that the REAL reason for RAM leakage was that https://github.com/eht16/python-logstash-async/blob/018d238e3f76b97def3b279b0226d8e326bd3dee/logstash_async/worker.py#L49 were getting big too fast. Not sure we could do anything about it.

Also turned out that event_ttl didn't really help... so we moved to using stdout and fluentd instead of your library. :(

May 11 '21 09:05 xunto

I also found out that the REAL reason for RAM leakage was that

https://github.com/eht16/python-logstash-async/blob/018d238e3f76b97def3b279b0226d8e326bd3dee/logstash_async/worker.py#L49

were getting big too fast. Not sure we could do anything about it.

Good catch! I was able to reproduce the behaviour by slowing down writing events to the SQLite database. Then the events impound in the Queue and so consume memory.

Is it possible that the storage you used could be too slow to keep up with the high event rate? If so, it would explain the high memory usage.

Also turned out that event_ttl didn't really help... so we moved to using stdout and fluentd instead of your library. :(

Sad to hear. Sorry for not being able to help you in time.

If you still want to try it, you could set the database_path setting to None to use an in-memory database instead of a storage based on. If the above assumption is true, this could help.

May 25 '21 21:05 eht16

Closing due to missing response. Feel free to re-open if the problem persists.

Dec 20 '23 12:12 eht16