client_python icon indicating copy to clipboard operation
client_python copied to clipboard

labels method lead to memory leak

Open zhaozhiming37 opened this issue 2 years ago • 2 comments

I find metrics in my celery task will lead to memory leak. And I reproduced this problem issue locally.

import tracemalloc
from prometheus_client import Counter

counter_a = Counter('request_count_a', 'request count to a', ['name'])
counter_b = Counter('request_count_b', 'request count to b')

tracemalloc.start()
for _ in range(10):
    counter_a.labels('a').inc()
    curr_mem, peak_mem = tracemalloc.get_traced_memory()
    print(f"curr mem: {curr_mem / 10**3} Kb, peak mem: {peak_mem / 10**3} Kb")

print("*****************************************")

for _ in range(10):
    counter_b.inc()
    curr_mem, peak_mem = tracemalloc.get_traced_memory()
    print(f"curr mem: {curr_mem / 10 ** 3} Kb, peak mem: {peak_mem / 10 ** 3} Kb")

It seems to be caused by the labels method. PS: prometheus-client==0.17.1 mem

zhaozhiming37 avatar Jul 12 '23 07:07 zhaozhiming37

If you run the garbage collector in your example, most of that allocated memory is reclaimed:

import gc
import tracemalloc
from prometheus_client import Counter

counter_a = Counter('request_count_a', 'request count to a', ['name'])

tracemalloc.start()

for _ in range(10):
    counter_a.labels('a').inc()
    curr_mem, peak_mem = tracemalloc.get_traced_memory()
    print(f"curr mem: {curr_mem / 10 ** 3} Kb, peak mem: {peak_mem / 10 ** 3} Kb")

print("Run the garbage collector")
gc.collect()

curr_mem, peak_mem = tracemalloc.get_traced_memory()
print(f"curr mem: {curr_mem / 10 ** 3} Kb, peak mem: {peak_mem / 10 ** 3} Kb")

gives the following:

curr mem: 3.56 Kb, peak mem: 3.704 Kb
curr mem: 3.664 Kb, peak mem: 3.952 Kb
curr mem: 3.712 Kb, peak mem: 4.0 Kb
curr mem: 3.76 Kb, peak mem: 4.048 Kb
curr mem: 3.808 Kb, peak mem: 4.096 Kb
curr mem: 3.856 Kb, peak mem: 4.144 Kb
curr mem: 3.904 Kb, peak mem: 4.192 Kb
curr mem: 3.952 Kb, peak mem: 4.24 Kb
curr mem: 4.0 Kb, peak mem: 4.288 Kb
curr mem: 4.048 Kb, peak mem: 4.336 Kb
Run the garbage collector
curr mem: 3.584 Kb, peak mem: 4.381 Kb

However perhaps your real code is different, could it be that you're creating labels that have an unbounded set of possible values? Prometheus client has to keep ahold of every label value it has ever seen, so if you keep passing in different label values (eg: user IDs), memory usage will continue to grow for each new label value. See the Prometheus label documentation:

image

angusholder avatar Aug 22 '23 16:08 angusholder

I've also found memory leaks, which on a few computers appear to be adding 3 gigabytes of memory immediately after being pulled by prometheus

zhanghaofei avatar Nov 22 '23 08:11 zhanghaofei