Aggregate expired pid db files, control the number of files and improve scrape effectiveness

Open cai-personal opened this issue 6 years ago • 1 comments

history related issues: https://github.com/prometheus/client_python/pull/441 https://github.com/prometheus/client_python/pull/430

Can we aggregate all the db files from a period of time ago and non-current pid into a total db file, to control the number of pid files ？

I have realized this idea with golang，here are some details：

Project deploy info：
gunicorn django
128 workers
gunicorn max_requests：10000（create a new pid file almost every minute）

I can't solve the problem that the pid file has been growing, and it can reach 6,000 in four days；
Try to delete the expired pid regularly in the code, but it will cause the figure to drop with grafana；
The time to request metric is getting longer as the program runs.

Improve scrape efficiency: I used golang to rewrite the logic of python aggregate metrics（generate metric still using python）. After rewriting, each scrape time is less than 1 second.

Solve the growing pid files: Aggregate all the db files from a period of time ago and non-current pid into a total db file. Then delete these files. When calculating metric, history total db + curent pid = current pid db. (i do it every hour)

Now, num of pid files is <200 in my project. if we can do change this, it would be a big strengthen. Just like prometheus will also aggregate historical data

Jul 26 '19 05:07 cai-personal

I just raised this idea to see if it is necessary. and sorry，i used Golang to rewrite aggregate metric, just because I was trying to improve the scrape time.

Jul 26 '19 05:07 cai-personal