Error on ingesting samples that are too old or are too far into the future
Hi!
Faced strange situation with timestamps.
Briefly:
- If timestamp is taken from
datetime.datetime.now()- everything is ok. - If timestamp is parsed from the date which looks same - I got warning in Prometheus server log about bad datetime and metric is not shown.
Here is code example with comments about which metrics works and which don't:
import datetime
import sys
import time
import pytz
from prometheus_client import (
start_http_server,
)
from prometheus_client.core import (
GaugeMetricFamily,
REGISTRY,
)
class BugDemoMetricsCollector:
def collect(self):
dt_format = '%Y-%m-%d_%H-%M-%S.%f %z'
dt_now = datetime.datetime.now(tz=pytz.timezone('UTC'))
print(dt_now)
# Works
gobj = GaugeMetricFamily('FooMetricGood', '')
gobj.add_metric([], 123, timestamp=dt_now.timestamp())
yield gobj
# Works too
dt_now_str = dt_now.strftime(dt_format)
dt_parsed = datetime.datetime.strptime(dt_now_str, dt_format)
gobj = GaugeMetricFamily('FooMetricGoodToo', '')
gobj.add_metric([], 456, timestamp=dt_parsed.timestamp())
yield gobj
# Does not work, but same date
dt_custom_str = '2021-11-11_18-12-59.000000 +0000'
dt_parsed_from_custom = datetime.datetime.strptime(dt_custom_str, dt_format)
gobj = GaugeMetricFamily('FooMetricNotWorking', '')
gobj.add_metric([], 789987, timestamp=dt_parsed_from_custom.timestamp())
yield gobj
def main():
start_http_server(8080)
REGISTRY.register(BugDemoMetricsCollector())
while True:
time.sleep(1)
if __name__ == '__main__':
sys.exit(main())
Message from Prometheus log about trouble metric:
prometheus-prometheus-1 | ts=2021-11-11T13:41:01.895Z caller=scrape.go:1563 level=warn component="scrape manager" scrape_pool=services target=http://192.168.64.1:8080/metrics msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=1
But date is correct, here is Python code for used date and format:
>>> datetime.datetime.strptime('2021-11-11_18-12-59.000000 +0000', '%Y-%m-%d_%H-%M-%S.%f %z')
datetime.datetime(2021, 11, 11, 18, 12, 59, tzinfo=datetime.timezone.utc)
I've spent more than one day trying to google, read docs and resolve it - no results.
Any help will be very appreciated.
I ran into this same error message when using Prometheus. This is an issue with promethus itself, not an error in client_python.
It turns out, it's not comparing the time of the incoming log to the current system time. It is comparing it to the latest time in the database. The log message is missing three key pieces of data: what the time of the entry is, whether the entry is too old or too far in the future, and boundry date/time that determines whether an entry is acceptable.
As best I can tell, the responsible code is here. I say this because the similar code in target.go did not change the error message when I commented it out and deployed my modified build, whereas when I commented out the checks in head_append.go, they got rejected because they were out of order instead of too old or too far in the future.
Thus, if you have an entry in the database in the future, all new data will be rejected because it's too old. It doesn't look like promtool tsdb ... has any way to delete future entries. I solved it on my system by stopping prometheus, removing the data directory and starting prometheus again.
This can be reproduced by setting the system clock to the future, adding some data to prometheus, and then correcting the system clock.
Apologies if replying to an old post is a faux pas here, but there's only one other place this issue seems to have been discussed and that thread was closed without resolving the issue. So my hope is that this comment will help people in the future when they run into this same error message.