haproxy_exporter MAINT status should not return 0

When a frontend or backend is set to maintenance mode, is is down on purpose and should not return a fail state to prometheus/grafana by returning 0. Instead I suggest a returncode of 2 so grafana can give the status a special color in case a loadbalancer is equipped with such a setting.

We've tried changing the return value in the code but that didn't seem to do the trick.

Dec 12 '17 10:12 jzielke84

In order to avoid complexity to understand the different code status, I would propose to rename the up metric to something like:

status{status="up", backend="my_backend_name"} 42
status{status="down", backend="my_backend_name"} 5
status{status="maint", backend="my_backend_name"} 18

WDYT @grobie?

Dec 13 '17 13:12 wdauchy

Strongly advise this change

Dec 19 '17 09:12 jzielke84

waiting for @grobie acknowledgment before starting any patch

Dec 19 '17 16:12 wdauchy

I wouldn't change the current up metric, it's being used in many dashboards and alert expressions. I'd be fine adding an additional metric, but I'm a bit worried about the label cardinality. I'm counting at least 9 different status values. Maybe only add that metric to backend and frontend lines?

Dec 19 '17 17:12 grobie

@grobie the MAINT status is only avaiable in the backend. At least the status MAINT shouldn't return 0 because a planned maitenance by definition is not an error nor an unwanted condition. Normally the status page of haproxy only has the status UP,DOWN,MAINT and DRAIN (see: Link).

Also drain is a forced action which is wanted by the administrators of the proxy and thus should not be considered as a failure either. So generally spoken, UP and DOWN are conditions which could be met without any action (e.g. in an error case) while DRAIN and MAINT are forced actions which should be treated differently. Hence my suggestion to return 2 and expand the haproxy metrics.

Dec 20 '17 10:12 jzielke84

ok I can add what I proposed

Dec 20 '17 18:12 wdauchy

The up metric is a common pattern in Prometheus and is a boolean value with the values 0 or 1. An instance / a server which can't serve requests is not up, whether it's not up because of errors or maintenance is not being answered by this metric. If that destinction is relevant to users, I'm happy to accept a PR which will add a new metric broken down by status type.

Dec 21 '17 15:12 grobie

I would very much like the status to be exposed, particularly per server, few thoughts (partly echoing what has already been said):

"returncode of 2" would be meaningless once you start summing across metrics.
Not (numerically) counting MAINT as down is dangerous. Let's say I alert when I have less than 5 servers ready. I normally have 10 servers in the backend, I take 2 down for maintenance (MAINT), then 4 of the remaining servers fail. If MAINT isn't being counted as down I will not get an alert.

I would support leaving the current up metric as is, and creating a new metric e.g:

haproxy_server_status{status="MAINT",backend="foo",instance="bar",job="haproxy",server="server1"} 1

Transitional statuses could perhaps be normalised, e.g "UP 1/3" and "UP 2/3" become "UP".

Dec 30 '17 21:12 Tom-Fawcett

So rather than changing the metric, adding a new attribute is the right way to go so users can decide whether to work with that new information being parsed or not without breaking anything they've created so far.

Unfortunately I'm not familiar with the go-syntax and although I understand parts of it I think it's better someone does the pull request who knows what he's doing. Th suggestion of @Tom-Fawcett seems to point in the right direction.

Jan 02 '18 11:01 jzielke84

Any plans on implementing this one in the near future?

Feb 16 '18 12:02 jzielke84

@jzielke-nli I have opened #101. Depending on feedback it may require some additional work.

Feb 26 '18 19:02 Tom-Fawcett

@grobie Please be so kind and commit this change if ok.

Apr 27 '18 14:04 jzielke84

With MAINT of a server being configured as a new metric, in this threads example, how would that work with Grafana where you have a single panel for status of a server?

May 18 '18 21:05 Shadow00Caster

Any update on this? @grobie

Aug 29 '18 06:08 jnogol

Dead end here?

Aug 02 '19 08:08 jzielke84

Any update on this one ?

Nov 19 '19 09:11 ekm1908

Hi, I am closing this issue because we are retiring this exporter. We will not be implementing new features anymore.

Please use the Prometheus support in HAProxy directly. It may already support this; if not, please open an issue against the HAProxy repository.

Feb 15 '23 10:02 matthiasr