Monitoring endpoints unavailable
Slack us first! I have raised the issue via Slack here but we were unable to fix it.
**Be informative / bug description ** I’d like to improve our monitoring for defectdojo, which is running inside kubernetes, using the helm terraform provider. Through kube-state-metrics we are monitoring for crashloops and so on, but I’d like to get more in-depth infos from the setup.
- No monitoring port on the pod even though
monitoring.enabledas well asmonitoring.prometheus.enabledare set to true Yet, as you can see here, the helm chart did not open a port in the pod. So we cannot use either PodMonitor nor ServiceMonitor of the prometheus-operator to gather the metrics.
When internally asking the nginx container, you receive nginx metrics.
- Django metrics unavailable The defectdojo metrics appear to be unavailable as well. In the nginx config, there is path for /django_metrics which should then be available at 8080 . Yet, it does not deliver anything under that route:
# curl defectdojo-django/django_metrics --user "monitoring:xxx" -IL
HTTP/1.1 302 Found
Server: nginx/1.25.0
Date: Mon, 12 Jun 2023 13:41:03 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 0
Connection: keep-alive
Location: /login?next=/django_metrics
X-Frame-Options: DENY
X-Content-Type-Options: nosniff
Referrer-Policy: same-origin
Cross-Origin-Opener-Policy: same-origin
Vary: Cookie
HTTP/1.1 200 OK
Server: nginx/1.25.0
Date: Mon, 12 Jun 2023 13:41:04 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 15976
Connection: keep-alive
Expires: Mon, 12 Jun 2023 13:41:04 GMT
Cache-Control: max-age=0, no-cache, no-store, must-revalidate, private
Vary: Cookie
X-Frame-Options: DENY
X-Content-Type-Options: nosniff
Referrer-Policy: same-origin
Cross-Origin-Opener-Policy: same-origin
Set-Cookie: csrftoken=xxx; expires=Mon, 10 Jun 2024 13:41:04 GMT; HttpOnly; Max-Age=31449600; Path=/; SameSite=Lax; Secure
(this was run within the cluster against the service from defectdojo)
Because of X-Frame-Options: DENY I'd guess that request is denied.
Username and Password are correct.
Steps to reproduce Steps to reproduce the behavior:
- In the helm chart, we’ve activate both
monitoring.enabledas well asmonitoring.prometheus.enabled -
DD_DJANGO_METRICS_ENABLED=trueis set in the environment of the container
Expected behavior Be able to gather monitoring data about the nginx container and the uwsgi container.
Deployment method (select with an X)
- [ ] Docker Compose
- [X] Kubernetes
- [ ] GoDojo
Environment information
- Operating System: [e.g. Ubuntu 18.04] AWS EKS Bottlerocket
- DefectDojo version (see footer) or commit message: [use
git show -s --format="[%ci] %h: %s [%d]"] 2.23.1
Logs
defectdojo-django-fb486bd59-rbvvn nginx 10.14.196.168 - monitoring [15/Jun/2023:13:46:13 +0000] "GET /django_metrics HTTP/1.1" 302 0 "-" "curl/8.0.1" "-"
defectdojo-django-fb486bd59-rbvvn uwsgi [pid: 15|app: -|req: -/-] 10.14.196.168 (-) {34 vars in 457 bytes} [Thu Jun 15 13:46:13 2023] GET /django_metrics => generated 0 bytes in 27 msecs (HTTP/1.1 302) 8 headers in 261 bytes (1 switches on core 0)
Additional context (optional)
/django_metrics appears to be in LOGIN_EXEMPT_URLS using the settings.dist.py default provided in the image.
Hi @KarstenSiemer , have you ever figure out this? I pretty much reproduced what you described running the latest version, Helm chart version 1.6.135. From a pod inside the cluster, I do
curl -v -u monitoring:<PASSWORD> -H 'Host: <HOSTNAME>' <POD_IP>:80/django_metrics
And I get a redirect to https://<HOSTNAME>/django_metrics, then I tried from outside of the cluster
curl -v -u monitoring:<PASSWORD> https://<HOSTNAME>/django_metrics
And I get a redirect to /login?next=/django_metrics.
I'd love to have a ServiceMonitor for it, I think it would be a pretty good enhancement for the chart to create it, but first step is to be able to retrieve the metrics.
On another attempt, I tried to use local_settings.py with the following content because I thought the end / of django_metrics/ could be the issue
LOGIN_EXEMPT_URLS += (rf'^{URL_PREFIX}django_metrics',)
After that,
curl -v -u monitoring:<PASSWORD> https://<HOSTNAME>/django_metrics
Gives 404 Not Found. In the logs, I see:
WARNING [django.request:241] Not Found: /django_metrics
Well, that was quite a surprise.
curl -v -u monitoring:<PASSWORD> https://<HOSTNAME>/django_metrics/metrics
Worked! :clown_face:
No modification needed in the setup, just that.
Next step is setup a ServiceMonitor for it.
TBH I've a use case where an AWS ALB need to check the health of the application (which is behind SSO) and said ALB doesn't know anything about hostnames or auth. I think there would be value in having a totally open endpoint responding even a simple 200 in case the app is alive.
Is there anything like that? nothing I can see unfortunately. For more context, the nginx frontend has an entry here:
# Used by Kubernetes readiness checks
location = /uwsgi_health {
limit_except GET { deny all; }
rewrite /.+ /login?force_login_form&next=/ break;
include /run/defectdojo/uwsgi_pass;
include /etc/nginx/wsgi_params;
access_log off;
}
but if I run it from an ALB which can't set the host name it fails with a 404. Example:
curl https://$CONFIGURED_HOSTNAME/uwsgi_health -I
HTTP/2 200
[...]
curl https://$IP_ADDRESS/uwsgi_health -Ik
HTTP/2 404
[...]
For future readers I've worked around this by overriding the AWS ALB health check code to be successful with a 400 retcode... ugly but at least if we get a response we are alive right?