naemon-core icon indicating copy to clipboard operation
naemon-core copied to clipboard

Implemented: Adding configuration for start and reload (#364)

Open ccztux opened this issue 4 years ago • 11 comments

This pull request adds a config check before reloading/starting naemon. The implementation is done in the systemd unit and the init script. If the config check throws an error, reloading/starting will be aborted.

ccztux avatar Nov 02 '21 13:11 ccztux

Thanks for sending this is. We used to have something similar in the past although for restarts, however it was removed (see references below). The past is messy though, and it seems it was never removed for systemd, even though that was probably intended.

The reason at that time, is that for very large systems, reload/restarts/verify-config can take a significant amount of time. Bundling verify-config with the systemd/service commands makes it hard for a user/other software to make a choice what you want to prioritize. I.e. there is nothing stopping you today from running a verify config yourselves before doing the systemctl/service call, if that's what you prefer.

I personally don't have a strong opinion on the subject (and we ship our own systemd files anyway), but it's not entirely clear to me what the "correct" / "best" approach is.

References: https://github.com/naemon/naemon-core/pull/170 https://github.com/naemon/naemon-core/pull/175 https://github.com/naemon/naemon-core/pull/177

jacobbaungard avatar Nov 02 '21 14:11 jacobbaungard

What's the problem to check the scale, the size of naemon installation and to perform the check at the restart when number of services/hosts/etc is not greater than X? For me it is very expected feature, missing in Naemon, existing in Nagios. For my installation (2388 services, 169 hosts) config check takes miliseconds. People make mistakes, forget about something and restarting Naemon with existing errors sometimes can take time to find the cause of it.

tomaszdubiel18 avatar Nov 03 '21 08:11 tomaszdubiel18

The problem is, this will add a random and arbitrary boundary at which reloads/restart behave differently. The reason why we removed the config check is, that its super easy to do the check whenever you need it and its quite hard to remove the check once its there and you don't want it.

I would consider this a standard workflow, you change something, you run a config check and if thats sucessful, do a reload.

In OMD we have a special environment variable CORE_NOVERIFY=yes which controls the behaviour if config checks before reload/restart/start commands. Maybe something like that would be a good idea for the standard rc files here as well.

sni avatar Nov 03 '21 09:11 sni

Maybe just add the parameter in Naemon configuration? "Check Naemon configuration before the restart = true" with default "No"

tomaszdubiel18 avatar Nov 03 '21 09:11 tomaszdubiel18

In OMD we have a special environment variable CORE_NOVERIFY=yes which controls the behaviour if config checks before reload/restart/start commands. Maybe something like that would be a good idea for the standard rc files here as well.

I think that sounds like a great solution :+1:

jacobbaungard avatar Nov 03 '21 09:11 jacobbaungard

Maybe just add the parameter in Naemon configuration? "Check Naemon configuration before the restart = true" with default "No"

I think that's not so easy to implement as we'd need to parse the config file somehow then (i.e in the systemd/service files). It would be much simpler to implement with an environment variable.

jacobbaungard avatar Nov 03 '21 09:11 jacobbaungard

I think this would be useful for users that are not working with Naemon all day long :)

To be fair, I guess most of us are using some tool to generate the Naemon config files for us. I don't think that @jacobbaungard or @sni are messing around with config files anymore.

I also assume that the power users are using own systemd services anyway. Mine for example does a config check:

[Service]
Type=forking
PIDFile=/opt/openitc/nagios/var/nagios.lock
ExecStartPre=/opt/openitc/nagios/bin/naemon -v /opt/openitc/etc/nagios/nagios.cfg
ExecStart=/opt/openitc/nagios/bin/naemon -d /opt/openitc/etc/nagios/nagios.cfg
ExecReload=/bin/kill -HUP $MAINPID

Also the config generator is running a validation before reloading Naemon:

Bildschirmfoto 2021-11-03 um 10 33 22

On my test system with 5024 hosts and 93363 services the verification takes 2.5 seconds.

In my opinion a config validation should be enabled by default. Users with such big setups where the verification takes too long have enough experience to disable the verification, or are already using own systemd service configurations anyway.

nook24 avatar Nov 03 '21 09:11 nook24

In my opinion a config validation should be enabled by default. Users with such big setups where the verification takes too long have enough experience to disable the verification, or are already using own systemd service configurations anyway.

OK for me as long as it's easy to disable with i.e an env var. You have any preference for default behaviour @sni ?

jacobbaungard avatar Nov 03 '21 11:11 jacobbaungard

its fine for me if the default is enabled. As long as its easy to disable.

sni avatar Nov 03 '21 12:11 sni

In OMD we have a special environment variable CORE_NOVERIFY=yes which controls the behaviour if config checks before reload/restart/start commands. Maybe something like that would be a good idea for the standard rc files here as well.

I think that sounds like a great solution 👍

That sounds great.

ccztux avatar Nov 03 '21 12:11 ccztux

Any news on this one? And could you please rebase onto the latest upstream? thanks

sni avatar Sep 07 '23 07:09 sni