Dependencies and/or escalations corrupt the database
See issue #53:
Host X has service Y, host X is behind switch Z.
Added/removed various combinations of dependencies between X, Y and Z via the web interface, tested by closing/reopening port on Z. After noticing issues with notification reliability (see https://forum.centreon.com/forum/plugins-aa/notification-et-escalation/141928-host-service-dependency-notifications-filtering-only-works-50-of-the-time), removed dependency definition.
Similarly tested escalation on same host/service combination, again removed after noticing notification reliability, but escalation definition isn't gone from database and is in active for a newly defined host Q which has the same IP address as host X! (two issues there, escalation not gone, host confusion due to same IP address, even though hostname/description are different)
Please provide guidance on how to isolate and fix the database problem as this is a production system.
CES 3.3 iso, updated to Centreon 2.7.5.
edit: issue #51 and #52 may be a symptom of the db corruption.
Almost 5 months have passed, has anyone looked at this? If you need any more information please let me know!
Hi @btassite ,
Thanks for detailed information, hope that all your issues are linked but good news. We got a core dump of something really similar to your issue. And related issues on engine segfault tends to confirm this.
@ganoze don't you think there is some link between this and case with escalation ?
@btassite , if you can provide some core dump when engine crash that would help a lot ! To do so, please set debug to 1 in centengine init script. When the crash occurs please let us know and we'll give you some ftp to drop the files.
Thanks, Simon
@Sims24 Yes, might be related to commit b8235d8d5d8d54b0ccf28f66286e4d2a930b105c, but cannot be sure without updating and testing.
I've changed the init script to debug=1 but haven't restarted the engine yet, there have been no crashes since my report, possibly because we have stopped trying to use dependencies and escalations until the bug is fixed :)
Is the commit public? Maybe I'm looking in the wrong place but couldn't find it through GitHub search or using Google.
Edit: it would still be nice to know how to check the db integrity and clean out remaining related garbage, regardless of the bugfixing..
Can you give an update on the escalation bug? Has this been looked at, reproduced? I can't see anything relating to it in the changelogs up to 2.8.3.