Should get_available_modes be influenced by inference
If we know through mode inference, that a certain state or mode of a (sub-)system can't be transitioned to, should get_available_states and get_available_modes of (sub-)systems report accordingly?
Example:
- Subsystem A contains several nodes, e.g., node B
- If node B transitions into
error_processing, certain states and modes for subsystem A are not available until errors in node B are fixed -
get_available_statesandget_available_modesof subsystem A only report states and modes that allow node B to be in error_processing, e.g., degraded modes
Proposal after discussion with MROS team:
-
get_available_modesshould stay consistent withget_available_states, i.e. returning all modes of a node/system -
in addition, response of the service should be extended by a list - e.g., termed
reachable_modes- that only shows modes that are currently available/reachable -
reachable_modescan probably be inferred for systems automatically by the mode_manager. Still tbd, what the default is for nodes, i.e. eitherreachable_modes = []by default orreachable_modes = available_modesby default, if not explicitly implemented by the node
When/how can we consider modes to be not reachable by nodes? Available modes are currently reported by the mode manager based on the SMH file (this is because the lifecycle node itself doesn't have any idea about modes). We could obviously conclude that no mode is reachable when the node is in error-processing, but is this the only thing we know? @chcorbato Do you have an idea on this?
Maybe we could associate the ErrorProcessing transition to the node MODE at that moment, so that only that MODE is considered not reachable. What do you think? Could the Mode Manager keep track of reachable modes?
The Mode Manager has to keep track of reachable modes, yes. So you suggest that all modes are considered reachable when the node is inactive/active and no modes are considered reachable when the node is in error processing? That should be doable.
No, not all modes. I think we should consider not reachable only those node modes that were active when the node went into ErrorProcessing. It could be that only specific configurations cause the node to go into error, right?
Okay, so the mode manager would then keep track of the modes that "made" the nodes transition into error processing and exclude them from the list of reachable modes. Maybe until the node gets reset?
Okay, so the mode manager would then keep track of the modes that "made" the nodes transition into error processing and exclude them from the list of reachable modes. Maybe until the node gets reset?
Sounds reasonable to me.