Fix: agent: prevent fencing from hanging on sbd commands if any of the devices is silently blocked
If any of the configured SBD devices is silently blocked without any explicit I/O error from kernel, fencing will get stuck and time out, even if the majority of the devices are still available.
Under this situation, gethosts, off/reset and status actions will get stuck on sbd list, dump, or message command which will be hanging on exit_aio() on exit, and become D state.
With these commits, sbd fence agent asynchronously calls the commands individually for the devices and returns whenever the purposes of the commands are achieved, so that it prevents execution of sbd fence agent from unnecessary hanging under such a situation.
Let's first confirm the concept. I'll ask the user to test it as well.
Let's first confirm the concept. I'll ask the user to test it as well.
Definitely something that looks as if it need improvement. Personally I guess we should try to tackle the issue in sbd itself and not in the fencing-script. One of the reasons is that like this we have to do it twice. fence_sbd should suffer from similar issues. Was as well thinking if we could pass the logic to pacemaker-fenced - like using topology - not directly usable but the concept could be extended ...
@kgaillot What do you think?
@gao-yan what are you using to simulate the stall behavior?
Maybe we can add it to the tests then.
Personally I guess we should try to tackle the issue in sbd itself and not in the fencing-script. One of the reasons is that like this we have to do it twice. fence_sbd should suffer from similar issues.
Indeed... I'll take a look what could be done with sbd itself.
@gao-yan what are you using to simulate the stall behavior?
Maybe we can add it to the tests then.
I haven't figured out how to simulate it in a test environment. So far I've been testing it with an iscsi setup and iptables :-)
I haven't figured out how to simulate it in a test environment. So far I've been testing it with an iscsi setup and iptables :-)
Maybe something simple in the testbed I've written would be enough. I'm already intercepting most of the IO stuff so that we could add configurable delays/stalls. That includes libaio-calls where I'm either just wrapping the original calls or replacing them with an implementation mapping to read/write. That behavior is switchable and we can pick what sounds easier for that purpose.
Was as well thinking if we could pass the logic to pacemaker-fenced - like using topology - not directly usable but the concept could be extended ...
@kgaillot What do you think?
Not sure what you mean. Are you suggesting using something like topology to manage multiple sbd devices, so that sbd and fence_sbd only ever deal with a single device?
Not sure what you mean. Are you suggesting using something like topology to manage multiple sbd devices, so that sbd and fence_sbd only ever deal with a single device?
Yes. Wanted to see if thinking in that direction leads to somewhere useful as pacemaker already has that logic of individual timeouts and logic combination already. Parallel fencing is already available - not in case of topologies I guess though. Being content with a quorate number of positive results would have to be added.
@gao-yan Sorry for not driving this further in the last nearly 2 years. Do you have updates for your work on the fencing-script? I would still prefer doing the parallel-execution-stuff in the c-code although I've just submitted a PR for fence_sbd that should do a similar thing. With a device-mapper-disk that is suspended we are currently creating D state children - which is really ugly but probably hard to avoid with the current architecture. An alternative might be handling the messaging via some kind of ipc with the sbd-daemon.
You happened to ask :-) I actually just got the chance to get back to this one. I'm working on a solution in the C code to make several sbd commands execute with sub-processes for respective devices to prevent a main process from hanging. It should be universally beneficial for both sbd fence agents, so that we could avoid changing the fence agents. I'm going to open a PR soon to show you the draft so that we can talk about the details.
That's cool! Good that I asked before starting a parallel project ;-)