plugins icon indicating copy to clipboard operation
plugins copied to clipboard

host-device: cmdDel fails if device was renamed in container

Open olivier-matz-6wind opened this issue 3 years ago • 0 comments

Hi,

[creating an issue from the pull request #766]

When a device previously passed to the container has to be moved back to the host, netlink.LinkByName() is used to find it in the container. This does not work if the device was renamed in the container.

In my use-case, it blocks the destruction of the pod (when I'm using kubectl scale deployment NAME --replicas=0) with the following error:

error killing pod: failed to "KillPodSandbox" for "90accdea-b2d9-4f1f-a9fe-8c838defd726" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to destroy network for sandbox \"3ae636bf65be936512620499bca54789767f874d4cfd16497fc91a9fba02a6a2\": delegateDel: error invoking DelegateDel - \"host-device\": error in getting result from DelNetwork: failed to find \"net1\""

In pull request #766, a first proposal is done to fix the issue by storing the initial interface name in the alias, in addition to the host interface name. To find the device, the link list is iterated until the one whose alias corresponds is found.

However, as explained in a later comment of the pull request, I'm not sure this is the best way to fix the issue. Using the ifalias as a database does not look very reliable too, because it can still be changed by the user in the container.

My second idea was to store the ifindex of the interface (maybe in /var/lib/cni) in cmdAdd(), and re-use it in cmdDel() to find the interface in the container. That looks better because we don't have to change the interface alias. But still, it won't work if the interface was moved in another sub-netns inside the container.

I cannot find a good solution that would work in any case. That leads me to this question:

Would it make sense to ignore the error when we cannot find the device and just issue a warning? Indeed, in Linux, when a netns is destroyed, all physical devices that were present in the netns will respawn in init_net.

Note: I'm not very familiar with development of containernetworking plugins, or even with golang. Any advice would be greatly appreciated.

Thanks, Olivier

olivier-matz-6wind avatar Sep 16 '22 12:09 olivier-matz-6wind