cni icon indicating copy to clipboard operation
cni copied to clipboard

AddNetworkList will cause something leak

Open hongli-my opened this issue 5 years ago • 4 comments

https://github.com/containernetworking/cni/blob/master/libcni/api.go#L428

func (c *CNIConfig) AddNetworkList(ctx context.Context, list *NetworkConfigList, rt *RuntimeConf) (types.Result, error) {
	var err error
	var result types.Result
	for _, net := range list.Plugins {
		result, err = c.addNetwork(ctx, list.Name, list.CNIVersion, net, result, rt)
		if err != nil {
			return nil, err
		}
	}

	if err = c.cacheAdd(result, list.Bytes, list.Name, rt); err != nil {
		return nil, fmt.Errorf("failed to set network %q cached result: %v", list.Name, err)
	}

	return result, nil
}

for .conflist cni conf like: if flannel cni run success, but tunning failed, will not be cleaned for flannel.

{
  "cniVersion": "0.3.1",
  "name": "flannel",
  "plugins": [
  {
    "cniVersion": "0.3.1",
    "name": "flannel",
    "type": "flannel",
    "subnetFile": "/run/flannel/subnet.env",
    "delegate": {
      "bridge": "cbr0",
      "ipMasq": false,
      "mtu": 1450,
      "isDefaultGateway": true
    }
  },
  {
    "cniVersion": "0.3.1",
    "name": "tuning",
    "type": "tuning",
    "ethtoolConf": {
      "ifName": "eth0",
      "offloads": {
        "tso": false
      }
    }
  }
  ]
}

hongli-my avatar Nov 20 '20 04:11 hongli-my

I believe this is covered in the spec:

If an ADD action fails, when the runtime decides to handle the failure it should execute the DEL action (in reverse order from the ADD as specified above) for all plugins in the list, even if some were not called during the ADD action.

bboreham avatar Nov 25 '20 16:11 bboreham

We're considering changing the specification (and libcni) to

  • delete only the plugins that were ADD'ed
  • Automatically delete in libcni on failure.

squeed avatar Nov 25 '20 16:11 squeed

This is definitely not a spec change, so removing the milestone.

Because libcni itself might crash during ADD (or the subsequent DEL), the runtime (e.g. podman, containerd) should ALWAYS call DEL on a failed ADD. So, while we might paper this over in libcni, the behavior doesn't change: you should always DEL.

squeed avatar Dec 02 '20 16:12 squeed

Do you mean crash as in "panic", or some more general concept of "really bad error where we can't recover enough to call DEL, but haven't actually killed the process" ?

bboreham avatar Dec 02 '20 18:12 bboreham