Add an ipam option for `docker network create` to allow hinting a different preferred size than what may be configured in default-address-pools
Description
When doing local Docker development, the default /16 (and /20) subnet sizes configured in default-address-pools can result in available default network pools being exhausted if networks aren't regularly cleaned up. For scenarios where a smaller subnet (say /24) may be viable, the available workarounds of manually configuring network subnets or overriding the default-address-pools configuration require per-machine configuration (and educating users how to make the appropriate change or select an available network range) and neither are particularly viable for tooling or libraries that need to create one or more networks with a configuration that can work on any developers machine.
I'd like to propose a new optional ipam option value that could be passed to docker network create and act as a hint for a preferred network subnet size from the default pools. This hint would be optional and the ipam driver could chose to ignore the value; the created network wouldn't be guaranteed to match the requested size. For example, if the default address pool was configured to issue /24 subnets and a request indicated a preferred size of 20, the user may still receive a /24 network. Or in the event that a user indicated that they preferred an invalid network subnet size (a non numeric value, a value greater than is allowed for the current subnet IP type (IPv4, IPV6), etc.).
Effectively, I'd like to see something like the following:
docker network create --ipam-opt preferred_default_address_pool_size=24 test
For the default address pool configurations, this might return 172.18.0.0/24 as the subnet for the newly created network.
The important thing is that the value is simply a hint that a network can accept a smaller subnet and the ipam driver can choose to honor or ignore it as makes sense on a request-by-request basis.
I have a local branch where I've been playing around with this idea and it's a fairly small change to the default ipam driver to implement this in such a way that it doesn't impact any existing behavior but enables allocating smaller than default subnets for networks from the default pools.
Thank you @danegsta - providing a size hint seems like a good idea to me, or perhaps a strict minimum size.
The current built-in default address pools are too big. I think @akerouanton has a plan to do something about that and may have had something in progress, but I've forgotten what the plan was ... let's wait for his thoughts (but it will be a couple of weeks).
I have a local branch where I've been playing around with this idea
FWIW, feel free to open a draft PR if you want to show the changes you had in mind; sometimes it helps the conversation if engineers / maintainers have code-changes to look at.
I have a local branch where I've been playing around with this idea
FWIW, feel free to open a draft PR if you want to show the changes you had in mind; sometimes it helps the conversation if engineers / maintainers have code-changes to look at.
Sounds good; I'll cleanup my branch and get a few more tests added then open a draft PR linked to this issue.
Wanted to see if there'd been any chance to look at this proposal?
Thanks for the proposal! This is indeed something we discussed a few times internally and we'd like to see implemented.
This would be especially useful for bridge networks as the Linux kernel doesn't support more than 1024 ports per bridge -- you can't even change that through a compile-time flag, you'd need to do kernel surgery to get anything bigger. So it doesn't make sense to allocate /16 for them.
Or in the event that a user indicated that they preferred an invalid network subnet size
In that case, it's even better to just return an error to the user. If they specified something invalid, the Engine falls back to a smaller subnet, the user starts attaching containers to the misconfigured network, they will realize their mistake too late and will have to either detach or destroy their containers to recreate a network of the correct size. Or the Engine could fall back to something bigger and all default-address-pools will be subnetted faster than anticipated. Either case, it's bad from a user standpoint.
This hint would be optional and the ipam driver could chose to ignore the value; the created network wouldn't be guaranteed to match the requested size.
For builtin IPAM drivers, we should return an error if a subnet size is specified but the driver doesn't understand it (e.g. windows IPAM). Otherwise, for IPAM plugins, we don't require them to reject requests with options they don't understand, so this is fine.
docker network create --ipam-opt preferred_default_address_pool_size=24 test
Maybe we can improve the UX by making this shorter. What about docker network create --subnet /28 test? Whether a dynamic or a static allocation is requested would depend on whether a subnet address is specified. Leaving --subnet unspecified would be equivalent to --subnet 0.0.0.0/0 ("anything goes" = dynamic allocation with predefined subnet size), and --subnet /28 would be equivalent to 0.0.0.0/28 ("any /28" = dynamic allocation with a specific subnet size).
In terms of implementation, this is slightly more involved though, as we need to take care of backward compatibility. We don't officially support backward compatibility across major versions, but we also try to not break it unless really needed.
The defaultipam driver is using netip.ParsePrefix and it doesn't understand CIDR notation with unspec subnet address, so we'd need to store the requested subnet size in an IPAM option. That way, only compatible Engines would take it into account.
As a side note, I recall there were also discussions about dropping the subnet size from default-address-pools altogether. We can leave this for now, but it'd be a nice follow-up. Once we get this proposal in, we could set the default subnet size to /20 (to not break bridge users), and let users change the default through their daemon.json (although the current default-network-opts doesn't influence IPAM allocations unfortunately).
In that case, it's even better to just return an error to the user.
That would simplify the logic quite a bit as fallback can get complicated.
docker network create --ipam-opt preferred_default_address_pool_size=24 test
Maybe we can improve the UX by making this shorter. What about docker network create --subnet /28 test?
I definitely prefer the —subnet notation in terms of overall UX; the main reason for suggesting an IPAM option was that it would make dealing with backwards compatibility easier for tooling that wants to make use of the feature (either it works or is silently ignored, but the user gets a network either way).
One option could be to support both syntax (but raise an error if both the —subnet and IPAM arguments are specified). That would give users (and eventually tools) access to a more straightforward flag, while allowing the IPAM option to be used by tooling in the short term when there’s a need to support older engine versions.
I just rebased https://github.com/moby/moby/pull/50114 on latest master and updated it to use the subnet/pool option from above. The subnet size is no longer considered a hint; the request will fail if an invalid subnet size is provided. I've simplified the ipam option to subnet_size (from preferred_default_address_pool_size).
docker network create --subnet /24 test, docker network create --subnet 0.0.0.0/24 test, docker network create --subnet ::/24 test, and docker network create --ipam-opt subnet_size=24 test will all return a /24 sized subnet from the default pools if possible.
I haven't had a chance yet to look at providing a useful error message when attempting to pass subnet to the windows ipam driver.
Implemented in https://github.com/moby/moby/pull/50114, for Docker 29.0.0.