Running multiple `img build` instances concurrently?
Running multiple img build instances concurrently with a single state dir is currently impossible, due to the boltdb lock.
Some solutions I can come up with:
-
Eliminate boltdb
- Pro: Ideal
- Con: Hard
-
Internally spawns
buildkitdandcontainerd- Pro: Easy
- Con: Too much complexity
-
Define
img build-batchas follows- Pro: Easy
- Con: Weird UX. Still multiple
imgcommands cannot be executed simultaneously.
$ cat << EOF | img build-batch
-t foo /tmp/foo
-t example.com/bar --target release --push /tmp/bar
EOF
Any thought?
I am all for the easiest solution... so #3 :)
Our use case (a docker image ci-build-server) wouldn't benefit from #3 unfortunately.
I like 3, although I'm not if it makes sense for standalone builders to do this work or for higher level pipelines to do them. Also what about: 4. just keep boltdb for now and allow parallel execution using multiple state dirs. Trade off caching for concurrency, for heterogeneous builds it should improve them significantly with low implementation cost.
4.1: multiple state dir but with single blob storage dir? (gc might be hard)
could re-write the interface to use sqlite which will then allow multiple concurrent processes to open... i might try this in a timebox
actually just realized there are like 4 interfaces that would need that so nope lol
Cc @tonistiigi
It is not the boltdb that causes problems here. That could be easily solved by just releasing the boltdb lock between batches of queries. The real problems start when 2 parallel builds return same cache keys or when gc runs. 2 cache keys are merged internally in the solver and solver also internally keeps track of the snapshot reference counts. If you run prune on one of the builders using same state / boltdb it is likely that the second build will fail with some panic-like error.
I'm all for starting a hidden shared process that does the actual job. If you don't want to call it a daemon for some reason then that is fine. The shared process can go away as soon a there are no main processes running. If there are any changes needed for that in buildkit, then I'm happy to help. Either in combination with getting access to client for a controller directly while having this capability, or something like this in buildctl (or buildkitrun). There are some related ideas also in https://github.com/moby/buildkit/issues/237
The shared process can go away as soon a there are no main processes running. If there are any changes needed for that in buildkit, then I'm happy to help.
In current design we need to use containerd worker and spawn containerd for accessing the image store and the content store from the client (img).
This would be ok, but maybe we want to add image store service and content service to buildkitd (with runc worker) as well?
@AkihiroSuda This has come up before in https://github.com/moby/buildkit/pull/289 . I'm ok with adding it to the buildkitd as it is a set of features atm that only work with a specific worker and that isn't very nice. Another way would be to push the imagestore (referencestore really) to the client and providing a callback for the resolver/prune to check local images.
I would propose this be considered out of scope for img. Multiple img builds can be run in separate containers already, and if you want a daemon and scalable builds, I think buildctl would be a better option for those use cases.