coreos-assembler icon indicating copy to clipboard operation
coreos-assembler copied to clipboard

oscontainer-deprecated-legacy-format: use runvm not nested containers

Open jmarrero opened this issue 3 years ago • 45 comments

Initial effort to fix: https://github.com/openshift/os/issues/1009

jmarrero avatar Sep 22 '22 20:09 jmarrero

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

openshift-ci[bot] avatar Sep 22 '22 20:09 openshift-ci[bot]

Trying to get this running locally first, right now hitting python module issues:

  File "/usr/lib/coreos-assembler/oscontainer-deprecated-legacy-format.py", line 22, in <module>
    from cosalib import cmdlib
  File "/usr/lib/coreos-assembler/cosalib/cmdlib.py", line 15, in <module>
    import yaml
ModuleNotFoundError: No module named 'yaml'
Traceback (most recent call last):
  File "/usr/lib/coreos-assembler/cmd-upload-oscontainer", line 120, in <module>
    subprocess.check_call(cosa_argv +
  File "/usr/lib64/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)

maybe it's time to just farm out the buildah/podman call to the vm.

jmarrero avatar Sep 22 '22 20:09 jmarrero

Short term I'm OK with just taking the hit of dragging python into supermin.

(Hmm...i wonder if we could actually just mount the host container's /usr into the supermin over 9p VM and use that...)

cgwalters avatar Sep 22 '22 21:09 cgwalters

I see, I'll try that next I was trying to modify oscontainer.py instead of calling buildah, to call something like

buildah_base_argv = ['. /usr/lib/coreos-assembler/cmdlib.sh; prepare_build && . /usr/lib/coreos-assembler/cmdlib.sh && runvm -- buildah']

But keep getting

  File "/usr/lib64/python3.10/subprocess.py", line 1845, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '. /usr/lib/coreos-assembler/cmdlib.sh; prepare_build && . /usr/lib/coreos-assembler/cmdlib.sh && runvm -- buildah'

jmarrero avatar Sep 22 '22 22:09 jmarrero

I think I am hitting some kind of argument limit when doing this via string because I get:

oscontainer-deprecated-legacy-format.py: error: unrecognized arguments: CoreOS kernel-rt-core ostree rpm-ostree ignition systemd runc cri-o /srv/tmp/repo ece6f5adbc58edd21135a922e9adb0c224cd0409fe284cdb6c2fe59483ba4b8a rhcos:412.86.202209021413-0

when I add the {arguments} to the call which look like:

['sudo', '--preserve-env=container,DISABLE_TLS_VERIFICATION,SSL_CERT_DIR,SSL_CERT_FILE,REGISTRY_AUTH_FILE,OSCONTAINER_CERT_DIR', '/bin/sh', '-c', '. /usr/lib/coreos-assembler/cmdlib.sh; prepare_build && . /usr/lib/coreos-assembler/cmdlib.sh && runvm -- /usr/lib/coreos-assembler/oscontainer-deprecated-legacy-format.py --workdir=./tmp build --from=registry.access.redhat.com/ubi8/ubi:latest  --display-name="Red Hat Enterprise Linux CoreOS" --labeled-packages="kernel kernel-rt-core ostree rpm-ostree ignition systemd runc cri-o" --digestfile=tmp/oscontainer-digest --push /srv/tmp/repo ece6f5adbc58edd21135a922e9adb0c224cd0409fe284cdb6c2fe59483ba4b8a rhcos:412.86.202209021413-0']

Without the arguments that have spaces, it starts the build in the VM. Still debugging what is going on.

jmarrero avatar Sep 23 '22 21:09 jmarrero

quite close, just need to handle the actual upload. The image is now being built on the vm.

error pushing image "jmarrerotest:412.86.202209302130-0" to "docker://jmarrerotest:412.86.202209302130-0": trying to reuse blob sha256:b38cb92596778e2c18c2bde15f229772fe794af39345dd456c3bf6702cc11eef at destination: checking whether a blob sha256:b38cb92596778e2c18c2bde15f229772fe794af39345dd456c3bf6702cc11eef exists in docker.io/library/jmarrerotest: errors:
denied: requested access to the resource is denied
error parsing HTTP 401 response body: unexpected end of JSON input: ""

Just working on moving the upload outside the vm as discussed with Colin.

jmarrero avatar Sep 30 '22 21:09 jmarrero

Just working on moving the upload outside the vm as discussed with Colin.

If the image is in the local builds dir and also in the meta.json then you should be able to use cosa push-container-manifest for this. See https://github.com/coreos/fedora-coreos-pipeline/blob/7aea3fe533f19acc7ce76cac372d8d0f65f1d4ea/jobs/release.Jenkinsfile#L184-L187

dustymabe avatar Sep 30 '22 22:09 dustymabe

tested this on a pod and locally, it created a oci-archive with the name I pass to the command under --name. for example:

cosa upload-oscontainer --name rhctest

Now generates:

skopeo inspect --config oci-archive:rhctest | grep version 
            "io.buildah.version": "1.27.0",
            "io.openshift.build.version-display-names": "machine-os=Red Hat Enterprise Linux CoreOS",
            "io.openshift.build.versions": "machine-os=412.86.202209021413-0",
            "version": "412.86.202209021413-0"

The meta.json shows the entry too:

  "name": "rhcos",
  "oscontainer": {
    "digest": "sha256:0498f3bf675eb3700bd8c789da2707e71a73a1da26e67d4a250ed74827c76011",
    "image": "rhctest"
  },

jmarrero avatar Oct 03 '22 19:10 jmarrero

@dustymabe you mean to use it in the pipeline right? If so then this PR now creates the oci-archive. So in the pipeline I need to call cosa upload-oscontainer and then cosa push-container-manifest. If that is the case I will raise a PR with that next.

jmarrero avatar Oct 04 '22 18:10 jmarrero

@dustymabe you mean to use it in the pipeline right?

Yes

If so then this PR now creates the oci-archive. So in the pipeline I need to call cosa upload-oscontainer and then cosa push-container-manifest. If that is the case I will raise a PR with that next.

We might want to rename cosa upload-oscontainer since it's not doing that any longer. cosa create-legacy-oscontainer maybe?

We also should consider how we want the containers in the registry to look. For example the resulting container from our FCOS pipeline is manifest listed with all architectures referenced. We run this in the release job (after all arches have finished building) rather than in each build job for each architecture.

Just something to think about.

dustymabe avatar Oct 04 '22 20:10 dustymabe

Renamed, I think we can influence how it looks with the --name we pass to cosa create-legacy-oscontainer and when we push it. We can discuss it on the PR doing the push I think.

jmarrero avatar Oct 04 '22 21:10 jmarrero

So the hard rename i.e. requiring the pipeline change seems to run straight into https://issues.redhat.com/browse/MCO-392

IOW, if we merge this change as is, won't it break the current (legacy) RHCOS pipeline for 4.12? Or I guess the answer is: we can just hold this until we do a hard cutover to the new pipeline?

cgwalters avatar Oct 05 '22 13:10 cgwalters

Playing with the upload-container-manifest not doing the upload while creating the container might be a bit tricky. The meta.json currently has oscontainer in a seperate entry from the artifacts.

  "oscontainer": {
    "digest": "sha256:6a4fd352a230431efe0ddfc49a432673692e36d6e08785244f11c70aa465466e",
    "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev"
  },

And image points to a repo not a file.

I can call upload-container-manifest with the oci-image but, question for you guys... does it matter that the meta.json does not point to a repository anymore and just to a file? Like so:

  "oscontainer": {
    "digest": "sha256:6a4fd352a230431efe0ddfc49a432673692e36d6e08785244f11c70aa465466e",
    "image": "oscontainer-test"
  },

jmarrero avatar Oct 05 '22 21:10 jmarrero

Hmm I think https://github.com/openshift/doozer might rely on that...why can't we write the remote registry here still?

cgwalters avatar Oct 05 '22 21:10 cgwalters

Hmm I think https://github.com/openshift/doozer might rely on that...why can't we write the remote registry here still?

with: https://github.com/coreos/coreos-assembler/pull/3111/commits/22d1a813e17593e81271071c236f6b516846866a it can. But without that, the --name could be a url.

jmarrero avatar Oct 06 '22 01:10 jmarrero

Playing with the upload-container-manifest not doing the upload while creating the container might be a bit tricky. The meta.json currently has oscontainer in a seperate entry from the artifacts.

Do you mean push-container-manifest when you say upload-container-manifest?

  "oscontainer": {
    "digest": "sha256:6a4fd352a230431efe0ddfc49a432673692e36d6e08785244f11c70aa465466e",
    "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev"
  },

And image points to a repo not a file.

I can call upload-container-manifest with the oci-image but, question for you guys... does it matter that the meta.json does not point to a repository anymore and just to a file? Like so:

  "oscontainer": {
    "digest": "sha256:6a4fd352a230431efe0ddfc49a432673692e36d6e08785244f11c70aa465466e",
    "image": "oscontainer-test"
  },

I think it's worth us taking a step back and looking at what is done for other artifacts here. In most cases we have something that is built in the pipeline and stored as a file (look at all the entries in meta.json under images). Then we have other entries in meta.json for when those artifacts were pushed to remote resources (AMIs, container registry, etc). I think it could make sense to do that here too. If you look at what we're doing in FCOS for the new style oscontainer we have images.ostree.path that points to something like fedora-coreos-36.20220906.3.2-ostree.x86_64.ociarchive which is stored along with the meta.json file and then we have a base-oscontainer.image entry with a value of quay.io/fedora/fedora-coreos@sha256:de2dbf690f82077d1eae67da30e2a918908dcd457b518ae1de8671192309aba8.

First step: create artifact, store artifact, update meta.json with entry in images Second step: collect artifacts (multi-arch), push manifest listed container to registry, update meta.json with oscontainer entry.

dustymabe avatar Oct 06 '22 01:10 dustymabe

Yeah, I think you're right - it would make sense to write an OCI archive stored as an artifact indeed, then this code/logic looks exactly the same as the "new" base image and extensions containers.

cgwalters avatar Oct 06 '22 11:10 cgwalters

Tested that:

cosa push-container-manifest --repo quay.io/MYREPO/MYIMG --tag oscontainer --artifact=oscontainer --metajsonname=oscontainer

Now works with the file and meta.json updated generated by:

cosa create-legacy-oscontainer

The meta.json entry looks like:

    "oscontainer": {
      "path": "rhcos-412.86.202209021413-0-oscontainer.x86_64.ociarchive",
      "sha256": "5a8d43ceec92980f07d5ddcdd477699a41498673ab7b3c3fe53f085003225c25",
      "size": 1330113536
    },

jmarrero avatar Oct 07 '22 16:10 jmarrero

Upgrades test failing with

Error: failed to parse build: json: unknown field "digest"
[2348](https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/coreos_coreos-assembler/3111/pull-ci-coreos-coreos-assembler-main-rhcos/1578423504221507584#1:build-log.txt%3A2348)
2022-10-07T17:35:24Z cli: failed to parse build: json: unknown field "digest"
[2349](https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/coreos_coreos-assembler/3111/pull-ci-coreos-coreos-assembler-main-rhcos/1578423504221507584#1:build-log.txt%3A2349)
{"component":"entrypoint","error":"wrapped process failed: exit status 1","file":"k8s.io/test-infra/prow/entrypoint/run.go:80","func":"k8s.io/test-infra/prow/entrypoint.Options.Run","level":"error","msg":"Error executing test process","severity":"error","time":"2022-10-07T17:35:24Z"}

I guess the old oscontainer entry in the meta.json is needed. Retesting just in case but I can quickly re-add the old oscontainer entry with the digest too I think they will likely not conflict at all with the new artifact oscontainer entry.

jmarrero avatar Oct 08 '22 14:10 jmarrero

/retest-required

jmarrero avatar Oct 08 '22 14:10 jmarrero

This LGTM from a quick look. Now the question becomes: How can we backport that to 4.8+?

travier avatar Oct 10 '22 14:10 travier

I don't see push-container-manifest in https://github.com/coreos/coreos-assembler/commits/rhcos-4.8 so that will be required to be backported too. @dustymabe you had any other plan? like trying to build old versions with the latest cosa too?

jmarrero avatar Oct 10 '22 14:10 jmarrero

I don't see push-container-manifest in https://github.com/coreos/coreos-assembler/commits/rhcos-4.8 so that will be required to be backported too. @dustymabe you had any other plan? like trying to build old versions with the latest cosa too?

I don't have any silver bullet here. We'll just have to look at the backports after we get it working with :latest and see what the options are.

dustymabe avatar Oct 10 '22 19:10 dustymabe

Is the update to go modules (vendoring) required in this PR? Can we split it to a separate PR?

dustymabe avatar Oct 10 '22 20:10 dustymabe

Is the update to go modules (vendoring) required in this PR? Can we split it to a separate PR?

Since I am adding something to new to the schema I believe it's required: https://github.com/jmarrero/coreos-assembler/blob/main/Makefile#L88

jmarrero avatar Oct 10 '22 22:10 jmarrero

Is the update to go modules (vendoring) required in this PR? Can we split it to a separate PR?

Since I am adding something to new to the schema I believe it's required: https://github.com/jmarrero/coreos-assembler/blob/main/Makefile#L88

If I drop your second commit in a local checkout and then run make schema and cd mantle && go mod vendor I only end up with the expected files changed:

$ git diff --stat
 mantle/vendor/github.com/coreos/coreos-assembler/pkg/builds/build.go      |  8 ++------
 mantle/vendor/github.com/coreos/coreos-assembler/pkg/builds/cosa_v1.go    |  8 +++++++-
 mantle/vendor/github.com/coreos/coreos-assembler/pkg/builds/schema_doc.go | 25 ++++++++++++++++++++++++-
 pkg/builds/cosa_v1.go                                                     |  8 +++++++-
 pkg/builds/schema_doc.go                                                  | 25 ++++++++++++++++++++++++-
 5 files changed, 64 insertions(+), 10 deletions(-)

dustymabe avatar Oct 11 '22 03:10 dustymabe

Is the update to go modules (vendoring) required in this PR? Can we split it to a separate PR?

Since I am adding something to new to the schema I believe it's required: https://github.com/jmarrero/coreos-assembler/blob/main/Makefile#L88

If I drop your second commit in a local checkout and then run make schema and cd mantle && go mod vendor I only end up with the expected files changed:

$ git diff --stat
 mantle/vendor/github.com/coreos/coreos-assembler/pkg/builds/build.go      |  8 ++------
 mantle/vendor/github.com/coreos/coreos-assembler/pkg/builds/cosa_v1.go    |  8 +++++++-
 mantle/vendor/github.com/coreos/coreos-assembler/pkg/builds/schema_doc.go | 25 ++++++++++++++++++++++++-
 pkg/builds/cosa_v1.go                                                     |  8 +++++++-
 pkg/builds/schema_doc.go                                                  | 25 ++++++++++++++++++++++++-
 5 files changed, 64 insertions(+), 10 deletions(-)

It must be something in my environment, maybe an older golang, I tried again and I get the same result as the commit. I will try a new toolbox and see if the result is the same as yours.

jmarrero avatar Oct 11 '22 13:10 jmarrero

Only skimming the conversation here, I'd just note that while I'm all in favor of renaming the CLI etc. to call the old container "deprecated" in main/latest/4.12, I don't think we can change the meta.json schema for older OCP releases because that is used by other tools.

cgwalters avatar Oct 11 '22 13:10 cgwalters

Only skimming the conversation here, I'd just note that while I'm all in favor of renaming the CLI etc. to call the old container "deprecated" in main/latest/4.12, I don't think we can change the meta.json schema for older OCP releases because that is used by other tools.

That probably means that the original oscontainer entry is needed by them as well not the artifact entry.

jmarrero avatar Oct 11 '22 13:10 jmarrero

That probably means that the original oscontainer entry is needed by them as well not the artifact entry.

Right, they don't need the artifact entry. But it doesn't hurt to add it, and I do think it makes sense to do that because as discussed it makes the whole "build and push" flow symmetrical.

IOW to summarize:

  • Don't change the oscontainer meta.json schema
  • Do add a new legacy-oscontainer or whatever artifact entry

WDYT?

cgwalters avatar Oct 11 '22 14:10 cgwalters