slsa discussion: How to interpret "was built from the official source and build process" in Build L3

Moving the discussion on the slsa-github-generator repository here for broader visibility.

According to SLSA v1.0, Build L3: Hardened builds

Provides strong confidence that the package was built from the official source and build process.

Depending on how we interpret this sentence, the following provenance generators on slsa-github-generator would be classified as Build L2 instead of L3 as @laurentsimon pointed out:

GitHub workflows: users can clone any repo, so we can't guarantee it (that's why you're asking if it should be labelled L2)
GCB: GCB clones the repo on behalf of users (based on webhook info). But the build script could be doing anything, e.g. download a binary without building from source code
BYOB: similar to GCB, BYOB clones the repo on behalf of users

What is the definition of "source" in the SLSA v1.0 specification? Is it fair to classify the above generators as Build L2?

Jul 21 '23 06:07 behnazh-w

I feel like this would still be able to be classified as Build L3 because it should work properly for "well-intentioned" builds. The provenance SHOULD be complete, but the build system might not have the ability to ensure that it is complete (until a future higher level).

We mentioned this in the isolated requirement but I can see that it might extend to the source requirements as well.

There are no sub-requirements on the build itself. Build L3 is limited to ensuring that a well-intentioned build runs securely. It does not require that a build platform prevents a producer from performing a risky or insecure build. In particular, the “Isolated” requirement does not prohibit a build from calling out to a remote execution service or a “self-hosted runner” that is outside the trust boundary of the build platform.

In these GitHub actions, is it true that some verifier would be able to see the script that produced the artifact and therefore any potential remote clone that influenced the build? As long as the provenance can accurately represent the source that triggered the build, that seems reasonable to me. Any additional git dependencies that the build needs and clones manually could be classified as incomplete resolvedDependencies.

Jul 21 '23 12:07 arewm

I think there are two problems:

Use of the word "official" without any mention of verification or expectations. Previously we had verification as part of the requirements (1127029, de5117a) but later decided to remove it (84bb75b). Looks like we forgot to update this bullet.
Use of the word "source" without a definition. Elsewhere we have avoided "source" due to this confusion. Again, looks like we forgot to address this one here.

A potential fix is to split:

Provides strong confidence that the package was built from the official source and build process.

Into to two bullets:

Provides strong confidence in the accuracy [correctness? trustworthiness?] of the package's provenance.

With verification, prevents building from unofficial or unexpected build inputs.

Rationale:

The first bullet is about transparency, i.e. understanding where the artifact came from. We should be mindful that Build L3 is not "complete" whereas a future Build L4 (or above?) would likely be complete.
The second bullet is about control or tamper prevention, but that requires an expectation of what "official" is. The wording mimics the equivalent Build L1 bullet: "With verification, prevents mistakes..."
Switch from "source and build process" to "build inputs" to avoid confusion around what "source" means. I don't love it, but it seems like it causes enough confusion that we should avoid "source".

Thoughts?

Jul 21 '23 17:07 MarkLodato

The second bullet seems too specific to me for level 3 since no guarantees can be made about the build itself. As of L3, the only hardening is being done is on the build platform and no additional requirements are being done by the producer. L3 can produce strong confidence in the correctness of the provenance, but if the build subverts the provenance by cloning another repo to build, that is outside of the controls presented.

Jul 21 '23 17:07 arewm

What if we say "top-level build inputs"? Would that fix it?

I do think it makes sense for Build L3. We are not making any claims about what is done after the workload starts or what git repos are actually cloned during the build. Rather, we're only saying that the top-level inputs (e.g. the repo and path for the GitHub Actions Workflow) were as expected. Those top-level inputs are the most critical ones; if they are wrong, all bets are off. That's why I think Build L3 has value.

For example, suppose you want to compromise NPM package "foo". The NPM ACL has 20 accounts on it. The package is built by workflow github.com/foo-org/foo/.github/workflows/build.yaml. That workflow depends on git repos X, Y, and Z as well as NPM packages A, B, C, and D.

Compromising one of those 20 accounts is the most direct attack (C/F in our diagram), and comparing Build L3 provenance to expectations stops it. It does not address compromise of dependencies (D), but we are planning to tackle that another day.

Jul 21 '23 18:07 MarkLodato

+1. Something like top-level build inputs would make sense, but I am not convinced at the wording. Looking at the "what's new", it seems like we might have lost something which would enable easier relaying of this information

These changes include: renaming, simplifying, and merging some requirements; removing the redundant “scripted build” and “config as code” requirements;

If we had a reference to scripted build or config as code in the requirements, then we could state that that is this top-level build input source which defines the build configuration. I think that this requirement was implied to exist even for build L1, but it isn't explicitly mentioned.

Jul 24 '23 16:07 arewm

Discussed in 2023-07-24 spec meeting. Sounds like we want to:

Clarify the levels.md bullet in question according to suggestion from https://github.com/slsa-framework/slsa/issues/926#issuecomment-1646035110.
Clarify in whats-new.md and elsewhere about "scripted build" and "config as code" where they went and how external parameters is the new thing.

Jul 24 '23 16:07 MarkLodato

Thanks @MarkLodato and @arewm . I'm trying to differentiate Build L2 and L3 based on the suggestions in https://github.com/slsa-framework/slsa/issues/926#issuecomment-1646035110.

With the first Build L3 provenance trustworthiness/accuracy bullet, the main additional benefit over L2 is that insider threats and project maintainers themselves cannot tamper with the provenance. If that's correct, it would be helpful to spell it out?

With the second bullet, comparing the new verification requirement for Build L3, with the verification in Build L2, which is:

Downstream verification of provenance includes validating the authenticity of the provenance.

Build L3 takes it one step further from verifying the authenticity of the provenance, and also checks the top-level inputs against expectations. For instance, we can accurately (verifiably) answer questions like:

Which build platform has built this artifact?

Jul 25 '23 01:07 behnazh-w

With the first Build L3 provenance trustworthiness/accuracy bullet, the main additional benefit over L2 is that insider threats and project maintainers themselves cannot tamper with the provenance. If that's correct, it would be helpful to spell it out?

Could you be more specific? Do you mean replace "insider threats" with "project maintainers themselves", or that the existing first bullet and the proposed one from the comment overlap, or something else?

Build L3 takes it one step further from verifying the authenticity of the provenance, and also checks the top-level inputs against expectations.

To clarify, there is no requirement at L3 (or any level) to actually check the top-level inputs against expectations. Rather, if you do that, then you get the benefits of protecting against mistakes (at L1) and attacks (at L3). (I don't love this complication, but it's the best we could do for v1. I'm hoping that we can streamline this in a future version of the specification.)

Jul 25 '23 13:07 MarkLodato

Could you be more specific? Do you mean replace "insider threats" with "project maintainers themselves", or that the existing first bullet and the proposed one from the comment overlap, or something else?

My bad, I forgot about the existing first bullet, which already mentions "insider threats".

To clarify, there is no requirement at L3 (or any level) to actually check the top-level inputs against expectations. Rather, if you do that, then you get the benefits of protecting against mistakes (at L1) and attacks (at L3). (I don't love this complication, but it's the best we could do for v1. I'm hoping that we can streamline this in a future version of the specification.)

Makes sense. Thanks for the clarification.

Jul 26 '23 03:07 behnazh-w