slsa source track: clarify definition of "contributor" and recommend best practices for SCPs.

Thanks @steiza and @zachariahcox! This is a great start. My team has thought about this topic quite significantly and I'm happy to share those thoughts as well (matches fairly well to what has been said above).

But before I do that, let's figure out the best way to collaborate here. Should we discuss these ideas via GitHub Issues? GitHub Wiki? Google Doc? Something else?

My inclination is to use a google doc or wiki so that people can have threads and iterate on ideas. A comment thread is pretty hard to follow.

Either way, I think we might want to break this down along the following lines:

What high-level, hand-wavy guarantees might we care about, and how do we organize them into a meaningful set of levels? For example (building on ideas from @steiza):

All contributions can be traced to one or more strongly authenticated authors

All contributions went through multi-party review and approval (there are probably degrees of strength here, including number of reviewers, changes after approval, whether it can be bypassed, etc.)

All source code was retained for at least X period of time (for incident response, investigations, auditing, etc.)

All contributions passed some automated check I care about (e.g. DCO)

Eventually we'll want to aggregate these into a single "theme" for the track. But that might come later.

How do we translate those high-level ideas into concrete requirements? Here is where we would answer the nitty gritty questions, such as:

What is the subject of the thing that has a level (a commit, a repo, etc.)?

What range of time will source attestations cover?

How do we decide what contributions to include?

Who "contributed" a line of code?

What about changes contributed by a robot?

Who attests to this information? The code-hosting server?

How does this information propagate (attestation formats, storage, and APIs)?

These two pieces will necessarily influence each other, but they can happen in parallel. The reason I think it might be valuable to split them is that it's hard to have conversations at two very different levels of abstraction.

Originally posted by @MarkLodato in #956

Sep 30 '24 14:09 zachariahcox

The current state of the project seems to answer @MarkLodato 's questions like this:

1: What high-level, hand-wavy guarantees might we care about?

SLSA itself doesn't enforce policy, it ensures that tamper-resistant data can be produced during various phases of the SDLC. VSAs are the characters that have policy opinions and care about then contents of attestations. To help guide VSA authors, we have provided some down-to-earth, old-fashioned, grandma's-secret-recipe-style guidance in the "verifying-systems" and "verifying-source" documents, but they don't answer the same question.

Our guidance documents seem to answer the questions about the domain itself.

how do domain experts think about this problem?
which threats do they consider most serious?
what combinations of tools actually help mitigate the risks from those threats?
how do those combinations actually mitigate the risk?

With this advice an organization is well-positioned to choose combinations of policies to apply to their process. The policy enforcement would happen via VSA, or other kind of "gate" feature.

2. How do we translate those high-level ideas into concrete requirements?

Based on the content of the source-requirements document, I believe these are the answers:

Q: What is the subject of the thing that has a level (a commit, a repo, etc.)? A: It's the revision (the commit in git terms). A git repo has many revisions, all of which could contain completely different content.

Q: What range of time will source attestations cover? A: The part of the SCS that controls the introduction of new revisions can issue attestations about the process that created those revisions. On GitHub, this would be typically be the pull request application.

Q: How do we decide what contributions to include? A: This may vary depending on the underlying VCS tech. It should be only the content that was subjected to the full process. For git, in practice it should be the diff that was reviewed, squashed into a new commit by a trusted application.

Q: Who "contributed" a line of code? A: This is defined by the SCS. There are multiple ways to do this safely.

Q: What about changes contributed by a robot? A: Robots are not special by default.

Q: Who attests to this information? A: The authoritative source as defined by the SCS. In practice, this should virtually always be the change revision tooling controlled by the canonical repository server. For teams that rely on commit metadata contents, this information may be distributed elsewhere.

Q: How does this information propagate (attestation formats, storage, and APIs)? A: Defined by the SCS (kind of a weak answer), but the important thing is that the relevant VSA can access it.

Oct 02 '24 20:10 zachariahcox

I think we're in a place where we can close this issue. These things are largely addressed by the current draft language and apply to the track as a whole.

I'm tempted to close this issue and address any deficiencies in the track during the RC review process.

Thoughts?

Jun 02 '25 15:06 TomHennen

@TomHennen makes sense to me

Jun 02 '25 19:06 mlieberman85

Good enough for me. We can reopen if anyone thinks otherwise.

Jun 02 '25 20:06 TomHennen