slsa icon indicating copy to clipboard operation
slsa copied to clipboard

Provenance: add optional URI and digest for build logs

Open shaunmlowry opened this issue 3 years ago • 4 comments

Currently, the spec for predicate metadata includes:

metadata.buildInvocationId string, optional

Identifies this particular build invocation, which can be useful for finding associated logs or other ad-hoc analysis. The exact meaning and format is defined by builder.id; by default it is treated as opaque and case-sensitive. The value SHOULD be globally unique.

Seems like logs are useful not just for debugging, but also for verifying the actions laid out elsewhere in the attestation were actually those taken. I'd like to suggest that logs get special treatment in the metadata section such that they include a URI and a digest similar to other referenced objects, e.g.:

"logs": [
    {
        "uri": "scheme://some/path",
        "digest": {
            "sha256": "eb158a15263554e65cab9dfd6f2a640b434d17e7e6af8f40435707662f88234a"
        }
    }
]

shaunmlowry avatar Jun 21 '22 21:06 shaunmlowry

Great discussion topic. AIUI the provenance specification does attempt to ensure that there's enough information captured to verify the actions taken. This might be implicitly through invocation.configSource and the repository it describes or explicitly through buildConfig.

Do you think that logs would provide additional information for verification on top of the above? Would this verification be more about being able to check the build service executed the invocation as expected?

Aside: one thing the spec does allow for, which is easy to miss, is extensibility. Both by having several "arbitrary JSON object with a schema defined by buildType." (invocation.parameters, invocation.environment and buildConfig) and through the parsing rules explicitly defining how to add extension fields:

Producers MAY add extension fields using field names that are URIs.

joshuagl avatar Jun 21 '22 21:06 joshuagl

I believe the goal is to aid in debugging and auditing. I've heard from others that it is valuable to link to "evidence" of the attestation, where logs would be perhaps the most common form of evidence. This is similar to the original in-toto link format's byproducts: other outputs that are not the main output of the build.

I'm open adding it as an optional field.

Other alternatives to adding it to the provenance predicate:

  • Add it to the statement (most likely as evidence or similar).
  • Generalize the notion of relationships: https://github.com/in-toto/attestation/issues/6.

MarkLodato avatar Jun 22 '22 14:06 MarkLodato

Do you think that logs would provide additional information for verification on top of the above? Would this verification be more about being able to check the build service executed the invocation as expected?

Exactly. This is useful when doing either a preventative or diagnostic forensic investigation of the claims made in the attestation. It's even more useful if the logs are non-falsifiable hence the suggestion of adding a digest for them.

one thing the spec does allow for, which is easy to miss, is extensibility

The metadata field seems not to be one of the explicitly extensible fields.

Producers MAY add extension fields using field names that are URIs.

I'd really like this to be a field that contains both the URI and DigestSet. Maybe we need another field type to describe external resources like this which contain both a URI and an optional DigestSet given that there are a number of fields that already fall into this category (materials, subject(?), configSource) and to which it might also be applicable (logs, policies in VSAs etc.)

Add it to the statement (most likely as evidence or similar).

I really like this idea but I think it needs more discussion, especially regarding what else should be included in this field and how it relates to the intent of the other content of the metadata field

shaunmlowry avatar Jun 22 '22 21:06 shaunmlowry

Thanks both for articulating the need. Thinking of some of the other predicate types we've observed in the wild (see https://github.com/in-toto/attestation/issues/98) I can picture this being in the statement layer.

For example, a vulnerability attestation might include only a high-level reporting with the evidence linking to the full scan results.

joshuagl avatar Jul 01 '22 16:07 joshuagl

I've filed https://github.com/in-toto/attestation/issues/114 against the in-toto/attestation repo to discuss the idea of including an evidence field in the statement.

joshuagl avatar Oct 19 '22 15:10 joshuagl

This is now present as the byproducts field.

MarkLodato avatar Feb 06 '23 17:02 MarkLodato