Provenance (original) (raw)

To trace software back to the source and define the moving parts in a complex supply chain, provenance needs to be there from the very beginning. It’s the verifiable information about software artifacts describing where, when, and how something was produced. For higher SLSA levels and more resilient integrity guarantees, provenance requirements are stricter and need a deeper, more technical understanding of the predicate.

This document defines the following predicate type within the in-toto attestation framework:

"predicateType": "https://slsa.dev/provenance/v1"

Important: Always use the above string for predicateType rather than what is in the URL bar. The predicateType URI will always resolve to the latest minor version of this specification. See parsing rules for more information.

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.

Purpose

Describe how an artifact or set of artifacts was produced so that:

This predicate is the RECOMMENDED way to satisfy the SLSA v1.0 provenance requirements.

Model

Provenance is an attestation that a particular build platform produced a set of software artifacts through execution of the buildDefinition.

Build Model

The model is as follows:

For concrete examples, see index of build types.

Parsing rules

This predicate follows the in-toto attestation parsing rules. Summary:

Schema

Summary

NOTE: This summary (in cue) is informative. In the event of a disagreement with the text description, the text is authoritative.

{
    // Standard attestation fields:
    "_type": "https://in-toto.io/Statement/v1",
    "subject": [...],

    // Predicate:
    "predicateType": "https://slsa.dev/provenance/v1",
    "predicate": {
        "buildDefinition": {
            "buildType": string,
            "externalParameters": object,
            "internalParameters": object,
            "resolvedDependencies": [ ...#ResourceDescriptor ],
        },
        "runDetails": {
            "builder": {
                "id": string,
                "builderDependencies": [ ...#ResourceDescriptor ],
                "version": { ...string },
            },
            "metadata": {
                "invocationId": string,
                "startedOn": #Timestamp,
                "finishedOn": #Timestamp,
            },
            "byproducts": [ ...#ResourceDescriptor ],
        }
    }
}

#ResourceDescriptor: {
    "uri": string,
    "digest": {
        "sha256": string,
        "sha512": string,
        "gitCommit": string,
        [string]: string,
    },
    "name": string,
    "downloadLocation": string,
    "mediaType": string,
    "content": bytes, // base64-encoded
    "annotations": {
        [string]: _  // any JSON type
    }
}

#Timestamp: string  // <YYYY>-<MM>-<DD>T<hh>:<mm>:<ss>Z

Protocol buffer schema

NOTE: This summary (in protobuf) is informative. In the event of a disagreement with the text description, the text is authoritative.

Link: provenance.proto

NOTE: This protobuf definition prioritizes being a human-readable summary of the schema for readers of the specification. A version of the protobuf definition useful for code generation is maintained in thein-toto attestation repository.

syntax = "proto3";

package slsa.v1;

import "google/protobuf/struct.proto";
import "google/protobuf/timestamp.proto";

// NOTE: While file uses snake_case as per the Protocol Buffers Style Guide, the
// provenance is always serialized using JSON with lowerCamelCase. Protobuf
// tooling performs this case conversion automatically.

message Provenance {
  BuildDefinition build_definition = 1;
  RunDetails run_details = 2;
}

message BuildDefinition {
  string build_type = 1;
  google.protobuf.Struct external_parameters = 2;
  google.protobuf.Struct internal_parameters = 3;
  repeated ResourceDescriptor resolved_dependencies = 4;
}

message ResourceDescriptor {
  string uri = 1;
  map<string, string> digest = 2;
  string name = 3;
  string download_location = 4;
  string media_type = 5;
  bytes content = 6;
  map<string, google.protobuf.Value> annotations = 7;
}

message RunDetails {
  Builder builder = 1;
  BuildMetadata metadata = 2;
  repeated ResourceDescriptor byproducts = 3;
}

message Builder {
  string id = 1;
  map<string, string> version = 2;
  repeated ResourceDescriptor builder_dependencies = 3;
}

message BuildMetadata {
  string invocation_id = 1;
  google.protobuf.Timestamp started_on = 2;
  google.protobuf.Timestamp finished_on = 3;
}

Provenance

NOTE: This section describes the fields within predicate. For a description of the other top-level fields, such as subject, see Statement.

REQUIRED for SLSA Build L1: buildDefinition, runDetails

Field Type Description
buildDefinition BuildDefinition The input to the build. The accuracy and completeness are implied byrunDetails.builder.id.
runDetails RunDetails Details specific to this particular execution of the build.

BuildDefinition

REQUIRED for SLSA Build L1: buildType, externalParameters

Field Type Description
buildType string (TypeURI) Identifies the template for how to perform the build and interpret the parameters and dependencies. The URI SHOULD resolve to a human-readable specification that includes: overall description of the build type; schema for externalParameters andinternalParameters; unambiguous instructions for how to initiate the build given this BuildDefinition, and a complete example. Example:https://slsa-framework.github.io/github-actions-buildtypes/workflow/v1
externalParameters object The parameters that are under external control, such as those set by a user or tenant of the build platform. They MUST be complete at SLSA Build L3, meaning that there is no additional mechanism for an external party to influence the build. (At lower SLSA Build levels, the completeness MAY be best effort.) The build platform SHOULD be designed to minimize the size and complexity ofexternalParameters, in order to reduce fragility and ease verification. Consumers SHOULD have an expectation of what “good” looks like; the more information that they need to check, the harder that task becomes. Verifiers SHOULD reject unrecognized or unexpected fields withinexternalParameters.
internalParameters object The parameters that are under the control of the entity represented bybuilder.id. The primary intention of this field is for debugging, incident response, and vulnerability management. The values here MAY be necessary for reproducing the build. There is no need to verify these parameters because the build platform is already trusted, and in many cases it is not practical to do so.
resolvedDependencies array (ResourceDescriptor) Unordered collection of artifacts needed at build time. Completeness is best effort, at least through SLSA Build L3. For example, if the build script fetches and executes “example.com/foo.sh”, which in turn fetches “example.com/bar.tar.gz”, then both “foo.sh” and “bar.tar.gz” SHOULD be listed here.

The BuildDefinition describes all of the inputs to the build. It SHOULD contain all the information necessary and sufficient to initialize the build and begin execution.

The externalParameters and internalParameters are the top-level inputs to the template, meaning inputs not derived from another input. Each is an arbitrary JSON object, though it is RECOMMENDED to keep the structure simple with string values to aid verification. The same field name SHOULD NOT be used for bothexternalParameters and internalParameters.

The parameters SHOULD only contain the actual values passed in through the interface to the build platform. Metadata about those parameter values, particularly digests of artifacts referenced by those parameters, SHOULD instead go in resolvedDependencies. The documentation for buildType SHOULD explain how to convert from a parameter to the dependency uri. For example:

"externalParameters": {
    "repository": "https://github.com/octocat/hello-world",
    "ref": "refs/heads/main"
},
"resolvedDependencies": [{
    "uri": "git+https://github.com/octocat/hello-world@refs/heads/main",
    "digest": {"gitCommit": "7fd1a60b01f91b314f59955a4e4d4e80d8edf11d"}
}]

Guidelines:

RunDetails

REQUIRED for SLSA Build L1: builder

Field Type Description
builder Builder Identifies the build platform that executed the invocation, which is trusted to have correctly performed the operation and populated this provenance.
metadata BuildMetadata Metadata about this particular execution of the build.
byproducts array (ResourceDescriptor) Additional artifacts generated during the build that are not considered the “output” of the build but that might be needed during debugging or incident response. For example, this might reference logs generated during the build and/or a digest of the fully evaluated build configuration. In most cases, this SHOULD NOT contain all intermediate files generated during the build. Instead, this SHOULD only contain files that are likely to be useful later and that cannot be easily reproduced.

Builder

REQUIRED for SLSA Build L1: id

Field Type Description
id string (TypeURI) URI indicating the transitive closure of the trusted build platform. This isintendedto be the sole determiner of the SLSA Build level. If a build platform has multiple modes of operations that have differing security attributes or SLSA Build levels, each mode MUST have a differentbuilder.id and SHOULD have a different signer identity. This is to minimize the risk that a less secure mode compromises a more secure one. The builder.id URI SHOULD resolve to documentation explaining: The scope of what this ID represents. The claimed SLSA Build level. The accuracy and completeness guarantees of the fields in the provenance. Any fields that are generated by the tenant-controlled build process and not verified by the trusted control plane, except for the subject. The interpretation of any extension fields.
builderDependencies array (ResourceDescriptor) Dependencies used by the orchestrator that are not run within the workload and that do not affect the build, but might affect the provenance generation or security guarantees.
version map (string→string) Map of names of components of the build platform to their version.

The build platform, or builder for short, represents the transitive closure of all the entities that are, by necessity, trusted to faithfully run the build and record the provenance. This includes not only the software but the hardware and people involved in running the service. For example, a particular instance of Tekton could be a build platform, while Tekton itself is not. For more info, see Build model.

The id MUST reflect the trust base that consumers care about. How detailed to be is a judgement call. For example, GitHub Actions supports both GitHub-hosted runners and self-hosted runners. The GitHub-hosted runner might be a single identity because it’s all GitHub from the consumer’s perspective. Meanwhile, each self-hosted runner might have its own identity because not all runners are trusted by all consumers.

Consumers MUST accept only specific signer-builder pairs. For example, “GitHub” can sign provenance for the “GitHub Actions” builder, and “Google” can sign provenance for the “Google Cloud Build” builder, but “GitHub” cannot sign for the “Google Cloud Build” builder.

Design rationale: The builder is distinct from the signer in order to support the case where one signer generates attestations for more than one builder, as in the GitHub Actions example above. The field is REQUIRED, even if it is implicit from the signer, to aid readability and debugging. It is an object to allow additional fields in the future, in case one URI is not sufficient.

BuildMetadata

REQUIRED: (none)

Field Type Description
invocationId string Identifies this particular build invocation, which can be useful for finding associated logs or other ad-hoc analysis. The exact meaning and format is defined by builder.id; by default it is treated as opaque and case-sensitive. The value SHOULD be globally unique.
startedOn string (Timestamp) The timestamp of when the build started.
finishedOn string (Timestamp) The timestamp of when the build completed.

Extension fields

Implementations MAY add extension fields to any JSON object to describe information that is not captured in a standard field. Guidelines:

Verification

Please see Verifying Artifacts for a detailed discussion of provenance verification.

Index of build types

The following is a partial index of build type definitions. Each contains a complete example predicate.

To add an entry here, please send a pull request on GitHub.

Migrating from 0.2

To migrate from version 0.2 (old), use the following pseudocode. The meaning of each field is unchanged unless otherwise noted.

{
    "buildDefinition": {
        // The `buildType` MUST be updated for v1.0 to describe how to
        // interpret `inputArtifacts`.
        "buildType": /* updated version of */ old.buildType,
        "externalParameters":
            old.invocation.parameters + {
            // It is RECOMMENDED to rename "entryPoint" to something more
            // descriptive.
            "entryPoint": old.invocation.configSource.entryPoint,
            // It is OPTIONAL to rename "source" to something more descriptive,
            // especially if "source" is ambiguous or confusing.
            "source": old.invocation.configSource.uri,
        },
        "internalParameters": old.invocation.environment,
        "resolvedDependencies":
            old.materials + [
            {
                "uri": old.invocation.configSource.uri,
                "digest": old.invocation.configSource.digest,
            }
        ]
    },
    "runDetails": {
        "builder": {
            "id": old.builder.id,
            "builderDependencies": null,  // not in v0.2
            "version": null,  // not in v0.2
        },
        "metadata": {
            "invocationId": old.metadata.buildInvocationId,
            "startedOn": old.metadata.buildStartedOn,
            "finishedOn": old.metadata.buildFinishedOn,
        },
        "byproducts": null,  // not in v0.2
    },
}

The following fields from v0.2 are no longer present in v1.0:

Change history

v1.0

Major refactor to reduce misinterpretation, including a minor change in model.

Differences from RC1 and RC2:

v0.2

Refactored to aid clarity and added buildConfig. The model is unchanged.

rename: slsa.dev/provenance

Renamed to “slsa.dev/provenance”.

v0.1.1

v0.1

Initial version, named “in-toto.io/Provenance”

  1. The externalParameters SHOULD reflect reality. If clients send the evaluated configuration object directly to the build server, record the digest directly in externalParameters. If clients upload the configuration object to a temporary storage location and send that location to the build server, record the location in externalParameters as a URI and record the uri and digest in resolvedDependencies.