The In-Toto Framework

Securing individual components of a software supply chain—like signing a commit or scanning a docker image—is essential, but it is not sufficient. A sophisticated attacker can leave the source code untouched and the final signature valid, but compromise the build server in between to inject malicious code.

In-toto (Latin for “in whole” or “in total”) is an open-source framework designed to solve this specific problem. It provides a mechanism to verify the integrity of the entire software supply chain, ensuring that not only are the ingredients safe, but the “recipe” was followed exactly as intended.

It allows project owners to define a layout (the rules) for how software should be built, and requires every actor in the chain to provide links (evidence) that they followed those rules.


The Problem: The Chain of Custody

In a complex CI/CD pipeline, a file passes through many hands (developers, build scripts, packagers, servers). If you only verify the end result, you have a blind spot regarding how that result was achieved.

Consider a standard pipeline:

  1. Developer pushes code (Source).
  2. CI Server builds binary (Build).
  3. CD Server packages binary (Package).

If an attacker compromises the CI Server, they can take the legitimate source code, compile a malicious binary, and pass it to the CD server. The CD server, seeing a binary coming from the trusted CI server, packages and signs it. The final artifact looks legitimate, but it is compromised.

In-toto solves this by requiring a cryptographically verifiable chain of custody. It ensures that step 2 actually used the output from step 1, and step 3 used the output from step 2, without any unauthorized tampering in between.


Core Concepts

Understanding in-toto requires distinguishing between what you expect to happen and what actually happened.

1. The Layout (The Expectation)

The Layout is the policy or the “contract” for the supply chain. It is a file (signed by the project owner) that defines:

  • Steps: What actions must be performed (e.g., “clone-repo”, “build-binary”, “run-tests”).
  • Functionaries: Who is allowed to perform these steps (identified by their public keys).
  • Materials and Products:
    • Materials: What files go into a step.
    • Products: What files come out of a step.
  • Inspections: Rules to check that the products of one step match the materials of the next.

Example Layout Logic:

“I expect the user ‘Bob’ to perform the ‘build’ step. He must take the source code files (Materials) and produce a binary file (Product). I expect the ‘test’ step to run on exactly that binary.”

As the software moves through the supply chain, every actor (or “Functionary”) creates a Link metadata file. This is an attestation of what they actually did.

When a step is performed, the in-toto client records:

  • The exact hash of the files used (Materials).
  • The exact hash of the files created (Products).
  • The command that was executed.
  • The signature of the actor (Functionary) who performed the work.

Example Link Logic:

“I am the CI Runner. I ran the command make build. I started with source file main.go (Hash: abc…) and I produced app.bin (Hash: xyz…).”


How Verification Works

The power of in-toto lies in the final verification phase. When a user (or an admission controller like DevGuard) wants to install the software, they run the in-toto verification.

The verifier takes the Layout (the rules) and collects all the Links (the evidence). It then overlays them to check for discrepancies.

In-toto metadata flow
In-toto project, CC BY-SA 4.0, via Creative Commons

The verification workflow: The Layout defines the expected steps, and the Links provide the evidence. The Verifier ensures they match.

The verification fails if:

  1. Unauthorized Actor: A step was signed by a key not listed in the Layout.
  2. Tampering: The hash of a “Product” from Step A does not match the hash of the “Material” in Step B. (This indicates the file was modified in transit).
  3. Missing Step: A required step in the Layout has no corresponding Link metadata.
  4. Command mismatch: The command executed differs from the allowed command in the Layout.

In-Toto and SLSA

You will often see in-toto mentioned alongside SLSA (Supply-chain Levels for Software Artifacts). It is important to understand their relationship:

  • SLSA is the standard that defines security levels (Level 1, 2, 3) and requirements.
  • In-toto is the implementation format used to satisfy those requirements.

SLSA recommends using the in-toto attestation format to store provenance data. When you generate a “SLSA Provenance” document, it is technically a JSON file wrapped in an in-toto envelope.

Read more about this relationship in the SLSA Framework chapter.


Conclusion

The in-toto framework shifts security from trusting the person to trusting the process. By cryptographically chaining every step of the software delivery pipeline—from the developer’s laptop to the final production server—it eliminates the “black box” of software building.

It provides the mathematical proof that the software running in production is exactly what the developers intended, with no unauthorized modifications along the way.


References