Skip to content

Commit 0a81f5e

Browse files
authored
chore: refactor provenance level 3 check into analysis (#817)
Signed-off-by: Ben Selwyn-Smith <[email protected]>
1 parent 5aa1321 commit 0a81f5e

File tree

87 files changed

+899
-1113
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

87 files changed

+899
-1113
lines changed

docs/source/pages/cli_usage/command_analyze.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,11 @@ Options
8080

8181
The path to the local .m2 directory. If this option is not used, Macaron will use the default location at $HOME/.m2
8282

83+
.. option:: --verify-provenance
84+
85+
Allow the analysis to attempt to verify provenance files as part of its normal operations.
86+
87+
8388
-----------
8489
Environment
8590
-----------
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
macaron.provenance package
2+
==========================
3+
4+
.. automodule:: macaron.provenance
5+
:members:
6+
:undoc-members:
7+
:show-inheritance:
8+
9+
Submodules
10+
----------
11+
12+
macaron.provenance.provenance\_extractor module
13+
-----------------------------------------------
14+
15+
.. automodule:: macaron.provenance.provenance_extractor
16+
:members:
17+
:undoc-members:
18+
:show-inheritance:
19+
20+
macaron.provenance.provenance\_finder module
21+
--------------------------------------------
22+
23+
.. automodule:: macaron.provenance.provenance_finder
24+
:members:
25+
:undoc-members:
26+
:show-inheritance:
27+
28+
macaron.provenance.provenance\_verifier module
29+
----------------------------------------------
30+
31+
.. automodule:: macaron.provenance.provenance_verifier
32+
:members:
33+
:undoc-members:
34+
:show-inheritance:

docs/source/pages/developers_guide/apidoc/macaron.repo_finder.rst

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -17,22 +17,6 @@ macaron.repo\_finder.commit\_finder module
1717
:undoc-members:
1818
:show-inheritance:
1919

20-
macaron.repo\_finder.provenance\_extractor module
21-
-------------------------------------------------
22-
23-
.. automodule:: macaron.repo_finder.provenance_extractor
24-
:members:
25-
:undoc-members:
26-
:show-inheritance:
27-
28-
macaron.repo\_finder.provenance\_finder module
29-
----------------------------------------------
30-
31-
.. automodule:: macaron.repo_finder.provenance_finder
32-
:members:
33-
:undoc-members:
34-
:show-inheritance:
35-
3620
macaron.repo\_finder.repo\_finder module
3721
----------------------------------------
3822

docs/source/pages/developers_guide/apidoc/macaron.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ Subpackages
2020
macaron.output_reporter
2121
macaron.parsers
2222
macaron.policy_engine
23+
macaron.provenance
2324
macaron.repo_finder
2425
macaron.repo_verifier
2526
macaron.slsa_analyzer

docs/source/pages/developers_guide/apidoc/macaron.slsa_analyzer.checks.rst

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -89,14 +89,6 @@ macaron.slsa\_analyzer.checks.provenance\_commit\_check module
8989
:undoc-members:
9090
:show-inheritance:
9191

92-
macaron.slsa\_analyzer.checks.provenance\_l3\_check module
93-
----------------------------------------------------------
94-
95-
.. automodule:: macaron.slsa_analyzer.checks.provenance_l3_check
96-
:members:
97-
:undoc-members:
98-
:show-inheritance:
99-
10092
macaron.slsa\_analyzer.checks.provenance\_l3\_content\_check module
10193
-------------------------------------------------------------------
10294

docs/source/pages/tutorials/npm_provenance.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ To perform an analysis on the latest version of semver (when this tutorial was w
4242

4343
.. code-block:: shell
4444
45-
./run_macaron.sh analyze -purl pkg:npm/[email protected]
45+
./run_macaron.sh analyze -purl pkg:npm/[email protected] --verify-provenance
4646
4747
The analysis involves Macaron downloading the contents of the target repository to the configured, or default, ``output`` folder. Results from the analysis, including checks, are stored in the database found at ``output/macaron.db`` (See :ref:`Output Files Guide <output_files_guide>`). Once the analysis is complete, Macaron will also produce a report in the form of a HTML file.
4848

@@ -52,7 +52,7 @@ During this analysis, Macaron will retrieve two provenance files from the npm re
5252

5353
.. note:: Most of the details from the two provenance files can be found through the links provided on the artifacts page on the npm website. In particular: `Sigstore Rekor <https://search.sigstore.dev/?logIndex=92391688>`_. The provenance file itself can be found at: `npm registry <https://registry.npmjs.org/-/npm/v1/attestations/[email protected]>`_.
5454

55-
Of course to reliably say the above does what is claimed here, proof is needed. For this we can rely on the check results produced from the analysis run. In particular, we want to know the results of three checks: ``mcn_provenance_derived_repo_1``, ``mcn_provenance_derived_commit_1``, and ``mcn_provenance_verified_1``. The first two to ensure that the commit and the repository being analyzed match those found in the provenance file, and the last check to ensure that the provenance file has been verified.
55+
Of course to reliably say the above does what is claimed here, proof is needed. For this we can rely on the check results produced from the analysis run. In particular, we want to know the results of three checks: ``mcn_provenance_derived_repo_1``, ``mcn_provenance_derived_commit_1``, and ``mcn_provenance_verified_1``. The first two to ensure that the commit and the repository being analyzed match those found in the provenance file, and the last check to ensure that the provenance file has been verified. For the third check to succeed, you need to enable provenance verification in Macaron by using the ``--verify-provenance`` command-line argument, as demonstrated above. This verification is disabled by default because it can be slow in some cases due to I/O-bound operations.
5656

5757
.. _fig_semver_7.6.2_report:
5858

src/macaron/__main__.py

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Copyright (c) 2022 - 2024, Oracle and/or its affiliates. All rights reserved.
1+
# Copyright (c) 2022 - 2025, Oracle and/or its affiliates. All rights reserved.
22
# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/.
33

44
"""This is the main entrypoint to run Macaron."""
@@ -32,7 +32,6 @@
3232

3333
def analyze_slsa_levels_single(analyzer_single_args: argparse.Namespace) -> None:
3434
"""Run the SLSA checks against a single target repository."""
35-
deps_depth = None
3635
if analyzer_single_args.deps_depth == "inf":
3736
deps_depth = -1
3837
else:
@@ -173,7 +172,8 @@ def analyze_slsa_levels_single(analyzer_single_args: argparse.Namespace) -> None
173172
analyzer_single_args.sbom_path,
174173
deps_depth,
175174
provenance_payload=prov_payload,
176-
validate_malware_switch=analyzer_single_args.validate_malware_switch,
175+
validate_malware=analyzer_single_args.validate_malware,
176+
verify_provenance=analyzer_single_args.verify_provenance,
177177
)
178178
sys.exit(status_code)
179179

@@ -360,7 +360,7 @@ def main(argv: list[str] | None = None) -> None:
360360
help="The directory where Macaron looks for already cloned repositories.",
361361
)
362362

363-
# Add sub parsers for each action
363+
# Add sub parsers for each action.
364364
sub_parser = main_parser.add_subparsers(dest="action", help="Run macaron <action> --help for help")
365365

366366
# Use Macaron to analyze one single repository.
@@ -470,12 +470,19 @@ def main(argv: list[str] | None = None) -> None:
470470
)
471471

472472
single_analyze_parser.add_argument(
473-
"--validate-malware-switch",
473+
"--validate-malware",
474474
required=False,
475475
action="store_true",
476476
help=("Enable malware validation."),
477477
)
478478

479+
single_analyze_parser.add_argument(
480+
"--verify-provenance",
481+
required=False,
482+
action="store_true",
483+
help=("Allow the analysis to attempt to verify provenance files as part of its normal operations."),
484+
)
485+
479486
# Dump the default values.
480487
sub_parser.add_parser(name="dump-defaults", description="Dumps the defaults.ini file to the output directory.")
481488

src/macaron/config/defaults.ini

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -46,11 +46,6 @@ validate = True
4646
# The CycloneDX schema version used for validation.
4747
schema = 1.6
4848

49-
# This is the Analyzer section used as part of Macaron's analysis.
50-
[analyzer]
51-
# This enables or disables attempts at verification of provenance.
52-
verify_provenance = True
53-
5449
# This is the repo finder script.
5550
[repofinder]
5651
find_repos = True
@@ -569,7 +564,7 @@ purl_endpoint = v3alpha/purl
569564
# [analysis.checks]
570565
# exclude =
571566
# mcn_build_as_code_1
572-
# mcn_provenance_level_three_1
567+
# mcn_provenance_verified_1
573568
# include = *
574569
# ```
575570
# 3. Exclude multiple checks that start with `mcn_provenance`:

src/macaron/database/db_custom_types.py

Lines changed: 69 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,21 @@
1-
# Copyright (c) 2023 - 2024, Oracle and/or its affiliates. All rights reserved.
1+
# Copyright (c) 2023 - 2025, Oracle and/or its affiliates. All rights reserved.
22
# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/.
33

4-
"""This module implements SQLAlchemy type for converting date format to RFC3339 string representation."""
4+
"""This module implements SQLAlchemy types for Python data types that cannot be automatically stored."""
55

66
import datetime
7+
import json
78
from typing import Any
89

910
from sqlalchemy import JSON, String, TypeDecorator
1011

12+
from macaron.slsa_analyzer.provenance.intoto import (
13+
InTotoPayload,
14+
InTotoV01Payload,
15+
InTotoV1Payload,
16+
validate_intoto_payload,
17+
)
18+
1119

1220
class RFC3339DateTime(TypeDecorator): # pylint: disable=W0223
1321
"""
@@ -36,7 +44,7 @@ def process_bind_param(self, value: None | Any, dialect: Any) -> None | str:
3644
if the provided ``datetime`` is a naive ``datetime`` object then UTC is added.
3745
3846
value: None | datetime.datetime
39-
The value being stored
47+
The value being stored.
4048
"""
4149
if value is None:
4250
return None
@@ -52,7 +60,7 @@ def process_result_value(self, value: None | str, dialect: Any) -> None | dateti
5260
If the deserialized ``datetime`` has a timezone then return it, otherwise add UTC as its timezone.
5361
5462
value: None | str
55-
The value being loaded
63+
The value being loaded.
5664
"""
5765
if value is None:
5866
return None
@@ -76,7 +84,7 @@ def process_bind_param(self, value: None | dict, dialect: Any) -> None | dict:
7684
"""Process when storing a dict object to the SQLite db.
7785
7886
value: None | dict
79-
The value being stored
87+
The value being stored.
8088
"""
8189
if not isinstance(value, dict):
8290
raise TypeError("DBJsonDict type expects a dict.")
@@ -87,8 +95,63 @@ def process_result_value(self, value: None | dict, dialect: Any) -> None | dict:
8795
"""Process when loading a dict object from the SQLite db.
8896
8997
value: None | dict
90-
The value being loaded
98+
The value being loaded.
9199
"""
92100
if not isinstance(value, dict):
93101
raise TypeError("DBJsonDict type expects a dict.")
94102
return value
103+
104+
105+
class ProvenancePayload(TypeDecorator): # pylint: disable=W0223
106+
"""SQLAlchemy column type to serialize InTotoProvenance."""
107+
108+
# It is stored in the database as a String value.
109+
impl = String
110+
111+
# To prevent Sphinx from rendering the docstrings for `cache_ok`, make this docstring private.
112+
#: :meta private:
113+
cache_ok = True
114+
115+
def process_bind_param(self, value: InTotoPayload | None, dialect: Any) -> str | None:
116+
"""Process when storing an InTotoPayload object to the SQLite db.
117+
118+
value: InTotoPayload | None
119+
The value being stored.
120+
"""
121+
if value is None:
122+
return None
123+
124+
if not isinstance(value, InTotoPayload):
125+
raise TypeError("ProvenancePayload type expects an InTotoPayload.")
126+
127+
payload_type = value.__class__.__name__
128+
payload_dict = {"payload_type": payload_type, "payload": value.statement}
129+
return json.dumps(payload_dict)
130+
131+
def process_result_value(self, value: str | None, dialect: Any) -> InTotoPayload | None:
132+
"""Process when loading an InTotoPayload object from the SQLite db.
133+
134+
value: str | None
135+
The value being loaded.
136+
"""
137+
if value is None:
138+
return None
139+
140+
try:
141+
payload_dict = json.loads(value)
142+
except ValueError as error:
143+
raise TypeError(f"Error parsing str as JSON: {error}") from error
144+
145+
if not isinstance(payload_dict, dict):
146+
raise TypeError("Parsed data is not a dict.")
147+
148+
if "payload_type" not in payload_dict or "payload" not in payload_dict:
149+
raise TypeError("Missing keys in dict for ProvenancePayload type.")
150+
151+
payload = payload_dict["payload"]
152+
if payload_dict["payload_type"] == "InTotoV01Payload":
153+
return InTotoV01Payload(statement=payload)
154+
if payload_dict["payload_type"] == "InTotoV1Payload":
155+
return InTotoV1Payload(statement=payload)
156+
157+
return validate_intoto_payload(payload)

src/macaron/database/table_definitions.py

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@
3434

3535
from macaron.artifact.maven import MavenSubjectPURLMatcher
3636
from macaron.database.database_manager import ORMBase
37-
from macaron.database.db_custom_types import RFC3339DateTime
37+
from macaron.database.db_custom_types import ProvenancePayload, RFC3339DateTime
3838
from macaron.errors import InvalidPURLError
3939
from macaron.repo_finder.repo_finder_enums import CommitFinderInfo, RepoFinderInfo
4040
from macaron.slsa_analyzer.provenance.intoto import InTotoPayload, ProvenanceSubjectPURLMatcher
@@ -491,16 +491,28 @@ class Provenance(ORMBase):
491491
component: Mapped["Component"] = relationship(back_populates="provenance")
492492

493493
#: The SLSA version.
494-
version: Mapped[str] = mapped_column(String, nullable=False)
494+
slsa_version: Mapped[str] = mapped_column(String, nullable=True)
495+
496+
#: The SLSA level.
497+
slsa_level: Mapped[int] = mapped_column(Integer, default=0)
495498

496499
#: The release tag commit sha.
497500
release_commit_sha: Mapped[str] = mapped_column(String, nullable=True)
498501

499502
#: The release tag.
500503
release_tag: Mapped[str] = mapped_column(String, nullable=True)
501504

502-
#: The provenance payload content in JSON format.
503-
provenance_json: Mapped[str] = mapped_column(String, nullable=False)
505+
#: The repository URL from the provenance.
506+
repository_url: Mapped[str] = mapped_column(String, nullable=True)
507+
508+
#: The commit sha from the provenance.
509+
commit_sha: Mapped[str] = mapped_column(String, nullable=True)
510+
511+
#: The provenance payload.
512+
provenance_payload: Mapped[InTotoPayload] = mapped_column(ProvenancePayload, nullable=False)
513+
514+
#: The verified status of the provenance.
515+
verified: Mapped[bool] = mapped_column(Boolean, nullable=False, default=False)
504516

505517
#: A one-to-many relationship with the release artifacts.
506518
artifact: Mapped[list["ReleaseArtifact"]] = relationship(back_populates="provenance")

src/macaron/provenance/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Copyright (c) 2024 - 2025, Oracle and/or its affiliates. All rights reserved.
2+
# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/.
3+
4+
"""This package contains the provenance tools for software components."""

0 commit comments

Comments
 (0)