macaron.repo_finder package
This package contains the dependency resolvers for Java projects.
- macaron.repo_finder.to_domain_from_known_purl_types(purl_type)
Return the git service domain from a known web-based purl type.
This method is used to handle cases where the purl type value is not the git domain but a pre-defined repo-based type in https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst.
Note that this method will be updated when there are new pre-defined types as per the PURL specification.
Submodules
macaron.repo_finder.commit_finder module
This module contains the logic for matching PackageURL versions to repository commits via the tags they contain.
- class macaron.repo_finder.commit_finder.AbstractPurlType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
The type represented by a PURL in terms of repositories versus artifacts.
Unsupported types are allowed as a third type.
- REPOSITORY = (0,)
- ARTIFACT = (1,)
- UNSUPPORTED = (2,)
- macaron.repo_finder.commit_finder.find_commit(git_obj, purl)
Try to find the commit matching the passed PURL.
The PURL may be a repository type, e.g. GitHub, in which case the commit might be in its version part. Otherwise, the PURL should be a package manager type, e.g. Maven, in which case the commit must be found from the artifact version.
- Parameters:
git_obj (Git) – The repository.
purl (PackageURL) – The PURL of the analysis target.
- Returns:
The digest, or None if the commit cannot be correctly retrieved.
- Return type:
str | None
- macaron.repo_finder.commit_finder.determine_abstract_purl_type(purl)
Determine if the passed purl is a repository type, artifact type, or unsupported type.
- Parameters:
purl (PackageURL) – A PURL that represents a repository, artifact, or something that is not supported.
- Returns:
The identified type of the PURL.
- Return type:
PurlType
- macaron.repo_finder.commit_finder.extract_commit_from_version(git_obj, version)
Try to extract the commit from the PURL’s version parameter.
E.g. With commit: pkg:github/package-url/purl-spec@244fd47e07d1004f0aed9c. With tag: pkg:github/apache/maven@maven-3.9.1.
- macaron.repo_finder.commit_finder.find_commit_from_version_and_name(git_obj, name, version)
Try to find the matching commit in a repository of a given version (and name) via tags.
The passed version is used to match with the tags in the target repository. The passed name is used in cases where a repository makes use of named prefixes in its tags.
- macaron.repo_finder.commit_finder.match_tags(tag_list, name, version)
Return items of the passed tag list that match the passed artifact name and version.
macaron.repo_finder.provenance_extractor module
This module contains methods for extracting repository and commit metadata from provenance files.
- macaron.repo_finder.provenance_extractor.extract_repo_and_commit_from_provenance(payload)
Extract the repository and commit metadata from the passed provenance payload.
- Parameters:
payload (InTotoPayload) – The payload to extract from.
- Returns:
The repository URL and commit hash if found, a pair of empty strings otherwise.
- Return type:
- Raises:
ProvenanceError – If the extraction process fails for any reason.
- macaron.repo_finder.provenance_extractor.check_if_input_repo_provenance_conflict(repo_path_input, provenance_repo_url)
Test if the input repo and commit match the contents of the provenance.
- macaron.repo_finder.provenance_extractor.check_if_input_purl_provenance_conflict(git_obj, repo_path_input, digest_input, provenance_repo_url, provenance_commit_digest, purl)
Test if the input repository type PURL’s repo and commit match the contents of the provenance.
- Parameters:
git_obj (Git) – The Git object.
repo_path_input (bool) – True if there is a repo as input.
digest_input (str) – True if there is a commit as input.
provenance_repo_url (str | None) – The repo url from provenance.
provenance_commit_digest (str | None) – The commit digest from provenance.
purl (PackageURL) – The input repository PURL.
- Returns:
True if there is a conflict between the inputs, False otherwise, or if the comparison cannot be performed.
- Return type:
- macaron.repo_finder.provenance_extractor.check_if_repository_purl_and_url_match(url, repo_purl)
Compare a repository PURL and URL for equality.
- class macaron.repo_finder.provenance_extractor.ProvenanceBuildDefinition
Bases:
ABC
Abstract base class for representing provenance build definitions.
This class serves as a blueprint for various types of build definitions in provenance data. It outlines the methods and properties that derived classes must implement to handle specific build definition types.
- abstract get_build_invocation(statement)
Retrieve the build invocation information from the given statement.
This method is intended to be implemented by subclasses to extract specific invocation details from a provenance statement.
- Parameters:
statement (InTotoV1Statement | InTotoV01Statement) – The provenance statement from which to extract the build invocation details. This statement contains the metadata about the build process and its associated artifacts.
- Returns:
A tuple containing two elements: - The first element is the build invocation entry point (e.g., workflow name), or None if not found. - The second element is the invocation URL or identifier (e.g., job URL), or None if not found.
- Return type:
- Raises:
NotImplementedError – If the method is called directly without being overridden in a subclass.
- class macaron.repo_finder.provenance_extractor.SLSAGithubGenericBuildDefinitionV01
Bases:
ProvenanceBuildDefinition
Class representing the SLSA GitHub Generic Build Definition (v0.1).
This class implements the abstract methods defined in ProvenanceBuildDefinition to extract build invocation details specific to the GitHub provenance generator’s generic build type.
-
expected_build_type:
str
= 'https://github.com/slsa-framework/slsa-github-generator/generic@v1' Determines the expected
buildType
field in the provenance predicate.
- get_build_invocation(statement)
Retrieve the build invocation information from the given statement.
- Parameters:
statement (InTotoV1Statement | InTotoV01Statement) – The provenance statement from which to extract the build invocation details. This statement contains the metadata about the build process and its associated artifacts.
- Returns:
A tuple containing two elements: - The first element is the build invocation entry point (e.g., workflow name), or None if not found. - The second element is the invocation URL or identifier (e.g., job URL), or None if not found.
- Return type:
-
expected_build_type:
- class macaron.repo_finder.provenance_extractor.SLSAGithubActionsBuildDefinitionV1
Bases:
ProvenanceBuildDefinition
Class representing the SLSA GitHub Actions Build Definition (v1).
This class implements the abstract methods from the ProvenanceBuildDefinition to extract build invocation details specific to the GitHub Actions build type.
-
expected_build_type:
str
= 'https://slsa-framework.github.io/github-actions-buildtypes/workflow/v1' Determines the expected
buildType
field in the provenance predicate.
- get_build_invocation(statement)
Retrieve the build invocation information from the given statement.
- Parameters:
statement (InTotoV1Statement | InTotoV01Statement) – The provenance statement from which to extract the build invocation details. This statement contains the metadata about the build process and its associated artifacts.
- Returns:
A tuple containing two elements: - The first element is the build invocation entry point (e.g., workflow name), or None if not found. - The second element is the invocation URL or identifier (e.g., job URL), or None if not found.
- Return type:
-
expected_build_type:
- class macaron.repo_finder.provenance_extractor.SLSANPMCLIBuildDefinitionV2
Bases:
ProvenanceBuildDefinition
Class representing the SLSA NPM CLI Build Definition (v12).
This class implements the abstract methods from the ProvenanceBuildDefinition to extract build invocation details specific to the GitHub Actions build type.
-
expected_build_type:
str
= 'https://github.com/npm/cli/gha/v2' Determines the expected
buildType
field in the provenance predicate.
- get_build_invocation(statement)
Retrieve the build invocation information from the given statement.
- Parameters:
statement (InTotoV1Statement | InTotoV01Statement) – The provenance statement from which to extract the build invocation details. This statement contains the metadata about the build process and its associated artifacts.
- Returns:
A tuple containing two elements: - The first element is the build invocation entry point (e.g., workflow name), or None if not found. - The second element is the invocation URL or identifier (e.g., job URL), or None if not found.
- Return type:
-
expected_build_type:
- class macaron.repo_finder.provenance_extractor.SLSAGCBBuildDefinitionV1
Bases:
ProvenanceBuildDefinition
Class representing the SLSA Google Cloud Build (GCB) Build Definition (v1).
This class implements the abstract methods from ProvenanceBuildDefinition to extract build invocation details specific to the Google Cloud Build (GCB).
-
expected_build_type:
str
= 'https://slsa-framework.github.io/gcb-buildtypes/triggered-build/v1' Determines the expected
buildType
field in the provenance predicate.
- get_build_invocation(statement)
Retrieve the build invocation information from the given statement.
- Parameters:
statement (InTotoV1Statement | InTotoV01Statement) – The provenance statement from which to extract the build invocation details. This statement contains the metadata about the build process and its associated artifacts.
- Returns:
A tuple containing two elements: - The first element is the build invocation entry point (e.g., workflow name), or None if not found. - The second element is the invocation URL or identifier (e.g., job URL), or None if not found.
- Return type:
-
expected_build_type:
- class macaron.repo_finder.provenance_extractor.SLSAOCIBuildDefinitionV1
Bases:
ProvenanceBuildDefinition
Class representing the SLSA Oracle Cloud Infrastructure (OCI) Build Definition (v1).
This class implements the abstract methods from ProvenanceBuildDefinition to extract build invocation details specific to OCI builds.
-
expected_build_type:
str
= 'https://github.com/oracle/macaron/tree/main/src/macaron/resources/provenance-buildtypes/oci/v1' Determines the expected
buildType
field in the provenance predicate.
- get_build_invocation(statement)
Retrieve the build invocation information from the given statement.
- Parameters:
statement (InTotoV1Statement | InTotoV01Statement) – The provenance statement from which to extract the build invocation details. This statement contains the metadata about the build process and its associated artifacts.
- Returns:
A tuple containing two elements: - The first element is the build invocation entry point (e.g., workflow name), or None if not found. - The second element is the invocation URL or identifier (e.g., job URL), or None if not found.
- Return type:
-
expected_build_type:
- class macaron.repo_finder.provenance_extractor.WitnessGitLabBuildDefinitionV01
Bases:
ProvenanceBuildDefinition
Class representing the Witness GitLab Build Definition (v0.1).
This class implements the abstract methods from ProvenanceBuildDefinition to extract build invocation details specific to GitLab.
-
expected_build_type:
str
= 'https://witness.testifysec.com/attestation-collection/v0.1' Determines the expected
buildType
field in the provenance predicate.
- expected_attestation_type = 'https://witness.dev/attestations/gitlab/v0.1'
Determines the expected
attestations.type
field in the Witness provenance predicate.
- get_build_invocation(statement)
Retrieve the build invocation information from the given statement.
- Parameters:
statement (InTotoV1Statement | InTotoV01Statement) – The provenance statement from which to extract the build invocation details. This statement contains the metadata about the build process and its associated artifacts.
- Returns:
A tuple containing two elements: - The first element is the build invocation entry point (e.g., workflow name), or None if not found. - The second element is the invocation URL or identifier (e.g., job URL), or None if not found.
- Return type:
-
expected_build_type:
- class macaron.repo_finder.provenance_extractor.ProvenancePredicate
Bases:
object
Class providing utility methods for handling provenance predicates.
This class contains static methods for extracting information from predicates in provenance statements related to various build definitions. It serves as a helper for identifying build types and finding the appropriate build definitions based on the extracted data.
- static get_build_type(statement)
Extract the build type from the provided provenance statement.
- Parameters:
statement (InTotoV1Statement | InTotoV01Statement) – The provenance statement from which to extract the build type.
- Returns:
The build type if found; otherwise, None.
- Return type:
str | None
- static find_build_def(statement)
Find the appropriate build definition class based on the extracted build type.
This method checks the provided provenance statement for its build type and returns the corresponding ProvenanceBuildDefinition subclass.
- Parameters:
statement (InTotoV01Statement | InTotoV1Statement) – The provenance statement containing the build type information.
- Returns:
An instance of the appropriate build definition class that matches the extracted build type.
- Return type:
- Raises:
ProvenanceError – Raised when the build definition cannot be found in the provenance statement.
macaron.repo_finder.provenance_finder module
This module contains methods for finding provenance files.
- class macaron.repo_finder.provenance_finder.ProvenanceFinder
Bases:
object
This class is used to find and retrieve provenance files from supported registries.
- __init__()
- macaron.repo_finder.provenance_finder.find_npm_provenance(purl, registry)
Find and download the NPM based provenance for the passed PURL.
Two kinds of attestation can be retrieved from npm: “Provenance” and “Publish”. The “Provenance” attestation contains the important information Macaron seeks, but is not signed. The “Publish” attestation is signed. Comparison of the signed vs unsigned at the subject level, allows the unsigned to be verified. See: https://docs.npmjs.com/generating-provenance-statements
- Parameters:
purl (PackageURL) – The PURL of the analysis target.
registry (NPMRegistry) – The npm registry to use.
- Returns:
The provenance payload(s), or an empty list if not found.
- Return type:
list[InTotoPayload]
- macaron.repo_finder.provenance_finder.verify_npm_provenance(purl, provenance)
Compare the unsigned payload subject digest with the signed payload digest, if available.
- macaron.repo_finder.provenance_finder.find_gav_provenance(purl, registry)
Find and download the GAV based provenance for the passed PURL.
- Parameters:
purl (PackageURL) – The PURL of the analysis target.
registry (JFrogMavenRegistry) – The registry to use for finding.
- Returns:
The provenance payload if found, or an empty list otherwise.
- Return type:
list[InTotoPayload] | None
- Raises:
ProvenanceAvailableException – If the discovered provenance file size exceeds the configured limit.
- macaron.repo_finder.provenance_finder.find_provenance_from_ci(analyze_ctx, git_obj)
Try to find provenance from CI services of the repository.
Note that we stop going through the CI services once we encounter a CI service that does host provenance assets.
This method also loads the provenance payloads into the
CIInfo
object where the provenance assets are found.- Parameters:
analyze_ctx (AnalyzeContext) – The contenxt of the ongoing analysis.
git_obj (Git | None) – The Pydriller Git object representing the repository, if any.
- Returns:
The provenance payload, or None if not found.
- Return type:
InTotoPayload | None
macaron.repo_finder.repo_finder module
This module contains the logic for using/calling the different repo finders.
Input
The entry point of the repo finder depends on the type of PURL being analyzed.
- If passing a PURL representing an artifact, the find_repo
function in this file should be called.
- If passing a PURL representing a repository, the to_repo_path
function in this file should be called.
Artifact PURLs
For artifact PURLs, the PURL type determines how the repositories are searched for. Currently, for Maven PURLs, SCM meta data is retrieved from the matching POM retrieved from Maven Central (or other configured location).
For Python, .NET, Rust, and NodeJS type PURLs, Google’s Open Source Insights API is used to find the meta data.
In either case, any repository links are extracted from the meta data, then checked for validity via
repo_validator::find_valid_repository_url
which accepts URLs that point to a GitHub repository or similar.
Repository PURLs
For repository PURLs, the type is checked against the configured valid domains, and accepted or rejected based on that data.
Result
If all goes well, a repository URL that matches the initial artifact or repository PURL will be returned for analysis.
- macaron.repo_finder.repo_finder.find_repo(purl)
Retrieve the repository URL that matches the given PURL.
- Parameters:
purl (PackageURL) – The parsed PURL to convert to the repository path.
- Returns:
The repository URL found for the passed package.
- Return type:
- macaron.repo_finder.repo_finder.to_repo_path(purl, available_domains)
Return the repository path from the PURL string.
This method only supports converting a PURL with the following format:
pkg:<type>/<namespace>/<name>[…]
Where
type
could be either: - The pre-defined repository-based PURL type as defined in https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rstThe supported git service domains (e.g.
github.com
) defined inavailable_domains
.
The repository path will be generated with the following format
https://<type>/<namespace>/<name>
.
- macaron.repo_finder.repo_finder.find_source(purl_string, input_repo)
Perform repo and commit finding for a passed PURL, or commit finding for a passed PURL and repo.
- macaron.repo_finder.repo_finder.get_tags_via_git_remote(repo)
Retrieve all tags from a given repository using ls-remote.
macaron.repo_finder.repo_finder_base module
This module contains the base class for the repo finders.
macaron.repo_finder.repo_finder_deps_dev module
This module contains the PythonRepoFinderDD class to be used for finding repositories using deps.dev.
- class macaron.repo_finder.repo_finder_deps_dev.DepsDevType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
StrEnum
The package manager types supported by deps.dev.
This enum should be updated based on updates to deps.dev.
- MAVEN = 'maven'
- PYPI = 'pypi'
- NUGET = 'nuget'
- CARGO = 'cargo'
- NPM = 'npm'
- class macaron.repo_finder.repo_finder_deps_dev.DepsDevRepoFinder
Bases:
BaseRepoFinder
This class is used to find repositories using Google’s Open Source Insights A.K.A. deps.dev.
- find_repo(purl)
Attempt to retrieve a repository URL that matches the passed artifact.
- Parameters:
purl (PackageURL) – The PURL of an artifact.
- Returns:
The URL of the found repository.
- Return type:
macaron.repo_finder.repo_finder_java module
This module contains the JavaRepoFinder class to be used for finding Java repositories.
- class macaron.repo_finder.repo_finder_java.JavaRepoFinder
Bases:
BaseRepoFinder
This class is used to find Java repositories.
- __init__()
Initialise the Java repository finder instance.
macaron.repo_finder.repo_utils module
This module contains the utility functions for repo and commit finder operations.
- macaron.repo_finder.repo_utils.create_filename(purl)
Create the filename of the report based on the PURL.
- Parameters:
purl (PackageURL) – The PackageURL of the artifact.
- Returns:
The filename to save the report under.
- Return type:
- macaron.repo_finder.repo_utils.generate_report(purl, commit, repo, target_dir)
Create the report and save it to the passed directory.
- Parameters:
- Returns:
True if the report was created. False otherwise.
- Return type:
- macaron.repo_finder.repo_utils.create_report(purl, commit, repo)
Generate report for standalone uses of the repo / commit finder.
- macaron.repo_finder.repo_utils.prepare_repo(target_dir, repo_path, branch_name='', digest='', purl=None)
Prepare the target repository for analysis.
If
repo_path
is a remote path, the target repo is cloned to{target_dir}/{unique_path}
. Theunique_path
of a repository will depend on its remote url. For example, if given therepo_path
https://github.com/org/name.git, it will be cloned to{target_dir}/github_com/org/name
.If
repo_path
is a local path, this method will check ifrepo_path
resolves to a directory insidelocal_repos_path
and to a valid git repository.- Parameters:
target_dir (str) – The directory where all remote repository will be cloned.
repo_path (str) – The path to the repository, can be either local or remote.
branch_name (str) – The name of the branch we want to checkout.
digest (str) – The hash of the commit that we want to checkout in the branch.
purl (PackageURL | None) – The PURL of the analysis target.
- Returns:
The pydriller.Git object of the repository or None if error.
- Return type:
Git | None
- macaron.repo_finder.repo_utils.get_local_repos_path()
Get the local repos path from global config or use default.
If the directory does not exist, it is created.
- Return type:
macaron.repo_finder.repo_validator module
This module exists to validate URLs in terms of their use as a repository that can be analyzed.
- macaron.repo_finder.repo_validator.find_valid_repository_url(urls)
Find a valid URL from the provided URLs.