Using Macaron
Note
The instructions below assume that you have setup you environment correctly to run Macaron (if not, please refer to Installation Guide).
Analyzing an artifact with a PURL string
Macaron can analyze an artifact (and its dependencies) to determine its supply chain security posture. To analyze an artifact, you need to provide the PURL identifier of the artifact:
pkg:<package_type>/<artifact_details>
Where artifact_details
varies based on the provided package_type
. Examples for those currently supported by Macaron are as follows:
Package Type |
PURL String |
---|---|
Maven (Java) |
|
PyPi (Python) |
|
Cargo (Rust) |
|
NuGet (.Net) |
|
NPM (NodeJS) |
|
For more detailed information on converting a given artifact into a PURL, see PURL Specification and PURL Types
To run Macaron on an artifact, we use the following command:
./run_macaron.sh analyze -purl <artifact-purl>
Automated repository and commit finder
Macaron is capable of automatically determining the repository and exact commit that match a given artifact. For repository URLs, this is achieved through examination of SCM meta data found within artifact POM files (for Java), or use of Google’s Open Source Insights API (for other languages). For commits, Macaron will attempt to match repository tags with the artifact version being sought, thereby requiring that the repository supports and uses tags on commits that were used for releases.
By default, Macaron will try to discover the corresponding repository of an artifact unless it is already provided as input (as shown later). To disable or otherwise configure this behavior, or others, a custom .ini
file should be passed to Macaron during execution. See How to change the default configuration for more details.
For example, under the repofinder
header, three options exist: find_repos
, use_open_source_insights
, and redirect_urls
:
find_repos
(Values: True or False) - Enables or disables the Repository Finding feature.use_open_source_insights
(Values: True or False) - Enables or disables use of Google’s Open Source Insights API.redirect_urls
(Values: List of URLs) - These are URLs that are known to redirect to actual repository URLs.
To turn off the automatic source repo finding feature, change the following section in the configuration ini
file:
[repofinder]
find_repos = False
Within the configuration file under the repofinder.java
header, three options exist: artifact_repositories
, repo_pom_paths
, find_parents
. These options behave as follows:
artifact_repositories
(Values: List of URLs) - Determines the remote artifact repositories to attempt to retrieve dependency information from.repo_pom_paths
(Values: List of POM tags) - Determines where to search for repository information in the POM files. E.g. scm.url.find_parents
(Values: True or False) - When enabled, the Repository Finding feature will also search for repository URLs in parents POM files of the current dependency.
Note
Finding repositories requires at least one remote call, adding some additional overhead to an analysis run.
Note
Google’s Open Source Insights API is currently used to find repositories for: Python, Rust, .Net, NodeJS
An example configuration file for utilising this feature:
[repofinder]
find_repos = True
use_open_source_insights = True
redirect_urls =
gitbox.apache.org
git-wip-us.apache.org
[repofinder.java]
artifact_repositories = https://repo.maven.apache.org/maven2
repo_pom_paths =
scm.url
scm.connection
scm.developerConnection
find_parents = True
Analyzing a source code repository
Analyzing a public GitHub repository
Macaron can also analyze a public GitHub repository (and potentially the repositories of its dependencies).
To run Macaron on a GitHub public repository, we use the following command:
./run_macaron.sh analyze -rp <repo_path>
With repo_path
being the remote path to your target repository.
By default, Macaron will analyze the latest commit of the default branch. However, you could specify the branch and commit digest to run the analysis against:
./run_macaron.sh analyze -rp <repo_path> -b <branch_name> -d <digest>
For example, to analyze the SLSA posture of micronaut-core at branch 4.0.x and commit 82d115b4901d10226552ac67b0a10978cd5bc603
we could use the following command:
./run_macaron.sh analyze -rp https://github.com/micronaut-projects/micronaut-core -b 4.0.x -d 82d115b4901d10226552ac67b0a10978cd5bc603
Note
Macaron automatically detects and analyzes direct dependencies for Java Maven and Gradle projects. This process might take a while and can be skipped by using the --skip-deps
option.
Take the same example as above, to disable analyzing micronaut-core direct dependencies, we could use the following command:
./run_macaron.sh analyze -rp https://github.com/micronaut-projects/micronaut-core -b 4.0.x -d 82d115b4901d10226552ac67b0a10978cd5bc603 --skip-deps
Note
By default, Macaron would generate report files into the output
directory in the current working directory. To understand the structure of this directory please see Output Files Guide.
With the example above, the generated output reports can be seen here:
Analyzing a GitLab repository
Macaron supports analyzing GitLab repositories, whether they are hosted on gitlab.com or on your self-hosted GitLab instance. The set up in these two cases are a little bit different.
Analyzing a repository on gitlab.com
Analyzing a public repository on gitlab.com is quite similar to analyzing a public GitHub repository – you just need to pass a proper GitLab repository URL to macaron analyze
.
To analyze a private repository hosted on gitlab.com
, you need to obtain a GitLab access token having at least the read_repository
permission and store it into the MCN_GITLAB_TOKEN
environment variable. For more detailed instructions, see GitLab documentation.
Analyzing a repository on a self-hosted GitLab instance
To analyze a repository on a self-hosted GitLab instance, you need to do the following:
Add the following
[git_service.gitlab.self_hosted]
section into your.ini
config. In the default .ini configuration (generated usingmacaron dump-default
– see instructions), there is already this section commented out. You can start by un-commenting this section and modifying thehostname
value with the hostname of your self-hosted GitLab instance.
# Access to a self-hosted GitLab instance (e.g. your organization's self-hosted GitLab instance).
# If this section is enabled, an access token must be provided through the ``MCN_SELF_HOSTED_GITLAB_TOKEN`` environment variable.
# The `read_repository` permission is required for this token.
[git_service.gitlab.self_hosted]
hostname = internal.gitlab.org
Obtain a GitLab access token having at least the
read_repository
permission and store it into theMCN_SELF_HOSTED_GITLAB_TOKEN
environment variable. For more detailed instructions, see GitLab documentation.
Providing a PURL string instead of a repository path
Instead of providing the repository path to analyze a software component, you can use a PURL. string for the target git repository.
To simplify the examples, we use the same configurations as above if needed (e.g., for the self-hosted GitLab instances). The PURL string for a git repository should have the following format:
pkg:<git_service_hostname>/<organization>/<name>
The list below shows examples for the corresponding PURL strings for different git repositories:
Repository path |
PURL string |
---|---|
|
Both |
|
Both |
|
|
|
|
Run the analysis using the PURL string as follows:
./run_macaron.sh analyze -purl <purl_string>
You can also provide the PURL string together with the repository path. In this case, the PURL string will be used as the unique identifier for the analysis target. If providing a PURL with a version, providing the repository path as well is sufficient for analysis to take place. If providing a PURL without a version, the branch and digest must also be provided alongside the repository path. Examples of both use cases follow.
Analyzing a PURL (with an included version) and a repository path:
./run_macaron.sh analyze -purl <purl_string_with_version> -rp <repo_path>
Analyzing a PURL (without an included version) and a repository path (with a digest and branch):
./run_macaron.sh analyze -purl <purl_string> -rp <repo_path> -b <branch> -d <digest>
Verifying provenance expectations in CUE language
When a project generates provenances, you can add a build expectation in the form of a Configure Unify Execute (CUE) policy to check the content of provenances. For instance, the expectation can specify the accepted GitHub Actions workflows that trigger a build, which can prevent using artifacts built from attackers workflows.
./run_macaron.sh analyze -pe micronaut-core.cue -rp https://github.com/micronaut-projects/micronaut-core -b 4.0.x -d 82d115b4901d10226552ac67b0a10978cd5bc603 --skip-deps
where micronaut-core.cue
file can contain:
{
target: "pkg:github.com/micronaut-projects/micronaut-core",
predicate: {
invocation: {
configSource: {
uri: =~"^git\\+https://github.com/micronaut-projects/micronaut-core@refs/tags/v[0-9]+.[0-9]+.[0-9]+$"
entryPoint: ".github/workflows/release.yml"
}
}
}
}
Note
The provenance expectation is verified via the provenance_expectation
check in Macaron. You can see the result of this check in the HTML or JSON report and see if the provenance found by Macaron meets the expectation CUE file.
Analyzing with an SBOM
Macaron can run the analysis against an existing SBOM in CycloneDX which contains all the necessary information of the dependencies of a target software component. In this case, the dependencies will not be resolved automatically.
CycloneDX provides open-source SBOM generators for different types of projects (e.g Maven, Gradle, etc). For instructions on generating a CycloneDX SBOM for your project, see CycloneDX documentation.
SBOM for Maven projects
For example, let’s analyze the dependencies of pkg:maven/org.apache.maven/maven@3.9.7?type=pom, using the SBOM generated by CycloneDX Maven plugin.
To run the analysis against that SBOM, run this command:
./run_macaron.sh analyze -purl pkg:maven/org.apache.maven/maven@3.9.7?type=pom -sbom <path_to_sbom>
Where path_to_sbom
is the path to the SBOM you want to use.
SBOM for Python projects
For Python projects, you can use cyclonedx-py to generate the SBOM. First install the package in a virtual environment, and then use cyclonedx-py
to generate an SBOM for it. Here is an example:
python -m venv .django_venv # Create a virtual environment called .django_venv
.django_venv/bin/pip install django==5.0.6 # Install the package in the virtual environment
cyclonedx-py environment .django_venv --output-format json --outfile django_sbom.json # Generate the SBOM
Then run Macaron and pass the SBOM file as input:
./run_macaron.sh analyze -purl pkg:pypi/django@5.0.6 -sbom <path_to_django_sbom.json>
Analyzing dependencies in the SBOM without the main software component
In the case where the repository URL of the main software component is not available (e.g. the repository is in a self-hosted git service instance where Macaron cannot access),
Macaron can still run the analysis on the dependencies listed in the SBOM.
To do that, you must first create a PURL to represent the main software component, e.g.,
pkg:maven/private.apache.maven/maven@4.0.0-alpha-1-SNAPSHOT?type=pom
.
Then the analysis can be run as follows:
./run_macaron.sh analyze -purl pkg:maven/private.apache.maven/maven@4.0.0-alpha-1-SNAPSHOT?type=pom -sbom <path_to_sbom>
Where path_to_sbom
is the path to the SBOM you want to use.
Analyzing dependencies using Python virtual environment
Macaron can automatically identify and analyze the dependencies of a Python package if you provide the path to the virtual environment where the package is installed.
Let’s say you want to analyze django@5.0.6
and its dependencies. First create a virtual environment and install django@5.0.6
:
python3.11 -m venv /tmp/.django_venv
/tmp/.django_venv/bin/pip install django==5.0.6
Then run Macaron as follows:
./run_macaron.sh analyze -purl pkg:pypi/django@5.0.6 --python-venv "/tmp/.django_venv"
Where --python-venv
is the path to virtual environment.
Alternatively, you can create an SBOM for the python package and provide it to Macaron as input as explained here.
Note
We only support Python 3.11 for this feature of Macaron. Please make sure to install the package using this version of Python.
Analyzing a repository on the local file system
Note
We assume that the origin
remote exists in the cloned repository and checkout the relevant commits from origin
only.
Macaron supports analyzing a repository on the local file system.
Analyzing a repository whose git service is not supported by Macaron
If the repository remote URL is from an unknown git service (see Git Services for a list of supported git services in Macaron), Macaron won’t recognize it when analyzing the repository.
You would need to tell Macaron about that git service through the defaults.ini
config.
For example, let’s say you want to analyze a repository hosted at https://git.example.com/foo/target
. First, you need to create a defaults.ini
file in the current working directory with the following content:
[git_service.local_repo]
hostname = git.example.com
In which hostname
contains the hostname of the git service URL. In this example it is git.example.com
.
Note
This defaults.ini
section must only be used for analyzing a repository on the local file system. If the hostname has already been supported in other services, it doesn’t need to be defined again here.
Assume that the dir tree at the current working directory has the following structure:
boo
├── foo
│ └── target
We can run Macaron against the local repository at target
by using this command:
./run_macaron.sh --local-repos-path ./boo/foo --defaults-path ./defaults.ini analyze --repo-path target <rest_of_args>
With rest_of_args
being the arguments to the analyze
command (e.g. --branch/-b
, --digest/-d
or --skip-deps
similar to two previous examples).
The --local-repos-path/-lr
flag tells Macaron to look into ./boo/foo
for local repositories. For more information, please see Command Line Usage.
Note
If --local-repos-path/-lr
is not provided, Macaron will looks inside <current_working_directory>/output/git_repos/local_repos/
whenever you provide a local path to --repo-path/-rp
.
Analyzing a local repository with supported git service
If the local repository you want to analyze has a remote origin hosted on a supported git service, you can run the analysis directly without having to prepare defaults.ini
as above.
Assume that the dir tree at the current working directory has the following structure:
boo
├── foo
│ └── target
We can run Macaron against the local repository at target
by using this command:
./run_macaron.sh --local-repos-path ./boo/foo analyze --repo-path target <rest_of_args>
With rest_of_args
being the arguments to the analyze
command (e.g. --branch/-b
, --digest/-d
or --skip-deps
similar to two previous examples).
The --local-repos-path/-lr
flag tells Macaron to look into ./boo/foo
for local repositories. For more information, please see Command Line Usage.
Note
If --local-repos-path/-lr
is not provided, Macaron will look inside <current_working_directory>/output/git_repos/local_repos/
whenever you provide a local path to --repo-path/-rp
.
Warning
Macaron by default analyzes the current state of the local repository. However, if the user provides a branch or commit hash as input, Macaron may reset the index and working tree of the repository to check out a specific commit. Therefore, any uncommitted changes in the repository need to be backed up to prevent loss (these include unstaged changes, staged changes and untracked files). However, Macaron will not modify the history of the repository.
Running the policy engine
Macaron’s policy engine accepts policies specified in Datalog. An example policy
can verify if a project and all its dependencies pass certain checks. We use Soufflé
as the Datalog engine in Macaron. Once you run the checks on a target project as described here,
the check results will be stored in macaron.db
in the output directory. We pass the check results to the policy engine by providing the path to macaron.db
together with a Datalog policy file to be validated by the policy engine.
In the Datalog policy file, we must specify the identifier for the target software component that interests us to validate the policy against. These are two ways to specify the target software component in the Datalog policy file:
Using the complete name of the target component (e.g.
github.com/oracle-quickstart/oci-micronaut
)Using the PURL string of the target component (e.g.
pkg:github.com/oracle-quickstart/oci-micronaut@<commit_sha>
).
We use Micronaut MuShop project as a case study to show how to run the policy engine. Micronaut MuShop is a cloud-native microservices example for Oracle Cloud Infrastructure. When we run Macaron on the Micronaut MuShop GitHub project, it automatically finds the project’s dependencies and runs checks for the top-level project and dependencies independently. For example, the build service check, as defined in SLSA, analyzes the CI configurations to determine if its artifacts are built using a build service. Another example is the check that determines whether a SLSA provenance document is available for an artifact. If so, it verifies whether the provenance document attests to the produced artifacts. For the Micronaut MuShop project, Macaron identifies 48 dependencies that map to 24 unique repositories and generates an HTML report that summarizes the check results.
Now we can run the policy engine over these results and enforce a policy:
./run_macaron.sh verify-policy -o outputs -d outputs/macaron.db --file <policy_file>
In this example, the Datalog policy files for both ways (as mentioned previously) are provided in oci-micronaut-repo.dl and oci-micronaut-purl.dl.
The differences between the two policy files can be observed below:
apply_policy_to("oci_micronaut_dependencies", repo_id) :- is_repo(repo_id, "github.com/oracle-quickstart/oci-micronaut", _).
apply_policy_to("oci_micronaut_dependencies", component_id) :- is_component(component_id, "<target_software_component_purl>").
The PURL string for the target software component is printed to the console by the analyze command. For example:
> ./run_macaron.sh analyze -rp https://github.com/oracle-quickstart/oci-micronaut
> ...
> 2023-08-15 14:36:56,672 [INFO] The PURL string for the main target software component in this analysis is
'pkg:github.com/oracle-quickstart/oci-micronaut@3ebe0c9520a25feeae983eac6eb956de7da29ead'.
> 2023-08-15 14:36:56,672 [INFO] Analysis Completed!
This example policy can verify if the Micronaut MuShop project and all its dependencies pass the build_service
check
and the Micronaut provenance documents meets the expectation provided as a CUE file.
Thanks to Datalog’s expressive language model, it’s easy to add exception rules if certain dependencies do not meet a
requirement. For example, the Mysql Connector/J dependency in
the Micronaut MuShop project does not pass the build_service
check, but can be manually investigated and exempted if trusted. Overall, policies expressed in Datalog can be
enforced by Macaron as part of your CI/CD pipeline to detect regressions or unexpected behavior.
Modifying the default configuration
See dump-defaults, the CLI command to dump the default configurations in defaults.ini
. After making changes, see analyze CLI command for the option to pass the modified defaults.ini
file.
For example, to turn off the automatic source repo finding feature, change the following section in the configuration ini
file:
[repofinder]
find_repos = False
Then run Macaron passing the modified configuration file:
./run_macaron.sh -dp <path-to-modified-default.ini> analyze -purl <artifact-purl>