macaron.slsa_analyzer.git_service package
The git_service package contains the supported git services for Macaron.
Submodules
macaron.slsa_analyzer.git_service.api_client module
The module provides API clients for VCS services, such as GitHub.
- class macaron.slsa_analyzer.git_service.api_client.GitHubReleaseAsset(name: str, url: str, size_in_bytes: int, api_client: GhAPIClient)
Bases:
NamedTuple
An asset published from a GitHub Release.
-
api_client:
GhAPIClient
The GitHub API client.
-
api_client:
- class macaron.slsa_analyzer.git_service.api_client.BaseAPIClient
Bases:
object
This is the base class for API clients.
- get_latest_release(full_name)
Return the latest release for the repo.
- fetch_assets(release, ext='')
Return the release assets that match or empty if it doesn’t exist.
The extension is ignored if name is set.
- download_asset(url, download_path)
Download the assets of the release that match the pattern (if specified).
- get_file_link(full_name, commit_sha, file_path)
Return a hyperlink to the file.
- class macaron.slsa_analyzer.git_service.api_client.GhAPIClient(profile)
Bases:
BaseAPIClient
This class acts as a client to use GitHub API.
See https://docs.github.com/en/rest for the GitHub API documentation.
- __init__(profile)
Initialize GHSearchClient.
- Parameters:
profile (dict) – The json object describes the profile to be included in each request by this client.
- get_repo_workflow_data(full_name, workflow_name)
Query GitHub REST API for the information of a workflow.
The url would be in the following form:
https://api.github.com/repos/{full_name}/actions/workflows/{workflow_name}
- Parameters:
- Returns:
The json query result or an empty dict if failed.
- Return type:
Examples
The following call to this method will perform a query to
https://api.github.com/repos/owner/repo/actions/workflows/build.yml
- get_workflow_runs(full_name, branch_name=None, created_after=None, page=1)
Query the GitHub REST API for the data of all workflow run of a repository.
The url would be in the following form:
https://api.github/com/repos/{full_name}/ actions/runs?{page}&branch={branch_name}&created=>={created_after}&per_page={MAX_ITEMS_NUM}
The
branch_name
andcommit_date
parameters can be empty.MAX_ITEMS_NUM
can be configured via the defaults.ini.- Parameters:
full_name (str) – The full name of the target repo in the form
owner/repo
.branch_name (str | None) – The name of the branch to look for workflow runs (e.g
master
).created_after (str) – Only look for workflow runs after this date (e.g.
2022-03-11T16:44:40Z
).page (int) – The page number for querying as the workflow we want to get might be in a different page (due to max limit 100 items per page).
- Returns:
The json query result or an empty dict if failed.
- Return type:
Examples
The following call to this method will perform a query to
https://api.github/com/repos/owner/repo/actions/runs?1&branch=master&created=>= 2022-03-11T16:44:40Z&per_page=100
- get_workflow_run_jobs(full_name, run_id)
Query the GitHub REST API for the workflow run jobs.
The url would be in the following form:
https://api.github/com/repos/{full_name}/actions/runs/<run_id>/jobs
- Parameters:
- Returns:
The json query result or an empty dict if failed.
- Return type:
Examples
The following call to this method will perform a query to
https://api.github/com/repos/{full_name}/ actions/runs/<run_id>/jobs
- get_workflow_run_for_date_time_range(full_name, datetime_range)
Query the GitHub REST API for the workflow run within a datetime range.
The url would be in the following form:
https://api.github.com/repos/{full_name}/actions/runs?create=datetime-range
- Parameters:
- Returns:
The json query result or an empty dict if failed.
- Return type:
Examples
The following call to this method will perform a query to
https://api.github/com/repos/owner/repo/actions/runs?created=2022-11-05T20:38:40..2022-11-05T20:38:58
- get_commit_data_from_hash(full_name, commit_hash)
Query the GitHub API for the data of a commit using the hash for that commit.
The url would be in the following form:
https://api.github.com/repos/{full_name}/commits/{commit_hash}
- Parameters:
- Returns:
The json query result or an empty dict if failed.
- Return type:
Examples
The following call to this method will perform a query to:
https://api.github.com/repos/owner/repo/commits/6dcb09b5b57875f334f61aebed695e2e4193db5e
gh_client.get_commit_data_from_hash( full_name="owner/repo", commit_hash="6dcb09b5b57875f334f61aebed695e2e4193db5e", )
- search(target, query)
Perform a search using GitHub REST API.
This query is at endpoint:
api.github.com/search/{target}?{query}
- Parameters:
- Returns:
The json query result or an empty dict if failed.
- Return type:
Examples
The following call to this method will perform a query to:
https://api.github.com/search/code?q=addClass+in:file+language:js+repo:jquery/jquery
gh_client.search( target="repositories", query="q=addClass+in:file+language:js+repo:jquery/jquery", )
- get(url)
Perform a GET request to the given URL.
- get_job_build_log(log_url)
Download and return the build log indicated at log_url.
- get_repo_data(full_name)
Get the repo data using GitHub REST API.
The query is at endpoint:
api.github.com/repos/{full_name}
- Parameters:
full_name (str) – The full name of the repository in the format {owner/name}.
- Returns:
The json query result or an empty dict if failed.
- Return type:
Examples
To get the repo data from repository
apache/maven
:gh_client.get_repo_data("apache/maven")
- get_file_link(full_name, commit_sha, file_path)
Return a GitHub hyperlink tag or just a link to the file.
The format for the link is https://github.com/{full_name}/blob/{digest}/{file_path}. The path of the file is relative to the root dir of the repository. The commit sha must be in full form.
- Parameters:
- Returns:
The hyperlink tag to the file.
- Return type:
Examples
>>> api_client = GhAPIClient(profile={"headers": "", "query": []}) >>> api_client.get_file_link("owner/repo", "5aaaaa43caabbdbc26c254df8f3aaa7bb3f4ec01", ".travis_ci.yml") 'https://github.com/owner/repo/blob/5aaaaa43caabbdbc26c254df8f3aaa7bb3f4ec01/.travis_ci.yml'
- get_relative_path_of_workflow(workflow_name)
Return the relative path of the workflow from the root dir of the repo.
- Parameters:
workflow_name (str) – The name of the yaml Gh Action workflow.
- Returns:
The relative path of the workflow from the root dir of the repo.
- Return type:
Examples
>>> api_client = GhAPIClient(profile={"headers": "", "query": []}) >>> api_client.get_relative_path_of_workflow("build.yaml") '.github/workflows/build.yaml'
- get_release_by_tag(full_name, tag)
Return the release of the passed tag.
- get_latest_release(full_name)
Return the latest release for the repo.
- Parameters:
full_name (str) – The full name of the repo.
- Returns:
The latest release object in JSON format. Schema: https://docs.github.com/en/rest/releases/releases?apiVersion=2022-11-28#get-the-latest-release.
- Return type:
- fetch_assets(release, ext='')
Return the release assets that match or empty if it doesn’t exist.
The extension is ignored if name is set.
- Parameters:
release (dict) – The release payload in JSON format. Schema: https://docs.github.com/en/rest/releases/releases?apiVersion=2022-11-28#get-the-latest-release.
ext (str) – The asset extension to find; this parameter is ignored if name is set.
- Returns:
A sequence of release assets.
- Return type:
Sequence[AssetLocator]
macaron.slsa_analyzer.git_service.base_git_service module
This module contains the BaseGitService class to be inherited by a git service.
- class macaron.slsa_analyzer.git_service.base_git_service.BaseGitService(name)
Bases:
object
This abstract class is used to implement git services.
- abstract load_defaults()
Load the values for this git service from the ini configuration.
- Return type:
- load_hostname(section_name)
Load the hostname of the git service from the ini configuration section
section_name
.The section may or may not be available in the configuration. In both cases, the method should not raise
ConfigurationError
.Meanwhile, if the section is present but there is a schema violation (e.g. a key such as
hostname
is missing), this method will raise aConfigurationError
.- Parameters:
section_name (str) – The name of the git service section in the ini configuration file.
- Returns:
The hostname. This can be
None
if the git service section is not found in the ini configuration file, meaning the user does not enable the corresponding git service.- Return type:
str | None
- Raises:
ConfigurationError – If there is a schema violation in the git service section.
- is_detected(url)
Check if the remote repo at the given
url
is hosted on this git service.This check is done by checking the URL of the repo against the hostname of this git service.
- abstract clone_repo(clone_dir, url)
Clone a repository.
- Parameters:
- Raises:
CloneError – If there is an error cloning the repo.
- Return type:
- abstract check_out_repo(git_obj, branch, digest, offline_mode)
Checkout the branch and commit specified by the user of a repository.
- Parameters:
- Returns:
The same Git object from the input.
- Return type:
Git
- Raises:
RepoError – If there is an error while checking out the specific branch or commit.
- class macaron.slsa_analyzer.git_service.base_git_service.NoneGitService
Bases:
BaseGitService
This class can be used to initialize an empty git service.
- __init__()
Initialize instance.
- load_defaults()
Load the values for this git service from the ini configuration.
In this particular case, since this class represents a
None
git service, we do nothing.- Return type:
- is_detected(url)
Return True if the remote repo is using this git service.
- clone_repo(_clone_dir, url)
Clone a repo.
In this particular case, since this class represents a
None
git service, we do nothing but raise aCloneError
.- Raises:
CloneError – Always raise, since this method should not be used to clone any repository.
- Return type:
- check_out_repo(git_obj, branch, digest, offline_mode)
Checkout the branch and commit specified by the user of a repository.
In this particular case, since this class represents a
None
git service, we do nothing but raise aRepoError
.- Raises:
RepoError – Always raise, since this method should not be used to check out in any repository.
- Return type:
Git
macaron.slsa_analyzer.git_service.bitbucket module
This module contains the spec for the BitBucket service.
- class macaron.slsa_analyzer.git_service.bitbucket.BitBucket
Bases:
BaseGitService
This class contains the spec of the BitBucket service.
- __init__()
Initialize instance.
- check_out_repo(git_obj, branch, digest, offline_mode)
Checkout the branch and commit specified by the user of a repository.
- Return type:
Git
macaron.slsa_analyzer.git_service.github module
This module contains the spec for the GitHub service.
- class macaron.slsa_analyzer.git_service.github.GitHub
Bases:
BaseGitService
This class contains the spec of the GitHub service.
- __init__()
Initialize instance.
- load_defaults()
Load the values for this git service from the ini configuration and environment variables.
- Raises:
ConfigurationError – If there is an error loading the configuration.
- Return type:
- property api_client: GhAPIClient
Return the API client used for querying GitHub API.
This API is used to check if a GitHub repo can be cloned.
- clone_repo(clone_dir, url)
Clone a GitHub repository.
- Return type:
- clone_dir: str
The name of the directory to clone into. This is equivalent to the <directory> argument of
git clone
. The url to the repository.
- Raises:
CloneError – If there is an error cloning the repo.
- check_out_repo(git_obj, branch, digest, offline_mode)
Checkout the branch and commit specified by the user of a repository.
- Parameters:
- Returns:
The same Git object from the input.
- Return type:
Git
- Raises:
RepoError – If there is error while checkout the specific branch and digest.
macaron.slsa_analyzer.git_service.gitlab module
This module contains the spec for the GitLab service.
Note: We are making the assumption that we are only supporting two different GitLab
services: one is called publicly_hosted
and the other is called self_hosted
.
The corresponding access tokens are stored in the environment variables
MCN_GITLAB_TOKEN
and MCN_SELF_HOSTED_GITLAB_TOKEN
, respectively.
Reason for this is mostly because of our assumption that Macaron is used as a container. Fixing static names for the environment variables allows for easier propagation of these variables into the container.
In the ini configuration file, settings for the publicly_hosted
GitLab service is in the
[git_service.gitlab.publicly_hosted]
section; settings for the self_hosted
GitLab service
is in the [git_service.gitlab.self_hosted]
section.
- class macaron.slsa_analyzer.git_service.gitlab.GitLab(token_function)
Bases:
BaseGitService
This class contains the spec of the GitLab service.
- __init__(token_function)
Initialize instance.
- Parameters:
token_function (Callable[[], str]) – A function that returns a token when called.
- construct_clone_url(url)
Construct a clone URL for GitLab, with or without access token.
- Parameters:
url (str) – The URL of the repository to be cloned.
- Returns:
The URL that is actually used for cloning, containing the access token. See GitLab documentation: https://docs.gitlab.com/ee/gitlab-basics/start-using-git.html#clone-using-a-token.
- Return type:
- Raises:
CloneError – If there is an error parsing the URL.
- clone_repo(clone_dir, url)
Clone a repository.
To clone a GitLab repository with access token, we embed the access token in the https URL. See GitLab documentation: https://docs.gitlab.com/ee/gitlab-basics/start-using-git.html#clone-using-a-token.
If we clone using the https URL with the token embedded, this URL will be stored as plain text in .git/config as the origin remote URL. Therefore, after a repository is cloned, this remote origin URL will be set with the value of the original
url
(which does not have the embedded token).- Parameters:
- Raises:
CloneError – If there is an error cloning the repository.
- Return type:
- check_out_repo(git_obj, branch, digest, offline_mode)
Checkout the branch and commit specified by the user of a repository.
For GitLab, this method set the origin remote URL of the target repository to the token-embedded URL if a token is available before performing the checkout operation.
After the checkout operation finishes, the origin remote URL is set back again to ensure that no token-embedded URL remains.
- Parameters:
- Returns:
The same Git object from the input.
- Return type:
Git
- Raises:
RepoCheckOutError – If there is error while checkout the specific branch and digest.
- class macaron.slsa_analyzer.git_service.gitlab.SelfHostedGitLab
Bases:
GitLab
The self-hosted GitLab instance.
- __init__()
Initialize instance.
- load_defaults()
Load the values for this git service from the ini configuration and environment variables.
In this case, the environment variable
MCN_SELF_HOSTED_GITLAB_TOKEN
holding the access token for the private GitLab service is expected.- Raises:
ConfigurationError – If there is an error loading the configuration.
- Return type:
- class macaron.slsa_analyzer.git_service.gitlab.PubliclyHostedGitLab
Bases:
GitLab
The publicly-hosted GitLab instance.
- __init__()
Initialize instance.
- load_defaults()
Load the values for this git service from the ini configuration and environment variables.
In this case, the environment variable
MCN_GITLAB_TOKEN
holding the access token for the public GitLab service is optional.- Raises:
ConfigurationError – If there is an error loading the configuration.
- Return type:
macaron.slsa_analyzer.git_service.local_repo_git_service module
This module contains the spec for the local repo git service.
- class macaron.slsa_analyzer.git_service.local_repo_git_service.LocalRepoGitService
Bases:
BaseGitService
This class contains the spec of the local repo git service.
- __init__()
Initialize instance.
- clone_repo(_clone_dir, _url)
Cloning from a local repo git service is not supported.
- Return type:
- check_out_repo(git_obj, branch, digest, offline_mode)
Checkout the branch and commit specified by the user of a repository.
- Return type:
Git