macaron.slsa_analyzer.ci_service.github_actions package

Submodules

macaron.slsa_analyzer.ci_service.github_actions.analyzer module

This module provides the intermediate representations and analysis functions for GitHub Actions.

class macaron.slsa_analyzer.ci_service.github_actions.analyzer.ThirdPartyAction(action_name, action_version)

Bases: object

The representation for a third-party GitHub Action.

action_name: str

The name of the GitHub Action.

action_version: str | None

The version of the GitHub Action.

__init__(action_name, action_version)
class macaron.slsa_analyzer.ci_service.github_actions.analyzer.GitHubWorkflowType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: str, Enum

This class represents different GitHub Actions workflow types.

INTERNAL = 'internal'
EXTERNAL = 'external'
REUSABLE = 'reusable'
class macaron.slsa_analyzer.ci_service.github_actions.analyzer.GitHubWorkflowNode(name, node_type, source_path, parsed_obj, model=None, **kwargs)

Bases: BaseNode

This class represents a callgraph node for GitHub Actions workflows.

__init__(name, node_type, source_path, parsed_obj, model=None, **kwargs)

Initialize instance.

Parameters:
  • name (str) – Name of the workflow (or URL for reusable and external workflows).

  • node_type (GitHubWorkflowType) – The type of workflow.

  • source_path (str) – The path of the workflow.

  • parsed_obj (Workflow | Identified[ReusableWorkflowCallJob] | ActionStep) – The parsed Actions workflow object. Actual type must correspond to node type. (INTERNAL -> Workflow, REUSABLE -> Identified[ReusableWorkflowCallJob], EXTERNAL -> ActionStep)

  • caller (BaseNode | None) – The caller node.

  • model (ThirdPartyAction | None) – The static analysis abstraction for the third-party GitHub Action.

class macaron.slsa_analyzer.ci_service.github_actions.analyzer.GitHubJobNode(name, source_path, parsed_obj, **kwargs)

Bases: BaseNode

This class represents a callgraph node for GitHub Actions jobs.

__init__(name, source_path, parsed_obj, **kwargs)

Initialize instance.

Parameters:
  • name (str) – Name of the workflow (or URL for reusable and external workflows).

  • source_path (str) – The path of the workflow.

  • parsed_obj (Identified[Job]) – The parsed Actions workflow object.

  • caller (BaseNode) – The caller node.

macaron.slsa_analyzer.ci_service.github_actions.analyzer.is_parsed_obj_workflow(parsed_obj)

Type guard for Workflow parsed_obj.

Return type:

TypeGuard[Workflow]

macaron.slsa_analyzer.ci_service.github_actions.analyzer.is_parsed_obj_reusable_workflow_call_job(obj)

Type guard for ReusableWorkflowCallJob parsed_obj.

Return type:

TypeGuard[Identified[ReusableWorkflowCallJob]]

macaron.slsa_analyzer.ci_service.github_actions.analyzer.is_parsed_obj_action_step(parsed_obj)

Type guard for ActionStep parsed_obj.

Return type:

TypeGuard[Step4]

macaron.slsa_analyzer.ci_service.github_actions.analyzer.find_expression_variables(value, exp_var)

Find all the matching GitHub Actions expression variables in a string value.

GitHub Actions Expression syntax: ${{ <expression> }} See https://docs.github.com/en/actions/learn-github-actions/expressions#about-expressions

Parameters:
  • value (str) – The value in which the expression values are searched.

  • exp_var (str) – The expression variable name.

Yields:

Iterable[str] – The expression variable names.

Return type:

Iterable[str]

Examples

>>> list(find_expression_variables("echo ${{ inputs.foo }}", "inputs"))
['foo']
>>> list(find_expression_variables("echo ${{ inputs.foo }} ${{ inputs.bar }}", "inputs"))
['foo', 'bar']
>>> list(find_expression_variables("echo ${{ inputs.foo }} ${{ inputs.bar }}", "matric"))
[]
macaron.slsa_analyzer.ci_service.github_actions.analyzer.resolve_matrix_variable(job_node, var)

Resolve the value of a GitHub Actions matrix variable.

For the specification of matrix variables in GitHub Actions see: https://docs.github.com/en/actions/using-jobs/using-a-matrix-for-your-jobs

Parameters:
  • job_node (GitHubJobNode) – The target GitHub Actions job.

  • var (str) – The matrix variable that needs to be resolved.

Yields:

str – The possible values of the matrix variable.

Raises:

GitHubActionsValueError – When the matrix variable cannot be found.

Return type:

Iterable[str]

macaron.slsa_analyzer.ci_service.github_actions.analyzer.is_expression(value)

Determine if a value is a GitHub Actions expression.

Parameters:

value (str) – The input value.

Returns:

True if the input value is a GitHub Actions expression.

Return type:

bool

Examples

>>> is_expression("${{ foo }}")
True
>>> is_expression("${{ foo }")
False
>>> is_expression("${ foo }")
False
macaron.slsa_analyzer.ci_service.github_actions.analyzer.find_language_setup_action(job_node, lang_name)

Find the step that calls a language setup GitHub Actions and return the model.

Parameters:
  • job_node (GitHubJobNode) – The target GitHub Actions job node.

  • lang_name (BuildLanguage) – The target language used in the build.

Returns:

The language model for the language setup GitHub Action or None.

Return type:

Language | None

macaron.slsa_analyzer.ci_service.github_actions.analyzer.build_call_graph_from_node(node, repo_path)

Analyze the GitHub Actions node to build the call graph.

Parameters:
  • node (GitHubWorkflowNode) – The node for a single GitHub Actions workflow.

  • repo_path (str) – The file system path to the repo.

Return type:

None

macaron.slsa_analyzer.ci_service.github_actions.analyzer.build_call_graph_from_path(root, workflow_path, repo_path, macaron_path='')

Build the call Graph for GitHub Actions workflows.

At the moment it does not analyze third-party workflows to include their callees.

Parameters:
  • repo_path (str) – The path to the repo.

  • workflow_path (str) – The path to the CI workflow file.

  • repo_path – The path to the target repository.

  • macaron_path (str) – Macaron’s root path (optional).

Returns:

The callgraph node for the GitHub Actions workflow.

Return type:

BaseNode

Raises:

ParseError – When parsing the workflow fails with error.

macaron.slsa_analyzer.ci_service.github_actions.analyzer.get_reachable_secrets(step_node)

Get reachable secrets to a GitHub Actions step.

Parameters:

step_node (BashNode) – The target GitHub Action step node.

Yields:

str – The reachable secret variable name.

Return type:

Iterable[str]

macaron.slsa_analyzer.ci_service.github_actions.analyzer.get_ci_events(workflow_node)

Get the CI events that trigger the GitHub Action workflow.

Parameters:

workflow_node (GitHubWorkflowNode) – The target GitHub Action workflow node.

Returns:

The list of event names or None.

Return type:

list[str] | None

class macaron.slsa_analyzer.ci_service.github_actions.analyzer.SetupJava(external_node)

Bases: Language, ThirdPartyAction

This class models the official setup-java GitHub Action from GitHub.

For the table of supported distributions see: https://github.com/actions/setup-java?tab=readme-ov-file#supported-distributions

action_name: str = 'actions/setup-java'

Name of the GitHub Action.

action_version: None

Version of the GitHub Action.

__init__(external_node)

Initialize the setup-java GitHub Action model.

Parameters:

external_node (GitHubWorkflowNode) – The external GitHub Action workflow node.

property lang_name: str

Get the name of the language.

property lang_versions: list[str] | None

Get the possible version of the language.

property lang_distributions: list[str] | None

Get the possible distributions of the language.

property lang_url: str | None

Get the URL that provides information about the language distributions and versions.

class macaron.slsa_analyzer.ci_service.github_actions.analyzer.OracleSetupJava(external_node)

Bases: Language, ThirdPartyAction

This class models the Oracle setup-java GitHub Action.

For the table of supported distributions see: # https://github.com/oracle-actions/setup-java?tab=readme-ov-file#input-overview

action_name: str = 'oracle-actions/setup-java'

Name of the GitHub Action.

action_version: None

Version of the GitHub Action.

__init__(external_node)

Initialize the Oracle setup-java GitHub Action model.

Parameters:

external_node (GitHubWorkflowNode) – The external GitHub Action workflow node.

property lang_name: str

Get the name of the language.

property lang_versions: list[str] | None

Get the possible version of the language.

property lang_distributions: list[str] | None

Get the possible distributions of the language.

property lang_url: str | None

Get the URL that provides information about the language distributions and versions.

class macaron.slsa_analyzer.ci_service.github_actions.analyzer.GraalVMSetup(external_node)

Bases: Language, ThirdPartyAction

This class models the GraalVM setup GitHub Action from GitHub.

For the table of supported distributions see: https://github.com/graalvm/setup-graalvm

action_name: str = 'graalvm/setup-graalvm'

Name of the GitHub Action.

action_version: None

Version of the GitHub Action.

__init__(external_node)

Initialize the setup-java GitHub Action model.

Parameters:

external_node (GitHubWorkflowNode) – The external GitHub Action workflow node.

property lang_name: str

Get the name of the language.

property lang_versions: list[str] | None

Get the possible version of the language.

property lang_distributions: list[str] | None

Get the possible distributions of the language.

property lang_url: str | None

Get the URL that provides information about the language distributions and versions.

macaron.slsa_analyzer.ci_service.github_actions.analyzer.create_third_party_action_model(external_node)

Create an instances of third-party model object.

Parameters:

external_node (GitHubWorkflowNode) – The external GitHub Actions workflow node.

Returns:

An instance object for the ThirdPartyAction model.

Return type:

ThirdPartyAction

macaron.slsa_analyzer.ci_service.github_actions.github_actions_ci module

This module analyzes GitHub Actions CI.

class macaron.slsa_analyzer.ci_service.github_actions.github_actions_ci.GitHubActions

Bases: BaseCIService

This class contains the spec of the GitHub Actions.

__init__()

Initialize instance.

set_api_client()

Set the GitHub client using the personal access token.

Return type:

None

load_defaults()

Load the default values from defaults.ini.

Return type:

None

is_detected(repo_path, git_service=None)

Return True if this CI service is used in the target repo.

Parameters:
  • repo_path (str) – The path to the target repo.

  • git_service (BaseGitService) – The Git service hosting the target repo.

Returns:

True if this CI service is detected, else False.

Return type:

bool

get_workflows(repo_path)

Get all workflows in a repository.

Parameters:

repo_path (str) – The path to the repository.

Returns:

The list of workflow files in this repository.

Return type:

list

has_latest_run_passed(repo_full_name, branch_name, commit_sha, commit_date, workflow)

Check if the latest run of workflow on commit commit_sha is passing.

This method queries for the list of workflow runs only from GitHub API using the repository full name. It will first perform a search using branch_name and commit_date as filters. If that failed, it will perform the same search but without any filtering.

Parameters:
  • repo_full_name (str) – The target repo’s full name.

  • branch_name (str | None) – The target branch.

  • commit_sha (str) – The commit sha of the target repo.

  • commit_date (str) – The commit date of the target repo.

  • workflow (str) – The name of the workflow file (e.g build.yml).

Returns:

The URL for the passing workflow run, or empty if no passing GitHub Action build workflow is found.

Return type:

str

check_publish_start_commit_timestamps(started_at, publish_date_time, commit_date_time, time_range)

Check if the timestamps of CI run, artifact publishing, and commit date are within the acceptable time range and valid.

This function checks that the CI run has happened before the artifact publishing timestamp.

This function also verifies whether the commit date is within an acceptable time range from the publish start time. The acceptable range is defined as half of the provided time range parameter.

Parameters:
  • started_at (datetime) – The timestamp indicating when the GitHub Actions workflow started.

  • publish_date_time (datetime) – The timestamp indicating when the artifact is published.

  • commit_date_time (datetime) – The timestamp of the source code commit.

  • time_range (int) – The total acceptable time range in seconds.

Returns:

True if the commit date is within the acceptable range from the publish start time,

False otherwise. Returns False in case of any errors during timestamp comparisons.

Return type:

bool

workflow_run_in_date_time_range(repo_full_name, workflow, publish_date_time, commit_date_time, job_id, step_name, step_id, time_range=0, callee_node_type=None)

Check if the repository has a workflow run started before the date_time timestamp within the time_range.

  • This method queries the list of workflow runs using the GitHub API for the provided repository full name.

  • It will filter out the runs that are not triggered by the given workflow.

  • It will only accept the runs that from date_time - time_range to date_time.

  • If a step_name is provided, checks that it has started before the date_time and has succeeded.

Parameters:
  • repo_full_name (str) – The target repo’s full name.

  • workflow (str) – The workflow URL.

  • date_time (datetime) – The datetime object to query.

  • step_name (str | None) – The name of the step in the GitHub Action workflow that needs to be checked.

  • step_id (str | None) – The ID of the step in the GitHub Action workflow that needs to be checked.

  • time_range (int) – The date-time range in seconds. The default value is 0.

Returns:

The set of URLs found for the workflow within the time range.

Return type:

set[str]

Raises:

GitHubActionsValueError – This error is raised when the GitHub Action workflow run misses values.

workflow_run_deleted(timestamp)

Check if the CI run data is deleted based on a retention policy.

Parameters:

timestamp (datetime) – The timestamp of the CI run.

Returns:

True if the CI run data is deleted.

Return type:

bool

search_for_workflow_run(workflow_id, commit_sha, full_name, branch_name=None, created_after=None)

Search for the target workflow run using GitHub API.

This method will perform a query to get workflow runs. It will then look through each run data to determine the target workflow run. It will only stop if:

  • There are no results left

  • It reaches the maximum number of results (1000) allowed by GitHub API

  • It finds the workflow run we are looking for

Parameters:
  • workflow_id (str) – The unique id of the workflow file obtained through GitHub API.

  • commit_sha (str) – The digest of the commit the workflow run on.

  • full_name (str) – The full name of the repository (e.g. owner/repo).

  • branch_name (str | None) – The branch name to filter out workflow runs.

  • created_after (str | None) – Only look for workflow runs after this date (e.g. 2022-03-11T16:44:40Z).

Returns:

The response data of the latest workflow run or an empty dict if error.

Return type:

dict

has_kws_in_log(latest_run, build_log)

Check the build log of this workflow run to see if it has build keywords.

Parameters:
  • latest_run (dict) – The latest run data from GitHub API.

  • build_log (list) – The list of kws used to analyze the build log.

Returns:

Whether the build log has build kw in it.

Return type:

bool

build_call_graph(repo_path, macaron_path='')

Build the call Graph for GitHub Actions workflows.

At the moment it does not analyze third-party workflows to include their callees.

Parameters:
  • repo_path (str) – The path to the repo.

  • macaron_path (str) – Macaron’s root path (optional).

Returns:

CallGraph – The call graph built for GitHub Actions.

Return type:

CallGraph

get_build_tool_commands(callgraph, build_tool)

Traverse the callgraph and find all the reachable build tool commands.

This generator yields sorted build tool command objects to allow a deterministic behavior. The objects are sorted based on the string representation of the build tool object.

Parameters:
  • callgraph (CallGraph) – The callgraph reachable from the CI workflows.

  • build_tool (BaseBuildTool) – The corresponding build tool for which shell commands need to be detected.

Yields:

BuildToolCommand – The object that contains the build command as well useful contextual information.

Raises:

CallGraphError – Error raised when an error occurs while traversing the callgraph.

Return type:

Iterable[BuildToolCommand]

get_third_party_configurations()

Get the list of third-party CI configuration files.

Returns:

The list of third-party CI configuration files

Return type:

list[str]