macaron.code_analyzer.dataflow_analysis package
Submodules
macaron.code_analyzer.dataflow_analysis.analysis module
Entry points to perform and use the dataflow analysis.
- macaron.code_analyzer.dataflow_analysis.analysis.analyse_github_workflow_file(workflow_path, repo_path, dump_debug=False)
Perform dataflow analysis for GitHub Actions Workflow file.
- macaron.code_analyzer.dataflow_analysis.analysis.analyse_github_workflow(workflow, workflow_source_path, repo_path, dump_debug=False)
Perform dataflow analysis for GitHub Actions Workflow.
- Parameters:
- Returns:
Graph representation of workflow and analysis results.
- Return type:
- macaron.code_analyzer.dataflow_analysis.analysis.analyse_bash_script(bash_content, source_path, repo_path, dump_debug=False)
Perform dataflow analysis for Bash script.
- Parameters:
- Returns:
Graph representation of Bash script and analysis results.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.analysis.FindSecretsVisitor(workflow_var_scope)
Bases:
objectVisitor to find references to GitHub secrets in analysis expressions.
- __init__(workflow_var_scope)
Construct a visitor to find secrets.
- Parameters:
workflow_var_scope (facts.Scope) – Scope in which secrets may be found
- macaron.code_analyzer.dataflow_analysis.analysis.get_reachable_secrets(bash_cmd_node)
Get GitHub secrets that are reachable at a bash command.
- Parameters:
bash_cmd_node (bash.BashSingleCommandNode) – The target Bash command node.
- Returns:
The set of reachable secret variable names.
- Return type:
- macaron.code_analyzer.dataflow_analysis.analysis.get_containing_github_job(node, parents)
Return the GitHub job node containing the given node, if any.
- Parameters:
- Returns:
The containing job node, or None if there is no containing job.
- Return type:
- macaron.code_analyzer.dataflow_analysis.analysis.get_containing_github_step(node, parents)
Return the GitHub step node containing the given node, if any.
- Parameters:
- Returns:
The containing step node, or None if there is no containing step.
- Return type:
- macaron.code_analyzer.dataflow_analysis.analysis.get_containing_github_workflow(node, parents)
Return the GitHub workflow node containing the given node, if any.
- Parameters:
- Returns:
The containing workflow node, or None if there is no containing workflow.
- Return type:
- macaron.code_analyzer.dataflow_analysis.analysis.get_build_tool_commands(nodes, build_tool)
Traverse the callgraph and find all the reachable build tool commands.
This generator yields sorted build tool command objects to allow a deterministic behavior. The objects are sorted based on the string representation of the build tool object.
- Parameters:
nodes (core.NodeForest) – The callgraph reachable from the CI workflows.
build_tool (BaseBuildTool) – The corresponding build tool for which shell commands need to be detected.
- Yields:
BuildToolCommand – The object that contains the build command as well useful contextual information.
- Return type:
macaron.code_analyzer.dataflow_analysis.bash module
Dataflow analysis implementation for analysing Bash shell scripts.
- class macaron.code_analyzer.dataflow_analysis.bash.BashExit
Bases:
ExitTypeExit type for Bash exit statement.
- class macaron.code_analyzer.dataflow_analysis.bash.BashReturn
Bases:
ExitTypeExit type for returning from a Bash function.
- class macaron.code_analyzer.dataflow_analysis.bash.BashScriptContext(outer_context, filesystem, env, func_decls, stdin_scope, stdin_loc, stdout_scope, stdout_loc, source_filepath)
Bases:
ContextContext for a Bash script.
-
outer_context:
Union[OwningContextRef[GitHubActionsStepContext],NonOwningContextRef[GitHubActionsStepContext],OwningContextRef[BashScriptContext],NonOwningContextRef[BashScriptContext],OwningContextRef[AnalysisContext],NonOwningContextRef[AnalysisContext]] Outer context, which may be a GitHub run step, another Bash script that ran this script, or just the outermost analysis context if analysing the script in isolation.
-
filesystem:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for filesystem used by the script.
-
env:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for env variables within the script.
-
func_decls:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for defined functions within the script.
-
stdin_scope:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for the stdin attached to the Bash process.
-
stdin_loc:
LocationSpecifier Location for the stdin attached to the Bash process.
-
stdout_scope:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for the stdout attached to the Bash process.
-
stdout_loc:
LocationSpecifier Location for the stdout attached to the Bash process.
- static create_from_run_step(context, source_filepath)
Create a new Bash script context (for being called from a GitHub step) and its associated scopes.
Reuses the filesystem and stdout scopes from the outer context, env scope inherits from the outer scope.
- Parameters:
context (core.ContextRef[github.GitHubActionsStepContext]) – Outer step context.
source_filepath (str) – Filepath of Bash script file.
- Returns:
The new Bash script context.
- Return type:
- static create_from_bash_script(context, source_filepath)
Create a new Bash script context (for being called from another Bash script) and its associated scopes.
Reuses the filesystem, stdin, and stdout scopes from the outer context, env scope inherits from the outer context.
- Parameters:
context (core.ContextRef[BashScriptContext]) – Outer Bash script context.
source_filepath (str) – Filepath of Bash script file.
- Returns:
The new Bash script context.
- Return type:
- static create_in_isolation(context, source_filepath)
Create a new Bash script context (for being analysed in isolation) and its associated scopes.
- Parameters:
context (core.ContextRef[core.AnalysisContext]) – Outer analysis context.
source_filepath (str) – Filepath of Bash script file.
- Returns:
The new Bash script context.
- Return type:
- with_stdin(stdin_scope, stdin_loc)
Return a modified bash script context with the given stdin.
- Return type:
- with_stdout(stdout_scope, stdout_loc)
Return a modified bash script context with the given stdout.
- Return type:
- get_containing_github_context()
Return the (possibly transitive) containing GitHub step context, if there is one.
- Return type:
- get_containing_analysis_context()
Return the (possibly transitive) containing analysis context.
- Return type:
- direct_refs()
Yield the direct references of the context, either to scopes or to other contexts.
- Return type:
Iterator[Union[OwningContextRef[Context],NonOwningContextRef[Context],OwningContextRef[Scope],NonOwningContextRef[Scope]]]
- __init__(outer_context, filesystem, env, func_decls, stdin_scope, stdin_loc, stdout_scope, stdout_loc, source_filepath)
-
outer_context:
- class macaron.code_analyzer.dataflow_analysis.bash.RawBashScriptNode(script, context)
Bases:
InterpretationNodeInterpretation node representing a Bash script (with the script as an unparsed string value).
Defines how to resolve and parse the Bash script content and generate the analysis representation.
- __init__(script, context)
Initialize Bash script node.
- Parameters:
script (facts.Value) – Value for Bash script content (as a string).
context (core.ContextRef[BashScriptContext]) – Bash script context.
- script: facts.Value
Value for Bash script content (as a string).
- context: core.ContextRef[BashScriptContext]
Bash script context.
- identify_interpretations(state)
Interpret the Bash script to resolve and parse the Bash script content and generate the analysis representation.
- Return type:
dict[InterpretationKey,Callable[[],Node]]
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.bash.BashScriptNode(definition, stmts, context)
Bases:
ControlFlowGraphNodeControl-flow-graph node representing a Bash script.
Control flow structure consists of a sequence of Bash statements. Note that this can model complex control flow with branching, loops, etc. because those control flow constructs will be statement nodes with their own control flow nested within.
Control flow that the cuts across multiple levels, such as an exit statement within a if statement branch that would cause the entire script to exit early, are modelled using the alternate exits mechanism (i.e. exit statement creates a BashExit exit state, in the enclosing control-flow constructs the successor of the BashExit exit of a child node will be an early BashExit exit of that construct, and so on up until this node, where there will be a early normal exit, and so the caller of this script would then proceed as normal after the script exits).
- __init__(definition, stmts, context)
Initialize Bash script node.
Typically, construction should be done via the create function rather than using this constructor directly.
- Parameters:
definition (bashparser_model.File) – Parsed Bash script AST.
stmts (list[BashStatementNode]) – Statement nodes in execution order.
context (core.ContextRef[BashScriptContext]) – Bash script context.
- definition: bashparser_model.File
Parsed Bash script AST.
- stmts: list[BashStatementNode]
Statement nodes in execution order.
- context: core.ContextRef[BashScriptContext]
Bash script context.
- get_successors(node, exit_type)
Return the successor for a given node.
Returns the next in the sequence or the exit in the case of the last node, or an early exit in the case of a BashExit or BashReturn exit type.
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- get_printable_properties_table()
Return a properties table containing the scopes.
- static create(script, context)
Create Bash script node from Bash script AST.
- Parameters:
script (bashparser_model.File) – Parsed Bash script AST.
context (core.NonOwningContextRef[BashScriptContext]) – Bash script context.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.bash.BashBlockNode(definition, stmts, context)
Bases:
ControlFlowGraphNodeControl-flow-graph node representing a Bash block.
Control flow structure consists of a sequence of Bash statements.
- __init__(definition, stmts, context)
Initialize Bash block node.
Typically, construction should be done via the create function rather than using this constructor directly.
- Parameters:
definition (bashparser_model.Block | list[bashparser_model.Stmt]) – Parsed block AST or list of statement ASTs.
stmts (list[BashStatementNode]) – Statement nodes in execution order.
context (core.ContextRef[BashScriptContext]) – Bash script context.
- definition: bashparser_model.Block | list[bashparser_model.Stmt]
Parsed block AST or list of statement ASTs.
- stmts: list[BashStatementNode]
Statement nodes in execution order.
- context: core.ContextRef[BashScriptContext]
Bash script context.
- get_successors(node, exit_type)
Return the successor for a given node.
Returns the next in the sequence or the exit in the case of the last node, or a propagated early exit of the same type in the case of a BashExit or BashReturn exit type.
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- get_printable_properties_table()
Return a properties table containing the line number and scopes.
- static create(script, context)
Create Bash block node from block AST or list of statement ASTs.
- Parameters:
script (bashparser_model.Block | list[bashparser_model.Stmt]) – Parsed block AST or list of statement ASTs.
context (core.NonOwningContextRef[BashScriptContext]) – Bash script context.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.bash.BashFuncCallNode(call_definition, func_definition, block, context)
Bases:
ControlFlowGraphNodeControl-flow-graph node representing a call to a Bash function.
Control flow structure consists of a single block containing the function body.
- __init__(call_definition, func_definition, block, context)
Initialize Bash function call node.
- Parameters:
call_definition (bashparser_model.Stmt) – The parsed AST of the callsite statement.
func_definition (bashparser_model.FuncDecl) – The parsed AST of the function declaration.
block (BashBlockNode) – Node representing the function body.
context (core.ContextRef[BashScriptContext]) – Bash script context.
- call_definition: bashparser_model.Stmt
The parsed AST of the callsite statement.
- func_definition: bashparser_model.FuncDecl
The parsed AST of the function declaration.
- block: BashBlockNode
Node representing the function body.
- context: core.ContextRef[BashScriptContext]
Bash script context.
- get_successors(node, exit_type)
Return the successor for a given node.
Returns the next node in the sequence or the exit in the case of the last node, or an early exit in the case of a BashReturn exit type, or a propagated early BashExit exit in the case of a BashExit exit type.
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- macaron.code_analyzer.dataflow_analysis.bash.get_stdout_redirects(stmt, context)
Extract the stdout redirects specified on the statement as a set of location expressions.
- class macaron.code_analyzer.dataflow_analysis.bash.BashStatementNode(definition, context)
Bases:
InterpretationNodeInterpretation node representing any kind of Bash statement.
Defines how to interpret the different kinds of statements and generate the appropriate analysis representation.
- __init__(definition, context)
Initialize statement node.
- definition: bashparser_model.Stmt
The parsed statement AST.
- context: core.ContextRef[BashScriptContext]
Bash script context.
- identify_interpretations(state)
Interpret the different kinds of statements and generate the appropriate analysis representation.
- Return type:
dict[InterpretationKey,Callable[[],Node]]
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.bash.BashIfClauseNode(definition, cond_stmts, then_stmts, else_stmts, context)
Bases:
ControlFlowGraphNodeControl-flow-graph node representing a Bash if statement.
Control flow structure consists of executing the statements of the condition, followed by a branch to execute either the then node or the else node (or if there is no else node, exit immediately). The analysis is not path sensitive, so both branches are always considered possible regardless of the condition.
- __init__(definition, cond_stmts, then_stmts, else_stmts, context)
Initialize Bash if statement node.
Typically, construction should be done via the create function rather than using this constructor directly.
- Parameters:
definition (bashparser_model.IfClause) – Parsed if statement AST.
cond_stmts (BashBlockNode) – Block node to execute the condition.
then_stmts (BashBlockNode) – Block node for the case where the condition is true.
else_stmts (BashBlockNode | BashIfClauseNode | None) – Node for the case where the condition is false, if any (will be another if node in the case of an elif).
context (core.ContextRef[BashScriptContext]) – Bash script context.
- definition: bashparser_model.IfClause
Parsed if statement AST.
- cond_stmts: BashBlockNode
Block node to execute the condition.
- then_stmts: BashBlockNode
Block node for the case where the condition is true.
- else_stmts: BashBlockNode | BashIfClauseNode | None
Node for the case where the condition is false, if any (will be another if node in the case of an elif).
- context: core.ContextRef[BashScriptContext]
Bash script context.
- children()
Yield the condition node, then node and (if present) else node.
- get_successors(node, exit_type)
Return the successor for a given node.
Returns a propagated early exit of the same type in the case of a BashExit or BashReturn exit type.
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- get_printable_properties_table()
Return a properties table containing the line number and scopes.
- static create(if_stmt, context)
Create a Bash if statement node from if statement AST.
- Parameters:
if_stmt (bashparser_model.IfClause) – Parsed if statement AST.
context (core.NonOwningContextRef[BashScriptContext]) – Bash script context.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.bash.BashForClauseNode(definition, init_stmts, cond_stmts, body_stmts, post_stmts, context)
Bases:
ControlFlowGraphNodeControl-flow-graph node representing a Bash for statement.
Control flow structure consists of executing the statements of the condition, followed by a branch to execute or skip the loop body node . The analysis is not path sensitive, so both branches are always considered possible regardless of the condition.
TODO: Currently doesn’t actually model the loop back edge (need more testing to be confident of analysis termination in the presence of loops).
- __init__(definition, init_stmts, cond_stmts, body_stmts, post_stmts, context)
Initialize Bash for statement node.
Typically, construction should be done via the create function rather than using this constructor directly.
- Parameters:
definition (bashparser_model.ForClause) – Parsed if statement AST.
init_stmts (BashBlockNode | None) – Block node to execute the initializer.
cond_stmts (BashBlockNode | None) – Block node to execute the condition.
body_stmts (BashBlockNode) – Block node for the body.
post_stmts (BashBlockNode | None) – Block node to execute the post.
context (core.ContextRef[BashScriptContext]) – Bash script context.
- definition: bashparser_model.ForClause
Parsed for statement AST.
- init_stmts: BashBlockNode | None
Block node to execute the initializer.
- cond_stmts: BashBlockNode | None
Block node to execute the condition.
- body_stmts: BashBlockNode
Block node for the loop body.
- post_stmts: BashBlockNode | None
Block node to execute the post.
- context: core.ContextRef[BashScriptContext]
Bash script context.
- get_successors(node, exit_type)
Return the successor for a given node.
Returns a propagated early exit of the same type in the case of a BashExit or BashReturn exit type.
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- get_printable_properties_table()
Return a properties table containing the line number and scopes.
- static create(for_stmt, context)
Create a Bash for statement node from for statement AST.
- Parameters:
for_stmt (bashparser_model.ForClause) – Parsed for statement AST.
context (core.NonOwningContextRef[BashScriptContext]) – Bash script context.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.bash.BashPipeContext(bash_script_context, pipe_scope, pipe_loc)
Bases:
ContextContext for a Bash pipe operation.
Introduces a scope and location to represent the pipe itself connecting the piped commands, where output from the piped-from command is written prior to being read as input by the piped-to command.
-
bash_script_context:
Union[OwningContextRef[BashScriptContext],NonOwningContextRef[BashScriptContext]] Outer Bash script context
-
pipe_scope:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for pipe.
-
pipe_loc:
LocationSpecifier Location for pipe.
- static create(context)
Create a new pipe context and its associated scope.
- Return type:
- direct_refs()
Yield the direct references of the context, either to scopes or to other contexts.
- Return type:
Iterator[Union[OwningContextRef[Context],NonOwningContextRef[Context],OwningContextRef[Scope],NonOwningContextRef[Scope]]]
- __init__(bash_script_context, pipe_scope, pipe_loc)
-
bash_script_context:
- class macaron.code_analyzer.dataflow_analysis.bash.BashPipeNode(definition, lhs, rhs, context)
Bases:
ControlFlowGraphNodeControl flow node representing a Bash pipe (“|”) binary command.
Control flow structure consists of executing the left-hand side, followed by the right-hand side. A pipe scope and location is introduced to model the piping of the output from the first command to the input of the second command.
- __init__(definition, lhs, rhs, context)
Initialize Bash pipe node.
Typically, construction should be done via the create function rather than using this constructor directly.
- Parameters:
definition (bashparser_model.BinaryCmd) – Parsed pipe binary command AST.
lhs (BashStatementNode) – Left-hand side (first) command.
rhs (BashStatementNode) – Right-hand side (second) command.
context (core.ContextRef[BashPipeContext]) – Pipe context.
- definition: bashparser_model.BinaryCmd
Parsed pipe binary command AST.
- lhs: BashStatementNode
Left-hand side (first) command.
- rhs: BashStatementNode
Right-hand side (second) command.
- context: core.ContextRef[BashPipeContext]
Pipe context.
- get_successors(node, exit_type)
Return the successor for a given node.
Returns a propagated early exit of the same type in the case of a BashExit or BashReturn exit type.
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- get_printable_properties_table()
Return a properties table containing the line number and scopes.
- static create(pipe_cmd, context)
Create Bash pipe node from pipe binary command AST.
- Parameters:
pipe_cmd (bashparser_model.BinaryCmd) – Parsed pipe binary command AST.
context (core.NonOwningContextRef[BashScriptContext]) – Bash script context.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.bash.BashAndNode(definition, lhs, rhs, context)
Bases:
ControlFlowGraphNodeControl flow node representing a Bash AND (”&&”) binary command.
Control flow structure consists of executing the left-hand side, followed by the right-hand side.
(TODO model short circuit?)
- __init__(definition, lhs, rhs, context)
Initialize Bash and node.
Typically, construction should be done via the create function rather than using this constructor directly.
- Parameters:
definition (bashparser_model.BinaryCmd) – Parsed AND binary command AST.
lhs (BashStatementNode) – Left-hand side (first) command.
rhs (BashStatementNode) – Right-hand side (second) command.
context (core.ContextRef[BashScriptContext]) – Bash script context.
- definition: bashparser_model.BinaryCmd
Parsed AND binary command AST.
- lhs: BashStatementNode
Left-hand side (first) command.
- rhs: BashStatementNode
Right-hand side (second) command.
- context: core.ContextRef[BashScriptContext]
Bash script context.
- get_successors(node, exit_type)
Return the successor for a given node.
Returns a propagated early exit of the same type in the case of a BashExit or BashReturn exit type.
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- get_printable_properties_table()
Return a properties table containing the line number and scopes.
- static create(and_cmd, context)
Create Bash and node from AND binary command AST.
- Parameters:
and_cmd (bashparser_model.BinaryCmd) – Parsed AND binary command AST.
context (core.NonOwningContextRef[BashScriptContext]) – Bash script context.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.bash.BashOrNode(definition, lhs, rhs, context)
Bases:
ControlFlowGraphNodeControl flow node representing a Bash OR (“||”) binary command.
Control flow structure consists of executing the left-hand side, followed by the right-hand side.
(TODO model short circuit?)
- __init__(definition, lhs, rhs, context)
Initialize Bash OR node.
Typically, construction should be done via the create function rather than using this constructor directly.
- Parameters:
definition (bashparser_model.BinaryCmd) – Parsed OR binary command AST.
lhs (BashStatementNode) – Left-hand side (first) command.
rhs (BashStatementNode) – Right-hand side (second) command.
context (core.ContextRef[BashScriptContext]) – Bash script context.
- definition: bashparser_model.BinaryCmd
Parsed OR binary command AST.
- lhs: BashStatementNode
Left-hand side (first) command.
- rhs: BashStatementNode
Right-hand side (second) command.
- context: core.ContextRef[BashScriptContext]
Bash script context.
- get_successors(node, exit_type)
Return the successor for a given node.
Returns a propagated early exit of the same type in the case of a BashExit or BashReturn exit type.
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- get_printable_properties_table()
Return a properties table containing the line number and scopes.
- static create(or_cmd, context)
Create Bash OR node from OR binary command AST.
- Parameters:
and_cmd (bashparser_model.BinaryCmd) – Parsed AND binary command AST.
context (core.NonOwningContextRef[BashScriptContext]) – Bash script context.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.bash.BashSingleCommandNode(definition, context, cmd, args, stdout_redirects)
Bases:
InterpretationNodeInterpretation node representing a single Bash command.
Defines how to interpret the semantics of the different supported commands that may be invoked.
- __init__(definition, context, cmd, args, stdout_redirects)
Initialize Bash single command node.
- Parameters:
definition (bashparser_model.Stmt) – Parsed statement AST.
context (core.ContextRef[BashScriptContext]) – Bash script context.
cmd (facts.Value) – Expression for command name.
args (list[facts.Value | None]) – Expressions for argument values (None if unrepresentable).
stdout_redirects (set[facts.Location]) – Location expressions for where stdout is redirected to.
- definition: bashparser_model.Stmt
Parsed statement AST.
- context: core.ContextRef[BashScriptContext]
Bash script context.
- cmd: facts.Value
Expression for command name.
- args: list[facts.Value | None]
Expressions for argument values (None if unrepresentable).
- stdout_redirects: set[facts.Location]
Location expressions for where stdout is redirected to.
- identify_interpretations(state)
Interpret the semantics of the different supported commands that may be invoked.
- Return type:
dict[InterpretationKey,Callable[[],Node]]
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.bash.BashExitNode
Bases:
StatementNodeStatement node representing a Bash exit command.
Always exits with the BashExit exit type (which causes the whole script to exit).
- class macaron.code_analyzer.dataflow_analysis.bash.LiteralOrEnvVar(is_env_var, literal)
Bases:
objectRepresents either a literal or a read of an environment variable.
- __init__(is_env_var, literal)
- macaron.code_analyzer.dataflow_analysis.bash.is_simple_var_read(param_exp)
Return whether expression is a simple env var read e.g. $ENV_VAR.
- Return type:
- macaron.code_analyzer.dataflow_analysis.bash.parse_env_var_read_word_part(part, allow_dbl_quoted)
Parse word part as a read of an environment variable.
If the given word part is a read of an env var (possibly enclosed in double quotes, if allowed), return the name of the variable, otherwise None.
- Return type:
str | None
- macaron.code_analyzer.dataflow_analysis.bash.parse_env_var_read_word(word, allow_dbl_quoted)
Parse word as a read of an environment variable.
If the given word is a read of an env var (possibly enclosed in double quotes, if allowed), return the name of the variable, otherwise None.
- macaron.code_analyzer.dataflow_analysis.bash.parse_content(parts, allow_dbl_quoted)
Parse the given sequence of word parts.
Return a representation as a sequence of string literal and env var reads, or else return None if not representable in this way.
If allow_dbl_quoted is True, permit word parts to be double quoted expressions, the content of which will be included in the sequence (if False, return None if the sequence contains double quoted expressions).
- Return type:
list[LiteralOrEnvVar] | None
- macaron.code_analyzer.dataflow_analysis.bash.convert_shell_value_sequence_to_fact_value(content, context)
Convert sequence of Bash values into a single concatenated expression.
- Return type:
- macaron.code_analyzer.dataflow_analysis.bash.convert_shell_value_to_fact_value(val, context)
Convert a Bash literal or env var read into a value expression.
- Return type:
- macaron.code_analyzer.dataflow_analysis.bash.convert_shell_word_to_value(word, context)
Convert a Bash word into a value expression.
Return value expression alongside a bool indicating whether the value is “quoted” (or else may require further expansion post-resolution if “unquoted”).
- macaron.code_analyzer.dataflow_analysis.bash.parse_dbl_quoted_string(word)
Parse double quoted string.
If the given word is a double quoted expression, return a representation as a sequence of string literal and env var reads, or else return None if it is not a double quoted expression or if it is not representable in this way.
- Return type:
- macaron.code_analyzer.dataflow_analysis.bash.parse_sgl_quoted_string(word)
Parse single quoted string.
If the given word is a single quoted string, return the string literal content, otherwise return None.
- macaron.code_analyzer.dataflow_analysis.bash.parse_singular_literal(word)
Parse singular literal word.
If the given word is a single literal, return the string literal content, otherwise return None.
macaron.code_analyzer.dataflow_analysis.cmd_parser module
This module contains parsers for command line interfaces for commands relevant to analysis.
- macaron.code_analyzer.dataflow_analysis.cmd_parser.parse_python_command_line(args)
Parse python command line.
- Parameters:
- Returns:
Parsed python command args
- Return type:
macaron.code_analyzer.dataflow_analysis.core module
Core dataflow analysis framework definitions and algorithm.
- macaron.code_analyzer.dataflow_analysis.core.reset_debug_sequence_number()
Reset debug sequence number.
- Return type:
- macaron.code_analyzer.dataflow_analysis.core.get_debug_sequence_number()
Get current debug sequence number value.
- Return type:
- macaron.code_analyzer.dataflow_analysis.core.increment_debug_sequence_number()
Increment debug sequence number.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.core.StateDebugLabel(sequence_number, copied)
Bases:
objectLabel for state fact providing information useful for debugging.
Provides a record of analysis ordering and whether the fact was just copied from another state rather than newly produced.
-
copied:
bool Whether the state fact is just copied from another state rather than newly produced.”””
- __init__(sequence_number, copied)
-
copied:
- class macaron.code_analyzer.dataflow_analysis.core.StateTransferFilter
Bases:
ABCInterface for state transfer filters, which filter out state facts by location.
- class macaron.code_analyzer.dataflow_analysis.core.State
Bases:
objectRepresentation of the abstract storage state at some program point.
Consists of a set of abstract locations, each associated with a set of possible values.
- __init__()
Construct an empty state.
- class macaron.code_analyzer.dataflow_analysis.core.DefaultStateTransferFilter
Bases:
StateTransferFilterDefault state transfer filter that includes all locations.
- class macaron.code_analyzer.dataflow_analysis.core.ExcludedLocsStateTransferFilter(excluded_locs)
Bases:
StateTransferFilterState transfer filter that excludes any locations in the given set.
- __init__(excluded_locs)
Construct filter that excludes the given locations.
- class macaron.code_analyzer.dataflow_analysis.core.ExcludedScopesStateTransferFilter(excluded_scopes)
Bases:
StateTransferFilterState transfer filter that excludes any locations that are within the scopes in the given set.
- __init__(excluded_scopes)
Construct filter that excludes the given scopes.
- macaron.code_analyzer.dataflow_analysis.core.transfer_state(src_state, dest_state, transfer_filter=<macaron.code_analyzer.dataflow_analysis.core.DefaultStateTransferFilter object>, debug_is_copy=True)
Transfer/copy all facts in the src state to the dest state, except those excluded by the given filter.
- Parameters:
src_state (State) – The state to transfer facts from.
dest_state (State) – The state to modify by transferring facts to.
transfer_filter (StateTransferFilter) – The filter to apply to the transferred facts (by default, transfer all).
debug_is_copy (bool) – Whether the facts newly added to the dest state should be recorded as being copied or not (for debugging purposes).
- Returns:
Whether the dest state was modified.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.core.ExitType
Bases:
ABCRepresentation of an exit type, describing the manner in which the execution of a node may terminate.
- class macaron.code_analyzer.dataflow_analysis.core.DefaultExit
Bases:
ExitTypeDefault, normal exit.
- class macaron.code_analyzer.dataflow_analysis.core.Node
Bases:
ABCBase class of all node types in dataflow analysis.
Subclasses will represent the various program/semantic constructs, and define how to analyse them.
- __init__()
Initialize with empty states.
-
exit_states:
dict[ExitType,State] Abstract state at the point after the execution of this node, for each possible distinct exit type.
-
created_debug_sequence_num:
int Sequence number at the point the node was created, recorded for debugging purposes.
-
processed_log:
list[tuple[int,int]] Log of begin/end sequence numbers each time this node was processed, recorded for debugging purposes.
- abstractmethod analyse()
Perform analysis of this node (and potentially any child nodes).
Update the exit states with the analysis result. Returns whether anything was modified.
- Return type:
- notify_processed(begin_seq_num, end_seq_num)
Record that this node has been processed.
- Return type:
- get_exit_state_transfer_filter()
Return the state transfer filter applicable to the exit state of this node.
By default, nothing is excluded. Subclasses should override to provide appropriate filters to avoid transferring state that will be irrelevant after the node exits.
- Return type:
- get_printable_properties_table()
Return a table of stringified properties, describing the details of this node, for debugging purposes.
The returned properties table is a mapping of name to value-set, which can be rendered via the functions in the printing module.
- macaron.code_analyzer.dataflow_analysis.core.node_is_not_none(node)
Return whether the given node is not None.
- Return type:
TypeGuard[Node]
- macaron.code_analyzer.dataflow_analysis.core.traverse_bfs(node)
Traverse the node tree in a breadth-first manner, yielding the nodes (including this node) in traversal order.
- macaron.code_analyzer.dataflow_analysis.core.build_parent_mapping(node)
Construct a mapping of nodes to their parent nodes.
- class macaron.code_analyzer.dataflow_analysis.core.NodeForest(root_nodes)
Bases:
objectA collection of independent root nodes (with no control-flow or relation between them).
- __init__(root_nodes)
Construct a NodeForest for the given nodes, and build the parent mapping.
- class macaron.code_analyzer.dataflow_analysis.core.ControlFlowGraph(entry)
Bases:
objectGraph structure to represent control flow graphs.
- __init__(entry)
Construct an initially-empty control flow graph.
-
successors:
dict[Node,dict[ExitType,set[Node|ExitType]]] Graph of successor edges. Each edge is from a particular exit of a particular node, either to a node or to an exit of the control flow itself.
- add_successor(src, exit_type, dest)
Add a successor edge to the control flow graph.
- Return type:
- get_successors(node, exit_type)
Return the successors for a particular exit of a particular node.
- static create_from_sequence(seq)
Construct a linear sequence of nodes.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.core.ControlFlowGraphNode
Bases:
NodeBase class for nodes representing control-flow constructs.
Defines the generic algorithm for analysing control flow graphs. Subclasses will define the child nodes and concrete graph structure.
- analyse()
Perform analysis of this node.
Performs analysis of the child nodes and propagates state from the exit state of an updated node to the before state of its successor nodes, according to the control-flow-graph structure, then analyses the successor nodes, and so on until a fixpoint is reached and no further updates may be made to any node states.
Returns whether anything was modified.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.core.StatementNode
Bases:
NodeBase class for nodes representing constructs with direct effects (and no child nodes).
Subclasses will define the effects that apply when the node is executed.
- analyse()
Perform analysis of this node, by applying the effects to update the after state.
Returns whether anything was modified.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.core.NoOpStatementNode
Bases:
StatementNodeStatement that has no effect.
- class macaron.code_analyzer.dataflow_analysis.core.InterpretationKey(*args, **kwargs)
Bases:
ProtocolInterpretation key used to identify interpretations that have been produced before.
Must support hashing and equality comparison to allow use as a dict key.
- __init__(*args, **kwargs)
- class macaron.code_analyzer.dataflow_analysis.core.InterpretationNode
Bases:
NodeBase class for nodes representing constructs requiring interpretation.
Such constructs must be interpreted to produce possibly-multiple child nodes representing possible interpretations of the semantics of the node.
Analysing the interpretation node will apply the combined effects of all of the possible interpretations. Subclasses will define how to identify the possible interpretations and generate the corresponding nodes.
- __init__()
Initialize node with no interpretations.
-
interpretations:
dict[InterpretationKey,Node] The generated interpretations of this node, identified/deduplicated by some interpretation key.
- update_interpretations()
Analyse the node to identify interpretations.
Analysis is done in the context of the current before state, adding any new interpretations generated to the interpretations dict.
- Return type:
- abstractmethod identify_interpretations(state)
Analyse the node, in the context of the given before state, to identify interpretations.
Returns, for each discovered interpretation, an identifying interpretation key that can be used to determine if the interpretation has been produced previously, and a callable that generates the node representing that interpretation (used to generate the node if the interpretation is new, otherwise the previously-generated node will be reused).
- Return type:
dict[InterpretationKey,Callable[[],Node]]
- class macaron.code_analyzer.dataflow_analysis.core.OwningContextRef(ref)
Bases:
Generic[R_co]A reference to a part of a node’s context that “owns” it.
Ownership is used to identify what scopes are tied to a particular node such that they cease to exist or become irrelevant after the node exits, and thus any values stored in locations within those scopes may be erased from the state beyond that point to simplify the state.
- get_non_owned()
Return a non owning reference to the same object.
- Return type:
NonOwningContextRef[TypeVar(R_co, covariant=True)]
- __init__(ref)
- class macaron.code_analyzer.dataflow_analysis.core.NonOwningContextRef(ref)
Bases:
Generic[R_co]A reference to a part of a node’s context that does not “own” it.
Ownership is used to identify what scopes are tied to a particular node such that they cease to exist or become irrelevant after the node exits, and thus any values stored in locations within those scopes may be erased from the state beyond that point to simplify the state.
- get_non_owned()
Return a non-owning reference to the same object.
- Return type:
NonOwningContextRef[TypeVar(R_co, covariant=True)]
- __init__(ref)
- class macaron.code_analyzer.dataflow_analysis.core.Context
Bases:
ABCBase class for node contexts.
Represents the necessary context that influences the analysis of a node, primarily that of identifying the concrete scopes that fill particular roles in the node.
- abstractmethod direct_refs()
Yield the direct references of the context, either to scopes or to other contexts.
- Return type:
Iterator[Union[OwningContextRef[Context],NonOwningContextRef[Context],OwningContextRef[Scope],NonOwningContextRef[Scope]]]
- owned_scopes()
Yield the scopes that are owned by this context.
Owned scopes are those that are directly referenced by owning references or scopes that are indirectly referenced by owning references, through referenced contexts that are referenced by owning references.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.core.AnalysisContext(repo_path)
Bases:
ContextOutermost context of the analysis.
Records the path to the repo checkout, to allow the analysis access to files in the repo.
- direct_refs()
No direct references, yields nothing.
- Return type:
Iterator[Union[OwningContextRef[Context],NonOwningContextRef[Context],OwningContextRef[Scope],NonOwningContextRef[Scope]]]
- __init__(repo_path)
- class macaron.code_analyzer.dataflow_analysis.core.SimpleSequence(seq)
Bases:
ControlFlowGraphNodeControl-flow-graph node representing the execution of a sequence of nodes.
- __init__(seq)
Construct control-flow-graph from sequence.
- class macaron.code_analyzer.dataflow_analysis.core.SimpleAlternatives(alts)
Bases:
InterpretationNodeInterpretation node representing a concrete set of alternative nodes.
- __init__(alts)
Initialize node.
- identify_interpretations(state)
Return the interpretations of this node, that is, each of the alternatives.
- Return type:
dict[InterpretationKey,Callable[[],Node]]
macaron.code_analyzer.dataflow_analysis.evaluation module
Functions for evaluating and resolving dataflow analysis expressions.
- macaron.code_analyzer.dataflow_analysis.evaluation.evaluate(node, value)
Evaluate the given value, at the point immediately prior to the execution of the given node.
- Parameters:
node (core.Node) – The node at which to evaluate the value (i.e. in the context of the before state of the node).
value (facts.Value) – The value expression to evaluate.
- Returns:
The set of possible resolved values for the value expression, each with a record of the resolved value chosen for any read expressions.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.evaluation.WriteStatement(location, value)
Bases:
objectRepresentation of a write to a given location of a given value.
- perform_write(before_state)
Return a state containing only the values stored by the write operation, in context of the before state.
Also returns the set of locations within that state which should be considered to have been overwritten, erasing any previous values.
- __init__(location, value)
- class macaron.code_analyzer.dataflow_analysis.evaluation.StatementSet(stmts)
Bases:
objectRepresentation of a set of (simultaneous) write operations.
-
stmts:
set[WriteStatement] The set of writes.
- apply_effects(before_state)
Apply the effect of the set of writes, returning the resulting state.
- Return type:
- static union(*stmt_sets)
Combine multiple write sets into one.
- Return type:
- __init__(stmts)
-
stmts:
- class macaron.code_analyzer.dataflow_analysis.evaluation.ParameterPlaceholderTransformer(allow_unbound_params=True, value_parameter_binds=None, location_parameter_binds=None, scope_parameter_binds=None)
Bases:
objectExpression transformer which replaces parameter placeholders with their corresponding bound values.
- __init__(allow_unbound_params=True, value_parameter_binds=None, location_parameter_binds=None, scope_parameter_binds=None)
Initialize transformer with bindings.
- Parameters:
allow_unbound_params (bool) – Whether to raise an exception if a parameter is found with no provided binding.
value_parameter_binds (dict[str, facts.Value] | None) – Bindings for value parameter placeholders, mapping parameter name to bound value expression.
location_parameter_binds (dict[str, facts.Value] | None) – Bindings for location parameter placeholders, mapping parameter name to bound location expression.
scope_parameter_binds (dict[str, facts.Value] | None) – Bindings for scope parameter placeholders, mapping parameter name to bound scope.
-
allow_unbound_params:
bool Whether to raise an exception if a parameter is found with no provided binding.
-
value_parameter_binds:
dict[str,Value] Bindings for value parameter placeholders, mapping parameter name to bound value expression.
-
location_parameter_binds:
dict[str,LocationSpecifier] Bindings for location parameter placeholders, mapping parameter name to bound location expression.
-
scope_parameter_binds:
dict[str,Scope] Bindings for scope parameter placeholders, mapping parameter name to bound scope.
- transform_value(value)
Transform given value expression.
Returns a value expression with any parameter placeholders replaced with their bound values.
- Return type:
- transform_location(location)
Transform given location expression.
Returns a location expression with any parameter placeholders replaced with their bound values.
- Return type:
- transform_location_specifier(location)
Transform given location specifier expression.
Returns a location specifier expression with any parameter placeholders replaced with their bound values.
- Return type:
- transform_scope(scope)
Transform given scope.
Returns a scope with any parameter placeholders replaced with their bound values.
- Return type:
- transform_statement(statement)
Transform given write statement.
Returns a write statement with any parameter placeholders replaced with their bound values.
- Return type:
- transform_statement_set(statement_set)
Transform given write statement set.
Returns a write statement set with any parameter placeholders replaced with their bound values.
- Return type:
- macaron.code_analyzer.dataflow_analysis.evaluation.is_singleton(s, e)
Return whether the given set contains only the single given element.
- Return type:
- macaron.code_analyzer.dataflow_analysis.evaluation.is_singleton_no_bindings(s, e)
Return whether the given set contains only the single given element with no read bindings.
- Return type:
- macaron.code_analyzer.dataflow_analysis.evaluation.scope_matches(read_scope, stored_scope)
Return whether the given read scope matches the given stored scope.
Matching means that a read of the read scope may return values from the stored scope.
- Return type:
- macaron.code_analyzer.dataflow_analysis.evaluation.location_subsumes(loc, subloc)
Return whether the given location subsumes the given sub location.
Subsumption means that a read of subloc may be considered to be a read of loc or some part thereof.
- Return type:
- macaron.code_analyzer.dataflow_analysis.evaluation.get_values_for_subsumed_read(read_loc, state_loc, state_vals)
Return the set of values stored in the state location, if relevant for the given read location.
- class macaron.code_analyzer.dataflow_analysis.evaluation.ReadBindings(binds=None)
Bases:
objectSet of bindings of read expressions to values bound as the result of those read expressions.
- __init__(binds=None)
Initialize with given bindings.
- with_binding(read, value)
Return bindings with the given additional binding, or None if the bindings conflict.
- Return type:
- with_bindings(bindings)
Return bindings with the given additional bindings, or None if the bindings conflict.
- Return type:
- static combine_bindings(bindings_list)
Return bindings combining all bindings in the given list, or None if the bindings conflict.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.evaluation.EvaluationTransformer(state)
Bases:
objectExpression transformer which evaluates the expression to produce a set of resolved values.
The expression is evaluated in the context of a specified abstract storage state.
- __init__(state)
Initialize transformer with state from which to resolve reads.
- transform_write(location, value)
Transform a write location and value, returning the set of resolved values with the necessary bindings.
- Return type:
- transform_value(value)
Transform a value expression, returning the set of resolved values with the necessary bindings.
- Return type:
- transform_location(location)
Transform a location expression, returning the set of resolved values with the necessary bindings.
- Return type:
- transform_location_specifier(location)
Transform a location specifier expression, returning the set of resolved values with the necessary bindings.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.evaluation.ContainsSymbolicVisitor
Bases:
objectVisitor to determine whether a given expression contains any symbolic expressions.
- visit_value(value)
Search value expression for symbolic expressions and return whether any were found.
- Return type:
- visit_location(location)
Search location expression for symbolic expressions and return whether any were found.
- Return type:
- macaron.code_analyzer.dataflow_analysis.evaluation.filter_symbolic_values(values)
Filter out symbolic values.
Returns a set containing all elements from the given set that do not contain any symbolic expressions.
- Return type:
- macaron.code_analyzer.dataflow_analysis.evaluation.filter_symbolic_locations(locs)
Filter out symbolic locations.
Returns a set containing all elements from the given set that do not contain any symbolic expressions.
- Return type:
- macaron.code_analyzer.dataflow_analysis.evaluation.filter_symbolic_location_specifiers(locs)
Filter out symbolic location specifiers.
Returns a set containing all elements from the given set that do not contain any symbolic expressions.
- Return type:
- macaron.code_analyzer.dataflow_analysis.evaluation.get_single_resolved_str(resolved_values)
If the given set contains only a single string literal value, return that string, or else None.
- macaron.code_analyzer.dataflow_analysis.evaluation.get_single_resolved_str_with_default(resolved_values, default_value)
If the given set contains only a single string literal value, return that string, else return default value.
- Return type:
macaron.code_analyzer.dataflow_analysis.facts module
Definitions of dataflow analysis representation for value expressions and abstract storage locations.
Also includes an incomplete implementation of serialization/deserialization to a Souffle-datalog-compatible representation, which originated as a remnant of a previous prototype version that involved the datalog engine in the analysis, but is retained here because the serialization is useful for producing a human-readable string representation for debugging purposes, and it may be necessary in future to make these expressions available to the policy engine (which uses datalog). Deserialization is currently non-functional primarily due to the inability to deserialize scope identity, but may potentially be revisited in future, so is left here for posterity.
- class macaron.code_analyzer.dataflow_analysis.facts.Value
Bases:
ABCBase class for value expressions.
Subclasses should be comparable by structural equality.
- class macaron.code_analyzer.dataflow_analysis.facts.LocationSpecifier
Bases:
ABCBase class for location expressions.
Subclasses should be comparable by structural equality.
- class macaron.code_analyzer.dataflow_analysis.facts.Scope(name, outer_scope=None)
Bases:
objectRepresentation of a scope in which a location may exist.
This allows for distinct locations with the same name/path/expression to exist separately in different namespaces.
A scope may have an outer scope, such that a read from a scope may return values from the outer scope(s).
Unlike other expression classes, scopes are distinguished by object identity and not structural equality (TODO now that scopes have names, maybe should revisit this since it makes serialization/deserialization difficult).
- __init__(name, outer_scope=None)
Initialize scope.
- class macaron.code_analyzer.dataflow_analysis.facts.ParameterPlaceholderScope(name)
Bases:
ScopeSpecial scope placeholder to allow generic parameterized expressions.
TODO This is not really a proper subclass of Scope, should revisit type relationship.
- __init__(name)
Initialize placeholder scope with given parameter name.
- class macaron.code_analyzer.dataflow_analysis.facts.Location(scope, loc)
Bases:
objectA location expression qualified with the scope it resides in.
-
loc:
LocationSpecifier Location expression.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(scope, loc)
-
loc:
- class macaron.code_analyzer.dataflow_analysis.facts.StringLiteral(literal)
Bases:
ValueValue expression representing a string literal.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(literal)
- class macaron.code_analyzer.dataflow_analysis.facts.Read(loc)
Bases:
ValueValue expression representing a read of the value stored at a location.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(loc)
- class macaron.code_analyzer.dataflow_analysis.facts.ArbitraryNewData(at)
Bases:
ValueValue expression representing some arbitrary data.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(at)
- class macaron.code_analyzer.dataflow_analysis.facts.InstalledPackage(name, version, distribution, url)
Bases:
ValueValue expression representing an installed package, with identifying metadata (name, version, etc.).
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(name, version, distribution, url)
- class macaron.code_analyzer.dataflow_analysis.facts.UnaryStringOperator(value)
Bases:
EnumUnary operators.
- BASENAME = 1
- BASE64_ENCODE = 2
- BASE64DECODE = 3
- macaron.code_analyzer.dataflow_analysis.facts.un_op_to_datalog_fact_string(op)
Return string representation of operator (in datalog serialized format).
- Return type:
- class macaron.code_analyzer.dataflow_analysis.facts.BinaryStringOperator(value)
Bases:
EnumBinary operators.
- STRING_CONCAT = 1
- macaron.code_analyzer.dataflow_analysis.facts.bin_op_to_datalog_fact_string(op)
Return string representation of operator (in datalog serialized format).
- Return type:
- class macaron.code_analyzer.dataflow_analysis.facts.UnaryStringOp(op, operand)
Bases:
ValueValue expression representing a unary operator.
-
op:
UnaryStringOperator Operator.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(op, operand)
-
op:
- class macaron.code_analyzer.dataflow_analysis.facts.BinaryStringOp(op, operand1, operand2)
Bases:
ValueValue expression representing a binary operator.
-
op:
BinaryStringOperator Operator.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- static get_string_concat(operand1, operand2)
Construct a string concatenation operator.
Applies some simple constant-folding simplifications.
- Return type:
- __init__(op, operand1, operand2)
-
op:
- class macaron.code_analyzer.dataflow_analysis.facts.ParameterPlaceholderValue(name)
Bases:
ValueSpecial placeholder value to allow generic parameterized expressions.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(name)
- class macaron.code_analyzer.dataflow_analysis.facts.Symbolic(val)
Bases:
ValueValue expression representing a symbolic expression.
Represents an expression that has been “frozen” in symbolic form rather than evaluated concretely.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(val)
- class macaron.code_analyzer.dataflow_analysis.facts.SingleBashTokenConstraint(val)
Bases:
ValueValue expression representing a constraint that the underlying value does not parse as multiple Bash tokens.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(val)
- class macaron.code_analyzer.dataflow_analysis.facts.Filesystem(path)
Bases:
LocationSpecifierLocation expression representing a filesystem location at a particular file path.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(path)
- class macaron.code_analyzer.dataflow_analysis.facts.Variable(name)
Bases:
LocationSpecifierLocation expression representing a variable.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(name)
- class macaron.code_analyzer.dataflow_analysis.facts.Artifact(name, file)
Bases:
LocationSpecifierLocation expression representing a file stored within some named artifact storage location.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(name, file)
- class macaron.code_analyzer.dataflow_analysis.facts.FilesystemAnyUnderDir(path)
Bases:
LocationSpecifierLocation expression representing any file under a particular directory.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(path)
- class macaron.code_analyzer.dataflow_analysis.facts.ArtifactAnyFilename(name)
Bases:
LocationSpecifierLocation expression representing any file contained with a named artifact storage location.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(name)
- class macaron.code_analyzer.dataflow_analysis.facts.ParameterPlaceholderLocation(name)
Bases:
LocationSpecifierSpecial placeholder location expression to allow generic parameterized expressions.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(name)
- class macaron.code_analyzer.dataflow_analysis.facts.Console
Bases:
LocationSpecifierLocation expression representing a console, pipe or other text stream.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__()
- class macaron.code_analyzer.dataflow_analysis.facts.Installed(name)
Bases:
LocationSpecifierLocation expression representing an installed package.
- to_datalog_fact_string()
Return string representation of expression (in datalog serialized format).
- Return type:
- __init__(name)
- macaron.code_analyzer.dataflow_analysis.facts.enquote_datalog_string_literal(literal)
Enquote a datalog string literal, with appropriate escaping.
- Return type:
- exception macaron.code_analyzer.dataflow_analysis.facts.FactParseError
Bases:
ExceptionHappens when an error occurs during fact parsing.
- macaron.code_analyzer.dataflow_analysis.facts.consume_whitespace(text)
Consume leading whitespace, returning the remainder to the text.
- Return type:
- macaron.code_analyzer.dataflow_analysis.facts.consume(text, token)
Consume the leading token from the text.
Raises exception if text does not start with the token.
- Return type:
- macaron.code_analyzer.dataflow_analysis.facts.parse_qualified_name(text)
Parse a qualified name, returning the name and the remainder of the text.
- macaron.code_analyzer.dataflow_analysis.facts.parse_symbol(text)
Parse datalog-serialized string literal.
- macaron.code_analyzer.dataflow_analysis.facts.parse_location_specifier(text)
Deserialize location specifier from string representation (in datalog serialized format).
- Return type:
- macaron.code_analyzer.dataflow_analysis.facts.parse_location(text)
Deserialize location from string representation (in datalog serialized format).
Currently non-functional primarily due to the inability to deserialize scope identity.
- macaron.code_analyzer.dataflow_analysis.facts.parse_value(text)
Deserialize value expression from string representation (in datalog serialized format).
- macaron.code_analyzer.dataflow_analysis.facts.parse_un_op(text)
Deserialize unary operator from string representation (in datalog serialized format).
- Return type:
- macaron.code_analyzer.dataflow_analysis.facts.parse_bin_op(text)
Deserialize binary operator from string representation (in datalog serialized format).
- Return type:
macaron.code_analyzer.dataflow_analysis.github module
Dataflow analysis implementation for analysing GitHub Actions Workflow build pipelines.
- class macaron.code_analyzer.dataflow_analysis.github.GitHubActionsWorkflowContext(analysis_context, artifacts, releases, env, workflow_variables, console, source_filepath)
Bases:
ContextContext for the top-level scope of a GitHub Actions Workflow.
-
analysis_context:
Union[OwningContextRef[AnalysisContext],NonOwningContextRef[AnalysisContext]] Outer analysis context.
-
artifacts:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for artifact storage within the pipeline execution (for upload/download artifact).
-
releases:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for artifacts published as GitHub releases by the pipeline.
-
env:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for environment variables (env block at top-level of workflow).
-
workflow_variables:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for variables within the workflow.
-
console:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for console output.
- static create(analysis_context, source_filepath)
Create a new workflow context and its associated scopes.
- Parameters:
analysis_context (core.ContextRef[core.AnalysisContext]) – Outer analysis context.
source_filepath (str) – Filepath of workflow file.
- Returns:
The new workflow context.
- Return type:
- direct_refs()
Yield the direct references of the context, either to scopes or to other contexts.
- Return type:
Iterator[Union[OwningContextRef[Context],NonOwningContextRef[Context],OwningContextRef[Scope],NonOwningContextRef[Scope]]]
- __init__(analysis_context, artifacts, releases, env, workflow_variables, console, source_filepath)
-
analysis_context:
- class macaron.code_analyzer.dataflow_analysis.github.GitHubActionsJobContext(workflow_context, filesystem, env, job_variables)
Bases:
ContextContext for a job within a GitHub Actions Workflow.
-
workflow_context:
Union[OwningContextRef[GitHubActionsWorkflowContext],NonOwningContextRef[GitHubActionsWorkflowContext]] Outer workflow context.
-
filesystem:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for filesystem used by the job and its steps.
-
env:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for environment variables (env block at job level).
-
job_variables:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for variables within the job (step output variables, etc.).
- static create(workflow_context)
Create a new job context and its associated scopes.
Env and job variables scopes inherit from outer context.
- Parameters:
workflow_context (core.ContextRef[GitHubActionsWorkflowContext]) – Outer workflow context.
- Returns:
The new job context.
- Return type:
- direct_refs()
Yield the direct references of the context, either to scopes or to other contexts.
- Return type:
Iterator[Union[OwningContextRef[Context],NonOwningContextRef[Context],OwningContextRef[Scope],NonOwningContextRef[Scope]]]
- __init__(workflow_context, filesystem, env, job_variables)
-
workflow_context:
- class macaron.code_analyzer.dataflow_analysis.github.GitHubActionsStepContext(job_context, env, output_var_prefix)
Bases:
ContextContext for a step within a job within a GitHub Actions Workflow.
-
job_context:
Union[OwningContextRef[GitHubActionsJobContext],NonOwningContextRef[GitHubActionsJobContext]] Outer job context.
-
env:
Union[OwningContextRef[Scope],NonOwningContextRef[Scope]] Scope for environment variables (env block at step level)
-
output_var_prefix:
str|None Name prefix for step output variables (stored in the job variables) belonging to this step (e.g. “steps.step_id.outputs.”)
- static create(job_context, step_id)
Create a new step context and its associated scopes.
Env scope inherits from outer context. Output var prefix is derived from step_id.
- Parameters:
job_context (core.ContextRef[GitHubActionsJobContext]) – Outer job context.
step_id (str | None) – Step id. If provided, used to derive name previx for step output variables.
- Returns:
The new step context.
- Return type:
- direct_refs()
Yield the direct references of the context, either to scopes or to other contexts.
- Return type:
Iterator[Union[OwningContextRef[Context],NonOwningContextRef[Context],OwningContextRef[Scope],NonOwningContextRef[Scope]]]
- __init__(job_context, env, output_var_prefix)
-
job_context:
- class macaron.code_analyzer.dataflow_analysis.github.RawGitHubActionsWorkflowNode(definition, context)
Bases:
InterpretationNodeInterpretation node representing a GitHub Actions Workflow.
Defines how to interpret a parsed workflow and generate its analysis representation.
- __init__(definition, context)
Initialize node.
Typically, construction should be done via the create function rather than using this constructor directly.
- definition: github_workflow_model.Workflow
Parsed workflow AST.
- context: core.ContextRef[GitHubActionsWorkflowContext]
Workflow context
- identify_interpretations(state)
Interpret the workflow AST to generate control flow representation.
- Return type:
dict[InterpretationKey,Callable[[],Node]]
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- get_printable_properties_table()
Return a properties table containing the workflow name and scopes.
- static create(workflow, analysis_context, source_filepath)
Create workflow node and its associated context.
- Parameters:
workflow (github_workflow_model.Workflow) – Parsed workflow AST.
analysis_context (core.ContextRef[core.AnalysisContext]) – Outer analysis context.
source_filepath (str) – Filepath of workflow file.
- Returns:
The new workflow node.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.github.GitHubActionsWorkflowNode(definition, context, env_block, jobs, order)
Bases:
ControlFlowGraphNodeControl-flow-graph node representing a GitHub Actions Workflow.
Control flow structure executes each job in an arbitrary linear sequence (by default a topological sort satsifying the job dependencies). If an env block exists, it is applied beforehand.
- __init__(definition, context, env_block, jobs, order)
Initialize workflow node.
Typically, construction should be done via the create function rather than using this constructor directly.
- Parameters:
definition (github_workflow_model.Workflow) – Parsed workflow AST.
context (core.ContextRef[GitHubActionsWorkflowContext]) – Workflow context.
env_block (RawGitHubActionsEnvNode | None) – Node to apply effects of env block, if any.
jobs (dict[str, RawGitHubActionsJobNode]) – List of job ids specifying job execution order.
order (list[str]) – List of job ids specifying job execution order.
- definition: github_workflow_model.Workflow
Parsed workflow AST.
- context: core.ContextRef[GitHubActionsWorkflowContext]
Workflow context.
- env_block: RawGitHubActionsEnvNode | None
Node to apply effects of env block, if any.
- jobs: dict[str, RawGitHubActionsJobNode]
Job nodes, identified by their job id.
- order: list[str]
List of job ids specifying job execution order.
- get_successors(node, exit_type)
Return the successors for a particular exit of a particular node.
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- get_printable_properties_table()
Return a properties table containing the workflow name and scopes.
- static create(workflow, context)
Create workflow node from workflow AST.
Also creates a job node for each job, and performs a topological sort of the job dependency graph to choose an arbitrary valid sequential execution order.
- Parameters:
workflow (github_workflow_model.Workflow) – Parsed workflow AST.
context (core.NonOwningContextRef[GitHubActionsWorkflowContext]) – Workflow context.
- Returns:
The new workflow node.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.github.RawGitHubActionsJobNode(definition, job_id, context)
Bases:
InterpretationNodeInterpretation node representing a GitHub Actions Job.
Defines how to interpret the different kinds of jobs (normal jobs, reusable workflow call jobs), and generate their analysis representation.
- __init__(definition, job_id, context)
Initialize node.
- definition: github_workflow_model.Job
Parsed job AST.
- job_id: str
Job id.
- context: core.ContextRef[GitHubActionsJobContext]
Job context.
- identify_interpretations(state)
Interpret job AST to generate representation for either a normal job or a reusable workflow call job.
- Return type:
dict[InterpretationKey,Callable[[],Node]]
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.github.GitHubActionsNormalJobNode(definition, job_id, matrix_block, env_block, steps, output_block, context)
Bases:
ControlFlowGraphNodeControl-flow-graph node representing a GitHub Actions Normal Job.
Control flow structure executes each step in the order defined by the job, preceded by applying the effects of the matrix and env blocks if they exist and succeeded by applying the effects of the output block if it exists. (TODO generating output block not yet implemented).
- __init__(definition, job_id, matrix_block, env_block, steps, output_block, context)
Initialize job node.
Typically, construction should be done via the create function rather than using this constructor directly.
- Parameters:
definition (github_workflow_model.NormalJob) – Parsed job AST.
job_id (str) – Job id.
matrix_block (RawGitHubActionsMatrixNode | None) – Node to apply effects of matrix block, if any.
env_block (RawGitHubActionsEnvNode | None) – Node to apply effects of env block, if any.
steps (list[RawGitHubActionsStepNode]) – Step nodes, in execution order.
output_block (core.Node | None,) – Node to apply effects of output block, if any.
context (core.ContextRef[GitHubActionsJobContext]) – Job context.
- definition: github_workflow_model.NormalJob
Parsed job AST.
- job_id: str
Job id.
- matrix_block: RawGitHubActionsMatrixNode | None
Node to apply effects of matrix block, if any.
- env_block: RawGitHubActionsEnvNode | None
Node to apply effects of env block, if any.
- steps: list[RawGitHubActionsStepNode]
Step nodes, in execution order.
- output_block: core.Node | None
Node to apply effects of output block, if any.
- context: core.ContextRef[GitHubActionsJobContext]
Job context
- get_successors(node, exit_type)
Return the successors for a particular exit of a particular node.
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- get_printable_properties_table()
Return a properties table containing the job id and scopes.
- static create(job, job_id, context)
Create normal job node from job AST. Also creates a step node for each step.
- Parameters:
job (github_workflow_model.NormalJob) – Parsed job AST.
job_id (str) – Job id.
context (core.NonOwningContextRef[GitHubActionsJobContext]) – Job context.
- Returns:
The new job node.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.github.GitHubActionsReusableWorkflowCallNode(definition, job_id, context, uses_name, uses_version, with_parameters)
Bases:
InterpretationNodeInterpretation node representing a GitHub Actions Reusable Workflow Call Job.
Defines how to interpret the semantics of different supported reusable workflows that may be invoked (TODO currently none are supported).
- __init__(definition, job_id, context, uses_name, uses_version, with_parameters)
Initialize reusable workflow call node.
- Parameters:
definition (github_workflow_model.ReusableWorkflowCallJob) – Parsed reusable workflow call AST.
job_id (str) – Job id.
context (core.ContextRef[GitHubActionsJobContext]) – Job context.
uses_name (str) – Name of the reusable workflow being invoked (without version component).
uses_version (str | None) – Version of the reusable workflow being invoked (if specified).
with_parameters (dict[str, facts.Value]) – Input parameters specified for reusable workflow.
- definition: github_workflow_model.ReusableWorkflowCallJob
Parsed reusable workflow call AST.
- job_id: str
Job id.
- context: core.ContextRef[GitHubActionsJobContext]
Job context.
- uses_name: str
Name of the reusable workflow being invoked (without version component).
- uses_version: str | None
Version of the reusable workflow being invoked (if specified).
- with_parameters: dict[str, facts.Value]
Input parameters specified for reusable workflow.
- identify_interpretations(state)
Intepret the semantics of the different supported reusable workflows.
(TODO currently none are supported).
- Return type:
dict[InterpretationKey,Callable[[],Node]]
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.github.RawGitHubActionsStepNode(definition, context)
Bases:
InterpretationNodeInterpretation node representing a GitHub Actions Step.
Defines how to interpret the different kinds of steps (run jobs, action steps), and generate their analysis representation.
- __init__(definition, context)
Intitialize node.
- definition: github_workflow_model.Step
Parsed step AST.
- context: core.ContextRef[GitHubActionsStepContext]
Step context
- identify_interpretations(state)
Interpret step AST to generate representation depending on whether it is a run step or an action step.
- Return type:
dict[InterpretationKey,Callable[[],Node]]
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.github.RawGitHubActionsActionStepNode(definition, context)
Bases:
InterpretationNodeInterpretation node representing a GitHub Actions Action Step.
Defines how to extract the name, version and parameters used to invoke the action, and generate a node with those details resolved for further interpretation.
- __init__(definition, context)
Initialize node.
- definition: github_workflow_model.ActionStep
Parsed step AST.
- context: core.ContextRef[GitHubActionsStepContext]
Step context.
- identify_interpretations(state)
Intepret action step AST to extract the name, version and parameters.
- Return type:
dict[InterpretationKey,Callable[[],Node]]
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.github.GitHubActionsActionStepNode(definition, context, uses_name, uses_version, with_parameters)
Bases:
InterpretationNodeInterpretation node representing a GitHub Actions Action Step.
Defines how to interpret the semantics of different supported actions that may be invoked.
- __init__(definition, context, uses_name, uses_version, with_parameters)
Initialize action step node.
- Parameters:
definition (github_workflow_model.ActionStep) – Parsed step AST.
context (core.ContextRef[GitHubActionsStepContext]) – Step context.
uses_name (str) – Name of the action being invoked (without version component).
uses_version (str | None) – Version of the action being invoked (if specified).
with_parameters (dict[str, facts.Value]) – Input parameters specified for action.
- definition: github_workflow_model.ActionStep
Parsed step AST.
- context: core.ContextRef[GitHubActionsStepContext]
Step context.
- uses_name: str
Name of the action being invoked (without version component).
- uses_version: str | None
Version of the action being invoked (if specified).
- with_parameters: dict[str, facts.Value]
Input parameters specified for action.
- identify_interpretations(state)
Intepret the semantics of the different supported actions.
- Return type:
dict[InterpretationKey,Callable[[],Node]]
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.github.GitHubActionsRunStepNode(definition, env_block, shell_block, context)
Bases:
ControlFlowGraphNodeControl-flow-graph node representing a GitHub Actions Run Step.
Control flow structure executes the shell script defined by the step. If an env block exists, it is applied beforehand.
- __init__(definition, env_block, shell_block, context)
Initialize run step node.
Typically, construction should be done via the create function rather than using this constructor directly.
- Parameters:
definition (github_workflow_model.RunStep) – Parsed step AST.
env_block (RawGitHubActionsEnvNode | None) – Node to apply effects of env block, if any.
shell_block (bash.RawBashScriptNode) – Shell script to be run.
context (core.ContextRef[GitHubActionsStepContext]) – Step context.
- definition: github_workflow_model.RunStep
Parsed step AST.
- env_block: RawGitHubActionsEnvNode | None
Node to apply effects of env block, if any.
- shell_block: bash.RawBashScriptNode
Shell script to be run.
- context: core.ContextRef[GitHubActionsStepContext]
Step context.
- get_successors(node, exit_type)
Return the successors for a particular exit of a particular node.
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- get_printable_properties_table()
Return a properties table containing the step id, name, and scopes.
- static create(run_step, context)
Create run step node from step AST.
- Parameters:
run_step (github_workflow_model.RunStep) – Parsed step AST.
context (core.NonOwningContextRef[GitHubActionsStepContext]) – Step context.
- Returns:
The new run step node.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.github.RawGitHubActionsEnvNode(definition, context)
Bases:
InterpretationNodeInterpretation node representing an env block in a GitHub Actions Workflow/Job/Step.
Defines how to interpret the declarative env block to generate imperative constructs to write the values to the env variables.
- __init__(definition, context)
Initialize env block node.
- Parameters:
definition (github_workflow_model.Env) – Parsed env block AST.
context (core.ContextRef[GitHubActionsWorkflowContext | GitHubActionsJobContext | GitHubActionsStepContext]) – Outer context.
- definition: github_workflow_model.Env
Parsed env block AST.
- context: core.ContextRef[GitHubActionsWorkflowContext | GitHubActionsJobContext | GitHubActionsStepContext]
Outer context.
- identify_interpretations(state)
Interpret declarative env block to generate imperative constructs to write to the env vars.
- Return type:
dict[InterpretationKey,Callable[[],Node]]
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.github.RawGitHubActionsMatrixNode(definition, context)
Bases:
InterpretationNodeInterpretation node representing a matrix block in a GitHub Actions Job.
Defines how to interpret the declarative matrix block to generate imperative constructs to write the values to the matrix variables.
- __init__(definition, context)
Initialize matrix node.
- Parameters:
definition (github_workflow_model.Matrix) – Parsed matrix block AST.
context (core.ContextRef[GitHubActionsJobContext]) – Outer job context.
- definition: github_workflow_model.Matrix
Parsed matrix block AST.
- context: core.ContextRef[GitHubActionsJobContext]
Outer job context.
- identify_interpretations(state)
Interpret declarative matrix block to generate imperative constructs to write to the matrix variables.
- Return type:
dict[InterpretationKey,Callable[[],Node]]
- get_exit_state_transfer_filter()
Return state transfer filter to clear scopes owned by this node after this node exits.
- Return type:
macaron.code_analyzer.dataflow_analysis.github_expr module
Parser for GitHub Actions expression language.
- macaron.code_analyzer.dataflow_analysis.github_expr.extract_expr_variable_name(node)
Return variable access path for token.
If the given node is a variable access or sequence of property accesses, return the access path as a string, otherwise return None.
- macaron.code_analyzer.dataflow_analysis.github_expr.extract_value_from_expr_string(s, var_scope)
Return a value expression representation of a string containing GitHub Actions expressions.
GitHub Action expressions within the string are denoted by “${{ <expr> }}”.
Returns None if it is unrepresentable.
macaron.code_analyzer.dataflow_analysis.models module
Models of supported commands, actions, etc. that may be invoked by build pipelines.
Defines how they are modelled by the dataflow analysis in terms of their effect on the abstract state.
- class macaron.code_analyzer.dataflow_analysis.models.BoundParameterisedStatementSet(parameterised_stmts, value_parameter_binds=None, location_parameter_binds=None, scope_parameter_binds=None)
Bases:
objectRepresentation of a set of (simultaneous) write operations.
Defined as a reference to a set of generic parameterised statements, along with a set of parameter bindings that instantiate the parameterised statements with concrete subexpressions.
- __init__(parameterised_stmts, value_parameter_binds=None, location_parameter_binds=None, scope_parameter_binds=None)
Initialize bound parameterised statement set.
- Parameters:
parameterised_stmts (evaluation.StatementSet) – Set of generic parameterised statements.
value_parameter_binds (dict[str, facts.Value] | None) – Parameter bindings for value.
location_parameter_binds (dict[str, facts.LocationSpecifier] | None) – Parameter bindings for locations.
scope_parameter_binds (dict[str, facts.Scope] | None) – Parameter bindings for scopes.
-
parameterised_stmts:
StatementSet Set of generic parameterised statements.
-
location_parameter_binds:
dict[str,LocationSpecifier] Parameter bindings for locations.
-
instantiated_statements:
StatementSet Instantiated statements.
- get_statements()
Return instantiated statement set.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.models.BoundParameterisedModelNode(stmts)
Bases:
StatementNodeStatement node that applies effects as defined in a provided model.
Subclasses will define a statement node with a specific model.
- __init__(stmts)
Initialise model statement node.
- stmts: BoundParameterisedStatementSet
Statement effects model.
- class macaron.code_analyzer.dataflow_analysis.models.InstallPackageNode(install_scope, name, version, distribution, url)
Bases:
BoundParameterisedModelNodeModel for package installation.
Stores a representation of the installed package into the abstract “installed packages” location.
- static get_model()
Return the model.
- Return type:
- __init__(install_scope, name, version, distribution, url)
Initialize install package node.
- Parameters:
install_scope (facts.Scope) – Scope into which to install.
name (facts.Value) – Package name.
version (facts.Value) – Package version.
distribution (facts.Value) – Package distribution.
url (facts.Value) – URL of package.
- install_scope: facts.Scope
Scope into which to install.
- name: facts.Value
Package name.
- version: facts.Value
Package version.
- distribution: facts.Value
Package distribution.
- url: facts.Value
URL of package.
- class macaron.code_analyzer.dataflow_analysis.models.VarAssignKind(value)
Bases:
EnumKind of variable assignment.
- BASH_ENV_VAR = 1
Bash environment variable.
- BASH_FUNC_DECL = 2
Bash function declaration.
- GITHUB_JOB_VAR = 3
GitHub job variable.
- GITHUB_ENV_VAR = 4
GitHub environment variable.
- OTHER = 5
Other uncategorized variable.
- class macaron.code_analyzer.dataflow_analysis.models.VarAssignNode(kind, var_scope, var_name, value)
Bases:
BoundParameterisedModelNodeModel for variable assignment.
Stores the assigned value to the variable location.
- static get_model()
Return the model.
- Return type:
- __init__(kind, var_scope, var_name, value)
Initialize variable assignment node.
- Parameters:
kind (VarAssignKind) – The kind of variable.
var_scope (facts.Scope) – The scope in which the variable is stored.
var_name (facts.Value) – The name of the variable.
value (facts.Value) – The value to assign to the variable.
- kind: VarAssignKind
The kind of variable.
- var_scope: facts.Scope
The scope in which the variable is stored.
- var_name: facts.Value
The name of the variable.
- value: facts.Value
The value to assign to the variable.
- class macaron.code_analyzer.dataflow_analysis.models.GitHubActionsGitCheckoutModelNode
Bases:
StatementNodeModel for GitHub git checkout operation.
Currently modelled as a no-op.
- class macaron.code_analyzer.dataflow_analysis.models.GitHubActionsUploadArtifactModelNode(artifacts_scope, artifact_name, artifact_file, filesystem_scope, path)
Bases:
BoundParameterisedModelNodeModel for uploading artifacts to GitHub pipeline artifact storage.
Stores the content read from a file to the artifact storage location.
- static get_model()
Return the model.
- Return type:
- __init__(artifacts_scope, artifact_name, artifact_file, filesystem_scope, path)
Initialize upload artifacts node.
- Parameters:
artifacts_scope (facts.Scope) – Scope for pipeline artifact storage.
artifact_name (facts.Value) – Artifact name.
artifact_file (facts.Value) – Artifact filename.
filesystem_scope (facts.Scope) – Scope for filesystem from which to read file.
path (facts.Value) – File path to read artifact content from.
- artifacts_scope: facts.Scope
Scope for pipeline artifact storage.
- artifact_name: facts.Value
Artifact name.
- artifact_file: facts.Value
Artifact filename.
- filesystem_scope: facts.Scope
Scope for filesystem from which to read file.
- path: facts.Value
File path to read artifact content from.
- class macaron.code_analyzer.dataflow_analysis.models.GitHubActionsDownloadArtifactModelNode(artifacts_scope, artifact_name, filesystem_scope)
Bases:
BoundParameterisedModelNodeModel for downloading artifacts from GitHub pipeline artifact storage.
For each file in the artifact, reads the content of that artifact and stores it to the filesystem under the same filename.
- static get_model()
Return model.
- Return type:
- __init__(artifacts_scope, artifact_name, filesystem_scope)
Initialize download artifacts node.
- Parameters:
artifacts_scope (facts.Scope) – Scope for pipeline artifact storage.
artifact_name (facts.Value) – Artifact name.
filesystem_scope (facts.Scope) – Scope for filesystem to store artifacts to.
- artifacts_scope: facts.Scope
Scope for pipeline artifact storage.
- artifact_name: facts.Value
Artifact name.
- filesystem_scope: facts.Scope
Scope for filesystem to store artifacts to.
- class macaron.code_analyzer.dataflow_analysis.models.GitHubActionsReleaseModelNode(artifacts_scope, artifact_name, artifact_file, filesystem_scope, path)
Bases:
GitHubActionsUploadArtifactModelNodeModel for uploading artifacts to a GitHub release.
Modelled in the same way as artifact upload.
- class macaron.code_analyzer.dataflow_analysis.models.BashEchoNode(out_loc, value)
Bases:
BoundParameterisedModelNodeModel for Bash echo command, which writes the echoed value to some location.
- static get_model()
Return model.
- Return type:
- __init__(out_loc, value)
Initialize echo node.
- Parameters:
out_loc (facts.Location) – Output location.
value (facts.Value) – Value written.
- out_loc: facts.Location
Output location.
- value: facts.Value
Value written.
- class macaron.code_analyzer.dataflow_analysis.models.Base64EncodeNode(in_loc, out_loc)
Bases:
BoundParameterisedModelNodeModel for Base64 encode operation.
Reads a value from some location, Base64-encodes it and writes the result to another location.
- static get_model()
Return model.
- Return type:
- __init__(in_loc, out_loc)
Initialize Base64 encode node.
- Parameters:
in_loc (facts.Location) – Location to read input from.
out_loc (facts.Location) – Location to write encoded output to.
- in_loc: facts.Location
Location to read input from.
- out_loc: facts.Location
Location to write encoded output to.
- class macaron.code_analyzer.dataflow_analysis.models.Base64DecodeNode(in_loc, out_loc)
Bases:
BoundParameterisedModelNodeModel for Base64 decode operation.
Reads a value from some location, Base64-decodes it and writes the result to another location.
- static get_model()
Return model.
- Return type:
- __init__(in_loc, out_loc)
Initialize Base64 decode node.
- Parameters:
in_loc (facts.Location) – Location to read input from.
out_loc (facts.Location) – Location to write decoded output to.
- in_loc: facts.Location
Location to read input from.
- out_loc: facts.Location
Location to write decoded output to.
- class macaron.code_analyzer.dataflow_analysis.models.MavenBuildModelNode(filesystem_scope)
Bases:
BoundParameterisedModelNodeModel for Maven build commands.
Maven build behaviour is approximated as writing some files under the target directory.
- static get_model()
Return model.
- Return type:
- __init__(filesystem_scope)
Initialize Maven build node.
- Parameters:
filesystem_scope (facts.Scope) – Scope for filesystem written to.
- filesystem_scope: facts.Scope
Scope for filesystem written to.
macaron.code_analyzer.dataflow_analysis.printing module
Functions for printing/displaying dataflow analysis nodes in the form of graphviz (dot) output.
Allows the analysis representation and results to be rendered as a human-readable node-link graph.
Makes use of graphviz’s html-like label feature to add detailed information to each node. Tables are specified in the form of a dict[str, set[tuple[str | None, str]], which is rendered as a two-column table, with the first column containing each of the keys of the dict, and the second column containing the corresponding set of values, as a nested vertical table, with each value having an optional label that, if present, will be rendered in a visually distinguished manner alongside the value.
- macaron.code_analyzer.dataflow_analysis.printing.print_as_dot_graph(node, out, include_properties, include_states)
Print root node as dot graph.
- Parameters:
node (core.Node) – The root node to print.
out (TextIO) – Output stream to print to.
include_properties (bool) – Whether to include detail on the properties of each node (disable to make nodes simpler/smaller).
include_states (bool) – Whether to include detail on the abstract state at each node (disable to make nodes simpler/smaller).
- Return type:
- macaron.code_analyzer.dataflow_analysis.printing.get_printable_table_for_state(state, state_filter=None)
Return a table of the stringified representation of the state.
Consists of a mapping of storage locations to the set of values they may contain (see module comment for description of the return type).
Values are additionally labeled with whether they were new and not copied, and whether they will be excluded by the given filter.
- macaron.code_analyzer.dataflow_analysis.printing.print_as_dot_string(node, out, include_properties, include_states)
Print node as dot representation (to be embedded within a dot graph).
- Parameters:
node (core.Node) – The node to print.
out (TextIO) – Output stream to print to.
include_properties (bool) – Whether to include detail on the properties of each node (disable to make nodes simpler/smaller).
include_states (bool) – Whether to include detail on the abstract state at each node (disable to make nodes simpler/smaller).
- Return type:
- macaron.code_analyzer.dataflow_analysis.printing.print_cfg_node_as_dot_string(cfg_node, out, include_properties, include_states)
Print control-flow-graph node as dot representation (to be embedded within a dot graph).
- Parameters:
cfg_node (core.ControlFlowGraphNode) – The control-flow-graph node to print.
out (TextIO) – Output stream to print to.
include_properties (bool) – Whether to include detail on the properties of each node (disable to make nodes simpler/smaller).
include_states (bool) – Whether to include detail on the abstract state at each node (disable to make nodes simpler/smaller).
- Return type:
- macaron.code_analyzer.dataflow_analysis.printing.print_statement_node_as_dot_string(node, out, include_properties, include_states)
Print statement node as dot representation (to be embedded within a dot graph).
- Parameters:
node (core.StatementNode) – The statement node to print.
out (TextIO) – Output stream to print to.
include_properties (bool) – Whether to include detail on the properties of each node (disable to make nodes simpler/smaller).
include_states (bool) – Whether to include detail on the abstract state at each node (disable to make nodes simpler/smaller).
- Return type:
- macaron.code_analyzer.dataflow_analysis.printing.print_interpretation_node_as_dot_string(node, out, include_properties, include_states)
Print interpretation node as dot representation (to be embedded within a dot graph).
- Parameters:
node (core.InterpretationNode) – The interpretation node to print.
out (TextIO) – Output stream to print to.
include_properties (bool) – Whether to include detail on the properties of each node (disable to make nodes simpler/smaller).
include_states (bool) – Whether to include detail on the abstract state at each node (disable to make nodes simpler/smaller).
- Return type:
- macaron.code_analyzer.dataflow_analysis.printing.escape_for_dot_html_like_label(s)
Return string escape for inclusion in a dot html-like label.
- Return type:
- class macaron.code_analyzer.dataflow_analysis.printing.DotHtmlLikeTableConfiguration(header_colour, header_font_colour, header_font_size, header_font_bold, body_colour, body_font_colour, body_font_size)
Bases:
objectConfiguration for rendering of dot html-like table.
- __init__(header_colour, header_font_colour, header_font_size, header_font_bold, body_colour, body_font_colour, body_font_size)
- macaron.code_analyzer.dataflow_analysis.printing.truncate_long_strings_for_display(s)
Truncate long string if necessary for display.
- Return type:
- macaron.code_analyzer.dataflow_analysis.printing.produce_dot_html_like_table(header, data, config)
Return the given data table rendered as a dot html-like label table.
See module comment for description of how data tables are rendered.
- Return type:
- macaron.code_analyzer.dataflow_analysis.printing.produce_node_dot_html_like_label(node_kind, node_type, node_label, config, subtables)
Return the given node table data rendered as a dot html-like label table.
Contains nested tables for each subtable (see module comment for description of how data tables are rendered).
- Return type:
- macaron.code_analyzer.dataflow_analysis.printing.produce_node_dot_def(node_id, node_kind, node_type, node_label, config, subtables)
Return the given node table data rendered as a dot node containig a html-like label table.
Contains nested tables for each subtable (see module comment for description of how data tables are rendered).
- Return type:
macaron.code_analyzer.dataflow_analysis.run_analysis_standalone module
Module providing entry point to run dataflow analysis independently of Macaron command.
For experimentation and debugging purposes only.