macaron.malware_analyzer.pypi_heuristics.sourcecode package

Submodules

macaron.malware_analyzer.pypi_heuristics.sourcecode.pypi_sourcecode_analyzer module

Detect suspicious function calls in the code and trace the arguments back to their original values.

This allows for deeper analysis of potentially malicious behavior.

class macaron.malware_analyzer.pypi_heuristics.sourcecode.pypi_sourcecode_analyzer.PyPISourcecodeAnalyzer(resources_path=None)

Bases: BaseHeuristicAnalyzer

This class is used to analyze the source code of python PyPI packages. This analyzer is a work in progress.

Currently the analyzer performs textual pattern matching and dataflow analysis using the open-source features of Semgrep. Semgrep open-source taint tracking can only perform in one locale, but this is a known limitation. Default rules are stored in ‘macaron/resources/pypi_malware_rules’ as semgrep .yaml rule files. A user may add additional rules stored in a specified directory passed by them in the ‘defaults.ini’ configuration file.

__init__(resources_path=None)

Initialise the source code analyzer and load default and custom semgrep rulesets.

Parameters:

resources_path (str | None) – The path to the resources directory which must contain a ‘pypi_malware_rules’ directory of semgrep rules. If None is provided, then this is loaded from the global config resources path. Defaults to None

Raises:

ConfigurationError – If the default rule path is invalid, the heuristic.pypi entry is not present, or if the semgrep validation of the custom rule path failed.

analyze(pypi_package_json)

Analyze the source code of the package for malicious patterns.

This is the first phase of the source code analyzer.

Parameters:

pypi_package_json (PyPIPackageJsonAsset) – The PyPI package JSON asset object.

Returns:

Containing the analysis results and relevant patterns identified.

Return type:

tuple[HeuristicResult, dict[str, JsonType]]

Raises:

HeuristicAnalyzerValueError – if there is no source code available.

macaron.malware_analyzer.pypi_heuristics.sourcecode.suspicious_setup module

This analyzer checks the suspicious pattern within setup.py.

class macaron.malware_analyzer.pypi_heuristics.sourcecode.suspicious_setup.SuspiciousSetupAnalyzer

Bases: BaseHeuristicAnalyzer

Check whether suspicious packages are imported in setup.py.

__init__()
analyze(pypi_package_json)

Analyze the package.

Parameters:

pypi_package_json (PyPIPackageJsonAsset) – The PyPI package JSON asset object.

Returns:

The result and related information collected during the analysis.

Return type:

tuple[HeuristicResult, dict[str, JsonType]]

extract_from_ast(source_content)

Extract imports from source code using the parsed AST.

Parameters:

source_content (str) – The source code as a string.

Returns:

The set of imports.

Return type:

set[str]

Raises:

SyntaxError – If the code could not be parsed.

extract_from_lines(source_content)

Extract imports from source code using per line pattern matching.

Parameters:

source_content (str) – The source code as a string.

Returns:

The list of imports.

Return type:

set[str]