macaron.malware_analyzer.pypi_heuristics.metadata package

Submodules

macaron.malware_analyzer.pypi_heuristics.metadata.anomalous_version module

The heuristic analyzer to check for an anomalous package version.

class macaron.malware_analyzer.pypi_heuristics.metadata.anomalous_version.AnomalousVersionAnalyzer

Bases: BaseHeuristicAnalyzer

Analyze the version number (if there is only a single release) to detect if it is anomalous.

A version number is anomalous if any of its values are greater than the epoch or major threshold values. If the version does not adhere to PyPI standards (PEP 440, as per the ‘packaging’ module), this heuristic cannot analyze it.

Calendar versioning is detected as version numbers with the year, month and day present in the following combinations: (using the example 11th October 2016) - YYYY.MM.DD, e.g. 2016.10.11 - YYYY.DD.MM, e.g. 2016.11.10 - YY.DD.MM, e.g. 16.11.10 - YY.MM.DD, e.g. 16.10.11 - MM.DD.YYYY, e.g. 10.11.2016 - DD.MM.YYYY, e.g. 11.10.2016 - DD.MM.YY, e.g. 11.10.16 - MM.DD.YY, e.g. 10.11.16 - YYYYMMDD, e.g. 20161011 - YYYYDDMM, e.g. 20161110 - YYDDMM, e.g. 161110 - YYMMDD, e.g. 161011 - MMDDYYYY, e.g. 10112016 - DDMMYYYY, e.g. 11102016 - DDMMYY, e.g. 111016 - MMDDYY, e.g. 101116 This may be followed by further versioning (e.g. 2016.10.11.5.6.2). This type of versioning is detected based on the date of the upload time for the release within a threshold of a number of days (in the defaults file).

Calendar-semantic versioning is detected as version numbers with the major value as the year (either yyyy or yy), and any other series of numbers following it: - 2016.7.1 woud be version 7.1 of 2016 - 16.1.4 would be version 1.4 of 2016 This type of versioning is detected based on the exact year of the upload time for the release.

All other versionings are detected as semantic versioning.

DETAIL_INFO_KEY: str = 'versioning'
DIGIT_DATE_FORMATS: list[str] = ['%Y%m%d', '%Y%d%m', '%d%m%Y', '%m%d%Y', '%y%m%d', '%y%d%m', '%d%m%y', '%m%d%y']
__init__()
analyze(pypi_package_json)

Analyze the package.

Parameters:

pypi_package_json (PyPIPackageJsonAsset) – The PyPI package JSON asset object.

Returns:

The result and related information collected during the analysis.

Return type:

tuple[HeuristicResult, dict[str, JsonType]]

Raises:

HeuristicAnalyzerValueError – if there is no release information available.

class macaron.malware_analyzer.pypi_heuristics.metadata.anomalous_version.Versioning(value)

Bases: Enum

Enum used to assign different versioning methods.

INVALID = 'invalid'
CALENDAR = 'calendar'
CALENDAR_SEMANTIC = 'calendar_semantic'
SEMANTIC = 'semantic'

macaron.malware_analyzer.pypi_heuristics.metadata.closer_release_join_date module

Analyzer checks whether the maintainers’ join date closer to latest package’s release date.

class macaron.malware_analyzer.pypi_heuristics.metadata.closer_release_join_date.CloserReleaseJoinDateAnalyzer

Bases: BaseHeuristicAnalyzer

Check whether the maintainers’ join date closer to package’s latest release date.

If any maintainer’s date duration is larger than threshold, we consider it as “PASS”.

__init__()
analyze(pypi_package_json)

Analyze the package.

Parameters:

pypi_package_json (PyPIPackageJsonAsset) – The PyPI package JSON asset object.

Returns:

The result and related information collected during the analysis.

Return type:

tuple[HeuristicResult, dict[str, JsonType]]

macaron.malware_analyzer.pypi_heuristics.metadata.high_release_frequency module

Analyzer checks the frequent release heuristic.

class macaron.malware_analyzer.pypi_heuristics.metadata.high_release_frequency.HighReleaseFrequencyAnalyzer

Bases: BaseHeuristicAnalyzer

Check whether the release frequency is high.

__init__()
analyze(pypi_package_json)

Analyze the package.

Parameters:

pypi_package_json (PyPIPackageJsonAsset) – The PyPI package JSON asset object.

Returns:

The result and related information collected during the analysis.

Return type:

tuple[HeuristicResult, dict[str, JsonType]]

macaron.malware_analyzer.pypi_heuristics.metadata.one_release module

Analyzer checks the packages contain one release.

class macaron.malware_analyzer.pypi_heuristics.metadata.one_release.OneReleaseAnalyzer

Bases: BaseHeuristicAnalyzer

Determine if there is only one release of the package.

__init__()
analyze(pypi_package_json)

Analyze the package.

Parameters:

pypi_package_json (PyPIPackageJsonAsset) – The PyPI package JSON asset object.

Returns:

The result and related information collected during the analysis.

Return type:

tuple[HeuristicResult, dict[str, JsonType]]

macaron.malware_analyzer.pypi_heuristics.metadata.source_code_repo module

The heuristic analyzer to check if a source code repo was found.

class macaron.malware_analyzer.pypi_heuristics.metadata.source_code_repo.SourceCodeRepoAnalyzer

Bases: BaseHeuristicAnalyzer

Analyze the accessibility of the source code repository.

Passes if a repository was found and validated by the repo finder, otherwise fails.

__init__()
analyze(pypi_package_json)

Analyze the package.

Parameters:

pypi_package_json (PyPIPackageJsonAsset) – The PyPI package JSON asset object.

Returns:

The result and related information collected during the analysis.

Return type:

tuple[HeuristicResult, dict[str, JsonType]]

macaron.malware_analyzer.pypi_heuristics.metadata.unchanged_release module

Heuristics analyzer to check unchanged content in multiple releases.

class macaron.malware_analyzer.pypi_heuristics.metadata.unchanged_release.UnchangedReleaseAnalyzer

Bases: BaseHeuristicAnalyzer

Analyze whether the content of the package is updated by the maintainer.

__init__()
analyze(pypi_package_json)

Check the content of releases keep updating.

Parameters:

pypi_package_json (PyPIPackageJsonAsset) – The PyPI package JSON asset object.

Returns:

The result and related information collected during the analysis.

Return type:

tuple[HeuristicResult, dict[str, JsonType]]

macaron.malware_analyzer.pypi_heuristics.metadata.wheel_absence module

The heuristic analyzer to check .whl file absence.

class macaron.malware_analyzer.pypi_heuristics.metadata.wheel_absence.WheelAbsenceAnalyzer

Bases: BaseHeuristicAnalyzer

Analyze to see if a .whl file is available for the package.

If a package is distributed with a .whl file, this heuristic passes. Otherwise, the heuristic fails.

WHEEL: str = 'bdist_wheel'
INSPECTOR_TEMPLATE = '{inspector_url_scheme}://{inspector_url_netloc}/project/{name}/{version}/packages/{first}/{second}/{rest}/{filename}'
__init__()
analyze(pypi_package_json)

Analyze the package.

Parameters:

pypi_package_json (PyPIPackageJsonAsset) – The PyPI package JSON asset object.

Returns:

The result and related information collected during the analysis.

Return type:

tuple[HeuristicResult, dict[str, JsonType]]

Raises:

HeuristicAnalyzerValueError – If there is no release information, or has other missing package information.