macaron.slsa_analyzer.package_registry package

This module defines the package registries.

Submodules

macaron.slsa_analyzer.package_registry.deps_dev module

This module contains implementation of deps.dev service.

class macaron.slsa_analyzer.package_registry.deps_dev.DepsDevService

Bases: object

The deps.dev service class.

static get_purl_endpoint(purl)

Build the purl API endpoint for the deps.dev service and return it.

Parameters:

purl (PackageURL | str) – The PURL to append to the API endpoint.

Returns:

The purl API endpoint.

Return type:

urllib.parse.SplitResult

Raises:

APIAccessError – If building the API endpoint fails.

static get_endpoint(path=None)

Build the API endpoint for the deps.dev service and return it.

Parameters:

path (str | None) – A path to be appended to the API endpoint.

Returns:

The API endpoint.

Return type:

urllib.parse.SplitResult

static encode_purl(purl)

Encode a PURL to match the deps.dev requirements.

The fragment (subpath) and query (qualifiers) PURL sections are not accepted by deps.dev. See: https://docs.deps.dev/api/v3alpha/index.html#purllookup. The documentation claims that all special characters must be percent-encoded. This is not strictly true, as ‘@’ and ‘:’ are accepted as is. The forward slashes in the PURL must be encoded to distinguish them from URL parts.

Parameters:

purl (PackageURL | str) – The PURL to encode.

Returns:

The encoded PURL.

Return type:

str | None

static get_package_info(purl)

Check if the package identified by the PackageURL (PURL) exists and return its information.

Parameters:

purl (PackageURL | str) – The PackageURL (PURL).

Returns:

The package metadata.

Return type:

dict

Raises:

APIAccessError – If the service is misconfigured, the API is invalid, a network error happens, or unexpected response is returned by the API.

macaron.slsa_analyzer.package_registry.jfrog_maven_registry module

Assets on a package registry.

class macaron.slsa_analyzer.package_registry.jfrog_maven_registry.JFrogMavenAsset(name: str, group_id: str, artifact_id: str, version: str, metadata: JFrogMavenAssetMetadata, jfrog_maven_registry: JFrogMavenRegistry)

Bases: NamedTuple

An asset hosted on a JFrog Artifactory repository with Maven layout.

name: str

The name of the Maven asset.

group_id: str

The group id.

artifact_id: str

The artifact id.

version: str

The version of the Maven asset.

metadata: JFrogMavenAssetMetadata

The metadata of the JFrog Maven asset.

jfrog_maven_registry: JFrogMavenRegistry

The JFrog repo that acts as a package registry following the Maven layout.

property url: str

Get the URL to the asset.

This URL can be used to download the asset.

property sha256_digest: str

Get the SHA256 digest of the asset.

property size_in_bytes: int

Get the size of the asset (in bytes).

download(dest)

Download the asset.

Parameters:

dest (str) – The local destination where the asset is downloaded to. Note that this must include the file name.

Returns:

True if the asset is downloaded successfully; False if not.

Return type:

bool

class macaron.slsa_analyzer.package_registry.jfrog_maven_registry.JFrogMavenAssetMetadata(size_in_bytes: int, sha256_digest: str, download_uri: str)

Bases: NamedTuple

Metadata of an asset on a JFrog Maven registry.

size_in_bytes: int

The size of the asset (in bytes).

sha256_digest: str

The SHA256 digest of the asset.

download_uri: str

The download URI of the asset.

class macaron.slsa_analyzer.package_registry.jfrog_maven_registry.JFrogMavenRegistry(hostname=None, repo=None, request_timeout=None, download_timeout=None, enabled=None)

Bases: PackageRegistry

A JFrog Artifactory repository that acts as a package registry with Maven layout.

For more details on JFrog Artifactory repository, see: https://jfrog.com/help/r/jfrog-artifactory-documentation/repository-management

__init__(hostname=None, repo=None, request_timeout=None, download_timeout=None, enabled=None)

Instantiate a JFrogMavenRegistry object.

Parameters:
  • hostname (str) – The hostname of the JFrog instance.

  • repo (str | None) – The Artifactory repository with Maven layout on the JFrog instance.

  • request_timeout (int | None) – The timeout (in seconds) for regular requests made to the package registry.

  • download_timeout (int | None) – The timeout (in seconds) for downloading files from the package registry.

  • enabled (bool | None) – Whether the package registry should be active in the analysis or not. “Not active” means no target repo/software component can be matched against this package registry.

load_defaults()

Load the .ini configuration for the current package registry.

Raises:

ConfigurationError – If there is a schema violation in the package_registry.jfrog.maven section.

Return type:

None

fetch_artifact_ids(group_id)

Get all artifact ids under a group id.

This is done by fetching all children folders under the group folder on the registry.

Parameters:

group_id (str) – The group id.

Returns:

The artifacts ids under the group.

Return type:

list[str]

construct_folder_info_url(folder_path)

Construct a URL for the JFrog Folder Info API.

Documentation: https://jfrog.com/help/r/jfrog-rest-apis/folder-info.

Parameters:

folder_path (str) – The path to the folder.

Returns:

The URL to request the info of the folder.

Return type:

str

construct_file_info_url(file_path)

Construct a URL for the JFrog File Info API.

Documentation: https://jfrog.com/help/r/jfrog-rest-apis/file-info.

Parameters:

file_path (str) – The path to the file.

Returns:

The URL to request the info of the file.

Return type:

str

construct_latest_version_url(group_id, artifact_id)

Construct a URL for the JFrog Latest Version Search API.

The response payload includes the latest version of the package with the given group id and artifact id. Documentation: https://jfrog.com/help/r/jfrog-rest-apis/artifact-latest-version-search-based-on-layout.

Parameters:
  • group_id (str) – The group id of the package.

  • artifact_id (str) – The artifact id of the package.

Returns:

The URL to request the latest version of the package.

Return type:

str

fetch_latest_version(group_id, artifact_id)

Fetch the latest version of a Java package on this JFrog Maven registry.

Parameters:
  • group_id (str) – The group id of the Java package.

  • artifact_id (str) – The artifact id of the Java package.

Returns:

The latest version of the Java package if it could be retrieved, or None otherwise.

Return type:

str | None

fetch_asset_names(group_id, artifact_id, version, extensions=None)

Retrieve the metadata of assets published for a version of a Maven package.

Parameters:
  • group_id (str) – The group id of the Maven package.

  • artifact_id (str) – The artifact id of the Maven package.

  • version (str) – The version of the Maven package.

  • extensions (set[str] | None) – The set of asset extensions. Only assets with names ending in these extensions are fetched. If this is None, then all assets are returned regardless of their extensions.

Returns:

The list of asset names.

Return type:

list[str]

extract_folder_names_from_folder_info_payload(folder_info_payload)

Extract a list of folder names from the Folder Info payload of a Maven group folder.

Parameters:

folder_info_payload (str) – The Folder Info payload.

Returns:

The artifact ids found in the payload.

Return type:

list[str]

extract_file_names_from_folder_info_payload(folder_info_payload, extensions=None)

Extract file names from the Folder Info response payload.

For the schema of this payload and other details regarding the API, see: https://jfrog.com/help/r/jfrog-rest-apis/folder-info.

Note: Currently, we do not try to validate the schema of the payload. Rather, we only try to read as much as possible things that we can recognise.

Parameters:
  • folder_info_payload (JsonType) – The JSON payload of a Folder Info response.

  • extensions (set[str] | None) – The set of allowed extensions. Filenames not ending in these extensions are omitted from the result. If this is None, then all file names are returned regardless of their extensions.

Returns:

The list of filenames in the folder, extracted from the payload.

Return type:

list[str]

fetch_asset_metadata(group_id, artifact_id, version, asset_name)

Fetch an asset’s metadata from JFrog.

Parameters:
  • group_id (str) – The group id of the package containing the asset.

  • artifact_id (str) – The artifact id of the package containing the asset.

  • version (str) – The version of the package containing the asset.

  • asset_name (str) – The name of the asset.

Returns:

The asset’s metadata, or None if the metadata cannot be retrieved.

Return type:

JFrogMavenAssetMetadata | None

extract_asset_metadata_from_file_info_payload(file_info_payload)

Extract the metadata of an asset from the File Info request payload.

Documentation: https://jfrog.com/help/r/jfrog-rest-apis/file-info.

Parameters:

file_info_payload (str) – The File Info request payload used to extract the metadata of an asset.

Returns:

The asset’s metadata, or None if the metadata cannot be retrieved.

Return type:

JFrogMavenAssetMetadata | None

fetch_assets(group_id, artifact_id, version, extensions=None)

Fetch the assets of a Maven package.

Parameters:
  • group_id (str) – The group id of the Maven package.

  • artifact_id (str) – The artifact id of the Maven package.

  • version (str) – The version of the Maven package.

  • extensions (set[str] | None) – The extensions of the assets to fetch. If this is None, all available assets are fetched.

Returns:

The list of assets of the package.

Return type:

list[JFrogMavenAsset]

construct_asset_url(group_id, artifact_id, version, asset_name)

Get the URL to download an asset.

Parameters:
  • group_id (str) – The group id of the package containing the asset.

  • artifact_id (str) – The artifact id of the package containing the asset.

  • version (str) – The version of the package containing the asset.

  • asset_name (str) – The name of the asset.

Returns:

The URL to the asset, which can be use for downloading the asset.

Return type:

str

download_asset(url, dest)

Download an asset from the given URL to a given location.

Parameters:
  • url (str) – The URL to the asset on the package registry.

  • dest (str) – The local destination where the asset is downloaded to.

Returns:

True if the file is downloaded successfully; False if not.

Return type:

bool

find_publish_timestamp(purl)

Make a search request to Maven Central to find the publishing timestamp of an artifact.

The reason for directly fetching timestamps from Maven Central is that deps.dev occasionally misses timestamps for Maven artifacts, making it unreliable for this purpose.

To see the search API syntax see: https://central.sonatype.org/search/rest-api-guide/

Parameters:

purl (str) – The Package URL (purl) of the package whose publication timestamp is to be retrieved. This should conform to the PURL specification.

Returns:

A timezone-aware datetime object representing the publication timestamp of the specified package.

Return type:

datetime

Raises:
  • InvalidHTTPResponseError – If the URL construction fails, the HTTP response is invalid, or if the response cannot be parsed correctly, or if the expected timestamp is missing or invalid.

  • NotImplementedError – If not implemented for a registry.

macaron.slsa_analyzer.package_registry.maven_central_registry module

The module provides abstractions for the Maven Central package registry.

macaron.slsa_analyzer.package_registry.maven_central_registry.same_organization(group_id_1, group_id_2)

Check if two maven group ids are from the same organization.

Note: It is assumed that for recognized source platforms, the top level domain doesn’t change the organization. I.e., io.github.foo and com.github.foo are assumed to be from the same organization.

Parameters:
  • group_id_1 (str) – The first group id.

  • group_id_2 (str) – The second group id.

Returns:

True if the two group ids are from the same organization, False otherwise.

Return type:

bool

class macaron.slsa_analyzer.package_registry.maven_central_registry.MavenCentralRegistry(search_netloc=None, search_scheme=None, search_endpoint=None, registry_url_netloc=None, registry_url_scheme=None, request_timeout=None)

Bases: PackageRegistry

This class implements a Maven Central package registry.

__init__(search_netloc=None, search_scheme=None, search_endpoint=None, registry_url_netloc=None, registry_url_scheme=None, request_timeout=None)

Initialize a Maven Central Registry instance.

Parameters:
  • search_netloc (str | None = None,) – The netloc of Maven Central search URL.

  • search_scheme (str | None = None,) – The scheme of Maven Central URL.

  • search_endpoint (str | None) – The search REST API to find artifacts.

  • registry_url_netloc (str | None) – The netloc of the Maven Central registry url.

  • registry_url_scheme (str | None) – The scheme of the Maven Central registry url.

  • request_timeout (int | None) – The timeout (in seconds) for requests made to the package registry.

load_defaults()

Load the .ini configuration for the current package registry.

Raises:

ConfigurationError – If there is a schema violation in the maven_central section.

Return type:

None

find_publish_timestamp(purl)

Make a search request to Maven Central to find the publishing timestamp of an artifact.

The reason for directly fetching timestamps from Maven Central is that deps.dev occasionally misses timestamps for Maven artifacts, making it unreliable for this purpose.

To see the search API syntax see: https://central.sonatype.org/search/rest-api-guide/

Parameters:

purl (str) – The Package URL (purl) of the package whose publication timestamp is to be retrieved. This should conform to the PURL specification.

Returns:

A timezone-aware datetime object representing the publication timestamp of the specified package.

Return type:

datetime

Raises:

InvalidHTTPResponseError – If the URL construction fails, the HTTP response is invalid, or if the response cannot be parsed correctly, or if the expected timestamp is missing or invalid.

get_artifact_hash(purl)

Return the hash of the artifact found by the passed purl relevant to the registry’s URL.

An artifact’s URL will be as follows: {registry_url}/{artifact_path}/{file_name} Where: - {registry_url} is determined by the setup/config of the registry. - {artifact_path} is determined by the Maven repository layout. (See: https://maven.apache.org/repository/layout.html and https://maven.apache.org/guides/mini/guide-naming-conventions.html) - {file_name} is {purl.name}-{purl.version}.jar (For a JAR artefact)

Example

PURL: pkg:maven/com.experlog/xapool@1.5.0

URL: https://repo1.maven.org/maven2/com/experlog/xapool/1.5.0/xapool-1.5.0.jar

Parameters:

purl (PackageURL) – The purl of the artifact.

Returns:

The hash of the artifact, or None if not found.

Return type:

str | None

macaron.slsa_analyzer.package_registry.npm_registry module

The module provides abstractions for the npm package registry.

class macaron.slsa_analyzer.package_registry.npm_registry.NPMRegistry(hostname=None, attestation_endpoint=None, request_timeout=None, enabled=True)

Bases: PackageRegistry

This class implements the npm package registry.

There is no complete and up-to-date API documentation for the npm registry and the endpoints are discovered by manual inspection of links on https://www.npmjs.com.

__init__(hostname=None, attestation_endpoint=None, request_timeout=None, enabled=True)

Initialize the npm Registry instance.

Parameters:
  • hostname (str | None) – The hostname of the npm registry.

  • attestation_endpoint (str | None) – The attestation REST API.

  • request_timeout (int | None) – The timeout (in seconds) for requests made to the package registry.

  • enabled (bool) – Shows whether making REST API calls to npm registry is enabled.

load_defaults()

Load the .ini configuration for the current package registry.

Raises:

ConfigurationError – If there is a schema violation in the npm registry section.

Return type:

None

download_attestation_payload(url, download_path)

Download the npm attestation from npm registry.

Each npm package can have the following types of attestations:

We download the unsigned SLSA provenance v0.2 or v1 in this method, and the signed npm type.

An example SLSA v0.2 provenance: https://registry.npmjs.org/-/npm/v1/attestations/@sigstore/mock@0.1.0 An example SLSA v1 provenance: https://registry.npmjs.org/-/npm/v1/attestations/@sigstore/mock@0.6.3

Parameters:
  • url (str) – The attestation URL.

  • download_path (str) – The download path for the asset.

Returns:

True if the asset is downloaded successfully; False if not.

Return type:

bool

Raises:

InvalidHTTPResponseError – If the HTTP request to the registry fails or an unexpected response is returned.

get_latest_version(namespace, name)

Try to retrieve the latest version of a package from the registry.

Parameters:
  • namespace (str | None) – The optional namespace of the package.

  • name (str) – The name of the package.

Returns:

The latest version of the package, or None if one cannot be found.

Return type:

str | None

class macaron.slsa_analyzer.package_registry.npm_registry.NPMAttestationAsset(namespace: str | None, artifact_id: str, version: str, npm_registry: NPMRegistry, size_in_bytes: int)

Bases: NamedTuple

An attestation asset hosted on the npm registry.

The API Documentation can be found here:

namespace: str | None

The optional scope of a package on npm, which is used as the namespace in a PURL string. See https://docs.npmjs.com/cli/v10/using-npm/scope to know about npm scopes. See https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#npm for the namespace in an npm PURL string.

artifact_id: str

The artifact ID.

version: str

The version of the asset.

npm_registry: NPMRegistry

The npm registry.

size_in_bytes: int

The size of the asset (in bytes). This attribute is added to match the AssetLocator protocol and is not used because npm API registry does not provide it.

property name: str

Get the asset name.

property url: str

Get the download URL of the asset.

Note: we assume that the path parameters used to construct the URL are sanitized already.

Return type:

str

download(dest)

Download the asset.

Parameters:

dest (str) – The local destination where the asset is downloaded to. Note that this must include the file name.

Returns:

True if the asset is downloaded successfully; False if not.

Return type:

bool

macaron.slsa_analyzer.package_registry.osv_dev module

This module contains implementation of osv.dev service.

class macaron.slsa_analyzer.package_registry.osv_dev.OSVDevService

Bases: object

The deps.dev service class.

static get_vulnerabilities_purl(purl)

Retrieve vulnerabilities associated with a specific package URL (PURL) by querying the OSV API.

This method calls the OSV query API with the provided package URL (PURL) to fetch any known vulnerabilities associated with that package.

Parameters:

purl (str) – A string representing the Package URL (PURL) of the package to query for vulnerabilities.

Returns:

A list of vulnerabilities under the key “vulns” if any vulnerabilities are found for the provided package.

Return type:

list

Raises:

APIAccessError – If there are issues with the API URL construction, missing configuration values, or invalid responses.

static get_vulnerabilities_package_name(ecosystem, name)

Retrieve vulnerabilities associated with a specific package name and ecosystem by querying the OSV API.

This method calls the OSV query API with the provided ecosystem and package name to fetch any known vulnerabilities associated with that package.

Parameters:
  • ecosystem (str) – A string representing the ecosystem of the package (e.g., “GitHub Actions”, “npm”, etc.).

  • name (str) – A string representing the name of the package to query for vulnerabilities.

Returns:

A list of vulnerabilities under the key “vulns” if any vulnerabilities are found for the provided ecosystem and package name.

Return type:

list

Raises:

APIAccessError – If there are issues with the API URL construction, missing configuration values, or invalid responses.

static get_vulnerabilities_package_name_batch(packages)

Retrieve vulnerabilities for a batch of packages based on their ecosystem and name.

This method constructs a batch query to the OSV API to check for vulnerabilities in multiple packages by querying the ecosystem and package name. It processes the results while preserving the order of the input packages. If a package has associated vulnerabilities, it is included in the returned list.

Parameters:

packages (list) – A list of dictionaries, where each dictionary represents a package with keys: - “ecosystem” (str): The package’s ecosystem (e.g., “GitHub Actions”, “npm”). - “name” (str): The name of the package.

Returns:

A list of packages from the input packages list that have associated vulnerabilities. The order of the returned packages matches the order of the input.

Return type:

list

Raises:

APIAccessError – If there is an issue with querying the OSV API or if the results do not match the expected size.

static get_osv_url(endpoint)

Construct a full API URL for a given OSV endpoint using values from the .ini configuration.

The configuration is expected to be in a section named [osv_dev] within the defaults object, and must include the following keys:

  • url_netloc: The base domain of the API.

  • url_scheme (optional): The scheme (e.g., “https”). Defaults to “https” if not provided.

  • A key matching the provided endpoint argument (e.g., “query_endpoint”), which defines the URL path.

Parameters:

endpoint (str) – The key name of the endpoint in the [osv_dev] section to construct the URL path.

Returns:

The fully constructed API URL.

Return type:

str

Raises:

APIAccessError – If required keys are missing from the configuration or if the URL cannot be constructed.

static call_osv_query_api(query_data)

Query the OSV (Open Source Vulnerability) knowledge base API with the given data.

This method sends a POST request to the OSV API and processes the response to extract information about vulnerabilities based on the provided query data.

Parameters:

query_data (dict) – A dictionary containing the query parameters to be sent to the OSV API. The query data should conform to the format expected by the OSV API for querying vulnerabilities.

Returns:

A list of vulnerabilities under the key “vulns” if the query is successful and the response is valid.

Return type:

list

Raises:

APIAccessError – If there are issues with the API URL construction, missing configuration values, or invalid responses.

static call_osv_querybatch_api(query_data, expected_size=None)

Query the OSV (Open Source Vulnerability) knowledge base API in batch mode and retrieves vulnerability data.

This method sends a batch query to the OSV API and processes the response to extract a list of results. The method also validates that the number of results matches an optional expected size. It handles API URL construction, error handling, and response validation.

Parameters:
  • query_data (dict) – A dictionary containing the batch query data to be sent to the OSV API. This data should conform to the expected format for batch querying vulnerabilities.

  • expected_size (int, optional) – The expected number of results from the query. If provided, the method checks that the number of results matches this value. If the actual number of results does not match the expected size, an exception is raised. Default is None.

Returns:

A list of results from the OSV API containing the vulnerability data that matches the query parameters.

Return type:

list

Raises:

APIAccessError – If any of the required configuration keys are missing, if the API URL construction fails, or if the response from the OSV API is invalid or the number of results does not match the expected size.

static is_version_affected(vuln, pkg_name, pkg_version, ecosystem, source_repo=None)

Check whether a specific version of a package is affected by a vulnerability.

This method parses a vulnerability dictionary to determine whether a given package version falls within the affected version ranges for the specified ecosystem. The function handles version comparisons, extracting details about introduced and fixed versions, and determines if the version is affected by the vulnerability.

Parameters:
  • vuln (dict) – A dictionary representing the vulnerability data. It should contain the affected versions and ranges of the package in question, as well as the details of the introduced and fixed versions for each affected range.

  • pkg_name (str) – The name of the package to check for vulnerability. This should match the package name in the vulnerability data.

  • pkg_version (str) – The version of the package to check against the vulnerability data.

  • ecosystem (str) – The ecosystem (e.g., npm, GitHub Actions) to which the package belongs. This should match the ecosystem in the vulnerability data.

  • source_repo (str | None, optional) – The source repository URL, used if the pkg_version is a commit hash. If provided, the method will try to retrieve the corresponding version tag from the repository. Default is None.

Returns:

Returns True if the given package version is affected by the vulnerability, otherwise returns False.

Return type:

bool

Raises:

APIAccessError – If the vulnerability data is incomplete or malformed, or if the version strings cannot be parsed correctly. This is raised in cases such as: - Missing affected version information - Malformed version data (e.g., invalid version strings) - Failure to parse the version ranges

macaron.slsa_analyzer.package_registry.package_registry module

This module defines package registries.

class macaron.slsa_analyzer.package_registry.package_registry.PackageRegistry(name, build_tool_names)

Bases: ABC

Base package registry class.

__init__(name, build_tool_names)
abstractmethod load_defaults()

Load the .ini configuration for the current package registry.

Return type:

None

is_detected(build_tool_name)

Detect if artifacts of the repo under analysis can possibly be published to this package registry.

The detection here is based on the repo’s detected build tool. If the package registry is compatible with the given build tool, it can be a possible place where the artifacts produced from the repo are published.

Parameters:

build_tool_name (str) – The name of a detected build tool of the repository under analysis.

Returns:

True if the repo under analysis can be published to this package registry, based on the given build tool.

Return type:

bool

find_publish_timestamp(purl)

Retrieve the publication timestamp for a package specified by its purl from the deps.dev repository by default.

This method constructs a request URL based on the provided purl, sends an HTTP GET request to fetch metadata about the package, and extracts the publication timestamp from the response.

Note: The method expects the response to include a version field with a publishedAt subfield containing an ISO 8601 formatted timestamp.

Parameters:

purl (str) – The Package URL (purl) of the package whose publication timestamp is to be retrieved. This should conform to the PURL specification.

Returns:

A timezone-aware datetime object representing the publication timestamp of the specified package.

Return type:

datetime

Raises:
  • InvalidHTTPResponseError – If the URL construction fails, the HTTP response is invalid, or if the response cannot be parsed correctly, or if the expected timestamp is missing or invalid.

  • NotImplementedError – If not implemented for a registry.

macaron.slsa_analyzer.package_registry.pypi_registry module

The module provides abstractions for the pypi package registry.

class macaron.slsa_analyzer.package_registry.pypi_registry.PyPIRegistry(registry_url_netloc=None, registry_url_scheme=None, fileserver_url_netloc=None, fileserver_url_scheme=None, inspector_url_netloc=None, inspector_url_scheme=None, request_timeout=None, enabled=True)

Bases: PackageRegistry

This class implements the pypi package registry.

__init__(registry_url_netloc=None, registry_url_scheme=None, fileserver_url_netloc=None, fileserver_url_scheme=None, inspector_url_netloc=None, inspector_url_scheme=None, request_timeout=None, enabled=True)

Initialize the pypi Registry instance.

Parameters:
  • registry_url_netloc (str | None) – The netloc of the pypi registry url.

  • registry_url_scheme (str | None) – The scheme of the pypi registry url.

  • fileserver_url_netloc (str | None) – The netloc of the server url that stores package source files, which contains the hostname and port.

  • fileserver_url_scheme (str | None) – The scheme of the server url that stores package source files.

  • inspector_url_netloc (str | None) – The netloc of the inspector server url, which contains the hostname and port.

  • inspector_url_scheme (str | None) – The scheme of the inspector server url.

  • request_timeout (int | None) – The timeout (in seconds) for requests made to the package registry.

  • enabled (bool) – Shows whether making REST API calls to pypi registry is enabled.

load_defaults()

Load the .ini configuration for the current package registry.

Raises:

ConfigurationError – If there is a schema violation in the pypi section.

Return type:

None

download_package_json(url)

Download the package JSON metadata from pypi registry.

Parameters:

url (str) – The package JSON url.

Returns:

The JSON response if the request is successful.

Return type:

dict

Raises:

InvalidHTTPResponseError – If the HTTP request to the registry fails or an unexpected response is returned.

static cleanup_sourcecode_directory(directory, error_message=None, error=None)

Remove the target directory, and report the passed error if present.

Parameters:
  • directory (str) – The directory to remove.

  • error_message (str | None) – The error message to report.

  • error (Exception | None) – The error to inherit from.

Raises:

InvalidHTTPResponseError – If there was an error during the operation.

Return type:

None

download_package_sourcecode(url)

Download the package source code from pypi registry.

Parameters:

url (str) – The package source code url.

Returns:

The temp directory with the source code.

Return type:

str

Raises:

InvalidHTTPResponseError – If the HTTP request to the registry fails or an unexpected response is returned.

get_artifact_hash(artifact_url)

Return the hash of the artifact found at the passed URL.

Parameters:

artifact_url (str) – The URL of the artifact.

Returns:

The hash of the artifact, or None if not found.

Return type:

str | None

get_package_page(package_name)

Implement custom API to get package main page.

Parameters:

package_name (str) – The package name.

Returns:

The package main page.

Return type:

str | None

get_maintainers_of_package(package_name)

Implement custom API to get all maintainers of the package.

Parameters:

package_name (str) – The package name.

Returns:

The list of maintainers.

Return type:

list | None

get_maintainer_profile_page(username)

Implement custom API to get maintainer’s profile page.

Parameters:

username (str) – The maintainer’s username.

Returns:

The profile page.

Return type:

str | None

get_packages_by_username(username)

Implement custom API to get the maintainer’s packages.

Parameters:

username (str) – The maintainer’s username.

Return type:

list[str] | None

Returns:

list[str]: A list of package names.

get_maintainer_join_date(username)

Implement custom API to get the maintainer’s join date.

Parameters:

username (str) – The maintainer’s username.

Return type:

datetime | None

Returns:

datetime | None: Maintainers join date. Only recent maintainer’s data available.

static extract_attestation(attestation_data)

Extract the first attestation file from a PyPI attestation response.

Parameters:

attestation_data (dict) – The JSON data representing a bundle of attestations.

Returns:

The first attestation, or None if not found.

Return type:

dict | None

class macaron.slsa_analyzer.package_registry.pypi_registry.PyPIPackageJsonAsset(component_name, component_version, has_repository, pypi_registry, package_json, package_sourcecode_path)

Bases: object

The package JSON hosted on the PyPI registry.

component_name: str

The target pypi software component name.

component_version: str | None

The target pypi software component version.

has_repository: bool

Whether the component of this asset has a related repository.

pypi_registry: PyPIRegistry

The pypi registry.

package_json: dict

The asset content.

package_sourcecode_path: str

the source code temporary location name

property size_in_bytes: int

Get the size of asset.

property name: str

Get the asset name.

property url: str

Get the download URL of the asset.

Note: we assume that the path parameters used to construct the URL are sanitized already.

Return type:

str

download(dest)

Download the package JSON metadata and store it in the package_json attribute.

Returns:

True if the asset is downloaded successfully; False if not.

Return type:

bool

get_releases()

Get all releases.

Returns:

Version to metadata.

Return type:

dict | None

Retrieve the project links from the base metadata.

This method accesses the “info” section of the base metadata to extract the “project_urls” dictionary, which contains various links related to the project.

Returns:

Containing project URLs where the keys are the names of the links and the values are the corresponding URLs. Returns None if the “project_urls” section is not found in the base metadata.

Return type:

dict | None

get_latest_version()

Get the latest version of the package.

Returns:

The latest version.

Return type:

str | None

get_sourcecode_url(package_type='sdist')

Get the url of the source distribution.

Parameters:

package_type (str) – The package type to retrieve the URL of.

Returns:

The URL of the source distribution.

Return type:

str | None

get_latest_release_upload_time()

Get upload time of the latest release.

Returns:

The upload time of the latest release.

Return type:

str | None

sourcecode()

Download and cleanup source code of the package with a context manager.

Return type:

Generator[None]

download_sourcecode()

Get the source code of the package and store it in a temporary directory.

Returns:

True if the source code is downloaded successfully; False if not.

Return type:

bool

get_sourcecode_file_contents(path)

Get the contents of a single source code file specified by the path.

The path can be relative to the package_sourcecode_path attribute, or an absolute path.

Parameters:

path (str) – The absolute or relative to package_sourcecode_path file path to open.

Returns:

The raw contents of the source code file.

Return type:

bytes

Raises:

SourceCodeError – if the source code has not been downloaded, or there is an error accessing the file.

iter_sourcecode()

Iterate through all source code files.

Returns:

The source code file path, and the raw contents of the source code file.

Return type:

tuple[str, bytes]

Raises:

SourceCodeError – if the source code has not been downloaded.

get_sha256()

Get the sha256 hash of the artifact from its payload.

Returns:

The sha256 hash of the artifact, or None if not found.

Return type:

str | None

__init__(component_name, component_version, has_repository, pypi_registry, package_json, package_sourcecode_path)
macaron.slsa_analyzer.package_registry.pypi_registry.find_or_create_pypi_asset(asset_name, asset_version, pypi_registry_info)

Find the matching asset in the provided package registry information, or if not found, create and add it.

Parameters:
  • asset_name (str) – The name of the asset.

  • asset_version (str | None) – The version of the asset.

  • pypi_registry_info (PackageRegistryInfo) – The package registry information. If a new asset is created, it will be added to the metadata of this registry.

Returns:

The asset, or None if not found.

Return type:

PyPIPackageJsonAsset | None