macaron.slsa_analyzer.package_registry package
This module defines the package registries.
Submodules
macaron.slsa_analyzer.package_registry.jfrog_maven_registry module
Assets on a package registry.
- class macaron.slsa_analyzer.package_registry.jfrog_maven_registry.JFrogMavenAsset(name: str, group_id: str, artifact_id: str, version: str, metadata: JFrogMavenAssetMetadata, jfrog_maven_registry: JFrogMavenRegistry)
Bases:
NamedTuple
An asset hosted on a JFrog Artifactory repository with Maven layout.
-
metadata:
JFrogMavenAssetMetadata
The metadata of the JFrog Maven asset.
-
jfrog_maven_registry:
JFrogMavenRegistry
The JFrog repo that acts as a package registry following the Maven layout.
-
metadata:
- class macaron.slsa_analyzer.package_registry.jfrog_maven_registry.JFrogMavenAssetMetadata(size_in_bytes: int, sha256_digest: str, download_uri: str)
Bases:
NamedTuple
Metadata of an asset on a JFrog Maven registry.
- class macaron.slsa_analyzer.package_registry.jfrog_maven_registry.JFrogMavenRegistry(hostname=None, repo=None, request_timeout=None, download_timeout=None, enabled=None)
Bases:
PackageRegistry
A JFrog Artifactory repository that acts as a package registry with Maven layout.
For more details on JFrog Artifactory repository, see: https://jfrog.com/help/r/jfrog-artifactory-documentation/repository-management
- __init__(hostname=None, repo=None, request_timeout=None, download_timeout=None, enabled=None)
Instantiate a JFrogMavenRegistry object.
- Parameters:
hostname (str) – The hostname of the JFrog instance.
repo (str | None) – The Artifactory repository with Maven layout on the JFrog instance.
request_timeout (int | None) – The timeout (in seconds) for regular requests made to the package registry.
download_timeout (int | None) – The timeout (in seconds) for downloading files from the package registry.
enabled (bool | None) – Whether the package registry should be active in the analysis or not. “Not active” means no target repo/software component can be matched against this package registry.
- load_defaults()
Load the .ini configuration for the current package registry.
- Raises:
ConfigurationError – If there is a schema violation in the
package_registry.jfrog.maven
section.- Return type:
- is_detected(build_tool)
Detect if artifacts of the repo under analysis can possibly be published to this package registry.
The detection here is based on the repo’s detected build tool. If the package registry is compatible with the given build tool, it can be a possible place where the artifacts produced from the repo are published.
JFrogMavenRegistry
is compatible with Maven and Gradle.- Parameters:
build_tool (BaseBuildTool) – A detected build tool of the repository under analysis.
- Returns:
True
if the repo under analysis can be published to this package registry, based on the given build tool.- Return type:
- construct_maven_repository_path(group_id, artifact_id=None, version=None, asset_name=None)
Construct a path to a folder or file on the registry, assuming Maven repository layout.
For more details regarding Maven repository layout, see the following: - https://maven.apache.org/repository/layout.html - https://maven.apache.org/guides/mini/guide-naming-conventions.html
- fetch_artifact_ids(group_id)
Get all artifact ids under a group id.
This is done by fetching all children folders under the group folder on the registry.
- construct_folder_info_url(folder_path)
Construct a URL for the JFrog Folder Info API.
Documentation: https://jfrog.com/help/r/jfrog-rest-apis/folder-info.
- construct_file_info_url(file_path)
Construct a URL for the JFrog File Info API.
Documentation: https://jfrog.com/help/r/jfrog-rest-apis/file-info.
- construct_latest_version_url(group_id, artifact_id)
Construct a URL for the JFrog Latest Version Search API.
The response payload includes the latest version of the package with the given group id and artifact id. Documentation: https://jfrog.com/help/r/jfrog-rest-apis/artifact-latest-version-search-based-on-layout.
- fetch_latest_version(group_id, artifact_id)
Fetch the latest version of a Java package on this JFrog Maven registry.
- fetch_asset_names(group_id, artifact_id, version, extensions=None)
Retrieve the metadata of assets published for a version of a Maven package.
- Parameters:
group_id (str) – The group id of the Maven package.
artifact_id (str) – The artifact id of the Maven package.
version (str) – The version of the Maven package.
extensions (set[str] | None) – The set of asset extensions. Only assets with names ending in these extensions are fetched. If this is
None
, then all assets are returned regardless of their extensions.
- Returns:
The list of asset names.
- Return type:
- extract_folder_names_from_folder_info_payload(folder_info_payload)
Extract a list of folder names from the Folder Info payload of a Maven group folder.
- extract_file_names_from_folder_info_payload(folder_info_payload, extensions=None)
Extract file names from the Folder Info response payload.
For the schema of this payload and other details regarding the API, see: https://jfrog.com/help/r/jfrog-rest-apis/folder-info.
Note: Currently, we do not try to validate the schema of the payload. Rather, we only try to read as much as possible things that we can recognise.
- Parameters:
- Returns:
The list of filenames in the folder, extracted from the payload.
- Return type:
- fetch_asset_metadata(group_id, artifact_id, version, asset_name)
Fetch an asset’s metadata from JFrog.
- Parameters:
- Returns:
The asset’s metadata, or
None
if the metadata cannot be retrieved.- Return type:
JFrogMavenAssetMetadata | None
- extract_asset_metadata_from_file_info_payload(file_info_payload)
Extract the metadata of an asset from the File Info request payload.
Documentation: https://jfrog.com/help/r/jfrog-rest-apis/file-info.
- Parameters:
file_info_payload (str) – The File Info request payload used to extract the metadata of an asset.
- Returns:
The asset’s metadata, or
None
if the metadata cannot be retrieved.- Return type:
JFrogMavenAssetMetadata | None
- fetch_assets(group_id, artifact_id, version, extensions=None)
Fetch the assets of a Maven package.
- Parameters:
- Returns:
The list of assets of the package.
- Return type:
- construct_asset_url(group_id, artifact_id, version, asset_name)
Get the URL to download an asset.
- Parameters:
- Returns:
The URL to the asset, which can be use for downloading the asset.
- Return type:
- download_asset(url, dest)
Download an asset from the given URL to a given location.
- find_publish_timestamp(purl, registry_url=None)
Make a search request to Maven Central to find the publishing timestamp of an artifact.
The reason for directly fetching timestamps from Maven Central is that deps.dev occasionally misses timestamps for Maven artifacts, making it unreliable for this purpose.
To see the search API syntax see: https://central.sonatype.org/search/rest-api-guide/
- Parameters:
- Returns:
A timezone-aware datetime object representing the publication timestamp of the specified package.
- Return type:
datetime
- Raises:
InvalidHTTPResponseError – If the URL construction fails, the HTTP response is invalid, or if the response cannot be parsed correctly, or if the expected timestamp is missing or invalid.
NotImplementedError – If not implemented for a registry.
macaron.slsa_analyzer.package_registry.maven_central_registry module
The module provides abstractions for the Maven Central package registry.
- macaron.slsa_analyzer.package_registry.maven_central_registry.same_organization(group_id_1, group_id_2)
Check if two maven group ids are from the same organization.
Note: It is assumed that for recognized source platforms, the top level domain doesn’t change the organization. I.e., io.github.foo and com.github.foo are assumed to be from the same organization.
- class macaron.slsa_analyzer.package_registry.maven_central_registry.MavenCentralRegistry(search_netloc=None, search_scheme=None, search_endpoint=None, registry_url_netloc=None, registry_url_scheme=None, request_timeout=None)
Bases:
PackageRegistry
This class implements a Maven Central package registry.
- __init__(search_netloc=None, search_scheme=None, search_endpoint=None, registry_url_netloc=None, registry_url_scheme=None, request_timeout=None)
Initialize a Maven Central Registry instance.
- Parameters:
search_netloc (str | None = None,) – The netloc of Maven Central search URL.
search_scheme (str | None = None,) – The scheme of Maven Central URL.
search_endpoint (str | None) – The search REST API to find artifacts.
registry_url_netloc (str | None) – The netloc of the Maven Central registry url.
registry_url_scheme (str | None) – The scheme of the Maven Central registry url.
request_timeout (int | None) – The timeout (in seconds) for requests made to the package registry.
- load_defaults()
Load the .ini configuration for the current package registry.
- Raises:
ConfigurationError – If there is a schema violation in the
maven_central
section.- Return type:
- is_detected(build_tool)
Detect if artifacts of the repo under analysis can possibly be published to this package registry.
The detection here is based on the repo’s detected build tools. If the package registry is compatible with the given build tools, it can be a possible place where the artifacts produced from the repo are published.
MavenCentralRegistry
is compatible with Maven and Gradle.- Parameters:
build_tool (BaseBuildTool) – A detected build tool of the repository under analysis.
- Returns:
True
if the repo under analysis can be published to this package registry, based on the given build tool.- Return type:
- find_publish_timestamp(purl, registry_url=None)
Make a search request to Maven Central to find the publishing timestamp of an artifact.
The reason for directly fetching timestamps from Maven Central is that deps.dev occasionally misses timestamps for Maven artifacts, making it unreliable for this purpose.
To see the search API syntax see: https://central.sonatype.org/search/rest-api-guide/
- Parameters:
- Returns:
A timezone-aware datetime object representing the publication timestamp of the specified package.
- Return type:
datetime
- Raises:
InvalidHTTPResponseError – If the URL construction fails, the HTTP response is invalid, or if the response cannot be parsed correctly, or if the expected timestamp is missing or invalid.
macaron.slsa_analyzer.package_registry.npm_registry module
The module provides abstractions for the npm package registry.
- class macaron.slsa_analyzer.package_registry.npm_registry.NPMRegistry(hostname=None, attestation_endpoint=None, request_timeout=None, enabled=True)
Bases:
PackageRegistry
This class implements the npm package registry.
There is no complete and up-to-date API documentation for the npm registry and the endpoints are discovered by manual inspection of links on https://www.npmjs.com.
- __init__(hostname=None, attestation_endpoint=None, request_timeout=None, enabled=True)
Initialize the npm Registry instance.
- Parameters:
- load_defaults()
Load the .ini configuration for the current package registry.
- Raises:
ConfigurationError – If there is a schema violation in the
npm registry
section.- Return type:
- is_detected(build_tool)
Detect if artifacts under analysis can be published to this package registry.
The detection here is based on the repo’s detected build tools. If the package registry is compatible with the given build tools, it can be a possible place where the artifacts are published.
NPMRegistry
is compatible with npm and Yarn build tools.Note: if the npm registry is disabled through the ini configuration, this method returns False.
- Parameters:
build_tool (BaseBuildTool) – A detected build tool of the repository under analysis.
- Returns:
True
if the repo under analysis can be published to this package registry, based on the given build tool.- Return type:
- download_attestation_payload(url, download_path)
Download the npm attestation from npm registry.
Each npm package can have the following types of attestations:
publish with “https://github.com/npm/attestation/tree/main/specs/publish/v0.1” predicateType
SLSA with “https://slsa.dev/provenance/v0.2” predicateType
SLSA with “https://slsa.dev/provenance/v1” predicateType
We download the unsigned SLSA provenance v0.2 or v1 in this method, and the signed npm type.
An example SLSA v0.2 provenance: https://registry.npmjs.org/-/npm/v1/attestations/@sigstore/mock@0.1.0 An example SLSA v1 provenance: https://registry.npmjs.org/-/npm/v1/attestations/@sigstore/mock@0.6.3
- Parameters:
- Returns:
True
if the asset is downloaded successfully;False
if not.- Return type:
- Raises:
InvalidHTTPResponseError – If the HTTP request to the registry fails or an unexpected response is returned.
- get_latest_version(namespace, name)
Try to retrieve the latest version of a package from the registry.
- class macaron.slsa_analyzer.package_registry.npm_registry.NPMAttestationAsset(namespace: str | None, artifact_id: str, version: str, npm_registry: NPMRegistry, size_in_bytes: int)
Bases:
NamedTuple
An attestation asset hosted on the npm registry.
The API Documentation can be found here:
-
namespace:
str
|None
The optional scope of a package on npm, which is used as the namespace in a PURL string. See https://docs.npmjs.com/cli/v10/using-npm/scope to know about npm scopes. See https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#npm for the namespace in an npm PURL string.
-
npm_registry:
NPMRegistry
The npm registry.
-
size_in_bytes:
int
The size of the asset (in bytes). This attribute is added to match the AssetLocator protocol and is not used because npm API registry does not provide it.
-
namespace:
macaron.slsa_analyzer.package_registry.package_registry module
This module defines package registries.
- class macaron.slsa_analyzer.package_registry.package_registry.PackageRegistry(name)
Bases:
ABC
Base package registry class.
- __init__(name)
- abstract load_defaults()
Load the .ini configuration for the current package registry.
- Return type:
- abstract is_detected(build_tool)
Detect if artifacts of the repo under analysis can possibly be published to this package registry.
The detection here is based on the repo’s detected build tool. If the package registry is compatible with the given build tool, it can be a possible place where the artifacts produced from the repo are published.
- Parameters:
build_tool (BaseBuildTool) – A detected build tool of the repository under analysis.
- Returns:
True
if the repo under analysis can be published to this package registry, based on the given build tool.- Return type:
- find_publish_timestamp(purl, registry_url=None)
Retrieve the publication timestamp for a package specified by its purl from the deps.dev repository by default.
This method constructs a request URL based on the provided purl, sends an HTTP GET request to fetch metadata about the package, and extracts the publication timestamp from the response.
Note: The method expects the response to include a
version
field with apublishedAt
subfield containing an ISO 8601 formatted timestamp.- Parameters:
- Returns:
A timezone-aware datetime object representing the publication timestamp of the specified package.
- Return type:
datetime
- Raises:
InvalidHTTPResponseError – If the URL construction fails, the HTTP response is invalid, or if the response cannot be parsed correctly, or if the expected timestamp is missing or invalid.
NotImplementedError – If not implemented for a registry.
macaron.slsa_analyzer.package_registry.pypi_registry module
The module provides abstractions for the pypi package registry.
- class macaron.slsa_analyzer.package_registry.pypi_registry.PyPIRegistry(registry_url_netloc=None, registry_url_scheme=None, fileserver_url_netloc=None, fileserver_url_scheme=None, request_timeout=None, enabled=True)
Bases:
PackageRegistry
This class implements the pypi package registry.
- __init__(registry_url_netloc=None, registry_url_scheme=None, fileserver_url_netloc=None, fileserver_url_scheme=None, request_timeout=None, enabled=True)
Initialize the pypi Registry instance.
- Parameters:
registry_url_netloc (str | None) – The netloc of the pypi registry url.
registry_url_scheme (str | None) – The scheme of the pypi registry url.
fileserver_url_netloc (str | None) – The netloc of the server url that stores package source files, which contains the hostname and port.
fileserver_url_scheme (str | None) – The scheme of the server url that stores package source files.
request_timeout (int | None) – The timeout (in seconds) for requests made to the package registry.
enabled (bool) – Shows whether making REST API calls to pypi registry is enabled.
- load_defaults()
Load the .ini configuration for the current package registry.
- Raises:
ConfigurationError – If there is a schema violation in the
pypi
section.- Return type:
- is_detected(build_tool)
Detect if artifacts of the repo under analysis can possibly be published to this package registry.
The detection here is based on the repo’s detected build tools. If the package registry is compatible with the given build tools, it can be a possible place where the artifacts produced from the repo are published.
PyPIRegistry
is compatible with Pip and Poetry.- Parameters:
build_tool (BaseBuildTool) – A detected build tool of the repository under analysis.
- Returns:
True
if the repo under analysis can be published to this package registry, based on the given build tool.- Return type:
- download_package_json(url)
Download the package JSON metadata from pypi registry.
- Parameters:
url (str) – The package JSON url.
- Returns:
The JSON response if the request is successful.
- Return type:
- Raises:
InvalidHTTPResponseError – If the HTTP request to the registry fails or an unexpected response is returned.
- get_package_page(package_name)
Implement custom API to get package main page.
- get_maintainers_of_package(package_name)
Implement custom API to get all maintainers of the package.
- get_maintainer_profile_page(username)
Implement custom API to get maintainer’s profile page.
- class macaron.slsa_analyzer.package_registry.pypi_registry.PyPIPackageJsonAsset(component, pypi_registry, package_json)
Bases:
object
The package JSON hosted on the PyPI registry.
-
pypi_registry:
PyPIRegistry
The pypi registry.
- property url: str
Get the download URL of the asset.
Note: we assume that the path parameters used to construct the URL are sanitized already.
- Return type:
- download(dest)
Download the package JSON metadata and store it in the package_json attribute.
- Returns:
True
if the asset is downloaded successfully;False
if not.- Return type:
- get_project_links()
Retrieve the project links from the base metadata.
This method accesses the “info” section of the base metadata to extract the “project_urls” dictionary, which contains various links related to the project.
- Returns:
Containing project URLs where the keys are the names of the links and the values are the corresponding URLs. Returns None if the “project_urls” section is not found in the base metadata.
- Return type:
dict | None
- get_latest_version()
Get the latest version of the package.
- Returns:
The latest version.
- Return type:
str | None
- get_sourcecode_url()
Get the url of the source distribution.
- Returns:
The URL of the source distribution.
- Return type:
str | None
- get_latest_release_upload_time()
Get upload time of the latest release.
- Returns:
The upload time of the latest release.
- Return type:
str | None
- __init__(component, pypi_registry, package_json)
-
pypi_registry: