Query API ignores purl qualifiers
Describe the bug The OSV API does not respect the distro qualifier when querying with a purl, potentially leading to false positive vulnerability reports. Specifically, this leads to false positives in GUAC when using the OSV certifier.
To Reproduce
Using the sample package busybox in version 1.36.1-r19 from Alpine 3.19.
Query the OSV API using a purl:
curl -d '{"package": { "purl": "pkg:apk/alpine/[email protected]?distro=alpine-3.19" } }' https://api.osv.dev/v1/query
Pipe through jq to only show relevant parts:
jq '.vulns[] | {id: .id, affected: (.affected[]? | select(has("package") and .package.ecosystem == "Alpine:v3.19" and .package.name == "busybox") | {ecosystem: .package.ecosystem, fixed_version: .ranges[].events[] | select(has("fixed")).fixed})}'
The result is:
{
"id": "CVE-2023-42363",
"affected": {
"ecosystem": "Alpine:v3.19",
"fixed_version": "1.36.1-r17"
}
}
{
"id": "CVE-2023-42364",
"affected": {
"ecosystem": "Alpine:v3.19",
"fixed_version": "1.36.1-r19"
}
}
{
"id": "CVE-2023-42365",
"affected": {
"ecosystem": "Alpine:v3.19",
"fixed_version": "1.36.1-r19"
}
}
{
"id": "CVE-2023-42366",
"affected": {
"ecosystem": "Alpine:v3.19",
"fixed_version": "1.36.1-r16"
}
}
The CVEs listed do not affect the queried version in Alpine 3.19, but do affect this version in Alpine 3.20 and 3.21
When Querying with name, ecosystem and version instead of a purl:
curl -d '{"package": { "name": "busybox", "ecosystem": "Alpine:v3.19" }, "version": "1.36.1-r19" }' https://api.osv.dev/v1/query
The API returns no results, as expected.
Additional context
Looking at https://github.com/google/osv.dev/blob/master/osv/purl_helpers.py reveals, that purl qualifiers are disregarded entirely after parsing.
While the distro qualifier is not properly standardized (see https://github.com/package-url/purl-spec/issues/247), I think this should be considered.
Without the distro qualifier being properly defined (there's not even an example of its usage for apk 😕) this is hard to properly support.
We might be able add some best-effort attempt to interpret the distro field for ecosystems that use it. Looking around online for Alpine I've seen e.g. alpine-3.19, alpine-3.19.1, 3.19.1.
I agree that the missing definition is probably the biggest blocker here. The discussion related to that in https://github.com/package-url/purl-spec/issues/247 was last updated almost exactly one year ago. I guess we'll have to wait a bit longer for proper specification.
To add more detail to your observations of how the qualifier is used:
TL;DR The pattern <ID>-<VERSION_ID> with the prefix being optional is the only one I could find.
- https://github.com/anchore/syft uses
distro=alpine-X.Y.Z, the general format is<ID >-<VERSION_ID >(values from os-release, see implementation in https://github.com/anchore/syft/blob/a17fe480a063998e152e6aee617bb0cb93fa0588/syft/pkg/url.go#L24) - https://github.com/aquasecurity/trivy only uses the version (
distro=3.21.2) - https://github.com/cyclonedx/cdxgen uses the same pattern as syft to construct the distro qualifier from os-release
From our own observations on this it is unfortunate that using the purl format and specifying a distro does not help with filtering the results, as compared to specifying a package name, ecosystem and version:
Compare:
curl -sd '{"package": {"purl": "pkg:deb/debian/[email protected]~dfsg-7+deb11u11?arch=source&distro=bullseye"}}' "https://api.osv.dev/v1/query" | jq '.vulns | length'
48
(No impact on replacing bullseye with debian-11)
with
curl -sd '{"package": {"name": "ghostscript", "ecosystem": "Debian:11"}, "version": "9.53.3~dfsg-7+deb11u11"}' https://api.osv.dev/v1/query | jq '.vulns | length'
6
@dinofizz yeah, thats what I've ended up using. Just split the purl to name ecosystem and version locally and query OSV this way.