Unable to improve advisory database for C / C++ packages
Indeed it is documented in the README that contributions are not accepted for advisories outside the supported ecosystems. But some of the most high-impact vulnerability bulletins that need improvements are in C and C++ packages that don't have an "ecosystem" as such. They are part of all the ecosystems.
I would really like to be able to improve https://github.com/advisories/GHSA-mq29-j5xf-cjwr in light of all the confusion seen in https://github.com/madler/zlib/issues/868#issuecomment-1821577241. But there's no way to do this.
What could possibly be done to improve these bulletins?
An important ticket!
They are part of all the ecosystems.
That's kinda the problem actually. Our ecosystems provide a one to one mapping between some package namespace and an advisory. We don't have false positives as a result and not having false positives (via dependency matching) results in more actionable advisories and better outcomes for developers receiving alerts.
What could possibly be done to improve these bulletins?
It's an open question and one that we're thinking about.
For what it's worth we do have plenty of advisories which are about C/C++ code which has been bundled into a package in one of our ecosystems eg. https://github.com/advisories?query=tensorflow+type%3Areviewed+ecosystem%3Apip
Big part of the C/C++ ecosystem relies either on git-submodules or CMake built-in functionality. The first must be trivial to track for GitHub, the second one can probably be decently solved using RegEx. A typical CMakeLists.txt may start like this:
cmake_minimum_required(VERSION 3.14)
project(MyProject VERSION 1.0)
include(FetchContent)
FetchContent_Declare(
SomeLibrary
GIT_REPOSITORY https://github.com/username/SomeLibrary.git
GIT_TAG main # or specific version tag
)
FetchContent_MakeAvailable(SomeLibrary)
include(ExternalProject)
ExternalProject_Add(
OtherLibrary
GIT_REPOSITORY https://github.com/username/OtherLibrary.git
GIT_TAG main # or specific version tag
SOURCE_DIR "${CMAKE_BINARY_DIR}/_deps/otherlibrary-src"
BINARY_DIR "${CMAKE_BINARY_DIR}/_deps/otherlibrary-build"
CONFIGURE_COMMAND "" # Custom configure command
BUILD_COMMAND "" # Custom build command
INSTALL_COMMAND "" # Custom install command
TEST_COMMAND "" # Custom test command
)
include_directories(${some_library_SOURCE_DIR}/include)
In that case, matching GitHub-hosted dependencies is probably as easy as:
(?:FetchContent_Declare|ExternalProject_Add)\s*\([^)]*\bGIT_REPOSITORY\s+https:\/\/github\.com\/[^\s)]+
This would cover a huge part of the ecosystem and will be invaluable for dependency analysis and security audits of low-level infrastructure.
Thanks all for the feedback here! We at GitHub are actively exploring what we can do in this space, so all ideas welcome! I'll keep this issue open in case more folks want to chime in and share thoughts/ideas.
I understand the narrow ecosystem support argument (the data is incredibly high quality)
Perhaps there could be a way to discuss some community support for non supported ecosystems. For example this could be related to https://github.com/github/advisory-database/issues/2900
Now that Grype relies on the GitHub data, there are some instances where we need to modify data in order to provide users with the best experience possible (we don't really want to do this, but there's no way around, especially for NVD data) https://github.com/anchore/grype/issues/1607
The GitHub Advisory Data is considerably easier to modify than other data sources, but there are times we cannot submit updates here, usually because the ecosystem is unsupported.
There was a proposal from Oliver at https://github.com/github/advisory-database/discussions/568#discussioncomment-3330460, on how to potentially mark unsupported ecosystems in OSV. What are folks' feelings on adopting that?
There was a proposal from Oliver at https://github.com/github/advisory-database/discussions/568#discussioncomment-3330460, on how to potentially mark unsupported ecosystems in OSV. What are folks' feelings on adopting that?
I would much rather we figure out a useful way to map the data. Accepting and validating contributions on a multitude of undefined/undefinable ecosystems is not exactly viable.
in case there's confusion about the advisory, after I opened this issue https://github.com/advisories/GHSA-mq29-j5xf-cjwr was essentially somewhat repurposed for a supported ecosystem: Python.