Support Static Analysis Results Interchange Format (SARIF) format?
CSA just gained the ability to output SARIF: https://github.com/llvm-mirror/clang/commit/962c092aae53360ab4d5b1adac78963694b5963f#diff-e47bf599aad9618f970aa41d0f09bf4f
Do we want to support this for interop with other tools? (Or prefer over plist at some point?) Even if we consume plist it would be great to be able to generate both plist and sarif at the same time so both CodeChecker and other tools can consume the same results.
Please note that the Sarif output still needs a great amount of maturing, for example, it can't handle checkers loaded from plugins, which is a particularly great weakness for our use case. Sadly, the elegant solution for this is very non-trivial, but is on the way.
But of course, that shouldn't stop us from considering using it in the future.
I do not see any bug id (hash) in the generated report. Maybe it should be added to the output or we should generate the report id in CodeChecker for it.
Documentation:
- https://sarifweb.azurewebsites.net/
- https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc16012580
Questions:
-
What is missing from the Clang Static Analyzer SARIF output compared to Plist? I have analyzed the following file with the following commands:
// main.cpp #define SET_PTR_VAR_TO_NULL \ ptr = 0 void nonFunctionLikeMacroTest() { int *ptr; SET_PTR_VAR_TO_NULL; *ptr = 5; // expected-warning{{Dereference of null pointer}} } void log() {} int max(int a, int b) { // expected-warning{{Duplicate code detected}} log(); if (a > b) return a; return b; } int maxClone(int a, int b) { // expected-note{{Similar code here}} log(); if (a > b) return a; return b; } int main() { return 0; }# Create plist output. clang --analyze -Xclang -analyzer-output=plist-multi-file -o ./main_1.plist -Xclang -analyzer-config -Xclang expand-macros=true -Xclang -analyzer-config -Xclang alpha.clone.CloneChecker:MinimumCloneComplexity=5 -Xclang -analyzer-checker=alpha.clone.CloneChecker -Xclang -analyzer-checker=core.NullDereference ./main.cpp # Create sarif output. clang --analyze -Xclang -analyzer-output=sarif -o ./main_1.sarif -Xclang -analyzer-config -Xclang expand-macros=true -Xclang -analyzer-config -Xclang alpha.clone.CloneChecker:MinimumCloneComplexity=5 -Xclang -analyzer-checker=alpha.clone.CloneChecker -Xclang -analyzer-checker=core.NullDereference ./main.cpp-
issue_hash_content_of_line_in_context(bug hash) is not found in the sarif output. I think we can use one of the following property:-
fingerprints property: http://docs.oasis-open.org/sarif/sarif/v2.0/csprd01/sarif-v2.0-csprd01.html#_Toc517436070 -
partialFingerprints property: http://docs.oasis-open.org/sarif/sarif/v2.0/csprd01/sarif-v2.0-csprd01.html#_Toc517436071. For example Github uses thepartialFingerprintsproperty: https://docs.github.com/en/code-security/secure-coding/sarif-support-for-code-scanning#preventing-duplicate-alerts-using-fingerprints
-
-
Notesare supported:"importance": "unimportant", "location": { "message": { "text": "Similar code here" } ... } - Sarif output doesn't contain
macro_expansionstext like plist file does:
vs<key>name</key><string>SET_PTR_VAR_TO_NULL</string> <key>expansion</key><string>ptr = 0</string>){ "importance": "unimportant", "location": { "physicalLocation": { "artifactLocation": { "index": 0, "uri": "file:///home/username/projects/simple/main.cpp" }, "region": { "endColumn": 3, "startColumn": 3, "startLine": 6 } } } }
-
-
Which analyzer supports sarif as an output?
-
Spotbugs: https://github.com/spotbugs/spotbugs/tree/master/spotbugs/src/main/java/edu/umd/cs/findbugs/sarif -
Gosec: https://github.com/securego/gosec -
Eslint(eslint formatter): https://eslint.org/docs/user-guide/formatters/
-
-
SARIF converters
- https://github.com/microsoft/sarif-sdk/tree/master/src/Sarif.Converters
- https://github.com/Microsoft/binskim
- https://github.com/GrammaTech/pylint-sarif
-
Which are the viewers supporting sarif?
- https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-viewer
- https://marketplace.visualstudio.com/items?itemName=WDGIS.MicrosoftSarifViewer
- https://github.com/microsoft/sarif-web-component
- https://blogs.grammatech.com/integrating-clang-static-analyzer-with-codesonar-using-sarif
- https://blogs.grammatech.com/using-sarif-to-extend-analysis-of-sast-tools
- https://www.shiftleft.io/scan/
Remaining questions:
- One sarif file for every analysis run with all the results.
- JSON parsing might get slow with lot of results.
- This will change the current behavior where we update only a few plist files.
Further notes:
- Thread flows in code flow object can be used to represent a bug path (thread flow describes a time-ordered sequence of code locations on a single thread of execution). Code flow can have multiple thread flows
Useful links:
- SARIF format sdk (C#): https://github.com/Microsoft/sarif-sdk
- Python library: https://github.com/microsoft/sarif-python-om
- Good introduction: https://github.com/microsoft/sarif-tutorials
@csordasmarton did a really good analysis above. I wanted to provide some updates on tools using/supporting SARIF since then:
- The MSVC static analysis toolset supports the SARIF format: https://devblogs.microsoft.com/cppblog/microsoft-c-code-analysis-supports-sarif-2-1/
- And we recently added notes support to our tools (also in SARIF): https://devblogs.microsoft.com/cppblog/microsoft-cpp-code-analysis-warnings-with-key-events/
- The MSVC warnings are also betting on SARIF in the long run: https://devblogs.microsoft.com/cppblog/the-future-of-c-compiler-diagnostics-in-msvc-and-visual-studio/
- There were some recent proposals in Clang to use SARIF for richer diagnostics emitted by the frontend: https://discourse.llvm.org/t/rfc-improving-clang-s-diagnostics/62584
- SARIF support is a top requested feature for SonarQube: https://portal.productboard.com/sonarsource/3-sonarqube/tabs/5-under-consideration
- CodeQL supports Sarif: https://codeql.github.com/docs/codeql-cli/sarif-output/
- CodeQL's and MSVC Code Analysis' integration with GitHub is based on SARIF: https://devblogs.microsoft.com/cppblog/microsoft-cpp-code-analysis-with-github-actions/
It looks like GCC's static analyzer will also have SARIF support in GCC 13.
Yeah Github integration in general is based on sarif. https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/uploading-a-sarif-file-to-github
Any news on this new feature?
we will resume work on this soon, @franco0700
Apologies for confusion, but it seems to me that #4011 added support for parsing SARIF into the CodeChecker, while i (and i suspect many others) was fully expecting that this issues was tracking the exact opposite feature, exporting CodeChecker results into SARIF format (so e.g. GitHub could visualize them). Am i missing the point?
Apologies for confusion, but it seems to me that #4011 added support for parsing SARIF into the CodeChecker, while i (and i suspect many others) was fully expecting that this issues was tracking the exact opposite feature, exporting CodeChecker results into SARIF format (so e.g. GitHub could visualize them). Am i missing the point?
I actually expected parsing an input SARIF file, so CodeChecker can ingest the output of more linters.
Apologies for confusion, but it seems to me that #4011 added support for parsing SARIF into the CodeChecker, while i (and i suspect many others) was fully expecting that this issues was tracking the exact opposite feature, exporting CodeChecker results into SARIF format (so e.g. GitHub could visualize them). Am i missing the point?
You can convert existing report to sarif using report-converter:
report-converter -t cppcheck out/double-free_5197372318879967385.plist -e sarif -o cppcheck.sarif
It is true though that results stored on the server cannot be exported just yet. Also, I realize that report-converter to manually postprocess results isn't the most convenient :)
edit: It may have been a little overkill to close this issue straight away, we support only a subset of sarif that we immediately needed to support the gcc static analyzer. Considering that we have a new issue already, I'll leave this closed.
You can convert existing report to sarif using report-converter:
This requires using git-tip of codechecker, as it was implemented in this PR and the commit https://github.com/Ericsson/codechecker/commit/90a39054d52f91f1acb50620b03beee6d0829e2c is not yet released.
- https://github.com/Ericsson/codechecker/pull/4011
I was initially surprised that my report-converter does not recognize -e sarif. So that's the reason.