Analysis upload fails with `rejecting SARIF, as there are more threadflow steps per result than allowed (19350 > 10000)`
Hello!
Today I wanted to enhance our CodeQL scan in the systemd repo by using the security-extended and security-and-quality query sets, but after adding them the CodeQL action can no longer upload the resulting SARIF file:
Waiting for processing to finish
Analysis upload status is pending.
Analysis upload status is failed.
Error: Code Scanning could not process the submitted SARIF file:
rejecting SARIF, as there are more threadflow steps per result than allowed (19350 > 10000)
Error: Code Scanning could not process the submitted SARIF file:
rejecting SARIF, as there are more threadflow steps per result than allowed (19350 > 10000)
at Object.waitForProcessing (/home/runner/work/_actions/github/codeql-action/0c670bbf0414f39666df6ce8e718ec5662c21e03/lib/upload-lib.js:334:19)
at async run (/home/runner/work/_actions/github/codeql-action/0c670bbf0414f39666df6ce8e718ec5662c21e03/lib/analyze-action.js:131:13)
at async runWrapper (/home/runner/work/_actions/github/codeql-action/0c670bbf0414f39666df6ce8e718ec5662c21e03/lib/analyze-action.js:221:9)
Example job: https://github.com/systemd/systemd/actions/runs/3053021449/jobs/4923112318
Configuration:
---
# vi: ts=2 sw=2 et:
# SPDX-License-Identifier: LGPL-2.1-or-later
#
name: "CodeQL"
on:
pull_request:
branches:
- main
- v[0-9]+-stable
paths:
- '**/meson.build'
- '.github/**/codeql*'
- 'src/**'
- 'test/**'
- 'tools/**'
push:
branches:
- main
- v[0-9]+-stable
permissions:
contents: read
jobs:
analyze:
name: Analyze
runs-on: ubuntu-22.04
concurrency:
group: ${{ github.workflow }}-${{ matrix.language }}-${{ github.ref }}
cancel-in-progress: true
permissions:
actions: read
security-events: write
strategy:
fail-fast: false
matrix:
language: ['cpp', 'python']
steps:
- name: Checkout repository
uses: actions/checkout@2541b1294d2704b0964813337f33b291d3f8596b
- name: Initialize CodeQL
uses: github/codeql-action/init@0c670bbf0414f39666df6ce8e718ec5662c21e03
with:
languages: ${{ matrix.language }}
queries: +security-extended,security-and-quality
- run: sudo -E .github/workflows/unit_tests.sh SETUP
- name: Autobuild
uses: github/codeql-action/autobuild@0c670bbf0414f39666df6ce8e718ec5662c21e03
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@0c670bbf0414f39666df6ce8e718ec5662c21e03
Since the build & analysis finishes successfully and only the last step fails, I don't think this is an issue on our side - is there something which can be done to mitigate this or we're out of luck?
Thank you!
Hello! Thanks for reporting this. The first thing to do would be to figure out which rules/alerts are causing the problem. As an immediate fix you could remove the problematic alerts using https://github.com/advanced-security/filter-sarif .
Could you Re-run all jobs on the workflow and toggle the Enable debug logging checkbox? When the workflow finishes a debug artefact should have been uploaded. This artefact contains a .sarif file. The debug artefact is a relatively new feature, so you may need to adjust the SHA of the codeql-action steps in the workflow. Alternatively, you can set the output property of the codeql-action/analyze step to a folder name and use the actions/upload action to upload that folder as an artefact.
Once you obtained the SARIF file, could you run the following jq commands on it and post the output?
cat FILE.sarif | jq '.runs[].results[].ruleId' | sort | uniq -c
cat FILE.sarif | jq -rc '.runs[].results[] | {rule: .ruleId, file: .locations[0].physicalLocation.artifactLocation.uri, codeFlowsCount: (.codeFlows | length), threadFlowStepsCount: (.codeFlows | reduce .[]? as $item (0; . + ($item.threadFlows[0].locations | length)))}' | jq -s -c 'sort_by(.threadFlowStepsCount) | .[]'
Hello! Thanks for reporting this. The first thing to do would be to figure out which rules/alerts are causing the problem. As an immediate fix you could remove the problematic alerts using https://github.com/advanced-security/filter-sarif .
Could you
Re-run all jobson the workflow and toggle the Enable debug logging checkbox? When the workflow finishes a debug artefact should have been uploaded. This artefact contains a.sariffile. The debug artefact is a relatively new feature, so you may need to adjust the SHA of the codeql-action steps in the workflow.
You were right, I had to use the main ref instead of the pinned SHAs to get the artifacts, so thanks for that!
Once you obtained the SARIF file, could you run the following
jqcommands on it and post the output?cat FILE.sarif | jq '.runs[].results[].ruleId' | sort | uniq -c
$ cat cpp.sarif | jq '.runs[].results[].ruleId' | sort | uniq -c
1 "cpp/cleartext-storage-file"
16 "cpp/commented-out-code"
2 "cpp/complex-condition"
2 "cpp/constant-comparison"
36 "cpp/fixme-comment"
4 "cpp/include-non-header"
3 "cpp/inconsistent-null-check"
70 "cpp/irregular-enum-init"
1 "cpp/leap-year/unchecked-after-arithmetic-year-modification"
92 "cpp/long-switch"
33 "cpp/loop-variable-changed"
1 "cpp/missing-case-in-switch"
7 "cpp/missing-header-guard"
1 "cpp/non-constant-format"
1 "cpp/offset-use-before-range-check"
1 "cpp/overrunning-write"
137 "cpp/path-injection"
293 "cpp/poorly-documented-function"
25 "cpp/stack-address-escape"
4 "cpp/suspicious-allocation-size"
2 "cpp/system-data-exposure"
19 "cpp/toctou-race-condition"
43 "cpp/too-many-format-arguments"
9 "cpp/trivial-switch"
113 "cpp/unbounded-write"
4 "cpp/uncontrolled-allocation-size"
15 "cpp/uncontrolled-process-operation"
81 "cpp/uninitialized-local"
82 "cpp/unterminated-variadic-call"
1 "cpp/unused-local-variable"
183 "cpp/unused-static-function"
3 "cpp/unused-static-variable"
34 "cpp/world-writable-file-creation"
cat FILE.sarif | jq -rc '.runs[].results[] | {rule: .ruleId, file: .locations[0].physicalLocation.artifactLocation.uri, codeFlowsCount: (.codeFlows | length), threadFlowStepsCount: (.codeFlows | reduce .[]? as $item (0; . + ($item.threadFlows[0].locations | length)))}' | jq -s -c 'sort_by(.threadFlowStepsCount) | .[]'
https://gist.github.com/mrc0mmand/d516a60a02c2f15b9bbfeac9b589f8bd
In case a further inspection is needed, I also uploaded the SARIF file to https://mrc0mmand.fedorapeople.org/gh_2022-09-14-cpp.sarif
Thanks for all the details!
The tail of the second jq command shows there are 3 alerts that exceed the limits:
{"rule":"cpp/unbounded-write","file":"src/libsystemd/sd-bus/sd-bus.c","codeFlowsCount":1208,"threadFlowStepsCount":14496}
{"rule":"cpp/path-injection","file":"src/test/test-fs-util.c","codeFlowsCount":606,"threadFlowStepsCount":15135}
{"rule":"cpp/unbounded-write","file":"src/basic/unit-name.c","codeFlowsCount":1211,"threadFlowStepsCount":19350}
You could try to add a advanced-security/filter-sarif to filter out the cpp/unbounded-write results for src/libsystemd/sd-bus/sd-bus.c and src/basic/unit-name.c and also the cpp/path-injection results for src/test/test-fs-util.c.
Alternatively, you could use a query-filter in a CodeQL config file to drop the two queries that are causing problems.
I'll let the team know about this too, so they can investigate the issue and perhaps improve the queries to avoid them producing that much data.
There is a new feature in code-scanning that allows you to remove queries from the analysis. Using this feature should remove the queries that are causing the problems. The documentation for this feature has not yet been published, but I will explain how to use it here.
- Create a custom configuration file, as explained here: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-a-custom-configuration-file
- The file can be empty, except for a
query-filtersblock, like this:
query-filters:
- exclude:
id:
- cpp/unbounded-write
- cpp/path-injection
That should be enough.
Thank you both for the suggestions. I tried filtering the SARIF file using advanced-security/filter-sarif and it seems to be doing the right thing*.
Using a CodeQL config file to disable the checks completely would also be an option (and would be probably much easier, given we already utilize such config file), but I'd like to avoid that, since both checks seem quite useful (especially in systemd's case).
On a related note - is this issue caused by some limitations of the SARIF format or it's limited somewhere in the CodeQL action?
* updated config for reference
---
# vi: ts=2 sw=2 et:
# SPDX-License-Identifier: LGPL-2.1-or-later
#
name: "CodeQL"
on:
pull_request:
branches:
- main
- v[0-9]+-stable
paths:
- '**/meson.build'
- '.github/**/codeql*'
- 'src/**'
- 'test/**'
- 'tools/**'
push:
branches:
- main
- v[0-9]+-stable
permissions:
contents: read
jobs:
analyze:
name: Analyze
runs-on: ubuntu-22.04
concurrency:
group: ${{ github.workflow }}-${{ matrix.language }}-${{ github.ref }}
cancel-in-progress: true
permissions:
actions: read
security-events: write
strategy:
fail-fast: false
matrix:
language: ['cpp', 'python']
steps:
- name: Checkout repository
uses: actions/checkout@2541b1294d2704b0964813337f33b291d3f8596b
- name: Initialize CodeQL
uses: github/codeql-action/init@0c670bbf0414f39666df6ce8e718ec5662c21e03
with:
languages: ${{ matrix.language }}
config-file: ./.github/codeql-config.yml
queries: +security-extended,security-and-quality
- run: sudo -E .github/workflows/unit_tests.sh SETUP
- name: Autobuild
uses: github/codeql-action/autobuild@0c670bbf0414f39666df6ce8e718ec5662c21e03
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@0c670bbf0414f39666df6ce8e718ec5662c21e03
with:
upload: False
output: sarif-results
- name: Filter SARIF results
uses: advanced-security/filter-sarif@54b1ee6ebe059d29692bcc246e3c397e99176c6b
if: ${{ matrix.language == 'cpp' }}
with:
patterns: |
-src/basic/unit-name.c:cpp/unbounded-write
-src/libsystemd/sd-bus/sd-bus.c:cpp/unbounded-write
-src/test/test-fs-util.c:cpp/path-injection
input: sarif-results/cpp.sarif
output: sarif-results/cpp.sarif
- name: Upload SARIF results
uses: github/codeql-action/upload-sarif@0c670bbf0414f39666df6ce8e718ec5662c21e03
with:
sarif_file: sarif-results/${{ matrix.language }}.sarif
The issue is that the two queries are behaving badly and producing a large number of paths for an alert. Our team is still diagnosing why. This is producing a sarif file with a large number of threadflow steps. Our code-scanning processor has trouble with these kinds of files and it rejects them.
The issue is that the two queries are behaving badly and producing a large number of paths for an alert. Our team is still diagnosing why. This is producing a sarif file with a large number of threadflow steps. Our code-scanning processor has trouble with these kinds of files and it rejects them.
Ah, I see, thank you for the explanation. I also just noticed how many alerts the cpp/unbounded-write check generates (and by a quick look they don't seem quite useful for us anyway), so I might end up going with your suggestion and disabling the check completely . For some reason I thought we had this one enabled on LGTM as well, but it turns out that wasn't the case.
Ah, I see, thank you for the explanation. I also just noticed how many alerts the
cpp/unbounded-writecheck generates (and by a quick look they don't seem quite useful for us anyway), so I might end up going with your suggestion and disabling the check completely . For some reason I thought we had this one enabled on LGTM as well, but it turns out that wasn't the case.
If you have time, could you create an issue on github/codeql with examples of alerts you don't find useful? Perhaps the team can improve the query to improve the quality of the result.
Ah, I see, thank you for the explanation. I also just noticed how many alerts the
cpp/unbounded-writecheck generates (and by a quick look they don't seem quite useful for us anyway), so I might end up going with your suggestion and disabling the check completely . For some reason I thought we had this one enabled on LGTM as well, but it turns out that wasn't the case.If you have time, could you create an issue on
github/codeqlwith examples of alerts you don't find useful? Perhaps the team can improve the query to improve the quality of the result.
That's definitely a good idea. After enabling the security-enhanced and security-and-quality query sets, they spawned around ~1400 new alerts where most of them were false positives - mostly a couple of the same alerts spread through the entire codebase. Once I have a bit of spare time I'll go through them, prepare some minimal examples, and report them to github/codeql.