Audit storage: validate consistency of replica and shard location metadata
Introduction AuditStorage is a functionality that serves to audit the system's data storage by checking for data consistency. It is triggered when a client requests an audit or when a consistency check is required at the end of simulation. When a client issues an auditStorage request, the request is first processed by CC. CC forwards the request to DD, which is responsible for processing audit requests. DD checks for ongoing audits before processing a new audit request. If there is an ongoing audit with the same audit type and range as the new request, DD obtains the audit ID for that audit. If there is an ongoing but irrelevant audit, DD returns an error message indicating that the system is busy, as currently only one ongoing audit is allowed at a time. If there is no ongoing audit, DD creates a new audit and persists its state. AuditStorage requests are asynchronous. DD immediately replies with the existing audit ID to CC. If DD is unable to get and persist the result, it captures the failure and automatically retries the audit until one of three outcomes is achieved: (1) the maximum number of retry attempts is exceeded, resulting in a "Failed" result; (2) the audit is completed without any errors, resulting in a "Complete" result; or (3) the audit is completed with errors detected, resulting in an "Error" result. In some cases, CC might not know whether the request has been delivered to DD, for example, when DD restarts after CC sends an audit request. In such cases, CC replies with "request_maybe_delivered" to the client. The client can then issue a new audit request if necessary.
Following designs are obeyed when developing the audit storage: (1) A audit request generates an AuditStorage; (2) An AuditStorage can automatically retry for failures; (3) Any component of AuditStorage must not block or kill SS and DD; (4) Audit storage must be retriable --- being able to make progress by retrying. A large audit is partitioned into tasks and assigned to SSes. Each SS runs assigned tasks until completing all assigned tasks or failed. Upon completing each task, SS persists the progress. If a task is failed, SS notifies DD, and DD loads the progress made by the SS and resend the remaining tasks to the SS.
Current limitations: (1) TSS servers are not covered; (2) If a bad assignment consistently updated to metadata, this bad assignment is not detected. For example, DD assigns an empty SS or a removed SS to KeyServer and ServerKey. This bad assignment cannot be detected by current implementation of AuditStorage.
AduitStorageTest 100k: 20230429-063732-zhewang-b06297784516243e compressed=True data_size=32945656 duration=3028909 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=0:45:18 sanity=False started=100000 stopped=20230429-072250 submitted=20230429-063732 timeout=5400 username=zhewang
100k correctness: 20230429-063413-zhewang-422a958e38f9fa0b compressed=True data_size=32915465 duration=5142264 ended=100000 fail=2 fail_fast=10 max_runs=100000 pass=99998 priority=100 remaining=0 runtime=1:19:41 sanity=False started=100000 stopped=20230429-075354 submitted=20230429-063413 timeout=5400 username=zhewang
Code-Reviewer Section
The general pull request guidelines can be found here.
Please check each of the following things and check all boxes before accepting a PR.
- [ ] The PR has a description, explaining both the problem and the solution.
- [ ] The description mentions which forms of testing were done and the testing seems reasonable.
- [ ] Every function/class/actor that was touched is reasonably well documented.
For Release-Branches
If this PR is made against a release-branch, please also check the following:
- [ ] This change/bugfix is a cherry-pick from the next younger branch (younger
release-branchormainif this is the youngest branch) - [ ] There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)
Result of foundationdb-pr on Linux CentOS 7
- Commit ID: fd3a91a61afd02801b76d2e077d53cfcbe3e8b29
- Duration 0:04:25
- Result: :x: FAILED
- Error:
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang-ide on Linux CentOS 7
- Commit ID: fd3a91a61afd02801b76d2e077d53cfcbe3e8b29
- Duration 0:04:24
- Result: :x: FAILED
- Error:
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang on Linux CentOS 7
- Commit ID: fd3a91a61afd02801b76d2e077d53cfcbe3e8b29
- Duration 0:04:24
- Result: :x: FAILED
- Error:
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
- Commit ID: fd3a91a61afd02801b76d2e077d53cfcbe3e8b29
- Duration 0:04:27
- Result: :x: FAILED
- Error:
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
- Cluster Test Logs zip file of the test logs (available for 30 days)
Result of foundationdb-pr-clang-ide on Linux CentOS 7
- Commit ID: 4bb084b891a5be7a93666db4a33d53313e3ade05
- Duration 0:09:35
- Result: :x: FAILED
- Error:
Error while executing command: ninja -v -C build_output -j ${NPROC} all. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
- Commit ID: 4bb084b891a5be7a93666db4a33d53313e3ade05
- Duration 0:28:17
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang on Linux CentOS 7
- Commit ID: 4bb084b891a5be7a93666db4a33d53313e3ade05
- Duration 0:36:29
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-macos on macOS Ventura 13.x
- Commit ID: 4bb084b891a5be7a93666db4a33d53313e3ade05
- Duration 0:37:31
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr on Linux CentOS 7
- Commit ID: 4bb084b891a5be7a93666db4a33d53313e3ade05
- Duration 1:06:48
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
- Commit ID: 4bb084b891a5be7a93666db4a33d53313e3ade05
- Duration 1:21:14
- Result: :x: FAILED
- Error:
Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
- Cluster Test Logs zip file of the test logs (available for 30 days)
Doxense CI Report for Windows 10
- Commit ID: 4bb084b891a5be7a93666db4a33d53313e3ade05
- Result: :x: FAILED
- Build Logs (available for 30 days)
Result of foundationdb-pr-clang-ide on Linux CentOS 7
- Commit ID: 17a0200fd3e5b5fe9ca4b087102950c681d234cb
- Duration 0:04:18
- Result: :x: FAILED
- Error:
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang on Linux CentOS 7
- Commit ID: 17a0200fd3e5b5fe9ca4b087102950c681d234cb
- Duration 0:04:17
- Result: :x: FAILED
- Error:
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr on Linux CentOS 7
- Commit ID: 17a0200fd3e5b5fe9ca4b087102950c681d234cb
- Duration 0:04:15
- Result: :x: FAILED
- Error:
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
- Commit ID: 17a0200fd3e5b5fe9ca4b087102950c681d234cb
- Duration 0:04:19
- Result: :x: FAILED
- Error:
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
- Cluster Test Logs zip file of the test logs (available for 30 days)
Doxense CI Report for Windows 10
- Commit ID: 17a0200fd3e5b5fe9ca4b087102950c681d234cb
- Result: :heavy_check_mark: SUCCEEDED
- Build Logs (available for 30 days)
Result of foundationdb-pr on Linux CentOS 7
- Commit ID: 28811e6060642caf136f496302e93174c1abe84f
- Duration 0:04:07
- Result: :x: FAILED
- Error:
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
- Commit ID: 28811e6060642caf136f496302e93174c1abe84f
- Duration 0:04:18
- Result: :x: FAILED
- Error:
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
- Cluster Test Logs zip file of the test logs (available for 30 days)
Result of foundationdb-pr-clang on Linux CentOS 7
- Commit ID: 28811e6060642caf136f496302e93174c1abe84f
- Duration 0:04:22
- Result: :x: FAILED
- Error:
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang-ide on Linux CentOS 7
- Commit ID: 28811e6060642caf136f496302e93174c1abe84f
- Duration 0:04:24
- Result: :x: FAILED
- Error:
Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang-ide on Linux CentOS 7
- Commit ID: 29ef692e56a55394c4b4a2e4452e123605c6ec9a
- Duration 0:08:03
- Result: :x: FAILED
- Error:
Error while executing command: ninja -v -C build_output -j ${NPROC} all. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
- Commit ID: 29ef692e56a55394c4b4a2e4452e123605c6ec9a
- Duration 0:28:47
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang on Linux CentOS 7
- Commit ID: 29ef692e56a55394c4b4a2e4452e123605c6ec9a
- Duration 0:34:16
- Result: :x: FAILED
- Error:
Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-macos on macOS Ventura 13.x
- Commit ID: 29ef692e56a55394c4b4a2e4452e123605c6ec9a
- Duration 0:37:14
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Doxense CI Report for Windows 10
- Commit ID: 29ef692e56a55394c4b4a2e4452e123605c6ec9a
- Result: :heavy_check_mark: SUCCEEDED
- Build Logs (available for 30 days)
Result of foundationdb-pr on Linux CentOS 7
- Commit ID: 29ef692e56a55394c4b4a2e4452e123605c6ec9a
- Duration 0:54:43
- Result: :x: FAILED
- Error:
Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
- Commit ID: 29ef692e56a55394c4b4a2e4452e123605c6ec9a
- Duration 1:24:27
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
- Cluster Test Logs zip file of the test logs (available for 30 days)
Result of foundationdb-pr-clang-ide on Linux CentOS 7
- Commit ID: 79d35af45d679d6a5246149a7d0a7c1f2bd8f65e
- Duration 0:08:15
- Result: :x: FAILED
- Error:
Error while executing command: ninja -v -C build_output -j ${NPROC} all. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Doxense CI Report for Windows 10
- Commit ID: 79d35af45d679d6a5246149a7d0a7c1f2bd8f65e
- Result: :heavy_check_mark: SUCCEEDED
- Build Logs (available for 30 days)
Result of foundationdb-pr-clang on Linux CentOS 7
- Commit ID: 79d35af45d679d6a5246149a7d0a7c1f2bd8f65e
- Duration 0:42:01
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)