Propagate rv to tLogs on version vector recovery
This PR fixes many of the "quiet database" errors in simulation tests running version vector, caused by peeks waiting for versions that never arrive during recovery.
in version vector, a locked tLog may be requested for a version beyond what the tLog has. During recovery tLogs are peeked up to the RV, but this can higher than logData->version, as tLogs in version vector advance at different rates.
To avoid deadlocking, return with the end version received in the request, which is the RV if set to a valid version during recovery. Do not do this if the version is not set or is not an valid version. For example, the SS may also be peeking tLogs during recovery, but they may not yet know the RV; they set the end version to infinity.
The PR modifies log routers instantiated for recovery to receive the RV in their initiation message, as do tLogs. This ensures they send a valid end version (RV) during recovery in peek messages.
This PR removes the previous solution, in which RPCs were sent to tLogs during recovery with the RV.
Joshua
20241008-140414-dlambrig-a9091fcbb91cc075 (commit 7)
Code-Reviewer Section
The general pull request guidelines can be found here.
Please check each of the following things and check all boxes before accepting a PR.
- [x] The PR has a description, explaining both the problem and the solution.
- [x] The description mentions which forms of testing were done and the testing seems reasonable.
- [x] Every function/class/actor that was touched is reasonably well documented.
For Release-Branches
If this PR is made against a release-branch, please also check the following:
- [ ] This change/bugfix is a cherry-pick from the next younger branch (younger
release-branchormainif this is the youngest branch) - [ ] There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
- Commit ID: 16f7fb217dd100c574a9baf19563663ccc468ae4
- Duration 0:06:24
- Result: :x: FAILED
- Error:
Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /opt/homebrew/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-macos on macOS Ventura 13.x
- Commit ID: 16f7fb217dd100c574a9baf19563663ccc468ae4
- Duration 0:11:09
- Result: :x: FAILED
- Error:
Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /usr/local/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang-ide on Linux CentOS 7
- Commit ID: 16f7fb217dd100c574a9baf19563663ccc468ae4
- Duration 0:16:50
- Result: :x: FAILED
- Error:
Error while executing command: ninja -v -C build_output -j ${NPROC} all. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
- Commit ID: be140c5c2602c6bf7ef9c867a3ff7b19f425024d
- Duration 0:07:00
- Result: :x: FAILED
- Error:
Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /opt/homebrew/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-macos on macOS Ventura 13.x
- Commit ID: be140c5c2602c6bf7ef9c867a3ff7b19f425024d
- Duration 0:11:32
- Result: :x: FAILED
- Error:
Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /usr/local/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang-ide on Linux CentOS 7
- Commit ID: be140c5c2602c6bf7ef9c867a3ff7b19f425024d
- Duration 0:15:21
- Result: :x: FAILED
- Error:
Error while executing command: ninja -v -C build_output -j ${NPROC} all. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang-arm on Linux CentOS 7
- Commit ID: 16f7fb217dd100c574a9baf19563663ccc468ae4
- Duration 0:53:04
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
- Commit ID: 16f7fb217dd100c574a9baf19563663ccc468ae4
- Duration 0:53:51
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
- Cluster Test Logs zip file of the test logs (available for 30 days)
Result of foundationdb-pr on Linux CentOS 7
- Commit ID: 16f7fb217dd100c574a9baf19563663ccc468ae4
- Duration 0:54:52
- Result: :x: FAILED
- Error:
Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang on Linux CentOS 7
- Commit ID: 16f7fb217dd100c574a9baf19563663ccc468ae4
- Duration 1:05:52
- Result: :x: FAILED
- Error:
Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
- Commit ID: be140c5c2602c6bf7ef9c867a3ff7b19f425024d
- Duration 0:56:01
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
- Cluster Test Logs zip file of the test logs (available for 30 days)
Result of foundationdb-pr on Linux CentOS 7
- Commit ID: be140c5c2602c6bf7ef9c867a3ff7b19f425024d
- Duration 0:57:06
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang-arm on Linux CentOS 7
- Commit ID: be140c5c2602c6bf7ef9c867a3ff7b19f425024d
- Duration 0:57:18
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang on Linux CentOS 7
- Commit ID: be140c5c2602c6bf7ef9c867a3ff7b19f425024d
- Duration 1:07:23
- Result: :x: FAILED
- Error:
Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
- Commit ID: bcad4ec0b265dcd93dbfbcdb646a92895b8ab944
- Duration 0:07:35
- Result: :x: FAILED
- Error:
Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /opt/homebrew/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-macos on macOS Ventura 13.x
- Commit ID: bcad4ec0b265dcd93dbfbcdb646a92895b8ab944
- Duration 0:12:32
- Result: :x: FAILED
- Error:
Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /usr/local/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang-ide on Linux CentOS 7
- Commit ID: bcad4ec0b265dcd93dbfbcdb646a92895b8ab944
- Duration 0:17:57
- Result: :x: FAILED
- Error:
Error while executing command: ninja -v -C build_output -j ${NPROC} all. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang on Linux CentOS 7
- Commit ID: bcad4ec0b265dcd93dbfbcdb646a92895b8ab944
- Duration 0:50:14
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang-arm on Linux CentOS 7
- Commit ID: bcad4ec0b265dcd93dbfbcdb646a92895b8ab944
- Duration 0:52:10
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
- Commit ID: bcad4ec0b265dcd93dbfbcdb646a92895b8ab944
- Duration 0:56:18
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
- Cluster Test Logs zip file of the test logs (available for 30 days)
Result of foundationdb-pr on Linux CentOS 7
- Commit ID: bcad4ec0b265dcd93dbfbcdb646a92895b8ab944
- Duration 1:05:32
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
- Commit ID: 4c24394e161a5210998651d619068d9956346278
- Duration 0:06:31
- Result: :x: FAILED
- Error:
Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /opt/homebrew/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-macos on macOS Ventura 13.x
- Commit ID: 4c24394e161a5210998651d619068d9956346278
- Duration 0:11:14
- Result: :x: FAILED
- Error:
Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /usr/local/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1 - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang-ide on Linux CentOS 7
- Commit ID: 4c24394e161a5210998651d619068d9956346278
- Duration 0:21:17
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-clang-arm on Linux CentOS 7
- Commit ID: 4c24394e161a5210998651d619068d9956346278
- Duration 0:50:59
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
- Commit ID: 4c24394e161a5210998651d619068d9956346278
- Duration 0:59:00
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
- Cluster Test Logs zip file of the test logs (available for 30 days)
Result of foundationdb-pr-clang on Linux CentOS 7
- Commit ID: 4c24394e161a5210998651d619068d9956346278
- Duration 1:13:37
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Result of foundationdb-pr on Linux CentOS 7
- Commit ID: 4c24394e161a5210998651d619068d9956346278
- Duration 1:16:05
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)
Does this PR handle all (known) race conditions (or, are there any cases that need follow up PRs)? Thanks!
Result of foundationdb-pr-clang-ide on Linux CentOS 7
- Commit ID: a33504aee44d6f93cf93f6fd35c26f5ea155641b
- Duration 0:21:17
- Result: :white_check_mark: SUCCEEDED
- Error:
N/A - Build Log terminal output (available for 30 days)
- Build Workspace zip file of the working directory (available for 30 days)