foundationdb icon indicating copy to clipboard operation
foundationdb copied to clipboard

Fixed unability to run more than one commit proxy in small configurations

Open oleg68 opened this issue 2 years ago • 126 comments

Fixed unability to run more than one commit proxy in small configurations

Problem statement

By default, the desired nubrer of commit proxies is 3. But in small configurations with 3 stateless processes (and any number of transaction and storage processes) foundationdb runs only one commit proxy process, that may cause some performance issues with high write volume.

Any attempts to change the number of proxies with configure commit_proxies=2 and so on do not help: it change only the Desired commit proxies value shown by the status fdbcli command, but do not change the actual number of commit proxies.

Steps to reproduce

  1. Create an FDB cluster
  2. Setup three stateless server, processes and, for example, three transaction and three storage processes
  3. Execute fdbcli --exec "status json" | grep commit_proxy

Expected result

Three lines

                        "role" : "commit_proxy"
                        "role" : "commit_proxy"
                        "role" : "commit_proxy"

Actual result

Only one line

                        "role" : "commit_proxy"

Findings

As far as I understand a list of worker candidates is determined by the ClusterControllerData::getWorkersForRoleInDatacenter method.

The reason why it doesn't return more than one worker is the condition id_used[it.first] <= minWorker.get().used that is never satisfied with small number of stateless processes.

Proposal

  1. Fill out the worker candidate list fitness_workers regardless how much they are used.
  2. Fill out the result worker list with desired number of workers with the lowerst usage count.

Proposed PR implements these ideas. It

  • introduces the used_fitness local variable that is filled by the lowest fitness value of availdable processes
  • addes filtering by used_fitness when the result list is filled to use only workers with the smallest fitness value
  • eliminates the condition id_used[it.first] <= minWorker.get().used

After applying this PR all desired number of commit proxies is recruited as long as there is a sufficient number number of stateless processes.

oleg68 avatar Jun 06 '23 08:06 oleg68

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: b756c1c94ba7b00c58752276fb37b39172c7a966
  • Duration 0:04:38
  • Result: :x: FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 08:06 foundationdb-ci

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: b756c1c94ba7b00c58752276fb37b39172c7a966
  • Duration 0:04:36
  • Result: :x: FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 08:06 foundationdb-ci

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: b756c1c94ba7b00c58752276fb37b39172c7a966
  • Duration 0:04:37
  • Result: :x: FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 08:06 foundationdb-ci

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: b756c1c94ba7b00c58752276fb37b39172c7a966
  • Duration 0:04:42
  • Result: :x: FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci avatar Jun 06 '23 08:06 foundationdb-ci

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: b756c1c94ba7b00c58752276fb37b39172c7a966
  • Duration 0:05:35
  • Result: :x: FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 08:06 foundationdb-ci

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: b756c1c94ba7b00c58752276fb37b39172c7a966
  • Duration 0:05:46
  • Result: :x: FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 08:06 foundationdb-ci

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: 06228ae4a97c46bc3664919c9387f2cdd47183b4
  • Duration 0:20:07
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 10:06 foundationdb-ci

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 06228ae4a97c46bc3664919c9387f2cdd47183b4
  • Duration 0:30:06
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 10:06 foundationdb-ci

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 06228ae4a97c46bc3664919c9387f2cdd47183b4
  • Duration 0:43:30
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 11:06 foundationdb-ci

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: 06228ae4a97c46bc3664919c9387f2cdd47183b4
  • Duration 0:47:10
  • Result: :x: FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 11:06 foundationdb-ci

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: 06228ae4a97c46bc3664919c9387f2cdd47183b4
  • Duration 1:04:06
  • Result: :x: FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 11:06 foundationdb-ci

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: 06228ae4a97c46bc3664919c9387f2cdd47183b4
  • Duration 1:22:35
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci avatar Jun 06 '23 11:06 foundationdb-ci

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: d92fa5ac965ec9ea518a3dd86a1f99eeb632ef93
  • Duration 0:22:31
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 14:06 foundationdb-ci

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: d92fa5ac965ec9ea518a3dd86a1f99eeb632ef93
  • Duration 0:46:17
  • Result: :x: FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 15:06 foundationdb-ci

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: d92fa5ac965ec9ea518a3dd86a1f99eeb632ef93
  • Duration 1:11:27
  • Result: :x: FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 06 '23 15:06 foundationdb-ci

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: d92fa5ac965ec9ea518a3dd86a1f99eeb632ef93
  • Duration 1:24:11
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci avatar Jun 06 '23 15:06 foundationdb-ci

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: a664b4e17327a37434bcccc6d7aeff5620473ec2
  • Duration 0:20:55
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 08 '23 10:06 foundationdb-ci

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: a664b4e17327a37434bcccc6d7aeff5620473ec2
  • Duration 0:29:32
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 08 '23 10:06 foundationdb-ci

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: a664b4e17327a37434bcccc6d7aeff5620473ec2
  • Duration 0:45:56
  • Result: :x: FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 08 '23 11:06 foundationdb-ci

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: a664b4e17327a37434bcccc6d7aeff5620473ec2
  • Duration 0:46:02
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 08 '23 11:06 foundationdb-ci

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: a664b4e17327a37434bcccc6d7aeff5620473ec2
  • Duration 1:19:46
  • Result: :x: FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci avatar Jun 08 '23 11:06 foundationdb-ci

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: a664b4e17327a37434bcccc6d7aeff5620473ec2
  • Duration 1:20:48
  • Result: :x: FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 08 '23 11:06 foundationdb-ci

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: 0b9d42246f677773bd2be5e3814946e8e73825db
  • Duration 0:20:38
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 08 '23 14:06 foundationdb-ci

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 0b9d42246f677773bd2be5e3814946e8e73825db
  • Duration 0:29:46
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 08 '23 14:06 foundationdb-ci

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 0b9d42246f677773bd2be5e3814946e8e73825db
  • Duration 0:42:50
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 08 '23 14:06 foundationdb-ci

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: 0b9d42246f677773bd2be5e3814946e8e73825db
  • Duration 0:50:37
  • Result: :x: FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 08 '23 14:06 foundationdb-ci

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: 0b9d42246f677773bd2be5e3814946e8e73825db
  • Duration 1:01:33
  • Result: :x: FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 08 '23 14:06 foundationdb-ci

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: 0b9d42246f677773bd2be5e3814946e8e73825db
  • Duration 1:21:44
  • Result: :x: FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci avatar Jun 08 '23 15:06 foundationdb-ci

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: 31d713a5d6909f8189ef088ae7d865f77610604f
  • Duration 0:21:42
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 08 '23 15:06 foundationdb-ci

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 31d713a5d6909f8189ef088ae7d865f77610604f
  • Duration 0:30:02
  • Result: :white_check_mark: SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci avatar Jun 08 '23 15:06 foundationdb-ci