Polling worker Tentacles are slower when running many concurrent deployments (or concurrent steps).
Severity
No response
Version
2023.3.8103 (with feature toggle) or since at least 2023.4.1479
Latest Version
None
What happened?
The problem
When running concurrent deployments which all use a single polling worker, deployments take longer.
Context:
- Polling tentacles are limited to a single TCP connection to Octopus Server and can do at most one RPC at a time.
- ScriptServiceV2 introduced a concept of waiting for the script to finish. see also PR
- Octopus is using that
durationToWaitForScriptToFinishwith a value of 5s. - Tentacle workers usually run many scripts concurrently for many deployments concurrently.
Cause
The delay of 5s set on durationToWaitForScriptToFinish results in a bottle neck on starting scripts where multiple deployments may send a script to be executed by the worker, but must wait upto 5s for each other script the polling worker is starting.
Suggested Fix
Set durationToWaitForScriptToFinish to null when starting scripts on polling workers.
Reproduction
Create many deployments to a single polling worker, which have a mix of many steps bit each step takes longer than 5 secomds.
Error and Stacktrace
Start octopus with env var: OCTOPUS__Feature__LogTentacleRpcTimedOperationsWhenLongerThan_ms=0
Run the deployments as above.
Look for logs like:
Halibut RPC to tentacle calling IScriptServiceV2.StartScript succeeded after 11321ms.
That took 11.3s, which suggest it had to wait for two other scripts to start (2 x 5s) and then itself took 1.3s to run the script resulting in 10x worse performance.
More Information
No response
Workaround
Configure the polling tentacle to poll octopus server over many TCP connections by specifying other urls/ports to poll. For hosted users that can additionally poll the standard 443 port
by configuring their tentacle with:
/path/to/tentacle/Tentacle poll-server --server https://<yoururl>.octopus.app --apikey=API-APIKEY01 --server-comms-address "https://polling.<yoururl>.octopus.app" --server-comms-port=443