Supply step do not retry when error is connection reset by peer
What version of Cloud Foundry and CF CLI are you using? (i.e. What is the output of running cf curl /v2/info && cf version?
{
...
"version": 0,
"min_cli_version": null,
"min_recommended_cli_version": null,
"api_version": "2.203.0",
"osbapi_version": "2.15",
}
cf version 7.5.0+0ad1d63.2022-06-04
What version of the buildpack you are using?
We have multiple applications, the below versions are used currently.
Version: 1.8.5, 1.8.7 and 1.8.12
If you were attempting to accomplish a task, what was it you were attempting to do?
Package staging randomly fails at supply/finalize step when downloading Go without attempting any retry.
What did you expect to happen?
The curl command already has retry mechanism, and it should retry if the attempts encounter any network connection problems.
What was the actual behavior?
If the error is "connection reset by peer", the curl command do not retry and immediately return error
see https://stackoverflow.com/questions/42873285/curl-retry-mechanism
Sample log output from running the same curl command
2023-07-18T18:17:02.90+0000 [APP/TASK/eab4dfa6/0] ERR % Total % Received % Xferd Average Speed Time Time Time Current
2023-07-18T18:17:02.90+0000 [APP/TASK/eab4dfa6/0] ERR Dload Upload Total Spent Left Speed
2023-07-18T18:17:03.07+0000 [APP/TASK/eab4dfa6/0] ERR 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
2023-07-18T18:17:03.74+0000 [APP/TASK/eab4dfa6/0] ERR 0 151M 0 142k 0 0 812k 0 0:03:10 --:--:-- 0:03:10 809k
2023-07-18T18:17:03.75+0000 [APP/TASK/eab4dfa6/0] ERR 33 151M 33 50.1M 0 0 59.3M 0 0:00:02 --:--:-- 0:00:02 59.3M
2023-07-18T18:17:03.75+0000 [APP/TASK/eab4dfa6/0] ERR curl: (56) OpenSSL SSL_read: Connection reset by peer, errno 104
2023-07-18T18:17:03.75+0000 [APP/TASK/eab4dfa6/0] OUT Exit: 56, Elapsed: 1
To reproduce, create a sample app(doesn't matter what app), then create a run-task that perpetually download the go binary
cf run-task app-1 --command 'while true; do start_time=$(date +%s); curl https://buildpacks.cloudfoundry.org/dependencies/go/go_1.19_linux_x64_cflinuxfs3_7e231ea5.tgz --location --retry 15 --retry-delay 2 --output /tmp/go.tgz; echo -n "Exit: $?, "; end_time=$(date +%s); elapsed=$(( end_time - start_time )); echo "Elapsed: $elapsed"; sleep 10; done'
Watch the app log output for errors.
Can you provide a sample app?
N/A
Please confirm where necessary:
- [x] I have included a log output
- [x] My log includes an error message
- [x] I have included steps for reproduction
One quick way to fix this is to use --retry-all-errors instead of --retry https://stackoverflow.com/questions/10285700/curl-error-recv-failure-connection-reset-by-peer-php-curl#:~:text=One%20way%20we%20can%20avoid,number%20of%20worker_connections%20in%20nginx.
Note that --retry-all-errors only exists since curl v7.71.0 changelog. Therefore, it cannot be used in cflinuxfs3, but it should be available in cflinuxfs4.
A fix that would address this behaviour for cflinuxfs4 stack is proposed: PR Add option --retry-all-errors when downloading go version
@ivanovac implemented a small go server, which simulates the "connection reset by peer" by sending RST to the client. With this we were able to confirm, that those types of transient TCP issues are ignored by the curl --retry mechanism and can be handled only by --retry-all-errors.