Add a retry mechanism for VM migration
Description
This PR adds a retry mechanism controlled by two settings (number of retries & wait between retries) when migrating a vm. The VM initial deployment process already has such retries which help circumvent hypervisors issues and race conditions when allocating resources.
Types of changes
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Bug fix (non-breaking change which fixes an issue)
- [X] Enhancement (improves an existing feature and functionality)
- [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
Feature/Enhancement Scale or Bug Severity
This change is trivial.
Feature/Enhancement Scale
- [ ] Major
- [X] Minor
Bug Severity
- [ ] BLOCKER
- [ ] Critical
- [ ] Major
- [ ] Minor
- [X] Trivial
Screenshots (if appropriate):
How Has This Been Tested?
Codecov Report
Attention: Patch coverage is 6.45161% with 29 lines in your changes missing coverage. Please review.
Project coverage is 12.70%. Comparing base (
2e6100d) to head (226ea94). Report is 755 commits behind head on main.
:exclamation: Current head 226ea94 differs from pull request most recent head 1fa563e
Please upload reports for the commit 1fa563e to get more accurate results.
| Files | Patch % | Lines |
|---|---|---|
| .../src/main/java/com/cloud/vm/UserVmManagerImpl.java | 6.45% | 29 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #7383 +/- ##
============================================
- Coverage 14.40% 12.70% -1.70%
+ Complexity 10111 8691 -1420
============================================
Files 2748 2729 -19
Lines 259390 256623 -2767
Branches 40381 39997 -384
============================================
- Hits 37365 32607 -4758
- Misses 217190 219866 +2676
+ Partials 4835 4150 -685
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@shwstppr Default value has been changed to respect original behaviour. VM migration is already tested in the vm_lifecycle test suite. Anything else could prevent this from moving forward ?
@shwstppr Default value has been changed to respect original behaviour. VM migration is already tested in the vm_lifecycle test suite. Anything else could prevent this from moving forward ?
Thanks for addressing the suggestion. I don't think VM migration in vm_lifecycle test suite will be able to test retries. Have you done any testing where migration failed first but succeeds in retries?
@blueorangutan package
@shwstppr a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: el9 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 5961
This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.
@blueorangutan package
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Packaging result [SF]: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: el9 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 7582
@blueorangutan test
@DaanHoogland a [SL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests
[SF] Trillian test result (tid-8193) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 45280 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7383-t8193-kvm-centos7.zip Smoke tests completed. 114 look OK, 1 have errors, 0 did not run Only failed and skipped tests results shown below:
| Test | Result | Time (s) | Test File |
|---|---|---|---|
| test_02_upgrade_kubernetes_cluster | Failure |
592.74 | test_kubernetes_clusters.py |
This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.
Hi @benj-n Please check/resolve the conflicts. Can target this PR for 4.19.1?
Ping @benj-n Can you check/address the comments. Thanks.
@benj-n , I moved this to the next major release as there is little activity on this. cc @JoaoJandre
I think it is almost done though.
hi @benj-n please check and resolve any conflicts in the branch.
@benj-n , will you still be looking at this?
@benj-n , closing this one as it has conflicts and is old. please update and reopen if you think it is still relevant.








