cloudstack icon indicating copy to clipboard operation
cloudstack copied to clipboard

Storage Filesystem as a First Class Feature

Open abh1sar opened this issue 1 year ago • 77 comments

Description

This PR implements Storage filesystem as a first class feature. https://cwiki.apache.org/confluence/display/CLOUDSTACK/Storage+Filesystem+as+a+First+Class+Feature

Documentation PR: https://github.com/apache/cloudstack-documentation/pull/420

Types of changes

  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [ ] Enhancement (improves an existing feature and functionality)
  • [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
  • [ ] build/CI

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • [x] Major
  • [ ] Minor

Bug Severity

  • [ ] BLOCKER
  • [ ] Critical
  • [ ] Major
  • [ ] Minor
  • [ ] Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

abh1sar avatar Jun 11 '24 03:06 abh1sar

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

github-actions[bot] avatar Jun 11 '24 05:06 github-actions[bot]

Codecov Report

Attention: Patch coverage is 38.06986% with 1046 lines in your changes missing coverage. Please review.

Project coverage is 15.64%. Comparing base (72d0546) to head (c5214ac). Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...tack/storage/sharedfs/query/vo/SharedFSJoinVO.java 1.70% 114 Missing and 1 partial :warning:
...mmand/user/storage/sharedfs/CreateSharedFSCmd.java 0.00% 108 Missing :warning:
...oudstack/storage/sharedfs/SharedFSServiceImpl.java 73.38% 76 Missing and 31 partials :warning:
...oudstack/storage/sharedfs/dao/SharedFSDaoImpl.java 0.00% 56 Missing :warning:
...age/sharedfs/ChangeSharedFSServiceOfferingCmd.java 0.00% 51 Missing :warning:
...mand/user/storage/sharedfs/RestartSharedFSCmd.java 0.00% 51 Missing :warning:
...ache/cloudstack/api/response/SharedFSResponse.java 59.84% 49 Missing and 2 partials :warning:
...apache/cloudstack/storage/sharedfs/SharedFSVO.java 40.00% 51 Missing :warning:
...ommand/user/storage/sharedfs/StartSharedFSCmd.java 0.00% 48 Missing :warning:
...sharedfs/lifecycle/StorageVmSharedFSLifeCycle.java 68.99% 28 Missing and 12 partials :warning:
... and 30 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #9208      +/-   ##
============================================
+ Coverage     15.57%   15.64%   +0.06%     
- Complexity    12056    12110      +54     
============================================
  Files          5506     5535      +29     
  Lines        482997   485033    +2036     
  Branches      59483    62628    +3145     
============================================
+ Hits          75236    75887     +651     
- Misses       399450   400773    +1323     
- Partials       8311     8373      +62     
Flag Coverage Δ
uitests 4.12% <ø> (-0.05%) :arrow_down:
unittests 16.44% <38.06%> (+0.08%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jun 21 '24 19:06 codecov[bot]

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

github-actions[bot] avatar Jul 12 '24 08:07 github-actions[bot]

Hey @abh1sar

Good to see you actually have a specification for such a big feature. Sadly we have too many PRs that change 2k+ lines with little to none specification.

I have a few suggestions regarding the feature:

  1. On the file_share table you have on the spec, I think that we are missing a created column, to track when the file_share was created. I would also add a removed column, to track when it was removed.
  2. On your FSM, add a final state called Expunged, that is reached when the ExpungeOperation is successful.
  3. I know that using shell scripts or python might be tempting, but these tend to be hard to maintain and expand without breaking. I would advise on going back to the idea of doing something similar to what is done for the CPVM/SSVM.

JoaoJandre avatar Jul 26 '24 13:07 JoaoJandre

@blueorangutan package

abh1sar avatar Jul 30 '24 11:07 abh1sar

@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Jul 30 '24 11:07 blueorangutan

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 10514

blueorangutan avatar Jul 30 '24 12:07 blueorangutan

@blueorangutan test

abh1sar avatar Jul 30 '24 20:07 abh1sar

@abh1sar a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

blueorangutan avatar Jul 30 '24 20:07 blueorangutan

@blueorangutan package

abh1sar avatar Aug 02 '24 04:08 abh1sar

@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Aug 02 '24 04:08 blueorangutan

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 10543

blueorangutan avatar Aug 02 '24 05:08 blueorangutan

@blueorangutan package

abh1sar avatar Aug 02 '24 08:08 abh1sar

@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Aug 02 '24 08:08 blueorangutan

@weizhouapache PR is ready for review. Documentation PR is pending which I will link in a few days. Also working on UT and integrations tests.

abh1sar avatar Aug 02 '24 08:08 abh1sar

Hey @abh1sar

Good to see you actually have a specification for such a big feature. Sadly we have too many PRs that change 2k+ lines with little to none specification.

I have a few suggestions regarding the feature:

  1. On the file_share table you have on the spec, I think that we are missing a created column, to track when the file_share was created. I would also add a removed column, to track when it was removed.
  2. On your FSM, add a final state called Expunged, that is reached when the ExpungeOperation is successful.
  3. I know that using shell scripts or python might be tempting, but these tend to be hard to maintain and expand without breaking. I would advise on going back to the idea of doing something similar to what is done for the CPVM/SSVM.

Hi @JoaoJandre, I am afraid the specification is not upto date with the design changes that were discussed over the dev ML. Apologies for that. https://lists.apache.org/thread/rd4sborvzlpsw64o3mq257423ronnq53 I will update the spec asap.

1 and 2. I have the created, removed column and Expunge state in the implementation. 3. We have moved away from VR based model of using shell scripts and python code. Please refer to the ML discussion. The VM is deployed as a normal user VM and the userdata is used to push some udev rules to the VM. These udev rules take care of following operations :

  • Format and export the FS, first time a data volume is added to the VM
  • Resize the FS whenever the volume is resized

We can think about using the agent model in future if required, but using Userdata allows us to implement the basic functionality that is very lightweight in terms of mgmt server - vm communication. Please let me know your thoughts.

abh1sar avatar Aug 02 '24 08:08 abh1sar

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 10547

blueorangutan avatar Aug 02 '24 09:08 blueorangutan

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

github-actions[bot] avatar Aug 06 '24 17:08 github-actions[bot]

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

github-actions[bot] avatar Aug 07 '24 09:08 github-actions[bot]

@blueorangutan package

abh1sar avatar Aug 08 '24 20:08 abh1sar

@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Aug 08 '24 20:08 blueorangutan

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 10598

blueorangutan avatar Aug 08 '24 21:08 blueorangutan

@blueorangutan test

abh1sar avatar Aug 09 '24 02:08 abh1sar

@abh1sar a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

blueorangutan avatar Aug 09 '24 02:08 blueorangutan

[SF] Trillian test result (tid-11047) Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8 Total time taken: 54625 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9208-t11047-kvm-ol8.zip Smoke tests completed. 138 look OK, 1 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_redundant_vpc_site2site_vpn Failure 439.46 test_vpc_vpn.py

blueorangutan avatar Aug 09 '24 18:08 blueorangutan

@blueorangutan package

abh1sar avatar Aug 12 '24 03:08 abh1sar

@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Aug 12 '24 03:08 blueorangutan

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 10611

blueorangutan avatar Aug 12 '24 04:08 blueorangutan

@blueorangutan package

abh1sar avatar Aug 13 '24 08:08 abh1sar

@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Aug 13 '24 08:08 blueorangutan