mds: wait snapupdate to finish before notifying to clients
When making snapshot the MDS will notify the mds first and then the clients, but the other MDSes may get the notify late and just after the clients flush the snapcap to them. The MDSes will just drop the snapcap flush request to floor.
We need to wait for a while and make sure all the other MDSes get notifications first.
The same with rmsnap and renamesnap ops.
Fixes: https://tracker.ceph.com/issues/56011 Signed-off-by: Xiubo Li [email protected]
Contribution Guidelines
-
To sign and title your commits, please refer to Submitting Patches to Ceph.
-
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
- Tracker (select at least one)
- [ ] References tracker ticket
- [ ] Very recent bug; references commit where it was introduced
- [ ] New feature (ticket optional)
- [ ] Doc update (no ticket needed)
- [ ] Code cleanup (no ticket needed)
- Component impact
- [ ] Affects Dashboard, opened tracker ticket
- [ ] Affects Orchestrator, opened tracker ticket
- [ ] No impact that needs to be tracked
- Documentation (select at least one)
- [ ] Updates relevant documentation
- [ ] No doc update is appropriate
- Tests (select at least one)
- [ ] Includes unit test(s)
- [ ] Includes integration test(s)
- [ ] Includes bug reproducer
- [ ] No tests
Show available Jenkins commands
-
jenkins retest this please -
jenkins test classic perf -
jenkins test crimson perf -
jenkins test signed -
jenkins test make check -
jenkins test make check arm64 -
jenkins test submodules -
jenkins test dashboard -
jenkins test dashboard cephadm -
jenkins test api -
jenkins test docs -
jenkins render docs -
jenkins test ceph-volume all -
jenkins test ceph-volume tox -
jenkins test windows
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved
Resolved the conflicts.
jenkins test api
jenkins test make check
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved
@lxbsz this needs a rebase
@lxbsz this needs a rebase
Done. Thanks for reminding.
When making snapshot the MDS will notify the mds first and then the clients, but the other MDSes may get the notify late and just after the clients flush the snapcap to them. The MDSes will just drop the snapcap flush request to floor.
I think "When creating snapshot, the primary MDS will notify other mds first ..." would sound better also "The other MDSs will just drop the snapcap flush request to the floor"
We need to wait for a while and make sure all the other MDSes get notifications first.
The same with rmsnap and renamesnap ops.
When making snapshot the MDS will notify the mds first and then the clients, but the other MDSes may get the notify late and just after the clients flush the snapcap to them. The MDSes will just drop the snapcap flush request to floor.
I think "When creating snapshot, the primary MDS will notify other mds first ..." would sound better also "The other MDSs will just drop the snapcap flush request to the floor"
We need to wait for a while and make sure all the other MDSes get notifications first. The same with rmsnap and renamesnap ops.
Revised it and thanks @mchangir
jenkins retest this please
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward. If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward. If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!
@lxbsz Please rebase and push - I'll try to have a look again.
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward. If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.
@lxbsz please rebase. Will put this to test.
@lxbsz please rebase. Will put this to test.
Sorry @vshankar I missed you last two comments about this. Rebased it now.
jenkins retest this please