Ensure lifecycle tasks wait for messages to be pushed
lifecycle task pushes new entries to bucket topic, but may commit before the entry is commited : which allows multiple lifeycle iterations to happen in parallel.
Issue: BB-641
Hello francoisferrand,
My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.
Available options
| name | description | privileged | authored |
|---|---|---|---|
/after_pull_request |
Wait for the given pull request id to be merged before continuing with the current one. | ||
/bypass_author_approval |
Bypass the pull request author's approval | :star: | |
/bypass_build_status |
Bypass the build and test status | :star: | |
/bypass_commit_size |
Bypass the check on the size of the changeset TBA |
:star: | |
/bypass_incompatible_branch |
Bypass the check on the source branch prefix | :star: | |
/bypass_jira_check |
Bypass the Jira issue check | :star: | |
/bypass_peer_approval |
Bypass the pull request peers' approval | :star: | |
/bypass_leader_approval |
Bypass the pull request leaders' approval | :star: | |
/approve |
Instruct Bert-E that the author has approved the pull request. | :writing_hand: | |
/create_pull_requests |
Allow the creation of integration pull requests. | ||
/create_integration_branches |
Allow the creation of integration branches. | ||
/no_octopus |
Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead | ||
/unanimity |
Change review acceptance criteria from one reviewer at least to all reviewers |
||
/wait |
Instruct Bert-E not to run until further notice. |
Available commands
| name | description | privileged |
|---|---|---|
/help |
Print Bert-E's manual in the pull request. | |
/status |
Print Bert-E's current status in the pull request TBA |
|
/clear |
Remove all comments from Bert-E from the history TBA |
|
/retry |
Re-start a fresh build TBA |
|
/build |
Re-start a fresh build TBA |
|
/force_reset |
Delete integration branches & pull requests, and restart merge process from the beginning. | |
/reset |
Try to remove integration branches unless there are commits on them which do not appear on the source branch. |
Status report is not available.
Codecov Report
Attention: Patch coverage is 85.71429% with 5 lines in your changes missing coverage. Please review.
Project coverage is 55.34%. Comparing base (
61b9e9a) to head (0c257d7).
| Files with missing lines | Patch % | Lines |
|---|---|---|
| extensions/lifecycle/tasks/LifecycleTaskV2.js | 88.46% | 3 Missing :warning: |
| extensions/lifecycle/tasks/LifecycleTask.js | 77.77% | 2 Missing :warning: |
Additional details and impacted files
| Files with missing lines | Coverage Δ | |
|---|---|---|
| extensions/lifecycle/tasks/LifecycleTask.js | 83.30% <77.77%> (+0.11%) |
:arrow_up: |
| extensions/lifecycle/tasks/LifecycleTaskV2.js | 89.74% <88.46%> (+0.85%) |
:arrow_up: |
... and 4 files with indirect coverage changes
| Components | Coverage Δ | |
|---|---|---|
| Bucket Notification | 18.51% <ø> (ø) |
|
| Core Library | 61.90% <ø> (-0.23%) |
:arrow_down: |
| Ingestion | 67.53% <ø> (ø) |
|
| Lifecycle | 47.15% <85.71%> (+0.24%) |
:arrow_up: |
| Oplog Populator | 84.20% <ø> (ø) |
|
| Replication | 51.01% <ø> (-0.04%) |
:arrow_down: |
| Bucket Scanner | 85.60% <ø> (ø) |
@@ Coverage Diff @@
## development/8.6 #2603 +/- ##
===================================================
- Coverage 55.40% 55.34% -0.06%
===================================================
Files 198 198
Lines 12915 12928 +13
===================================================
Hits 7155 7155
- Misses 5750 5763 +13
Partials 10 10
| Flag | Coverage Δ | |
|---|---|---|
| api:retry | 9.62% <0.00%> (-0.01%) |
:arrow_down: |
| api:routes | 9.51% <0.00%> (-0.01%) |
:arrow_down: |
| bucket-scanner | 85.60% <ø> (ø) |
|
| ingestion | 12.45% <0.00%> (-0.02%) |
:arrow_down: |
| lib | 7.51% <0.00%> (-0.01%) |
:arrow_down: |
| lifecycle | 19.44% <85.71%> (+0.08%) |
:arrow_up: |
| notification | 0.88% <0.00%> (-0.01%) |
:arrow_down: |
| replication | 18.87% <0.00%> (-0.13%) |
:arrow_down: |
| unit | 5.13% <0.00%> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Request integration branches
Waiting for integration branch creation to be requested by the user.
To request integration branches, please comment on this pull request with the following command:
/create_integration_branches
Alternatively, the /approve and /create_pull_requests commands will automatically
create the integration branches.
The message is already considered as locally consumed even before it reached the queue processor queue
That is a fair point (and may actually help on another issue), but I don't really see how this is a problem for this change: handling an entry by the bucket processor typically takes at least one second already (scanning & checking the state of every object), so we face this discrepancy anyway...
This change is simply about ensuring that the we keep the "slot" until the entry is "fully" processed, instead of leaving many things pending: which can be an issue esp. since we are listing pushing continuation messages.
What am I missing here?
Ifwe decide to block or wait synchronously for that delivery report every time we send a message, it will impact performance and throughput.
In theory yes; Practically however, since we are processing up to 1000 entries at a time, I wonder if this makes a real impact: most of the reports would be received in the time we process each entry... (except for very small buckets, in which case throughput may not be so important)
It is certainly a trade off, but consistent processing seems important as well: or do you think it is completely safe to leave all these messages dangling, and already start processing next message(s)?
What am I missing here?
My understanding was that the goal of this PR is to prevent multiple lifecycle iterations (triggered by Conductor) from running in parallel. I just pointed out that the lag is based on the “locally consumed” offset rather than on a processed or stored offset. So even if we wait for an entry to be fully processed, it won't stop the bucket-lifecycle topic lag from being zero while there are still other bucket messages in the pipeline.
most of the reports would be received in the time we process each entry
Regarding the internal lifecycle listing, it does not necessarily return a 1000 objects; it only includes those that meet the specified criteria (prefix, age, etc...) from the next 10,000 entries. We might even end up with a listing response containing only a few objects, or none at all. NOTE: This 10,000 entry limit helps avoid placing excessive load on Metadata by preventing the evaluation of an unbounded number of entries.
Incorrect fix version
The Fix Version/s in issue BB-641 contains:
-
8.6.56 -
9.0.19
Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:
-
8.6.57 -
9.0.19 -
9.1.1
Please check the Fix Version/s of BB-641, or the target
branch of this pull request.