origin icon indicating copy to clipboard operation
origin copied to clipboard

bug 2093051: use reflects to drive node status monitor to improve reliability

Open deads2k opened this issue 3 years ago • 8 comments

While chasing problems with [bz-Machine Config Operator] Nodes should reach OSUpdateStaged in a timely fashion like in https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.11-e2e-aws-upgrade/1532346119462326272, I found the informer versus reflector issue we changed for events and pods. This makes it match pods.

I suggest doing a sweep eventually.

/assign @dgoodwin

deads2k avatar Jun 02 '22 19:06 deads2k

This is a significant failer on some NURPs, so I suggest a soonish review https://sippy.dptools.openshift.org/sippy-ng/tests/4.11/analysis?test=[bz-Machine%20Config%20Operator]%20Nodes%20should%20reach%20OSUpdateStaged%20in%20a%20timely%20fashion

deads2k avatar Jun 02 '22 19:06 deads2k

@deads2k: This pull request references Bugzilla bug 2093051, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.11.0) matches configured target release for branch (4.11.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

bug 2093051: use reflects to drive node status monitor to improve reliability

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci[bot] avatar Jun 02 '22 19:06 openshift-ci[bot]

/lgtm

dgoodwin avatar Jun 03 '22 10:06 dgoodwin

/retest-required

Remaining retests: 2 against base HEAD 099fac2d32b2eecfa4cb461936897df46ec6116d and 8 for PR HEAD 7eb2b7d32501cc6bf31a9337a28920d42d658197 in total

openshift-ci-robot avatar Jun 03 '22 11:06 openshift-ci-robot

/hold

while I double check results.

deads2k avatar Jun 03 '22 14:06 deads2k

New changes are detected. LGTM label has been removed.

openshift-ci[bot] avatar Jun 03 '22 20:06 openshift-ci[bot]

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, dgoodwin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci[bot] avatar Jun 03 '22 20:06 openshift-ci[bot]

@deads2k: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/verify-deps a8eabf0ab855c0a25cda3fbfec31b17736958526 link true /test verify-deps
ci/prow/e2e-gcp-ovn-rt-upgrade a8eabf0ab855c0a25cda3fbfec31b17736958526 link false /test e2e-gcp-ovn-rt-upgrade
ci/prow/e2e-aws-single-node-upgrade a8eabf0ab855c0a25cda3fbfec31b17736958526 link false /test e2e-aws-single-node-upgrade
ci/prow/e2e-gcp-upgrade a8eabf0ab855c0a25cda3fbfec31b17736958526 link true /test e2e-gcp-upgrade
ci/prow/e2e-aws-single-node a8eabf0ab855c0a25cda3fbfec31b17736958526 link false /test e2e-aws-single-node

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci[bot] avatar Jun 04 '22 00:06 openshift-ci[bot]

@deads2k: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/verify-deps a8eabf0ab855c0a25cda3fbfec31b17736958526 link true /test verify-deps
ci/prow/e2e-gcp-ovn-rt-upgrade a8eabf0ab855c0a25cda3fbfec31b17736958526 link false /test e2e-gcp-ovn-rt-upgrade
ci/prow/e2e-aws-single-node-upgrade a8eabf0ab855c0a25cda3fbfec31b17736958526 link false /test e2e-aws-single-node-upgrade
ci/prow/e2e-gcp-upgrade a8eabf0ab855c0a25cda3fbfec31b17736958526 link true /test e2e-gcp-upgrade
ci/prow/e2e-aws-single-node a8eabf0ab855c0a25cda3fbfec31b17736958526 link false /test e2e-aws-single-node
ci/prow/e2e-gcp-ovn-image-ecosystem a8eabf0ab855c0a25cda3fbfec31b17736958526 link true /test e2e-gcp-ovn-image-ecosystem

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci[bot] avatar Nov 05 '22 00:11 openshift-ci[bot]

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot avatar Feb 04 '23 01:02 openshift-bot

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

openshift-bot avatar Mar 06 '23 08:03 openshift-bot

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-bot avatar Apr 06 '23 00:04 openshift-bot

@openshift-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci[bot] avatar Apr 06 '23 00:04 openshift-ci[bot]

@deads2k: This pull request references Bugzilla bug 2093051. The bug has been updated to no longer refer to the pull request using the external bug tracker.

In response to this:

bug 2093051: use reflects to drive node status monitor to improve reliability

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci[bot] avatar Apr 06 '23 00:04 openshift-ci[bot]