New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1920221: Allow test invokers to skip test waits before and after #98781
Conversation
A number of e2e tests are useful to run after the system has been disrupted or is in the progress of being disrupted, but the current suite and test logic blocks progress waiting for all nodes to be healthy. By passing -1 to --minStartupPods or --allowed-not-ready-nodes flags the caller can bypass wait logic before and after test suites that would prevent running e2e during disruption. This allows use of parts of the e2e suite during cluster duress to verify that controllers or components still function.
@smarterclayton: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: smarterclayton The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
1 similar comment
/retest |
/retest |
controlled skips of preflights makes sense and doesn't impact callers making use of them. /lgtm |
A number of e2e tests are useful to run after the system has been disrupted or is in the progress of being disrupted, but the current
suite and test logic blocks progress waiting for all nodes to be healthy.
By passing -1 to --minStartupPods or --allowed-not-ready-nodes flags the caller can bypass wait logic before and after test suites that would prevent running e2e during disruption. This allows use of parts of the e2e suite during cluster duress to verify that controllers or components still function.
A specific example of this includes testing clusters that have a number of nodes marked unschedulable or not ready and verifying that the system still functions.
In general, some of the hardcoded waits won't make sense on all Kube distributions anyway (those without pods in kube-system), so bypassing may be useful for others who wrap e2e with their own logic. A caller should not have to have pods in kube-system to be conformant, for example.
This should not impact any existing callers of these APIs since previously -1 would fail or wait forever.
/kind cleanup