New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Indexed completionMode to Job API #98441
Conversation
/hold |
1ef63c6
to
778da29
Compare
/retest |
/assign @soltysh |
Keeping the hold until implementation PR (TODO) is also approved |
cc @ahg-g |
447128f
to
0ee7ce4
Compare
// CompletionMode specifies how Pod completions of a Job are tracked. | ||
type CompletionMode string | ||
|
||
const ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting that the comment describes completion from the pod perspective. Since the mode is a job level parameter, it might be clearer if we describe it from the job perspective?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggested wording?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps:
NonIndexedMode: a job is completed when the number of successfully completed pods is equal to Completions
, if completions is nil, then the success of any pod signals the success of the job.
IndexedMode: a job is completed when at least one pod for each unique index between 0 to Completions - 1
successfully completed.
When we start describing the mode from the perspective of the job, I think it becomes more important to document the restrictions on Completions/Parallelism values here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, comments for constants don't make it to the reference documentation https://kubernetes.io/docs/reference/kubernetes-api/
So leaving the comments here lean and putting all details in the field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, that is bad!
pkg/apis/batch/types.go
Outdated
NonIndexedCompletion CompletionMode = "NonIndexed" | ||
|
||
// IndexedCompletion means that each Pod completion of a Job is tracked | ||
// individually, being associated to a completion index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there restrictions on what values Parallelism
and Completions
can take when using this mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I'm keeping them in the field, as that's where people will take a look first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be not in this case, but a field can exist in multiple places, and so we wouldn't repeat the description of each value on each field. The documentation is for the type, not really for the field.
// more consecutive numbers are compressed and represented by the first and | ||
// last element of the series, separated by a hyphen. | ||
// For example, if the completed indexes are 1, 3, 4, 5 and 7, they are | ||
// represented as "1,3-5,7". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mention the limit on this field if there is one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no limit here. It's just bounded by parallelism.
pkg/apis/batch/types.go
Outdated
// | ||
// This field is alpha-level and is only honored by servers that enable the | ||
// IndexedJob feature gate. More completion modes can be added in the future. | ||
// If the Job controller observes a mode that it doesn't recognize, it manages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not reject it at validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do reject it at validation. But an apiserver can temporarily be at a newer version than the controller. Then the controller might see a mode that it doesn't recognize.
This note is a safeguard to allow the addition of new modes in the future. In production clusters, the situation described wouldn't happen, as we first release new modes with a disabled feture gate.
But actually, it's safer to skip the sync. Changing.
@@ -31,6 +31,8 @@ import ( | |||
apivalidation "k8s.io/kubernetes/pkg/apis/core/validation" | |||
) | |||
|
|||
const maxParallelismForIndexedJob = 100000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment explaining why parallelism is now capped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
looks good to me |
89a0d09
to
5f364c5
Compare
45865c3
to
5ed5e6c
Compare
/retest |
/retest |
5ed5e6c
to
fa66b11
Compare
/retest |
1 similar comment
/retest |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, erictune, lavalamp, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
/retest Review the full test history for this PR. Silence the bot with an |
/retest |
1 similar comment
/retest |
And IndexedJob feature gate, disabled by default. Update JobDescriber
fa66b11
to
a1a5868
Compare
/retest |
/lgtm |
What type of PR is this?
/kind feature
/kind api-change
/sig apps
/area workload-api/job
/area batch
What this PR does / why we need it:
Add the
.spec.completionMode
field with accepted valuesNonIndexed
(default) andIndexed
.Also requires a feature gate
IndexedJob
and the.status.completedIndexes
field in Job.Which issue(s) this PR fixes:
Ref #97169 #14188 kubernetes/enhancements#2214
Special notes for your reviewer:
This PR only contains API changes. Changes to Job controller will follow up in another PR.
Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: