New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JobDeletionDurationSeconds metric in TTLAfterFinished controller #98676
Conversation
/remove-kind api-change |
/assign @soltysh |
/assign @janetkuo |
a0be7db
to
ab069ff
Compare
/retest |
06a3830
to
c744af1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
TimeToDeletionSeconds = metrics.NewHistogram( | ||
&metrics.HistogramOpts{ | ||
Subsystem: TTLAfterFinishedSubsystem, | ||
Name: "time_to_deletion_seconds", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally durations have the suffix _duration_seconds
, such as http_request_duration_seconds
, or scheduling_duration_seconds
. This is to distinguish them from timestamps (e.g. process_start_time_seconds
).
The prefix is up to you, but time_to_deletion_duration_seconds
sounds a bit odd. Maybe deletion_duration_seconds
, or job_deletion_duration_seconds
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, changed to job_deletion_duration_seconds
/retest |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, dashpole, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
}, | ||
{ | ||
name: "Job failed 10s ago, 15s TTL", | ||
failedTime: metav1.NewTime(now.Add(-10 * time.Second)), | ||
ttl: utilpointer.Int32Ptr(15), | ||
since: &now.Time, | ||
expectedTimeLeft: durationPointer(5), | ||
expectedExpireAt: now.Add(5 * time.Second), | ||
}, | ||
} | ||
|
||
for _, tc := range testCases { | ||
job := newJob(tc.completionTime, tc.failedTime, tc.ttl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Off the logic, but we may migrate this to a pure subtest style.
/lgtm |
/retest |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adds a metric to track the time it took to delete a job by the ttl-after-finished controller.
Which issue(s) this PR fixes:
Part of kubernetes/enhancements#592
Special notes for your reviewer:
This is a requirement to graduate TTLAfterFinished to Beta: https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/592-ttl-after-finish#monitoring-requirements
Does this PR introduce a user-facing change?: