New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[volumeScheduling/metrics] Fix buckets initialization #100720
[volumeScheduling/metrics] Fix buckets initialization #100720
Conversation
Welcome @dntosas! |
Hi @dntosas. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
hola @lichuqiang @msau42 ! any change you take a look in here? pretty small and quick :) |
/ok-to-test |
/lgtm |
/retest Review the full test history for this PR. Silence the bot with an |
2 similar comments
/retest Review the full test history for this PR. Silence the bot with an |
/retest Review the full test history for this PR. Silence the bot with an |
/hold This metric is deprecated as of 1.19. Presumably it should be getting removed entirely in the upcoming release? /triage accepted |
Given that this metric is deprecated, what are your plans for this PR? Do you want to backport it? Ideally we would just remove it in the 1.23 release. |
yeap, I was thinking of backporting to existing releases and then create a new PR that removes the metric from the master branch. WDYT ? |
@logicalhan is this plan above acceptable? Arguably, the existing metric is broken, but changing the buckets is probably a breaking change. Right? |
Changing buckets is actually fine, it's adding removing labels that breaks people. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dntosas, logicalhan, msau42 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
/retest |
/retest The failure seems unrelated. Opened #104463 |
The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass. This bot retests PRs for certain kubernetes repos according to the following rules:
You can:
/retest |
As part of kubernetes#100720 we backported fix on existing releases and in this commit we completely remove the deprecated metric from master branch. Signed-off-by: dntosas <ntosas@gmail.com>
…720-upstream-release-1.20 Automated cherry pick of #100720: Fix buckets initialization
…720-upstream-release-1.21 Automated cherry pick of #100720: Fix buckets initialization
…720-upstream-release-1.19 Automated cherry pick of #100720: Fix buckets initialization
As part of kubernetes#100720 we backported fix on existing releases and in this commit we completely remove the deprecated metric from master branch. Signed-off-by: dntosas <ntosas@gmail.com>
What type of PR is this?
/kind bug
What this PR does / why we need it:
This metric is measured in seconds so it makes no sense starting from
1000 as init value. This breaks also the scheduler e2e metric thus make
users unable to compute, for example, their SLO for the scheduler.
Even if this metric is deprecated, it should behave correctly until it is
completely removed to avoid user confusion.
For example, for each volume created, the minimum value exposed
as a metric is 16.6min (1000sec/60) which is obviously wrong as logic.
In this commit, we migrate bucket creation to start from reasonable
numbers, copying the incrementation from the conventions that the
scheduler follows itself.
Special notes for your reviewer:
Check this example of trying to define some SLO for scheduling, but
exposed data is kinda unusable:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: