New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Count pod overhead against an entity's ResourceQuota #99600
Count pod overhead against an entity's ResourceQuota #99600
Conversation
/cc @egernst |
/sig node |
/kind bug |
/priority important-soon |
Need ok-to-test! |
/auto-cc |
/ok-to-test |
/retest |
/assign |
/assign @dims |
@gjkim42 thanks for your quick fix. About PodOverhead, I have something else to discuss. Currently, PodOverhead is defined as a fixed resource. this value looks like the average usage value and being added into requests resource during scheduling.
My use case is kata_container, the VMM related processes consume much overhead resource when pod running with high workload. A small overhead(small cgroup limits) will trigger oom but a big value result that big overhead requests take into account during scheduling. So how about define PodOverhead including limits and requests like that in Pod resources definition. It could be helpful
|
@chestack @egernst need your opinion! |
c783d35
to
53ab29b
Compare
/retest |
Rebased. PTAL |
/assign @vishh |
usage corev1.ResourceList | ||
pod *api.Pod | ||
usage corev1.ResourceList | ||
podOverheadEnabled bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so default is false
? And all tests that do not set this field will execute with the pod overhead disabled? What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would say that there is no need to explicitly enable the pod overhead for the other cases. The other cases are irrelevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it. So there is no intention to explicitly disable it for other cases. It is just a side effect of the need to enable it for one case. Since it's in beta and about to get GA, does this logic even needed?
If you think it is needed, maybe just do nothing when it's false. See comment in file below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I see the new test case. it's ok than. Hopefully will GA PodOverhead in 1.22
So this KEP paragraph wasn't implemented? This comment still lists it as a "(placeholder)". KEP xref: #kubernetes/enhancements#688 |
}, | ||
}, | ||
}, | ||
usage: corev1.ResourceList{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test case tests for usage, not for quota, correct? Sorry I might be missing something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the test is for usage, and resourcequota controller ensures the usage to not exceed the quota
According to the KEP (The controller will be updated to add the pod Overhead to the container resource request summation.
), I would say that the container resource request summation
means the usage. And #99577 requested the same thing.
0c65219
to
df48ee4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SergeyKanzhelev
Done.
usage corev1.ResourceList | ||
pod *api.Pod | ||
usage corev1.ResourceList | ||
podOverheadEnabled bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would say that there is no need to explicitly enable the pod overhead for the other cases. The other cases are irrelevant.
}, | ||
}, | ||
}, | ||
usage: corev1.ResourceList{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the test is for usage, and resourcequota controller ensures the usage to not exceed the quota
According to the KEP (The controller will be updated to add the pod Overhead to the container resource request summation.
), I would say that the container resource request summation
means the usage. And #99577 requested the same thing.
"do not count pod overhead as usage with pod overhead disabled": { | ||
pod: &api.Pod{ | ||
Spec: api.PodSpec{ | ||
Overhead: api.ResourceList{ | ||
api.ResourceCPU: resource.MustParse("1"), | ||
}, | ||
Containers: []api.Container{ | ||
{ | ||
Resources: api.ResourceRequirements{ | ||
Requests: api.ResourceList{ | ||
api.ResourceCPU: resource.MustParse("1"), | ||
}, | ||
Limits: api.ResourceList{ | ||
api.ResourceCPU: resource.MustParse("2"), | ||
}, | ||
}, | ||
}, | ||
}, | ||
}, | ||
}, | ||
usage: corev1.ResourceList{ | ||
corev1.ResourceRequestsCPU: resource.MustParse("1"), | ||
corev1.ResourceLimitsCPU: resource.MustParse("2"), | ||
corev1.ResourcePods: resource.MustParse("1"), | ||
corev1.ResourceCPU: resource.MustParse("1"), | ||
generic.ObjectCountQuotaResourceNameFor(schema.GroupResource{Resource: "pods"}): resource.MustParse("1"), | ||
}, | ||
podOverheadEnabled: false, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added the case containing pod overhead with the pod overhead disabled
I don't know the full history, but I think it wasn't. |
/lgtm |
@derekwaynecarr |
@@ -354,6 +354,10 @@ func PodUsageFunc(obj runtime.Object, clock clock.Clock) (corev1.ResourceList, e | |||
limits = quota.Max(limits, pod.Spec.InitContainers[i].Resources.Limits) | |||
} | |||
|
|||
if feature.DefaultFeatureGate.Enabled(features.PodOverhead) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wish we had this in beta as intended; happy we caught it prior to moving to GA.
/approve this should probably warrant cherry-picks to past releases. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: derekwaynecarr, dims, gjkim42 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…00-upstream-release-1.19 Automated cherry pick of #99600: Count pod overhead as an entity's resource usage
…00-upstream-release-1.18 Automated cherry pick of #99600: Count pod overhead as an entity's resource usage
…00-upstream-release-1.20 Automated cherry pick of #99600: Count pod overhead as an entity's resource usage
What type of PR is this?
/kind bug
What this PR does / why we need it:
According to https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/688-pod-overhead, the ResourceQuota controller should account for
pod.Spec.Overhead
Which issue(s) this PR fixes:
Fixes #99577
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: