New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cgroup handling for systemd with cgroup v2 #98365
Conversation
/sig node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Do we have any e2e-tests running with systemd/and-or cgroup v2? If not, we should probably have that.
AFAIK e2e tests were a requirement when we added the v2 support.
i found one cri-o based test here that is not running:
https://k8s-testgrid.appspot.com/sig-node-cri-o#pr-crio-cgrpv2-gce-e2e
Yeah, that was what I thought as well, but couldn't find much tbh. The cgroup hanling is quite obscure and hard to reason around with a lot of duplicate work in k8s/k8s and opencontainers/runc/libcontainer, and some is also handled by CRI. Hopefully we can get it battle tested with some solid e2e tests to avoid regressions. I am testing some cgroup v2 stuff now, and there are quite a few rough edges still (including this, and opencontainers/runc#2474 (comment)), so it is definitely not stable enough to run in production imo. I will continue to work on it tho, and I guess @giuseppe is working a bit on it as well. /retest |
/test pull-kubernetes-node-crio-cgrpv2-e2e |
Hmm. The failure on We should run these as periodic tasks, and not just presubmit. Also, this the https://testgrid.k8s.io/sig-node-kubelet#node-kubelet-serial tests are more relevant for cgroup handling, but that testgrid tab looks quite red already 😅 |
/triage accepted |
Ping @giuseppe |
we run e2e tests in the CRI-O CI. Could you open a PR for CRI-O and set the remote for Kubernetes to your branch so we can see what happens? |
This fixes issues where kubelet enforces qos and nodeAllocatable on the worng hierarchy. Kublet will now create the files /sys/fs/cgroup/kubepods/{burstable,besteffort,}/pod-xyz when running with systemd as the driver, making it impossible to enforce the limits on nodeAllocatable.
0d746a2
to
124de52
Compare
Hi @giuseppe ,
Yeah, that makes sense, but I don't think a CRI implementation have tests to verify node allocatable or pod limits, but I might be incorrect. If you do have tests running catching this, it would be nice to get them into the testgrid.
Can you clarify what you mean by "set the remote for kubernetes"? This file is not used as a go dep, so not sure what you want me to do... |
test: kubernetes/kubernetes#98365 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
I've opened a test PR cri-o/cri-o#4569 The only reason for opening it is to make sure there are no regressions in the e2e when running on cgroup v2 I think @harche is already working on adding these tests to kubernetes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests pass
LGTM
/test pull-kubernetes-node-crio-cgrpv2-e2e |
I wanted to fix the known failures in |
I see. That make sense, thanks! 😄
Ahh, nice!
Ahh, that makes sense. Do you have an issue were that is tracked? I didn't find one, so thought the errors were new (and therefore I guess I wasted some time :/)... Is fixing those errors being tracked as well, and is it being worked on? |
@odinuge I am working on fixing those 2 failures in |
/assign @sjenning |
@odinuge I created an issue to track the failures we are seeing with NodeConformance and NodeFeatures on cgroup v2, #99230 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe, mrunalp, odinuge The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Review the full test history for this PR. Silence the bot with an |
@odinuge: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
This fixes issues where kubelet enforces qos and nodeAllocatable on the
worng hierarchy. Kublet will now create the files
/sys/fs/cgroup/kubepods/{burstable,besteffort,}/pod-xyz
when running with systemd as the driver, making it impossible to enforce
the limits on nodeAllocatable.
Do we have any e2e-tests running with systemd/and-or cgroup v2? If not, we should probably have that.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: