Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce configmap and secret watch of kubelet #99393

Merged

Conversation

chenyw1990
Copy link
Contributor

@chenyw1990 chenyw1990 commented Feb 24, 2021

What type of PR is this?

/kind feature

What this PR does / why we need it:

In a large cluster, if a large number of pods reference configmap/secret as environment variables, kubelet will generate a lot of watches for configmap and secret, and kube-apiserver will use too much memory due to too many watches.
Although we have added Immutable features, in real usage scenarios, many configmaps are not marked as Immutable.

Which issue(s) this PR fixes:

Fixes #98660

Special notes for your reviewer:

@wojtek-t

Does this PR introduce a user-facing change?

When the kubelet is watching a ConfigMap or Secret purely in the context of setting environment variables
for containers, only hold that watch for a defined duration before cancelling it. This change reduces the CPU
and memory usage of the kube-apiserver in large clusters.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 24, 2021
@k8s-ci-robot k8s-ci-robot added area/kubelet area/test sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 24, 2021
return
}

go wait.Forever(func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this PR works.
The fact the secrets whenever changed re automatically updated to all pods mounting them:
https://kubernetes.io/docs/concepts/configuration/secret/#mounted-secrets-are-updated-automatically
Means that they are periodically checked if nothing changed.

So this PR doesn't work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following uses configmap as an example. The secret is the same as the configmap,
The configmap is used in the following two places:

  1. Used by a container as volume.
  2. Information in the configmap is injected to the container as environment variables.

For the configmap volume, we expect that the watch link can be maintained. Therefore, we set maxIdleTime to three times the value of resyncInterval. (Each resyncInterval period, VolumeManager queries the configmap information, lastAcquiredTime will be updated.)

For the configmap used by environment variables, it is expected that after the container is created, the kubelet can disconnect the watch link for the configmap.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I see what you're saying now - that makes sense.
(we can argue if three times resyncInterval isn't too small - I would feel more comfortable with something higher, but that's a minor thing)

But this kind of change for sure deserves a test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a ConfigMap or Secret is only used to set container environment variables, why watch it at all?

@chenyw1990
Copy link
Contributor Author

/retest

@ehashman ehashman added this to Triage in SIG Node PR Triage Feb 24, 2021
@ehashman
Copy link
Member

/priority important-longterm
/triage accepted

waiting on author for @wojtek-t's comments

@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 24, 2021
@ehashman ehashman moved this from Triage to Waiting on Author in SIG Node PR Triage Feb 24, 2021
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/configmap/configmap_manager.go Outdated Show resolved Hide resolved
@chenyw1990 chenyw1990 force-pushed the reduceKubeletResourceWatch branch 2 times, most recently from 0a243e6 to 2e1ddc6 Compare February 27, 2021 04:41
@chenyw1990
Copy link
Contributor Author

/retest

@chenyw1990
Copy link
Contributor Author

@wojtek-t I have perfected the code. And I tested on the 4000-node cluster. This solution can significantly reduce the number of watches. Could you help me to review the code again?
image

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Feb 27, 2021
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
@chenyw1990
Copy link
Contributor Author

@wojtek-t Thanks for your suggestions, I have modified the code based on the review comments.

@chenyw1990
Copy link
Contributor Author

/retest

pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager_test.go Outdated Show resolved Hide resolved
store := manager.NewObjectCache(listObj, watchObj, newObj, isImmutable, schema.GroupResource{Group: "v1", Resource: "secrets"})

fakeClock := clock.NewFakeClock(time.Now())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove this empty line

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done yet

staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Mar 8, 2021

@chenyw1990: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kubernetes-bazel-test 820af60db0a76b0494b148b68580ee7c11066753 link /test pull-kubernetes-bazel-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved

item.lock.Lock()
defer item.lock.Unlock()
select {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can just return item.stopped here (instead of checking channels)

staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go Outdated Show resolved Hide resolved
store := manager.NewObjectCache(listObj, watchObj, newObj, isImmutable, schema.GroupResource{Group: "v1", Resource: "secrets"})

fakeClock := clock.NewFakeClock(time.Now())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done yet

pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
@chenyw1990 chenyw1990 force-pushed the reduceKubeletResourceWatch branch 2 times, most recently from c6d5d6a to 390be42 Compare March 8, 2021 08:37
Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some very minor comments.

pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
pkg/kubelet/util/manager/watch_based_manager.go Outdated Show resolved Hide resolved
@wojtek-t
Copy link
Member

wojtek-t commented Mar 8, 2021

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 8, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chenyw1990, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 8, 2021
@k8s-ci-robot k8s-ci-robot merged commit a517ecc into kubernetes:master Mar 8, 2021
SIG Node PR Triage automation moved this from Waiting on Author to Done Mar 8, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Development

Successfully merging this pull request may close these issues.

Can kubelet stop monitoring configMaps which are only used as environment variables?
6 participants