Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: volume stats disabled when value is negative.(Currently, 0 means defualt 1m) #96675

Merged
merged 1 commit into from Jan 20, 2021

Conversation

pacoxu
Copy link
Member

@pacoxu pacoxu commented Nov 18, 2020

What type of PR is this?
/kind bug

What this PR does / why we need it:

https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/

--volume-stats-agg-period duration     Default: 1m0s

  | Specifies interval for kubelet to calculate and cache the volume disk usage for all pods and volumes. To disable volume calculations, set to 0. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)

the default value of volume-stats-agg-period is 1m0s
However when it is set to be 0, it means disable.

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/server/stats/fs_resource_analyzer.go#L60-62

  • nil means default, 1m0s.
  • 0 means disable

I tested it. It will be disabled only if we set it to be '-1s'

I tried to fix it but found that the metav1.Duration default value is zeroDuration, and it cannot decide whether it is 0s or none(not set, which should use default value).

Current logic: 1m0s will be used even you set it to 0s or 0.
So I fix it as a doc issue.

Which issue(s) this PR fixes:

Fixes #96674

Special notes for your reviewer:
I tested it. It will be disabled only if we set it to be '-1s'

Cause:

fs.DurationVar(&c.VolumeStatsAggPeriod.Duration, "volume-stats-agg-period", c.VolumeStatsAggPeriod.Duration, "Specifies interval for kubelet to calculate and cache the volume disk usage for all pods and volumes. To disable volume calculations, set to 0.")

    value, _ := time.ParseDuration("")
    value0, _ := time.ParseDuration("0s")
    fmt.Println(value == value0)

flag use ParseDuration and DurationVar which cannot distinguish no input and 0s.

Does this PR introduce a user-facing change?:

Set kubelet option `--volume-stats-agg-period` to negative value to disable volume calculations.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

None

It is very confused to make it different for none and 0. So I think doc fix would be a better way IMO.

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 18, 2020
@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. area/kubelet and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 18, 2020
@pacoxu pacoxu changed the title volume stats agg period should be disabled when value is 0 Doc error: volume stats agg period should be disabled when value is negative Nov 19, 2020
@pacoxu pacoxu changed the title Doc error: volume stats agg period should be disabled when value is negative bug or doc error: volume stats agg period should be disabled when value is negative(not 0) Nov 19, 2020
@pacoxu pacoxu changed the title bug or doc error: volume stats agg period should be disabled when value is negative(not 0) doc error: volume stats agg period should be disabled when value is negative(not 0) Nov 24, 2020
@pacoxu pacoxu changed the title doc error: volume stats agg period should be disabled when value is negative(not 0) volume stats agg period should be disabled when value is 0, and default value is 1m Nov 27, 2020
@pacoxu

This comment has been minimized.

@pacoxu pacoxu changed the title volume stats agg period should be disabled when value is 0, and default value is 1m wip: volume stats disabled when value is 0, and default value is 1m Nov 27, 2020
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 27, 2020
@pacoxu pacoxu changed the title wip: volume stats disabled when value is 0, and default value is 1m Comments/Doc fix: volume stats disabled when value is negative.(Currently, 0 means defualt 1m) Dec 7, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 7, 2020
@pacoxu pacoxu force-pushed the fix/volume-stats-agg-period branch from 475b751 to 5ab8207 Compare December 7, 2020 03:41
…disable when it is negative like -1s.

Signed-off-by: pacoxu <paco.xu@daocloud.io>
@pacoxu pacoxu force-pushed the fix/volume-stats-agg-period branch from 5ab8207 to 4901855 Compare December 7, 2020 03:47
@pacoxu
Copy link
Member Author

pacoxu commented Dec 7, 2020

@kubernetes/sig-node-pr-reviews

This should be cherry-picked to 1.19 or older versions as well.

@pacoxu pacoxu changed the title Comments/Doc fix: volume stats disabled when value is negative.(Currently, 0 means defualt 1m) fix: volume stats disabled when value is negative.(Currently, 0 means defualt 1m) Dec 21, 2020
@pacoxu
Copy link
Member Author

pacoxu commented Dec 21, 2020

@derekwaynecarr Have you had a chance to look at this?

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jan 4, 2021
@pacoxu
Copy link
Member Author

pacoxu commented Jan 6, 2021

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 6, 2021
@ehashman ehashman added this to Needs Reviewer in SIG Node PR Triage Jan 6, 2021
@ehashman
Copy link
Member

ehashman commented Jan 6, 2021

I'm a bit confused. In the linked code in the first comment, it should be disabled for 0 or any negative number (logic is x <= 0, has not been changed in 5 years):

func (s *fsResourceAnalyzer) Start() {
s.startOnce.Do(func() {
if s.calcPeriod <= 0 {
klog.Info("Volume stats collection disabled.")
return
}

Where is the = 0 case getting overridden to a default? According to #96674 (comment) the behaviour is different, but the linked code doesn't suggest it would be. cc @pacoxu

@ehashman
Copy link
Member

ehashman commented Jan 6, 2021

/priority important-longterm

@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 6, 2021
@pacoxu
Copy link
Member Author

pacoxu commented Jan 7, 2021

I'm a bit confused. In the linked code in the first comment, it should be disabled for 0 or any negative number (logic is x <= 0, has not been changed in 5 years):

func (s *fsResourceAnalyzer) Start() {
s.startOnce.Do(func() {
if s.calcPeriod <= 0 {
klog.Info("Volume stats collection disabled.")
return
}

Where is the = 0 case getting overridden to a default? According to #96674 (comment) the behaviour is different, but the linked code doesn't suggest it would be. cc @pacoxu

fs.DurationVar(&c.VolumeStatsAggPeriod.Duration, "volume-stats-agg-period", c.VolumeStatsAggPeriod.Duration, "Specifies interval for kubelet to calculate and cache the volume disk usage for all pods and volumes. To disable volume calculations, set to 0.")

@ehashman 0s will be treated as default 1 minute indeed.
Before running into fs_resource_analyzer.go, the value was already converted to 1m.

@ehashman
Copy link
Member

ehashman commented Jan 7, 2021

Okay I found where this gets overridden from 0 to 1m:

if obj.VolumeStatsAggPeriod == zeroDuration {
obj.VolumeStatsAggPeriod = metav1.Duration{Duration: time.Minute}
}

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 7, 2021
@ehashman ehashman moved this from Needs Reviewer to Needs Approver in SIG Node PR Triage Jan 7, 2021
@pacoxu

This comment has been minimized.

Copy link
Contributor

@mrunalp mrunalp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mrunalp, pacoxu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 20, 2021
@k8s-ci-robot k8s-ci-robot merged commit 0da0d18 into kubernetes:master Jan 20, 2021
SIG Node PR Triage automation moved this from Needs Approver to Done Jan 20, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Jan 20, 2021
@pacoxu pacoxu deleted the fix/volume-stats-agg-period branch June 23, 2021 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging this pull request may close these issues.

--volume-stats-agg-period=0 cannot disable volume agg feature.
6 participants