Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: more informative generic ephemeral volume events #104605

Merged
merged 1 commit into from Sep 4, 2021

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Aug 26, 2021

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Typically, pods using generic ephemeral inline volumes trigger two events:

  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  3s    default-scheduler  0/1 nodes are available: 1 persistentvolumeclaim "my-csi-app-inline-volume-my-csi-volume" not found.
  Warning  FailedScheduling  2s    default-scheduler  0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.

The first now is more informative:

   Type     Reason            Age   From               Message
   ----     ------            ----  ----               -------
   Warning  FailedScheduling  4s    default-scheduler  0/1 nodes are available: 1 waiting for ephemeral volume controller to create the persistentvolumeclaim "my-csi-app-inline-volume-my-csi-volume".
   Warning  FailedScheduling  2s    default-scheduler  0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.

Which issue(s) this PR fixes:

Fixes #104525

Special notes for your reviewer:

Originally, this PR also contained a second commit which delayed the first event and, as a result, typically never showed it because the condition for it is short-lived. This was taken out based on review feedback but could be brought back if desired: https://github.com/pohly/kubernetes/commits/ephemeral-volume-events-delay

Does this PR introduce a user-facing change?

generic ephemeral volumes: better pod events ("waiting for ephemeral volume controller to create the persistentvolumeclaim"" instead of "persistentvolumeclaim not found")

@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 26, 2021
@k8s-ci-robot
Copy link
Contributor

@pohly: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Aug 26, 2021
@pohly
Copy link
Contributor Author

pohly commented Aug 26, 2021

/sig storage
/sig scheduling

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 26, 2021
@ahg-g
Copy link
Member

ahg-g commented Aug 27, 2021

Thanks @pohly, I don't think the second commit is necessary, I think it is better for debugging to have the events consistent with pod conditions.

@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 30, 2021
@pohly
Copy link
Contributor Author

pohly commented Aug 30, 2021

I don't think the second commit is necessary, I think it is better for debugging to have the events consistent with pod conditions.

Okay. I've taken out that second part and updated the PR description accordingly.

/assign @msau42

These events are currently emitted for a pod using a generic ephemeral volume:

  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  3s    default-scheduler  0/1 nodes are available: 1 persistentvolumeclaim "my-csi-app-inline-volume-my-csi-volume" not found.
  Warning  FailedScheduling  2s    default-scheduler  0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.

The one about "persistentvolumeclaim not found" is potentially confusing. It
occurs because the scheduler typically checks the pod before the ephemeral
volume controller had a chance to create the PVC.

This is a bit easier to understand:

  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  4s    default-scheduler  0/1 nodes are available: 1 waiting for ephemeral volume controller to create the persistentvolumeclaim "my-csi-app-inline-volume-my-csi-volume".
  Warning  FailedScheduling  2s    default-scheduler  0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
// but we can do better for generic ephemeral inline volumes where that situation
// is normal directly after creating a pod.
if ephemeral && apierrors.IsNotFound(err) {
err = fmt.Errorf("waiting for ephemeral volume controller to create the persistentvolumeclaim %q", pvcName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it always the case that this type of PVC is created by the ephemeral volume controller?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, yes. It would be possible to set up a cluster where that controller is disabled and some other entity creates the PVCs, but that seems very unlikely.

@ahg-g
Copy link
Member

ahg-g commented Aug 30, 2021

/approve
leaving lgtm to @msau42

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 30, 2021
@msau42
Copy link
Member

msau42 commented Sep 3, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 3, 2021
@k8s-ci-robot k8s-ci-robot merged commit b12379e into kubernetes:master Sep 4, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Sep 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

generic ephemeral volumes: avoid emitting unnecessary events
4 participants