Recover CSI volumes from dangling attachments #96617

yuga711 · 2020-11-16T19:01:13Z

Change-Id: I72105d67d8a4069ab19bfa4638a7ac365cf4194c

What type of PR is this?

/kind bug

What this PR does / why we need it:
This PR recovers from certain type of dangling CSI attachments by detaching them from the undesired nodes. In the current design, if kube-controller-manager crashes/restarts while attach/detach is in-progress, the volume may be left attached to an undesired node (eg., when the pod using the volume is deleted during the restart window). This PR fixes such dangling attachments by using the presence of VolumeAttachment objects during Attach/Detach Controller initialization.

This solution relies on the existence of the VA object. While this solution would recover dangling volumes caused because of certain races, it will not recover from other dangling scenarios where VA object itself is gone

Which issue(s) this PR fixes:
Partially addresses #94912
Partially addresses #80488
Fixes #77324

Special notes for your reviewer:
Verified the fix through reproduce steps in #94912

Does this PR introduce a user-facing change?:
NONE

Fix to recover CSI volumes from certain dangling attachments

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot · 2020-11-16T19:01:21Z

Hi @yuga711. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

yuga711 · 2020-11-16T19:03:31Z

/assign @msau42

jingxu97 · 2020-11-16T19:18:00Z

pkg/controller/volume/attachdetach/attach_detach_controller.go

+			err = adc.actualStateOfWorld.MarkVolumeAsUncertain(volumeName, volumeSpec, nodeName)
+			if err != nil {
+				klog.Errorf("MarkVolumeAsUncertain fail to add the volume %q (%q) to ASW. err: %s", volumeName, nodeName, err)
+				continue


just a nit: is this continue needed?

gnufied · 2020-11-16T19:36:26Z

/assign

msau42 · 2020-11-16T19:44:50Z

/ok-to-test

msau42 · 2020-11-16T19:45:15Z

/assign @jingxu97

gnufied · 2020-11-16T20:07:53Z

Have we considered if we can use following error from CSI:

| Volume published to another node | 9 FAILED_PRECONDITION | Indicates that a volume corresponding to the specified volume_id has already been published at another node and does not have MULTI_NODE volume capability. If this error code is returned, the Plugin SHOULD specify the node_id of the node at which the volume is published as part of the gRPC status.message. | Caller SHOULD ensure the specified volume is not published at any other node before retrying with exponential back off. |

and use it to return dangling volume errors so as all plugins that implement ControllerPublishVolume correctly can benefit. Using VA to track attachment status is okay but not 100% fullproof.

msau42 · 2020-11-16T20:11:19Z

@kubernetes/sig-storage-pr-reviews

I think we'll need to look at both solutions. Relying on attach failure code means that the pod had to be rescheduled.

yuga711 · 2020-11-16T20:29:49Z

@kubernetes/sig-storage-pr-reviews

I think we'll need to look at both solutions. Relying on attach failure code means that the pod had to be rescheduled.

Yes, that's right. We need both solutions as they have their own limitations. Attach failure code approach will not fix until next attach and may not fix in multi-attach case.

k8s-ci-robot · 2020-11-16T20:29:57Z

@yuga711: Reiterating the mentions to trigger a notification:
@kubernetes/sig-storage-pr-reviews

In response to this:

@kubernetes/sig-storage-pr-reviews

I think we'll need to look at both solutions. Relying on attach failure code means that the pod had to be rescheduled.

Yes, that's right. We need both solutions as they have their own limitations. Attach failure code approach will not fix until next attach and may not fix in multi-attach case.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jingxu97 · 2020-11-16T20:57:46Z

Have we considered if we can use following error from CSI:

| Volume published to another node | 9 FAILED_PRECONDITION | Indicates that a volume corresponding to the specified volume_id has already been published at another node and does not have MULTI_NODE volume capability. If this error code is returned, the Plugin SHOULD specify the node_id of the node at which the volume is published as part of the gRPC status.message. | Caller SHOULD ensure the specified volume is not published at any other node before retrying with exponential back off. |

and use it to return dangling volume errors so as all plugins that implement ControllerPublishVolume correctly can benefit. Using VA to track attachment status is okay but not 100% fullproof.

Here I think the code does not rely on VA attachment status. As long as there is a VA, no matter the status is attached or not, the code will add volume as uncertain status to make sure following attach or detach will be triffered.

yuga711 · 2020-12-10T21:22:25Z

The tests seems to be failing because a function name (IsVolumeAttachedToNode) was replaced after this PR was posted. I will need to rework a bit and will repost the change.

fejta-bot · 2020-12-10T23:35:56Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

msau42 · 2020-12-10T23:53:15Z

/lgtm cancel

to fix merge conflict

Change-Id: I72105d67d8a4069ab19bfa4638a7ac365cf4194c

msau42 · 2020-12-16T21:59:50Z

/lgtm

jingxu97 · 2021-01-13T01:09:14Z

consider to cherrypick this PR?

msau42 · 2021-01-13T01:47:32Z

I think that's a good idea. @gnufied any concerns?

gnufied · 2021-01-20T15:14:52Z

@msau42 should be fine to cherry-pick.

…17-upstream-release-1.18 Automated cherry pick of #94599: Fixes Attach Detach Controller reconciler race reading #96617: Recover CSI volumes from dangling attachments

…17-upstream-release-1.20 Automated cherry pick of #94599: Fixes Attach Detach Controller reconciler race reading #96617: Recover CSI volumes from dangling attachments

…17-upstream-release-1.19 Automated cherry pick of #94599: Fixes Attach Detach Controller reconciler race reading #96617: Recover CSI volumes from dangling attachments

k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 16, 2020

k8s-ci-robot requested review from gnufied and jingxu97 November 16, 2020 19:01

k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 16, 2020

k8s-ci-robot assigned msau42 Nov 16, 2020

jingxu97 reviewed Nov 16, 2020

View reviewed changes

k8s-ci-robot assigned gnufied Nov 16, 2020

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 16, 2020

k8s-ci-robot assigned jingxu97 Nov 16, 2020

k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Nov 16, 2020

yuga711 force-pushed the dangling branch from 2c17ebd to ffda09e Compare November 17, 2020 00:07

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 10, 2020

Recover CSI volumes from dangling attachments

9b2b736

Change-Id: I72105d67d8a4069ab19bfa4638a7ac365cf4194c

yuga711 force-pushed the dangling branch from d861e39 to 9b2b736 Compare December 12, 2020 02:32

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 16, 2020

k8s-ci-robot merged commit efb9489 into kubernetes:master Dec 16, 2020

github-actions bot mentioned this pull request Jan 5, 2021

Week Ending January 3, 2021 dev-obs/actus#313

Open

yuga711 mentioned this pull request Jan 12, 2021

Dangling volume mechanism for CSI does not exist #80488

Open

This was referenced Jan 25, 2021

Pod recreation gets stuck due to dangling attach #94912

Closed

Volume not detached if leader election happens before detach is executed #92145

Closed

yuga711 mentioned this pull request Feb 17, 2021

REQUEST: New membership for yuga711 kubernetes/org#2507

Closed

6 tasks

chethanv28 mentioned this pull request Mar 9, 2021

fix volume detach if node is not existing anymore kubernetes-sigs/vsphere-csi-driver#529

Closed

brathina-spectro mentioned this pull request Apr 13, 2021

PV can not attach to new node if the previous node is deleted kubernetes-sigs/vsphere-csi-driver#359

Closed

svend mentioned this pull request May 5, 2021

Use CSI driver to determine unique name for migrated in-tree plugins #101423

Closed

ialidzhikov mentioned this pull request May 7, 2021

VolumeAttachment not marked as detached causes problems when the Node is deleted. kubernetes-csi/external-attacher#215

Open

This was referenced May 18, 2021

Garbage collect VolumeAttachments of migrated in-tree volumes #102097

Closed

Fix VolumeAttachment garbage collection for migrated PVs #102176

Merged

ialidzhikov mentioned this pull request Jun 14, 2021

Race condition can lead Pod to "steal" other Pod's PVC after CSI migration #102856

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recover CSI volumes from dangling attachments #96617

Recover CSI volumes from dangling attachments #96617

yuga711 commented Nov 16, 2020 •

edited

k8s-ci-robot commented Nov 16, 2020

yuga711 commented Nov 16, 2020

jingxu97 Nov 16, 2020

yuga711 Nov 16, 2020

gnufied commented Nov 16, 2020

msau42 commented Nov 16, 2020

msau42 commented Nov 16, 2020

gnufied commented Nov 16, 2020

msau42 commented Nov 16, 2020

yuga711 commented Nov 16, 2020

k8s-ci-robot commented Nov 16, 2020

jingxu97 commented Nov 16, 2020

yuga711 commented Dec 10, 2020

fejta-bot commented Dec 10, 2020

msau42 commented Dec 10, 2020

msau42 commented Dec 16, 2020

jingxu97 commented Jan 13, 2021

msau42 commented Jan 13, 2021

gnufied commented Jan 20, 2021

Recover CSI volumes from dangling attachments #96617

Recover CSI volumes from dangling attachments #96617

Conversation

yuga711 commented Nov 16, 2020 • edited

k8s-ci-robot commented Nov 16, 2020

yuga711 commented Nov 16, 2020

jingxu97 Nov 16, 2020

Choose a reason for hiding this comment

yuga711 Nov 16, 2020

Choose a reason for hiding this comment

gnufied commented Nov 16, 2020

msau42 commented Nov 16, 2020

msau42 commented Nov 16, 2020

gnufied commented Nov 16, 2020

msau42 commented Nov 16, 2020

yuga711 commented Nov 16, 2020

k8s-ci-robot commented Nov 16, 2020

jingxu97 commented Nov 16, 2020

yuga711 commented Dec 10, 2020

fejta-bot commented Dec 10, 2020

msau42 commented Dec 10, 2020

msau42 commented Dec 16, 2020

jingxu97 commented Jan 13, 2021

msau42 commented Jan 13, 2021

gnufied commented Jan 20, 2021

yuga711 commented Nov 16, 2020 •

edited