Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use volumeHandle as PV name when translating EBS inline volume #96821

Merged
merged 1 commit into from Jan 12, 2021

Conversation

wongma7
Copy link
Contributor

@wongma7 wongma7 commented Nov 23, 2020

What type of PR is this?

/kind bug

What this PR does / why we need it: EBS CSI inline migration/translation does not work if the volumeID is in the "uri" format and has a colon in it, like

  volumes:
  - awsElasticBlockStore:
      fsType: ext4
      volumeID: aws://us-west-2a/vol-0818b380d4fc22bec
    name: aws-volume-0
  1. This translation function sets PV name to ebs.csi.aws.com-aws://us-west-2a/vol-0818b380d4fc22bec which is not a valid object name.
  2. spec.Name() which points to this invalid PV name is set as CSI mounter's specVolumeID.
    specVolumeID: spec.Name(),
  3. CSI mounter returns a path including specVolumeID /var/lib/kubelet/pods/017390c5-0c21-404b-a850-228c4311dd36/volumes/kubernetes.io~csi/ebs.csi.aws.com-aws:~~us-west-2a~vol-0a4910df43136ea3e/mount
    func (c *csiMountMgr) GetPath() string {
  4. kubelet tries to create the container with this mount and since it contains a colon, it fails the assumption that the mount will be of format host path:container path:mode
    // generateMountBindings converts the mount list to a list of strings that
    resulting in this error from container runtime (docker):

    state:
      waiting:
        message: 'Error response from daemon: invalid mode: /opt/0'
        reason: CreateContainerError

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Fix CSI-migrated inline EBS volumes failing to mount if their volumeID is prefixed by aws://

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 23, 2020
@wongma7
Copy link
Contributor Author

wongma7 commented Nov 23, 2020

Identified by inline tests failing at https://testgrid.k8s.io/sig-aws-ebs-csi-driver#migration-test-latest

/sig storage
/area provider/aws

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. area/provider/aws Issues or PRs related to aws provider and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 23, 2020
@ayberk
Copy link
Contributor

ayberk commented Nov 24, 2020

/lgtm

@k8s-ci-robot
Copy link
Contributor

@ayberk: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wongma7
Copy link
Contributor Author

wongma7 commented Nov 25, 2020

/hold

Since I missed code freeze anyway (still intend to cherry pick) I will take some extra time to test this end to end that it actually fixes ebs migration inline volumes.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 25, 2020
@wongma7
Copy link
Contributor Author

wongma7 commented Dec 2, 2020

/hold cancel

Inline migration test passes with this patch.

[ebs-csi-migration] EBS CSI Migration [Driver: aws] [Testpattern: Inline-volume (xfs)][Slow] volumes 
  should store data
  /home/ANT.AMAZON.COM/mattwon/go/pkg/mod/k8s.io/kubernetes@v1.17.3/test/e2e/storage/testsuites/volumes.go:150
[BeforeEach] [Testpattern: Inline-volume (xfs)][Slow] volumes
  /home/ANT.AMAZON.COM/mattwon/go/pkg/mod/k8s.io/kubernetes@v1.17.3/test/e2e/storage/testsuites/base.go:101
[BeforeEach] [Testpattern: Inline-volume (xfs)][Slow] volumes
  /home/ANT.AMAZON.COM/mattwon/go/pkg/mod/k8s.io/kubernetes@v1.17.3/test/e2e/framework/framework.go:151
STEP: Creating a kubernetes client
Dec  2 11:56:34.665: INFO: >>> kubeConfig: /home/ANT.AMAZON.COM/mattwon/.kube/config
STEP: Building a namespace api object, basename volume
Dec  2 11:56:34.844: INFO: No PodSecurityPolicies found; assuming PodSecurityPolicy is disabled.
STEP: Waiting for a default service account to be provisioned in namespace
[It] should store data
  /home/ANT.AMAZON.COM/mattwon/go/pkg/mod/k8s.io/kubernetes@v1.17.3/test/e2e/storage/testsuites/volumes.go:150
Dec  2 11:56:34.882: INFO: Warning: Environment does not support getting controller-manager metrics
W1202 11:56:34.882234   12989 metrics_grabber.go:79] Master node is not registered. Grabbing metrics from Scheduler, ControllerManager and ClusterAutoscaler is disabled.
Dec  2 11:56:34.898: INFO: Warning: Environment does not support getting controller-manager metrics
STEP: creating a test aws volume
W1202 11:56:34.898106   12989 metrics_grabber.go:79] Master node is not registered. Grabbing metrics from Scheduler, ControllerManager and ClusterAutoscaler is disabled.
Dec  2 11:56:35.253: INFO: Successfully created a new PD: "aws://us-west-2a/vol-074b613450a05a730".
Dec  2 11:56:35.253: INFO: Creating resource for inline volume
STEP: starting aws-injector
STEP: Writing text file contents in the container.
Dec  2 11:57:09.303: INFO: Running '/home/ANT.AMAZON.COM/mattwon/go/src/k8s.io/kubernetes/_output/dockerized/bin/linux/amd64/kubectl --kubeconfig=/home/ANT.AMAZON.COM/mattwon/.kube/config exec aws-injector --namespace=volume-4684 -- /bin/sh -c echo 'Hello from aws from namespace volume-4684' > /opt/0/index.html'
Dec  2 11:57:09.636: INFO: stderr: ""
Dec  2 11:57:09.636: INFO: stdout: ""
STEP: Checking that text file contents are perfect.
Dec  2 11:57:09.636: INFO: Running '/home/ANT.AMAZON.COM/mattwon/go/src/k8s.io/kubernetes/_output/dockerized/bin/linux/amd64/kubectl --kubeconfig=/home/ANT.AMAZON.COM/mattwon/.kube/config exec aws-injector --namespace=volume-4684 -- cat /opt/0/index.html'
Dec  2 11:57:09.943: INFO: stderr: ""
Dec  2 11:57:09.943: INFO: stdout: "Hello from aws from namespace volume-4684\n"
Dec  2 11:57:09.943: INFO: ExecWithOptions {Command:[/bin/sh -c test -d /opt/0] Namespace:volume-4684 PodName:aws-injector ContainerName:aws-injector Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false}
Dec  2 11:57:09.943: INFO: >>> kubeConfig: /home/ANT.AMAZON.COM/mattwon/.kube/config
Dec  2 11:57:10.156: INFO: ExecWithOptions {Command:[/bin/sh -c test -b /opt/0] Namespace:volume-4684 PodName:aws-injector ContainerName:aws-injector Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false}
Dec  2 11:57:10.156: INFO: >>> kubeConfig: /home/ANT.AMAZON.COM/mattwon/.kube/config
STEP: Checking fsType is correct.
Dec  2 11:57:10.343: INFO: Running '/home/ANT.AMAZON.COM/mattwon/go/src/k8s.io/kubernetes/_output/dockerized/bin/linux/amd64/kubectl --kubeconfig=/home/ANT.AMAZON.COM/mattwon/.kube/config exec aws-injector --namespace=volume-4684 -- grep  /opt/0  /proc/mounts'
Dec  2 11:57:10.679: INFO: stderr: ""
Dec  2 11:57:10.680: INFO: stdout: "/dev/xvdbd /opt/0 xfs rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0\n"
STEP: Deleting pod aws-injector in namespace volume-4684
Dec  2 11:57:10.696: INFO: Waiting for pod aws-injector to disappear
Dec  2 11:57:10.709: INFO: Pod aws-injector still exists
Dec  2 11:57:12.710: INFO: Waiting for pod aws-injector to disappear
Dec  2 11:57:12.724: INFO: Pod aws-injector still exists
Dec  2 11:57:14.710: INFO: Waiting for pod aws-injector to disappear
Dec  2 11:57:14.724: INFO: Pod aws-injector still exists
Dec  2 11:57:16.710: INFO: Waiting for pod aws-injector to disappear
Dec  2 11:57:16.724: INFO: Pod aws-injector still exists
Dec  2 11:57:18.710: INFO: Waiting for pod aws-injector to disappear
Dec  2 11:57:18.724: INFO: Pod aws-injector still exists
Dec  2 11:57:20.710: INFO: Waiting for pod aws-injector to disappear
Dec  2 11:57:20.723: INFO: Pod aws-injector still exists
Dec  2 11:57:22.710: INFO: Waiting for pod aws-injector to disappear
Dec  2 11:57:22.724: INFO: Pod aws-injector still exists
Dec  2 11:57:24.710: INFO: Waiting for pod aws-injector to disappear
Dec  2 11:57:24.723: INFO: Pod aws-injector no longer exists
STEP: starting aws-client
STEP: Checking that text file contents are perfect.
Dec  2 11:57:42.785: INFO: Running '/home/ANT.AMAZON.COM/mattwon/go/src/k8s.io/kubernetes/_output/dockerized/bin/linux/amd64/kubectl --kubeconfig=/home/ANT.AMAZON.COM/mattwon/.kube/config exec aws-client --namespace=volume-4684 -- cat /opt/0/index.html'
Dec  2 11:57:43.125: INFO: stderr: ""
Dec  2 11:57:43.125: INFO: stdout: "Hello from aws from namespace volume-4684\n"
Dec  2 11:57:43.125: INFO: ExecWithOptions {Command:[/bin/sh -c test -d /opt/0] Namespace:volume-4684 PodName:aws-client ContainerName:aws-client Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false}
Dec  2 11:57:43.125: INFO: >>> kubeConfig: /home/ANT.AMAZON.COM/mattwon/.kube/config
Dec  2 11:57:43.328: INFO: ExecWithOptions {Command:[/bin/sh -c test -b /opt/0] Namespace:volume-4684 PodName:aws-client ContainerName:aws-client Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false}
Dec  2 11:57:43.328: INFO: >>> kubeConfig: /home/ANT.AMAZON.COM/mattwon/.kube/config
STEP: Checking fsType is correct.
Dec  2 11:57:43.530: INFO: Running '/home/ANT.AMAZON.COM/mattwon/go/src/k8s.io/kubernetes/_output/dockerized/bin/linux/amd64/kubectl --kubeconfig=/home/ANT.AMAZON.COM/mattwon/.kube/config exec aws-client --namespace=volume-4684 -- grep  /opt/0  /proc/mounts'
Dec  2 11:57:43.870: INFO: stderr: ""
Dec  2 11:57:43.870: INFO: stdout: "/dev/xvdbd /opt/0 xfs rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0\n"
STEP: cleaning the environment after aws
Dec  2 11:57:43.870: INFO: Deleting pod "aws-client" in namespace "volume-4684"
Dec  2 11:57:43.885: INFO: Wait up to 5m0s for pod "aws-client" to be fully deleted
Dec  2 11:57:50.031: INFO: Couldn't delete PD "aws://us-west-2a/vol-074b613450a05a730", sleeping 5s: error deleting EBS volumes: VolumeInUse: Volume vol-074b613450a05a730 is currently attached to i-05af9b9bd3414b1d2
	status code: 400, request id: f051cf08-90f6-4430-906b-ddf33626fe57
Dec  2 11:57:55.181: INFO: Couldn't delete PD "aws://us-west-2a/vol-074b613450a05a730", sleeping 5s: error deleting EBS volumes: VolumeInUse: Volume vol-074b613450a05a730 is currently attached to i-05af9b9bd3414b1d2
	status code: 400, request id: e85307dd-8a42-4b66-a542-96ee2a8ecf7e
Dec  2 11:58:00.369: INFO: Successfully deleted PD "aws://us-west-2a/vol-074b613450a05a730".
Dec  2 11:58:00.384: INFO: Warning: Environment does not support getting controller-manager metrics
W1202 11:58:00.384934   12989 metrics_grabber.go:79] Master node is not registered. Grabbing metrics from Scheduler, ControllerManager and ClusterAutoscaler is disabled.
Dec  2 11:58:00.402: INFO: Warning: Environment does not support getting controller-manager metrics
W1202 11:58:00.402571   12989 metrics_grabber.go:79] Master node is not registered. Grabbing metrics from Scheduler, ControllerManager and ClusterAutoscaler is disabled.
[AfterEach] [Testpattern: Inline-volume (xfs)][Slow] volumes
  /home/ANT.AMAZON.COM/mattwon/go/pkg/mod/k8s.io/kubernetes@v1.17.3/test/e2e/framework/framework.go:152
Dec  2 11:58:00.402: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready
STEP: Destroying namespace "volume-4684" for this suite.

• [SLOW TEST:85.767 seconds]
[ebs-csi-migration] EBS CSI Migration
/home/ANT.AMAZON.COM/mattwon/go/src/github.com/kubernetes-sigs/aws-ebs-csi-driver/tests/e2e-migration/e2e_test.go:88
  [Driver: aws]
  /home/ANT.AMAZON.COM/mattwon/go/src/github.com/kubernetes-sigs/aws-ebs-csi-driver/tests/e2e-migration/e2e_test.go:94
    [Testpattern: Inline-volume (xfs)][Slow] volumes
    /home/ANT.AMAZON.COM/mattwon/go/pkg/mod/k8s.io/kubernetes@v1.17.3/test/e2e/storage/testsuites/base.go:100
      should store data
      /home/ANT.AMAZON.COM/mattwon/go/pkg/mod/k8s.io/kubernetes@v1.17.3/test/e2e/storage/testsuites/volumes.go:150
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
Ran 1 of 225 Specs in 85.769 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 224 Skipped

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 2, 2020
@nckturner
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 7, 2021
@wongma7
Copy link
Contributor Author

wongma7 commented Jan 7, 2021

/assign @msau42

Hello, this is a oneliner AWS EBS migration fix for a bug affecting subset of inline volumes. Please review and approve ty !

Plan to cherrypick this also

@@ -98,7 +98,7 @@ func (t *awsElasticBlockStoreCSITranslator) TranslateInTreeInlineVolumeToCSI(vol
ObjectMeta: metav1.ObjectMeta{
// Must be unique per disk as it is used as the unique part of the
// staging path
Name: fmt.Sprintf("%s-%s", AWSEBSDriverName, ebsSource.VolumeID),
Name: fmt.Sprintf("%s-%s", AWSEBSDriverName, volumeHandle),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit test?

@msau42
Copy link
Member

msau42 commented Jan 8, 2021

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 8, 2021
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jan 12, 2021
@msau42
Copy link
Member

msau42 commented Jan 12, 2021

/approve
/lgtm

Thanks!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 12, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: msau42, wongma7

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 12, 2021
@msau42
Copy link
Member

msau42 commented Jan 12, 2021

I think for the cherrypick, you'll also have to add a release note describing the issue fixed

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 9720013 into kubernetes:master Jan 12, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Jan 12, 2021
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jan 13, 2021
k8s-ci-robot added a commit that referenced this pull request Feb 4, 2021
…1-upstream-release-1.18

Automated cherry pick of #96821: Use volumeHandle as PV name when translating EBS inline
k8s-ci-robot added a commit that referenced this pull request Feb 4, 2021
…1-upstream-release-1.20

Automated cherry pick of #96821: Use volumeHandle as PV name when translating EBS inline
k8s-ci-robot added a commit that referenced this pull request Feb 4, 2021
…1-upstream-release-1.19

Automated cherry pick of #96821: Use volumeHandle as PV name when translating EBS inline
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/aws Issues or PRs related to aws provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants