Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update gce-pd volume topology label to GA #98700

Merged
merged 1 commit into from Feb 4, 2021

Conversation

Jiawei0227
Copy link
Contributor

@Jiawei0227 Jiawei0227 commented Feb 2, 2021

What type of PR is this?
/kind feature

What this PR does / why we need it:
This is a follow-up PR for #97823

This PR updates the gce-pd volume topology label to use GA version. Previously we were using beta label(FailureDomain) for the in-tree volumes. But as the Beta label has been deprecated for a while and is scheduled to be removed soon. We should update this to use GA label instead.

Which issue(s) this PR fixes:

Fix #92237 by upgrading the label to GA version.

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

[ACTION REQUIRED] Newly provisioned PVs by gce-pd will no longer have the beta FailureDomain label. gce-pd volume plugin will start to have GA topology label instead.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


/sig storage
/cc @msau42

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 2, 2021
@k8s-ci-robot
Copy link
Contributor

@Jiawei0227: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/cloudprovider area/test sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Feb 2, 2021
@Jiawei0227
Copy link
Contributor Author

/retest

@@ -1243,7 +1243,7 @@ func InitGcePdDriver() storageframework.TestDriver {
},
SupportedFsType: supportedTypes,
SupportedMountOption: sets.NewString("debug", "nouid32"),
TopologyKeys: []string{v1.LabelFailureDomainBetaZone},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm let's double check but I think we have upgrade tests than run N-1 e2e version against a N version cluster. We may need to make some changes to the older e2e to handle both labels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so there's a couple of variations of tests, but IIRC, one variation will run N-1 e2e.test on an N-1 cluster, upgrade the cluster, and then run N-1 e2e.test on the version N cluster

I run the e2e binary on new cluster with my change and the topology test pass.

./e2e.test --ginkgo.focus=".*In-tree Volumes.*gcepd.*topology.*" -provider gce -gce-project=jiaweiwang-gke-multi-cloud-dev -gce-zone=us-central1-b

{"msg":"PASSED [sig-storage] In-tree Volumes [Driver: gcepd] [Testpattern: Dynamic PV (delayed binding)] topology should provision a volume and schedule a pod with AllowedTopologies","total":8,"completed":1,"skipped":1030,"failed":0}
{"msg":"PASSED [sig-storage] In-tree Volumes [Driver: gcepd] [Testpattern: Dynamic PV (immediate binding)] topology should provision a volume and schedule a pod with AllowedTopologies","total":8,"completed":2,"skipped":2859,"failed":0}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more specific if you have concerns on which tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you run e2e.test from 1 version before, ie release 1.20?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I build the e2e.test from release-1.20 branch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you also run the regional PD tests? (it requires a regional cluster). I see some parts of that test are validating the labels on the PV object, so I would expect a 1.20 test validating beta labels will fail on a 1.21 cluster that uses GA labels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems regional PD is checking PV labels. So we need to fix the 1.20 release branch e2e test to not break the version skew test.
#98733 PR out for review and cherry-pick approval.

@msau42
Copy link
Member

msau42 commented Feb 2, 2021

/assign @mattcary
cc @wongma7 @andyzhangx @jsafrane FYI for other clouds

@msau42
Copy link
Member

msau42 commented Feb 2, 2021

oops forgot one cc @divyenpatel @xing-yang as fyi

@Jiawei0227
Copy link
Contributor Author

/retest

@msau42
Copy link
Member

msau42 commented Feb 3, 2021

/lgtm
/approve
/assign @cheftako
for cloud provider updates

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 3, 2021
@msau42
Copy link
Member

msau42 commented Feb 3, 2021

Can you also update the release notes saying that newly provisioned PVs will no longer have the beta label, so any outside tools need to be fixed (and add an ACTION REQUIRED)

@k8s-ci-robot k8s-ci-robot added release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Feb 3, 2021
@Jiawei0227
Copy link
Contributor Author

Can you also update the release notes saying that newly provisioned PVs will no longer have the beta label, so any outside tools need to be fixed (and add an ACTION REQUIRED)

Done

@@ -515,7 +515,10 @@ func (g *Cloud) GetLabelsForVolume(ctx context.Context, pv *v1.PersistentVolume)

// If the zone is already labeled, honor the hint
name := pv.Spec.GCEPersistentDisk.PDName
zone := pv.Labels[v1.LabelFailureDomainBetaZone]
zone := pv.Labels[v1.LabelTopologyZone]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, We are no longer accepting enhancements on legacy-cloud-providers. Any such enhancements must be done out of tree. As this is clearly just a bug fix going to let it through.

@cheftako
Copy link
Member

cheftako commented Feb 3, 2021

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheftako, Jiawei0227, msau42

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 3, 2021
@Jiawei0227
Copy link
Contributor Author

/retest

@AlwaySummit
Copy link

Hi @Jiawei0227 , sorry to trouble but I'm hitting an error due to the missing labels of pv. I'm not sure if it is related to this PR. In version 1.18, neither the failure-domain.beta.kubernetes.io/zone nor topology.kubernetes.io/zone is correctly set for pv. I guess it is because the in-tree plugin was deprecated and the logic of label set was gone simultaneously but I don't have any concrete evidence to prove it since I'm not expert on this area. Could you please share more insights of this? Thanks.
BTW, please refer more detail for this issue: #103403

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cloudprovider area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PD in-tree plugin: migrate to stable zone/region labels
6 participants