Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud-controller-manager: routes controller should not depend on --allocate-node-cidrs #97029

Conversation

andrewsykim
Copy link
Member

What type of PR is this?

/kind bug

What this PR does / why we need it:
Enabling routes controller should be independent from --allocate-node-cidrs. Checking --allocate-node-cidrs made a lot of sense in kube-controller-manager, where node IPAM was required to ensure routes controller had node.spec.podCIDR set correctly. In cloud-controller-manager, this is not necessarily true since it does not run node IPAM (in most clusters) and instead depends on nodeipam controller running in kube-controller-manager. In some cases, an external controller may actually be allocated CIDRs to node.spec.podCIDR. Enablement of the routes controller should only depend on the --configure-cloud-routes flag.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

cloud-controller-manager: routes controller should not depend on --allocate-node-cidrs

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 3, 2020
@k8s-ci-robot
Copy link
Contributor

@andrewsykim: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/cloudprovider labels Dec 3, 2020
@k8s-ci-robot k8s-ci-robot added the sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. label Dec 3, 2020
@andrewsykim
Copy link
Member Author

/assign @cheftako @cici37 @nicolehanjing

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 3, 2020
@@ -104,15 +104,15 @@ func startServiceController(ctx *config.CompletedConfig, cloud cloudprovider.Int
}

func startRouteController(ctx *config.CompletedConfig, cloud cloudprovider.Interface, stopCh <-chan struct{}) (http.Handler, bool, error) {
if !ctx.ComponentConfig.KubeCloudShared.AllocateNodeCIDRs || !ctx.ComponentConfig.KubeCloudShared.ConfigureCloudRoutes {
klog.Infof("Will not configure cloud provider routes for allocate-node-cidrs: %v, configure-cloud-routes: %v.", ctx.ComponentConfig.KubeCloudShared.AllocateNodeCIDRs, ctx.ComponentConfig.KubeCloudShared.ConfigureCloudRoutes)
if !ctx.ComponentConfig.KubeCloudShared.ConfigureCloudRoutes {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contemplating if this is considered a breaking change since AllocateNodeCIDRs is false by default and ConfigureCloudRoutes is true by default. This means if you previously set neither of these flags, this check is now passing. I think it isn't because:

  1. the cloud provider still needs to implement Routes() to enable the controller.
  2. even if the cloud provider implements Routes(), routes controller itself is no-op if node.spec.podCIDR is not set. If node.spec.podCIDR is set, it means --allocate-node-cidr is enabled on kube-controller-manager, which means we should be running this controller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cheftako would still like your thoughts/opinions on this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My worry would be the managed vs unmanaged distinction. Take the GCP Cloud Provider; it has to implement Routes() as the same cloud provider impl is used in both managed (GKE) and unmanaged releases. The managed is not an issue as we have a dedicated team of experts keeping it working.

For unmanaged releases allocate-node-cidrs defaults to true and enable-ip-aliases to false (i.e. configure-cloud-routes to true). Given that allocate-node-cidrs defaults to true and routes() is implemented, the unmanaged instance would have failed this case but we're saved by configure-cloud-routes being true. Not sure I'm sanguine about being safe in the other cases where we may want different behaviors between managed and unmanaged kubernetes deployments.

@nicolehanjing
Copy link
Member

/lgtm I think this is a reasonable fix for the function

@cici37
Copy link
Contributor

cici37 commented Dec 3, 2020

/retest

@andrewsykim
Copy link
Member Author

Failing CI looks legit, I'll take a look soon

…locate-node-cidrs

Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
…ud-routes

Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
…roller-manager secure serving tests

Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 3, 2020
@andrewsykim andrewsykim force-pushed the routes-controller-allocate-node-cidr-check branch from ca0a7ef to 0c90d8d Compare December 3, 2020 19:03
@k8s-ci-robot k8s-ci-robot added the sig/testing Categorizes an issue or PR as relevant to SIG Testing. label Dec 3, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrewsykim

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cici37
Copy link
Contributor

cici37 commented Dec 3, 2020

/test pull-kubernetes-e2e-gce-ubuntu-containerd

@cici37
Copy link
Contributor

cici37 commented Dec 3, 2020

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 3, 2020
@cici37
Copy link
Contributor

cici37 commented Dec 3, 2020

/test pull-kubernetes-e2e-gce-ubuntu-containerd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cloudprovider area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants