Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated cherry pick of #103532: Service: Fix semantics for Update wrt allocations #104673

Conversation

thockin
Copy link
Member

@thockin thockin commented Aug 31, 2021

Cherry pick of #103532 #104601 on release-1.21.

#103532: Service: Fix semantics for Update wrt allocations

This has now manifested as user-reported outages (errant controller looping and allocating ports causing DoS in HA apiservers), hence the backport.

It is not uncommon for users to Create a Service and not specify things
like ClusterIP and NodePort, which we then allocate for them. They same
that YAML somewhere and later use it again in an Update, but then it
fails.

That's because we detected them trying to set a ClusterIP from a value
to "", which is not allowed. If it was just NodePort, they would
actually succeed and reallocate a new port.

After this change, we try to "patch" updates where the user did not
specify those values from the old object.

/kind bug
/kind api-change

When using `kubectl replace` (or the equivalent API call) on a Service, the caller no longer needs to do a read-modify-write cycle to fetch the allocated values for `.spec.clusterIP` and `.spec.ports[].nodePort`.  Instead the API server will automatically carry these forward from the original object when the new object does not specify them.  

@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Aug 31, 2021
@k8s-ci-robot k8s-ci-robot added do-not-merge/cherry-pick-not-approved Indicates that a PR is not yet approved to merge into a release branch. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 31, 2021
@k8s-ci-robot
Copy link
Contributor

@thockin: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 31, 2021
@thockin thockin added kind/bug Categorizes issue or PR as related to a bug. sig/network Categorizes an issue or PR as relevant to SIG Network. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Aug 31, 2021
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 31, 2021
@thockin thockin added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Aug 31, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Aug 31, 2021
@thockin
Copy link
Member Author

thockin commented Aug 31, 2021

/hold
until I can evaluate the cherrypick manually

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 31, 2021
@thockin thockin force-pushed the automated-cherry-pick-of-#103532-#104601-upstream-release-1.21 branch from e8bd33e to 27ce2fc Compare August 31, 2021 15:41
It is not uncommon for users to Create a Service and not specify things
like ClusterIP and NodePort, which we then allocate for them.  They same
that YAML somewhere and later use it again in an Update, but then it
fails.

That's because we detected them trying to set a ClusterIP from a value
to "", which is not allowed.  If it was just NodePort, they would
actually succeed and reallocate a new port.

After this change, we try to "patch" updates where the user did not
specify those values from the old object.
Prior to 1.22 a user could change NodePort values within a service
during an update, and the apiserver would allocate values for any that
were not specified.

Consider a YAML like:

```
apiVersion: v1
kind: Service
metadata:
  name: foo
spec:
  type: NodePort
  ports:
  - name: p
    port: 80
  - name: q
    port: 81
  selector:
    app: foo
```

When this is created, nodeport values will be allocated for each port.
Something like:

```
apiVersion: v1
kind: Service
metadata:
  name: foo
spec:
  clusterIP: 10.0.149.11
  type: NodePort
  ports:
  - name: p
    nodePort: 30872
    port: 80
    protocol: TCP
    targetPort: 9376
  - name: q
    nodePort: 31310
    port: 81
    protocol: TCP
    targetPort: 81
  selector:
    app: foo
```

If the user PUTs (kubectl replace) the original YAML, we would see that
`.nodePort = 0`, and allocate new ports.  This was ugly at best.

In 1.22 we fixed this to not allocate new values if we still had the old
values, but instead re-assign them.  Net new ports would still be seen
as `.nodePort = 0` and so new allocations would be made.

This broke a corner case as follows:

Prior to 1.22, the user could PUT this YAML:

```
apiVersion: v1
kind: Service
metadata:
  name: foo
spec:
  type: NodePort
  ports:
  - name: p
    nodePort: 31310 # note this is the `q` value
    port: 80
  - name: q
    # note this nodePort is not specified
    port: 81
  selector:
    app: foo
```

The `p` port would take the `q` port's value.  The `q` port would be
seen as `.nodePort = 0` and a new value allocated.  In 1.22 this results
in an error (duplicate value in `p` and `q`).

This is VERY minor but it is an API regression, which we try to avoid,
and the fix is not too horrible.

This commit adds more robust testing of this logic.
@thockin thockin force-pushed the automated-cherry-pick-of-#103532-#104601-upstream-release-1.21 branch from 27ce2fc to 7ef53ae Compare August 31, 2021 15:48
@thockin
Copy link
Member Author

thockin commented Aug 31, 2021

cc @liggitt @khenidak

@thockin thockin removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 31, 2021
@aojea
Copy link
Member

aojea commented Sep 4, 2021

Kal may have missed this
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 4, 2021
@thockin
Copy link
Member Author

thockin commented Sep 9, 2021

ping @kubernetes/release-managers

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: saschagrunert, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@saschagrunert saschagrunert added cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. and removed do-not-merge/cherry-pick-not-approved Indicates that a PR is not yet approved to merge into a release branch. labels Sep 10, 2021
@cpanato
Copy link
Member

cpanato commented Sep 10, 2021

/test pull-kubernetes-e2e-gce-ubuntu-containerd

@cpanato
Copy link
Member

cpanato commented Sep 10, 2021

/test pull-kubernetes-node-e2e

@k8s-ci-robot k8s-ci-robot merged commit 394f177 into kubernetes:release-1.21 Sep 10, 2021
@liggitt liggitt added the kind/regression Categorizes issue or PR as related to a regression from a prior release. label Apr 27, 2022
@thockin thockin deleted the automated-cherry-pick-of-#103532-#104601-upstream-release-1.21 branch July 6, 2022 21:03
@liggitt liggitt changed the title Automated cherry pick of #103532: Service: Fix semantics for Update wrt allocations #104601: Fix a small regression in Service updates Automated cherry pick of #103532: Service: Fix semantics for Update wrt allocations Sep 9, 2023
@liggitt liggitt removed the kind/regression Categorizes issue or PR as related to a regression from a prior release. label Sep 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants