New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kube-proxy treat ExternalIPs as ClusterIPs #96296
Conversation
@aojea: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @danwinship @thockin |
/test pull-kubernetes-e2e-gce-ubuntu |
/test pull-kubernetes-e2e-gce-ubuntu-canary |
@aojea: The specified target(s) for
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test pull-kubernetes-e2e-gce-canary |
@aojea: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
// This is imperfect in the face of network plugins that might not use a bridge, but we can revisit that later. | ||
externalTrafficOnlyArgs := append(args, | ||
"-m", "physdev", "!", "--physdev-is-in", | ||
"-m", "addrtype", "!", "--src-type", "LOCAL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switching to the local traffic detector is probably not currently a compatible replacement for this rule, since we don't currently have the physdev-based local traffic detector, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the discussion is if this rule is correct, that it seems it is not, because CNIs not using bridges are not affected ...
what we want to do is to SNAT all the non-local traffic that is what the local detector should do, independently of the implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is quite correct.
Before this change, we try to:
- always MASQ
- send all from-off-node traffic to the service chain
- send from-on-node traffic to the service chain IFF the dst is "local"
- send any other from-on-node traffic to the network
(4) matters because the IP presumably might do some processing/telemtry/whatever and we don't want to short-circuit that
(3) matters because the result of (4) will be wrong if the IP is local
Now, most CNIs, including kubenet (I think) don't use the bridge any more so we effectively do:
- always MASQ
- send all traffic to the service chain
What you do after this change is:
- MASQ from-off-node traffic if policy is not local-only (seems ok)
- send all traffic to the service chain
Your (1) seems like an improvement.
Your (2) seems like not worse than what has ACTUALLY been happening.
This field is rather under-specified and, in thinking about it, it was probably wrong before. So OK. I think I am in. I have a nit above, but I am happy to see physdev go away
@@ -1179,6 +1179,37 @@ var _ = SIGDescribe("Services", func() { | |||
framework.ExpectNoError(err) | |||
}) | |||
|
|||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this? C? 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cherry picked the previous test, learning something new every day :)
sorry, catching up on #95785 out of order... |
// This is imperfect in the face of network plugins that might not use a bridge, but we can revisit that later. | ||
externalTrafficOnlyArgs := append(args, | ||
"-m", "physdev", "!", "--physdev-is-in", | ||
"-m", "addrtype", "!", "--src-type", "LOCAL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is quite correct.
Before this change, we try to:
- always MASQ
- send all from-off-node traffic to the service chain
- send from-on-node traffic to the service chain IFF the dst is "local"
- send any other from-on-node traffic to the network
(4) matters because the IP presumably might do some processing/telemtry/whatever and we don't want to short-circuit that
(3) matters because the result of (4) will be wrong if the IP is local
Now, most CNIs, including kubenet (I think) don't use the bridge any more so we effectively do:
- always MASQ
- send all traffic to the service chain
What you do after this change is:
- MASQ from-off-node traffic if policy is not local-only (seems ok)
- send all traffic to the service chain
Your (1) seems like an improvement.
Your (2) seems like not worse than what has ACTUALLY been happening.
This field is rather under-specified and, in thinking about it, it was probably wrong before. So OK. I think I am in. I have a nit above, but I am happy to see physdev go away
Currently kube-proxy treat ExternalIPs differently depending on: - the traffic origin - if the ExternalIP is present or not in the system. It also depends on the CNI implementation to discriminate between local and non-local traffic. Since the ExternalIP belongs to a Service, we can avoid the roundtrip of sending outside the traffic originated in the cluster. Also, we leverage the new LocalTrafficDetector to detect the local traffic and not rely on the CNI implementations for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
writeLine(proxier.natRules, append(args, "-j", string(KubeMarkMasqChain))...) | ||
// This masquerades off-cluster traffic to a External IP. | ||
if proxier.localDetector.IsImplemented() { | ||
writeLine(proxier.natRules, proxier.localDetector.JumpIfNotLocal(args, string(KubeMarkMasqChain))...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been a while since I've read the local detector code, but does this also account for traffic that would be caught with --src-type LOCAL
?
Can we assume that externalIP should bind locally in the same way status.ingress[i].ip
(a.k.a loadbalancerIP) would? Loadbalancer IP has a check for --src-type LOCAL
which accounts for host network traffic to external IPs. Should we add the same here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nvm, this would end up jumping to svcXLBChain
anyways which would have the --src-type LOCAL
check if traffic was local.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this also account for traffic that would be caught with
--src-type LOCAL
?
Curiously, NO. It does not. The "by CIDR" implementation will not catch hostNetwork pods. Looking at how it is used today, it seems like we handle it specially ("// Next, redirect all src-type=LOCAL -> LB IP to the service chain") or is moot (MARK_MASQ does nothing).
I wonder if we should fix that more completely. Awkward because there are some other rules with the same predicates (MARK_MASQ on the above case).
/hold |
I don't think the CVE makes any difference here. It's still a bad feature
that should be disabled on most clusters.
…On Mon, Dec 7, 2020 at 3:13 PM Antonio Ojea ***@***.***> wrote:
/hold
this may not be appropiate due to the CVE opened to ExternalIPs
#97110 <#97110>
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#96296 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKWAVDU32URLXSQH4K5PQTSTVOTBANCNFSM4TMM6Y5Q>
.
|
/hold cancel 😅 👍 |
/kind cleanup
What this PR does / why we need it:
Currently kube-proxy treat ExternalIPs differently depending on
were the traffic is originated and if the ExternalIP is present
or not in the system. It also depends on the CNI implementation to
discriminate between local and non-local traffic.
Since the ExternalIP belongs to a Service, we can avoid the roundtrip
of sending the traffic originated internally outside, if the ExternalIP
is not present in the host and send it directly to the Service.
Also, we leverage the new LocalTrafficDetector to detect the local
traffic and not rely on the CNI implementations for this.
Fixes #95785
Special notes for your reviewer: