Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check Kubelet is running with correct Windows Permissions #96616

Merged

Conversation

perithompson
Copy link

@perithompson perithompson commented Nov 16, 2020

What type of PR is this?
/kind bug

What this PR does / why we need it:
This PR adds a Windows-specific permissions check on the kubelet startup. Currently, the log file outputs an error on windows stating that the UID is -1 as UID does not exist in a windows context.
Which issue(s) this PR fixes:

Fixes #96512

Special notes for your reviewer:
As part of this PR I have moved the existing checkPermissions function to server_others.go to replicate similar behaviour in kube-proxy as the checkPermissions is not linux specific it should be build whenever not running on windows.

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

The token.IsMember(sid) returns TRUE if the caller's process is a member of the Administrators local group. Caller is NOT
expected to be impersonating anyone and is expected to be able to open its own process and process token. In order for the process to "Run as administrator" the excuting user must be in the BUILTIN\Administrators Group so I have added a check for that too.

More information about this can be found in in the c++ implementation routine and this go issue

Release Notes

Kubelet.exe on Windows now checks that the process running as administrator and the executing user account is listed in the built-in administrators group.  This is the equivalent to checking the process is running as uid 0.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 16, 2020
@k8s-ci-robot
Copy link
Contributor

Welcome @perithompson!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 16, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @perithompson. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Nov 16, 2020
@jayunit100
Copy link
Member

@perithompson ok, next steps get your CLAs sorted internally (i.e. at vmware and sign the cla contributor stuff)

@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 16, 2020
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 17, 2020
@perithompson
Copy link
Author

/sig windows

@k8s-ci-robot k8s-ci-robot added the sig/windows Categorizes an issue or PR as relevant to SIG Windows. label Nov 17, 2020
@perithompson perithompson marked this pull request as ready for review November 17, 2020 11:26
@perithompson perithompson changed the title (WIP) Check Kubelet is running with correct Windows Permissions Check Kubelet is running with correct Windows Permissions Nov 17, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 17, 2020
@ehashman ehashman added this to Triage in SIG Node PR Triage Jan 6, 2021
@@ -135,7 +137,7 @@ go_library(
],
"@io_bazel_rules_go//go/platform:windows": [
"//pkg/windows/service:go_default_library",
"//vendor/golang.org/x/sys/windows:go_default_library",
"//vendor/golang.org/x/sys/windows:go_default_library",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments here: Extra space.

@marosset
Copy link
Contributor

marosset commented Mar 9, 2021

/test pull-kubernetes-e2e-aks-engine-azure-windows
/test pull-kubernetes-e2e-aks-engine-windows-containerd

@perithompson
Copy link
Author

perithompson commented Mar 9, 2021

@perithompson can you rebase the PR?

Not sure how you did it @marosset but it suddenly realised it was ok 🙂

@marosset
Copy link
Contributor

marosset commented Mar 9, 2021

/retest

1 similar comment
@perithompson
Copy link
Author

/retest

@ehashman ehashman moved this from Needs Reviewer to Needs Approver in SIG Node PR Triage Mar 9, 2021
@marosset
Copy link
Contributor

/retest

cmd/kubelet/app/server_windows.go Outdated Show resolved Hide resolved
cmd/kubelet/app/server_windows.go Outdated Show resolved Hide resolved
@adisky
Copy link
Contributor

adisky commented Mar 10, 2021

@perithompson
Copy link
Author

@perithompson The e2e test failures looks legit failures https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/96616/pull-kubernetes-e2e-aks-engine-azure-windows/1369474116045770752

From talking with some of the MSFT guys this was test infra related, but agree it does always seem like the same problem and I was thinking it was something to do with CSI that was causing it. I wouldn't have expected the kubelet to even start with where the checkPermissions function is called, if there was an error in that code then kubelet would fail to start and all the e2es would go red

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 10, 2021
@perithompson
Copy link
Author

/retest

@perithompson
Copy link
Author

/test pull-kubernetes-e2e-kind

@perithompson
Copy link
Author

@jsturtevant guessing/hoping these are still the same failures?

Mar 10 14:32:59.730: INFO: 
Latency metrics for node k8s-master-36380154-0
Mar 10 14:32:59.730: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready
STEP: Destroying namespace "azuredisk-9336" for this suite.
• Failure [413.600 seconds]
Dynamic Provisioning
/home/prow/go/src/sigs.k8s.io/azuredisk-csi-driver/test/e2e/dynamic_provisioning_test.go:38
  [single-az]
  /home/prow/go/src/sigs.k8s.io/azuredisk-csi-driver/test/e2e/dynamic_provisioning_test.go:41
    should create a statefulset object, write and read to it, delete the pod and write and read to it again [kubernetes.io/azure-disk] [disk.csi.azure.com] [Windows] [It]
    /home/prow/go/src/sigs.k8s.io/azuredisk-csi-driver/test/e2e/dynamic_provisioning_test.go:702
    Unexpected error:
      
        <*errors.errorString | 0xc0003a22b0>: {
            s: "timed out waiting for the condition",
        }
        timed out waiting for the condition
    occurred
    /home/prow/go/src/sigs.k8s.io/azuredisk-csi-driver/test/e2e/testsuites/testsuites.go:634
------------------------------
SSSSSSSSSSSSSSSSSSS
[Fail] Dynamic Provisioning [single-az] [It] should create a statefulset object, write and read to it, delete the pod and write and read to it again [kubernetes.io/azure-disk] [disk.csi.azure.com] [Windows] 
/home/prow/go/src/sigs.k8s.io/azuredisk-csi-driver/test/e2e/testsuites/testsuites.go:634
Ran 9 of 44 Specs in 1796.376 seconds
FAIL! -- 8 Passed | 1 Failed | 0 Pending | 35 Skipped
--- FAIL: TestE2E (1796.38s)
FAIL
FAIL	sigs.k8s.io/azuredisk-csi-driver/test/e2e	1796.452s
FAIL
make: *** [Makefile:112: e2e-test] Error 1
2021/03/10 14:33:06 process.go:155: Step 'make e2e-test' finished in 31m43.364986236s
2021/03/10 14:33:06 aksengine_helpers.go:424: downloading /root/tmp863472555/log-dump.sh from https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/hack/log-dump/log-dump.sh
2021/03/10 14:33:06 util.go:68: curl https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/hack/log-dump/log-dump.sh
2021/03/10 14:33:06 process.go:153: Running: chmod +x /root/tmp863472555/log-dump.sh
 Mar 10 14:16:26.559: INFO: 
Latency metrics for node k8s-master-23042803-0
Mar 10 14:16:26.559: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready
STEP: Destroying namespace "azurefile-5802" for this suite.
• Failure [305.802 seconds]
Dynamic Provisioning
/home/prow/go/src/sigs.k8s.io/azurefile-csi-driver/test/e2e/dynamic_provisioning_test.go:40
  should create a volume on demand with useDataPlaneAPI [kubernetes.io/azure-file] [file.csi.azure.com] [Windows] [It]
  /home/prow/go/src/sigs.k8s.io/azurefile-csi-driver/test/e2e/dynamic_provisioning_test.go:646
[Fail] Dynamic Provisioning [It] should create a volume on demand with useDataPlaneAPI [kubernetes.io/azure-file] [file.csi.azure.com] [Windows] 
/home/prow/go/src/sigs.k8s.io/azurefile-csi-driver/test/e2e/testsuites/testsuites.go:219
Ran 7 of 28 Specs in 709.103 seconds
FAIL! -- 6 Passed | 1 Failed | 0 Pending | 21 Skipped
--- FAIL: TestE2E (709.10s)
FAIL
FAIL	sigs.k8s.io/azurefile-csi-driver/test/e2e	709.155s
FAIL
make: *** [Makefile:84: e2e-test] Error 1
2021/03/10 14:16:38 process.go:155: Step 'make e2e-test' finished in 13m7.41104087s
2021/03/10 14:16:38 aksengine_helpers.go:424: downloading /root/tmp863250335/log-dump.sh from https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/hack/log-dump/log-dump.sh
2021/03/10 14:16:38 util.go:68: curl https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/hack/log-dump/log-dump.sh
2021/03/10 14:16:39 process.go:153: Running: chmod +x /root/tmp863250335/log-dump.sh 

@jayunit100
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 10, 2021
@marosset
Copy link
Contributor

/retest

@k8s-ci-robot
Copy link
Contributor

@perithompson: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kubernetes-e2e-aks-engine-azure-windows ad8de3818efc9253e8aac5ba89fd3dfa33302588 link /test pull-kubernetes-e2e-aks-engine-azure-windows
pull-kubernetes-e2e-azure-file-windows 46738b7 link /test pull-kubernetes-e2e-azure-file-windows
pull-kubernetes-e2e-azure-disk-windows 46738b7 link /test pull-kubernetes-e2e-azure-disk-windows

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mrunalp
Copy link
Contributor

mrunalp commented Mar 11, 2021

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jayunit100, mrunalp, perithompson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 11, 2021
@mrunalp mrunalp moved this from Needs Approver to Done in SIG Node PR Triage Mar 11, 2021
@k8s-ci-robot k8s-ci-robot merged commit a8ac0c9 into kubernetes:master Mar 12, 2021
SIG-Windows automation moved this from In Review (v1.21) to Done (v1.21) Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging this pull request may close these issues.

cheking UID's in windows, follow on (quick pr)