Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow all containers in the kubeflow namespace to run as non root #1756

Closed
wants to merge 10 commits into from

Conversation

juliusvonkohout
Copy link
Member

@juliusvonkohout juliusvonkohout commented Feb 25, 2021

My goal is to run kubeflow completly rootless except for istio-system.
I will use the istio CNI plugin to use rootless init containers.

The following changes are necessary:

  1. Use a port above 1024 in kubeflow-pipelines-profile-controller deployment to allow runasnonroot fix: runasnonroot for kubeflow-pipelines-controller pipelines#5294
  2. The metadata-db readiness probe is wrong. It does only work if you run the container as root. You can run the container with a restricted psp as another user ("daemon") without root rights. Someone forgot to take into account that the readiness probe should use the correct username "root" form the environement variable instead of relying on the wrong assumption that your container user is always root and can be abused as mysql user. @Bobgy wants to merge the deployments metadata-db(mysql 8.0) and mysql(mysql5.6) anyway feat(deployment): update mysql to 5.7 and support running as nonroot pipelines#5278
  3. Use image:mysql:8.0 instead of mysql:8.0.3 for the metadata-db deployment
  4. Use mysql 5.7 instead of 5.6 for the mysql deployment and make sure that it runs as non-root
  5. In tandem with Rebase Fix wrong port in admission-webhook-controller #5637 kubeflow#5668 we can change admission-webhook-deployment port from 443 to 4443 which then allows runasnonroot
  6. seldon-controller-manager port has already been changed from 443 to 4443 in 48ea3f8
  7. remove spartakus from kfctl_k8s_istio.yaml #1759 should cover the spartakus-volunteer
  8. Cache-server and cache-deployment also need small changes, but maybe in another PR. It might just be enough to switch from /bin/bin to /tmp in https://github.com/kubeflow/pipelines/blob/280cae3718269d5522cafafa1650204fa0bb8f2e/backend/src/cache/deployer/deploy-cache-service.sh#L33
  9. Kubeflow 1.3 uses argo 2.12 where even upstream uses runasnonroot https://github.com/argoproj/argo-workflows/blob/f1e0c6174b48af69d6e8ecd235a2d709f44f8095/manifests/base/workflow-controller/workflow-controller-deployment.yaml#L40

I really hope that this makes it into kubeflow 1.3, because rootless containers are forbidden in many enterprise environments

Checklist:

  • Unit tests pass:
    Make sure you have installed kustomize == 3.2.1
    1. make generate-changed-only
    2. make test

@aws-kf-ci-bot
Copy link

Hi @juliusvonkohout. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: juliusvonkohout
To complete the pull request process, please assign bobgy after the PR has been reviewed.
You can assign the PR to them by writing /assign @bobgy in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@juliusvonkohout juliusvonkohout changed the title WIP runasnonroot for more containers Allow all containers in the kubeflow namespace to run as non root Feb 25, 2021
@juliusvonkohout juliusvonkohout deleted the patch-1 branch February 25, 2021 19:31
@juliusvonkohout juliusvonkohout restored the patch-1 branch February 25, 2021 19:31
@juliusvonkohout
Copy link
Member Author

juliusvonkohout commented Feb 26, 2021

@Bobgy @pvaneck sorry for accidentally deleting and restoring the branch. I still hope that this is reviewed and merged before Kubeflow 1.3

@Bobgy
Copy link
Contributor

Bobgy commented Feb 26, 2021

Hi @juliusvonkohout , I think that sounds great, but we are in the process of moving development to upstream repos.

Would you mind submitting these changes in those upstreams after the transition

@davidspek
Copy link
Contributor

@juliusvonkohout I think this is a great effort. Regarding the cache deployer, it might also be interesting to look at kubeflow/pipelines#4695 as well.

@juliusvonkohout
Copy link
Member Author

Hi @juliusvonkohout , I think that sounds great, but we are in the process of moving development to upstream repos.

Would you mind submitting these changes in those upstreams after the transition

@Bobgy
I really need those changes in Kubeflow 1.3 for enterprise customers. As long as this is possible i will do whatever is necessary. Just tell me as soon as possible when i can work on something.

By the way what is your time estimation for Kubeflow 1.3?

@davidspek Yes i will look into the cache deployer with low priority as soon as the main parts are running rootless. In the mean time i will work on getting a rootless istio-init with minikube and cilium CNI.

@Bobgy
Copy link
Contributor

Bobgy commented Feb 28, 2021

@juliusvonkohout for KFP, you can contribute to github.com/kubeflow/pipelines/manifests/kustomize. @yanniszark do we have a map of where each application manifest is moved to

@juliusvonkohout
Copy link
Member Author

Should everything regarding apps/pipeline/upstream/installs/multi-user/pipelines-profile-controller/ not be merged in this repository? I could not find another corresponding repository.

@juliusvonkohout
Copy link
Member Author

Most of the fixes are upstream, but in 1.3.0 rc the newly developed Jupyterlab vscode and rstudio images are broken kubeflow/kubeflow#5808

@juliusvonkohout
Copy link
Member Author

juliusvonkohout commented Apr 26, 2021

except for the cache deployment and the notebooks everything is working with istio-cni.
I just had to disable the optional cache and write a proper single image that supports jupyterlab, Rstudio and code-server without the problematic (setuid) s6 overlay. And of course you need to write pipelines with k8sapi or use a newer argo with the emissary executor.

The only root container is the istio cni installer in kube-system. But that is not n security issue.

@davidspek
Copy link
Contributor

For the record, the fix for notebook server images is to extend the notebook controller to set the SecurityContext in the StatefulSet. Nothing in the containers actually runs as root, and the S6 overlay can be used once the notebook controller is updated to accommodate this.

@yanniszark
Copy link
Contributor

@juliusvonkohout changes to manifests under apps should happen in their respective upstream app repos.
For the rest of the changes, are they still relevant?

@juliusvonkohout
Copy link
Member Author

@juliusvonkohout changes to manifests under apps should happen in their respective upstream app repos.
For the rest of the changes, are they still relevant?

Most of them have already been merged in 5-10 other pull requests. I am primarily waiting for Argo and the emissary executer on which @NikeNano and @Bobgy are working on.

I will just close this PR at some point.

@stale
Copy link

stale bot commented Oct 2, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in one week if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale label Oct 2, 2021
@juliusvonkohout
Copy link
Member Author

This is the most important next step #2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants