Skip to content

KarmaComputing/posit-team-k8s

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 

Repository files navigation

Posit Team Kubernetes (k8s)

Also known as: How the heck do I reliably & repeatedly deploy the Posit Team (wrappers for Rstudio, jupyter notebook, artefact management et al) locally and also in managed k8s environments?

Ref

Good first reading / watching

To get a grasp on the principles of what Kubernetes is trying to solve (the orchestration of compute workloads, across machines), the Google paper Large-scale cluster management at Google with Borg gives a good background as it inspired the creation of Kubernetes- in addition if you prefer video, there's even Kubernetes - the documentary (25min).

Prerequisites

  1. Install kubectl - the command line interface (CLI) tool for interacting with your Kubernetes cluster

    1. Enable kubectl autocompletion - you'll really-really benefit from this, do it!
  2. Install minikube for a local kubernetes cluster

    1. Enable minikube autocompletion you'll really-really benefit from this too, do it!

    Enable KVM mode - we need to use Virtual machines for the nodes, because the storage we're using requires iSCI which isn't easily containerised

    Follow minikube kvm2

    sudo apt install cpu-checker
    

    This will install Virtual Machine Manager, a convenient simple UI to view the virtual machines. It's rare you'll need this, but if you're more visually inclined it's helpful.

    sudo apt install virt-manager
    kvm-ok 
    INFO: /dev/kvm exists
    KVM acceleration can be used
    sudo apt install qemu-system libvirt-clients libvirt-daemon-system

    Validate with virt-host-validate (It's OK for LXC to have failures, since we're using kvm/qemu)

    sudo virt-host-validate
    QEMU: Checking for hardware virtualization                                 : PASS
    QEMU: Checking if device /dev/kvm exists                                   : PASS
    QEMU: Checking if device /dev/kvm is accessible                            : PASS
    QEMU: Checking if device /dev/vhost-net exists                             : PASS
    QEMU: Checking if device /dev/net/tun exists                               : PASS
    QEMU: Checking for cgroup 'cpu' controller support                         : PASS
    QEMU: Checking for cgroup 'cpuacct' controller support                     : PASS
    QEMU: Checking for cgroup 'cpuset' controller support                      : PASS
    QEMU: Checking for cgroup 'memory' controller support                      : PASS
    QEMU: Checking for cgroup 'devices' controller support                     : PASS
    QEMU: Checking for cgroup 'blkio' controller support                       : PASS
    QEMU: Checking for device assignment IOMMU support                         : PASS
    QEMU: Checking if IOMMU is enabled by kernel                               : PASS
    QEMU: Checking for secure guest support                                    : WARN (Unknown if this platform has Secure Guest support)
    LXC: Checking for Linux >= 2.6.26                                         : PASS
    LXC: Checking for namespace ipc                                           : PASS
    LXC: Checking for namespace mnt                                           : PASS
    LXC: Checking for namespace pid                                           : PASS
    LXC: Checking for namespace uts                                           : PASS
    LXC: Checking for namespace net                                           : PASS
    LXC: Checking for namespace user                                          : PASS
    LXC: Checking for cgroup 'cpu' controller support                         : PASS
    LXC: Checking for cgroup 'cpuacct' controller support                     : PASS
    LXC: Checking for cgroup 'cpuset' controller support                      : PASS
    LXC: Checking for cgroup 'memory' controller support                      : PASS
    LXC: Checking for cgroup 'devices' controller support                     : PASS
    LXC: Checking for cgroup 'freezer' controller support                     : FAIL (Enable 'freezer' in kernel Kconfig file or mount/enable cgroup controller in your system)
    LXC: Checking for cgroup 'blkio' controller support                       : PASS
    LXC: Checking if device /sys/fs/fuse/connections exists                   : PASS
    

    Make kvm2 the default driver:

    minikube config set driver kvm2

    Create multi-node k8s cluster where each node is a vm

    minikube start --nodes=3 --driver=kvm2 --iso-url file://$(pwd)/minikube-amd64.iso
    # (This may take some time)

    Where did minikube-amd64.iso come from? minikube-amd64.iso is the built image with iSCSI and open-iscsi support which is needed for Longhorn storage.

    After a while success looks like:

     ...
     > kubeadm.sha256:  64 B / 64 B [-------------------------] 100.00% ? p/s 0s
     > kubelet.sha256:  64 B / 64 B [-------------------------] 100.00% ? p/s 0s
     > kubectl.sha256:  64 B / 64 B [-------------------------] 100.00% ? p/s 0s
     > kubeadm:  71.08 MiB / 71.08 MiB [-------------] 100.00% 9.22 MiB p/s 7.9s
     > kubectl:  57.34 MiB / 57.34 MiB [-------------] 100.00% 6.16 MiB p/s 9.5s
     > kubelet:  77.91 MiB / 77.91 MiB [--------------] 100.00% 6.15 MiB p/s 13s
     🔎  Verifying Kubernetes components...
     🏄  Done! kubectl is now configured to use "node" cluster and "default" namespace by default

    You can verify the noes (virtual machines) are running by issing a:

    virsh list

    You'll see (hopefully):

    Id   Name           State

4 minikube running 5 minikube-m02 running 6 minikube-m03 running


Alternativly, open the application "Virtual machine manager" (which you installed with
`sudo apt install virt-manager` and you'll be able to see the same information with a GUI.


1. [Helm](https://helm.sh/) for installing kubernetes packages

1. It's *extreemly helpful* (but not required) to install a container runtime on your local machine also for debugging containers. the Kubernetes project defines a [Container Runtime Interface (CRI)](https://kubernetes.io/docs/concepts/architecture/cri/) which is what allows various providers to 'provide' a [container runtime](https://kubernetes.io/docs/setup/production-environment/container-runtimes/)- such as 'containerd', 'Docker Engine', 'CRI-O' etc. Historically, `kubernetes` only came with, and had hard-coded/built-in the `containerd` project, which was eventually removed in favor of this 'pluggable' container runtime interface.
 - You might want to install [Podman](https://podman.io/) or [Docker](https://www.docker.com/). You may want to use podman to get into the habit of rootless and ever-smaller containers (for the benefit of smaller attack surfaces), however it requires some extra setup and is considered [expermintal for `minikube`](https://minikube.sigs.k8s.io/docs/drivers/podman/), so we'll use Docker for testing here.

## Setup Health-check stop-point

> Run these commands to verify you have everything setup OK. There's little point continuing if you see errors at this point. If you do, stop, investigate those and resolve before continuing.

### `kubectl` health-check

Follow [Verify `kubectl` docs](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/#verify-kubectl-configuration)

At your terminal:

Input:

```bash
kubectl get pods

Expected output:

No resources found in default namespace.

helm health-check

Input:

helm list

Output:

NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION

High-level ingredients

  • Container orchestration: Kubernetes
  • Deployment orchestration: ArgoCD
  • Ingress Networking: KubeVIP
  • Persistent Storage: Longhorn
  • Relational Database: Postgresql (patroni)

High-level order of install steps

(don't follow thi right now, this is just an overview)

  1. Installation of Kubernetes
  2. Installation of ArgoCD into Kubernetes
  3. Installation of Longhorn (storage)
  4. Installation of Postgres (database)
  5. Installation of application(s) (e.g. Posit)

ArgoCD installation

Why? To automate the day-to-day rollout of updates and changes via git.

https://argo-cd.readthedocs.io/en/stable/getting_started/

Storage setup

Posit requires somewhere to store state. For this, storage needs to be configured.

https://longhorn.io/docs/1.9.1/deploy/install/install-with-argocd/

Exposing services

Prerequisite: KubeVIP is configured (for local dev using Kind + docker/podman network rage for LoadBalancer type)

Patching existing service to use type LoadBalancer

kubectl patch svc <service-name> -n <namespace> -p '{"spec": {"type": "LoadBalancer"}}

Posit Installation

With all the Prerequisites done, these are the "Posit Team" specific steps for installing the Posit Team components into your Kubernetes cluster locally. Recall that Posit Team Components refers to "a sales bundle of Posit Workbench, Posit Connect, and Posit Package Manager software for developing data science projects, publishing data products, and managing packages." - Posit

Add the rstudio and bitnami helm chart repos

  1. Add Helm repo for rstudio:

    helm repo add rstudio https://helm.rstudio.com

    Output:

    "rstudio" has been added to your repositories
  2. Add helm repo PostgreSQL

Verify helm repos

Input

helm search repo rstudio

Output

NAME                         	CHART VERSION	APP VERSION	DESCRIPTION
rstudio/rstudio-connect      	0.8.7        	2025.07.0  	Official Helm chart for Posit Connect
rstudio/rstudio-launcher-rbac	0.2.24       	0.2.21     	RBAC definition for the RStudio Job Launcher
rstudio/rstudio-library      	0.1.34       	0.1.34     	Helm library helpers for use by official RStudi...
rstudio/rstudio-pm           	0.5.49       	2025.04.4  	Official Helm chart for Posit Package Manager
rstudio/rstudio-workbench    	0.9.11       	2025.05.1  	Official Helm chart for Posit Workbench
rstudio/posit-chronicle      	0.4.5        	2025.08.0  	Official Helm chart for Posit Chronicle Server
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
  • I would suggest making some somewhere to store config (values.yaml) files, e.g.
mkdir -p $HOME/posit-team-testing/{connect,workbench,packagemanager}
  • Note that this guide does not include any information about licensing, but a quick overview:
    • For each service, define a secret containing the license file - kubectl create secret generic rstudio-<service>-license --from-file=<license_file>
    • In the values.yaml for that service, add a section as below:
license:
  file:
    secret: rstudio-<service>-license
    secretKey: <license_file>
  • e.g. for Workbench:
kubectl create secret generic rstudio-workbench-license --from-file=licences/rstudio-workbench.lic
  • In values.yaml:
license:
  file:
    secret: rstudio-workbench-license
    secretKey: rstudio-workbench.lic
  • Also note, deletion of PersistentVolumes and PersistentVolumeClaims seems to not work properly sometimes, I'm yet to figure out how to properly clean these up

  • helm uninstall needs to be used to allow some PVs and PVCs to be deleted, namely with Postgres, as the Chart creates a StatefulSet pod to handle persistent storage (the template can be seen here)

  • Completed values.yaml files can be found under values

Installation

  • Ensure minikube is running
minikube start
  • Note that you can entirely reset minikube by running
minikube delete
  1. Ensure prerequisites are all met
  2. Create and switch to a namespace for Workbench
kubectl create namespace posit-workbench
kubectl config set-context --current --namespace=posit-workbench
  1. Create a PostgreSQL database (in the cluster) and a Secret containing the password
  • Create the database
helm upgrade --install rsw-db bitnami/postgresql \
	--set auth.database="rs-workbench" \
    --set auth.username="workbench" \
    --set auth.password="workbench"
  • Create Secret with password
kubectl create secret generic rsw-database --from-literal=password=workbench
  1. Create a StorageClass with ReadWriteMany access
  • minikube automatically creates a StorageClass for us under the name standard, which you can view with kubectl get StorageClass allowing us to skip this step for now.
  1. Configure Helm chart values with values.yaml file
    • Posit maintains a Helm chart for Workbench which they recommend for deployment on Kubernetes
    • values.yaml overrides defaults specified in the Helm chart
      • The complete (and most up to date) values.yaml with all defaults can be seen here, or a more realistic one here. Below is a minimal example for testing with no license file:
# Automatically create a test user with defined credentials
userCreate: true
userName: rstudio
userPassword: rstudio

sharedStorage:
  create: true
  mount: true

  storageClassName: standard
  requests:
    storage: 100G

pod:
  env:
    - name: WORKBENCH_POSTGRES_PASSWORD
      valueFrom:
        secretKeyRef:
          name: rsw-database
          key: password

config:
  secret:
    database.conf:
      provider: "postgresql"
      connection-uri: "postgres://[email protected]:5432/rs-workbench?sslmode=disable"
  • Postgres URL format for an in-cluster database
    • postgres://<username>@<pod_name>.<namespace>.svc.cluster.local:<port>/<database>
  • You can test that the database is actually being used by Workbench/is otherwise accessible from inside the cluster with a temporary client pod:
    1. kubectl run -ti --rm debug --image=postgres:latest --env="PGPASSWORD=workbench" -- bash
    2. psql -h rsw-db-postgresql.posit-workbench.svc.cluster.local -U workbench -d rs-workbench
    3. \dt to view all tables
  1. Deploy
helm upgrade --install rstudio-workbench-testing \
    rstudio/rstudio-workbench \
    --values path/to/values.yaml
  • The same command can be used to update values.yaml in the future
  • Check it's running with kubectl get pod -l app.kubernetes.io/name=rstudio-workbench
  1. Make accessible by port forwarding in a new terminal (only for testing)
kubectl port-forward svc/rstudio-workbench-testing 3940:80
  • View in your browser at localhost:3940 and login with the credentials:
    • User: rstudio
    • Password: rstudio
  1. Stop and clean up
  • Remove the helm charts
helm uninstall rsw-db rstudio-workbench-testing
  • If desired, delete the PersistentVolumes and PersistentVolumeClaims
# Delete PersistentVolumes/Claims
kubectl delete pvc/data-rsw-db-postgresql-0 pvc/rstudio-workbench-testing-shared-storage
  1. Ensure all prerequisites are met
  2. Create and switch to a namespace for Connect
kubectl create namespace posit-connect
kubectl config set-context --current --namespace=posit-connect
  1. Create a PostgreSQL database (in the cluster) and a Secret containing the password
  • Create the database
helm upgrade --install rsc-db bitnami/postgresql \
	--set auth.database="rs-connect" \
    --set auth.username="connect" \
    --set auth.password="connect"
  • Create Secret with password
kubectl create secret generic rsc-database --from-literal=password=connect
  1. Create a StorageClass with ReadWriteMany access
  • minikube automatically creates a StorageClass for us under the name standard, which you can view with kubectl get StorageClass allowing us to skip this step for now.
  1. Configure Helm chart values with values.yaml file
    • Posit maintains a Helm chart for Connect which they recommend for deployment on Kubernetes
    • values.yaml overrides defaults specified in the Helm chart
      • The complete (and most up to date) values.yaml with all defaults can be seen here, or a more realistic one here. Below is a minimal example for testing with no license file
pod:
  env:
    - name: CONNECT_POSTGRES_PASSWORD
      valueFrom:
        secretKeyRef:
          name: rsc-database
          key: password

homeStorage:
  create: true
  mount: true
  storageClassName: standard
  requests:
    storage: 100G

sharedStorage:
  create: true
  mount: true
  storageClassName: standard
  requests:
    storage: 1G

# Overrides values in /etc/rstudio-connect/rstudio-connect.gcfg inside the Pod
config:
  Database:
    Provider: "Postgres"
  Postgres:
    URL: "postgres://[email protected]:5432/rs-connect?sslmode=disable"
  1. Deploy and verify running
helm upgrade --install rstudio-connect-testing rstudio/rstudio-connect --values path/to/values.yaml
  • Optionally pass the -l parameter to select the container
kubectl get pod -l app.kubernetes.io/name=rstudio-connect

PostgreS 8. Access by port-forwarding in a separate terminal window (only for testing)

kubectl port-forward svc/rstudio-connect-testing 3941:80
  • View in your browser at localhost:3941
  1. Stop and clean up
  • Remove the helm charts
helm uninstall rsc-db rstudio-connect-testing
  • If you deployed anything (e.g. from the Gallery tab) then there may have been more pods created underservice-<random_characters>, run-python-<random_characters>, or others. These can be deleted too.
  • The entire namespace (deployments, services, and pods) can be purged with kubectl delete --all --namespace=posit-connect
  1. Ensure prerequisites are all met
  2. Create and switch to a namespace for Package Manager
kubectl create namespace posit-pm
kubectl config set-context --current --namespace=posit-pm
  1. Create a PostgreSQL database (in the cluster) and a Secret containing the password
  • Create the database
helm upgrade --install rspm-db bitnami/postgresql \
	--set auth.database="rs-pm" \
    --set auth.username="packageman" \
    --set auth.password="packageman"
  • Create Secret with password
kubectl create secret generic rspm-database --from-literal=password=packageman
  1. Create a StorageClass with ReadWriteMany access
  • minikube automatically creates a StorageClass for us under the name standard, which you can view with kubectl get StorageClass allowing us to skip this step for now.
  1. Configure Helm chart values with values.yaml file
    • Posit maintains a Helm chart which they recommend for deployment on Kubernetes
    • values.yaml overrides defaults specified in the Helm chart
      • The complete (and most up to date) values.yaml with all defaults can be seen here, or a more realistic one here. Below is a minimal example for testing with no license file
pod:
  env:
    - name: PACKAGEMANAGER_POSTGRES_PASSWORD
      valueFrom:
        secretKeyRef:
          name: rspm-database
          key: password

sharedStorage:
  create: true
  mount: true
  storageClassName: standard
  requests:
    storage: 100G # Default, you may need to decrease this during testing

config:
  Database:
    Provider: postgres
  Postgres:
    URL: "postgres://[email protected]:5432/rs-pm?sslmode=disable"
  1. Deploy and verify running
helm upgrade --install rstudio-pm-testing rstudio/rstudio-pm --values path/to/values.yaml
  • Optionally pass the -l parameter to select the container
kubectl get pod -l app.kubernetes.io/name=rstudio-pm
  1. Access by port-forwarding in a separate terminal window (only for testing)
kubectl port-forward svc/rstudio-pm-testing 3942:80
  • View in your browser at localhost:3942
  1. Stop and clean up
# This will remove everything, including shared storage
helm uninstall rspm-db rstudio-pm-testing

Using Package Manager

# Temporary alias
alias rspm="kubectl exec --namespace posit-pm services/rstudio-pm-testing -- rspm"
  • For testing, we will serve CRAN packages to our Posit Workbench instance
  1. Create a repository and subscribe it to the built-in cran source
# Create a repository:
rspm create repo --name=cran --description='Access CRAN packages'

# Subscribe the repository to the cran source:
rspm subscribe --repo=cran --source=cran
  • You can now visit the PackageManager instance in your web browser and see that the repository is present
  1. Edit the values.yaml for Posit Workbench accordingly (see Configure Your Helm Chart Values – Posit Workbench Documentation)
  • Assuming you have copied the commands in this guide resulting in the same hostnames and namespaces, add this section to your values.yaml being used for Posit Workbench
config:
# --- snip ---
  session:
    repos.conf:
      CRAN: http://rstudio-pm-testing.posit-pm.svc.cluster.local/cran/__linux__/jammy/latest
  1. Redeploy
  • You can redeploy Posit Workbench using the same helm upgrade command used to initially deploy it
helm upgrade --install rstudio-workbench-testing \
  --namespace posit-workbench \
  rstudio/rstudio-workbench \
  --values path/to/values.yaml
  • You can check if the new repo URL was applied properly by opening Posit Workbench in your browser, starting a new RStudio session, and then running options('repos') in the left terminal window
    • This should output the URL we set earler in values.yaml
  1. (Optional) viewing the repository in the web interface
  • Ensure that PackageManager is still port forwarded in a separate terminal window (I suggest using something like tmux to make managing terminals easier)
# Accessible at localhost:3942
kubectl port-forward --namespace posit-pm svc/rstudio-pm-testing 3942:80
  • You can search for packages to confirm that the repository was added correctly
  • To see configuration options:
    • Select the repository in the left drop down then click "Setup" on the right
    • Input the following options:
      • Operating System - Linux, since we are running everything inside linux-based containers
      • Linux distribution - Ubuntu 22.04 (Jammy), since by default the Posit pods use this distribution.
        • This can be confirmed by running kubectl exec -it svc/rstudio-workbench-testing -- cat /etc/os-release or seen here
    • This gives us (some) correct values/commands to use in the configuration, which as we saw is done on the CLI
      • As you probably noticed, (depending on your development setup) the URL given points to localhost instead of the hostname in the cluster

Notes / Further reading

minikube isn't the only way to interface with Kubernetes locally- there's also kind project, kubeadm, Podman play kube - Redhat, heck even entirely different container orchestration systems such as nomad by hashicorp. The common denominator is: They all manage the orchestration of containers (semi-isolated *nix processes sharing the same kernel) across multiple hosts.

  • GoLang >1.22.0
  • Make >4.0
  • Docker
  • Ubuntu 22.04 or earlier is a MUST (because the official kubernetes/minikube project has not yet updated build tooling to drop python2 dependency (#21441) - and python2 is not packaged in Ubuntu beyond ubuntu 22.04.
    • Otherwise fails with errors compiling gl_ functions
  • Ensure PATH does not include any paths with whitespace

Other prerequisites for buildroot, install with apt:

  • p7zip-full
  • python2
  • docker-buildx
  • bc

Without python2 Buildroot fails with checking whether /usr/bin/python2 version >= 2.6... configure: error: too old AKA: Buildroot/Minikube build process requires python2 :O

apt install p7zip-full python2 docker-buildx bc
  • Make docker runnable without sudo (not required, but recommended)
sudo groupadd docker
sudo usermod -aG docker $USER
newgrp docker
  1. Clone the minikube repo
git clone https://github.com/kubernetes/minikube.git
cd minikube
  1. Build buildroot-image
make buildroot-image
  1. Modify iso with menuconfig if desired (e.g. to add ISCSI)
cd out/buildroot
make menuconfig
make
make savedefconfig
  1. Build iso
cd ../..
make minikube-iso-x86_64
  • The iso is placed at out/minikube-amd64.iso
  1. Test
  • Run with minikube using local iso
minikube start --iso-url=file://$(pwd)/out/minikube-amd64.iso
  • SSH into minikube node
./out/minikube ssh
  • Check kernel modules with lsmod
docker@minikube:~$ lsmod | grep iscsi
iscsi_tcp              20480  0
libiscsi_tcp           28672  1 iscsi_tcp
libiscsi               69632  2 libiscsi_tcp,iscsi_tcp

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published