|
| 1 | +--- |
| 2 | +title: Forensic container checkpointing in Kubernetes |
| 3 | +date: 2025-09-14 |
| 4 | +tags: ["security", "kubernetes"] |
| 5 | +authors: ["Kapil Agrawal"] |
| 6 | +comments: false |
| 7 | +--- |
| 8 | + |
| 9 | +## Identify our Pod of interest |
| 10 | + |
| 11 | +Find the node where the pod is currently running |
| 12 | + |
| 13 | +```sh |
| 14 | +kubectl get pod -o wide |
| 15 | +``` |
| 16 | + |
| 17 | +Locate the container id of the Pod |
| 18 | + |
| 19 | +```sh |
| 20 | +kubectl desribe pod PODNAME | grep -i "Container ID" |
| 21 | +``` |
| 22 | + |
| 23 | +## Requirements |
| 24 | + |
| 25 | +1. Download and Install CRIU on the node |
| 26 | + https://criu.org/Packages |
| 27 | + |
| 28 | +2. You may need to explicitly allow access to checkpoint api on the node |
| 29 | + |
| 30 | +```yaml |
| 31 | +# kubectl apply -f node-checkpoint-rbac.yaml |
| 32 | +--- |
| 33 | +apiVersion: rbac.authorization.k8s.io/v1 |
| 34 | +kind: ClusterRole |
| 35 | +metadata: |
| 36 | + name: node-checkpoint-access |
| 37 | +rules: |
| 38 | + - apiGroups: [""] |
| 39 | + resources: ["nodes/checkpoint"] |
| 40 | + verbs: ["create"] |
| 41 | + |
| 42 | +--- |
| 43 | +apiVersion: rbac.authorization.k8s.io/v1 |
| 44 | +kind: ClusterRoleBinding |
| 45 | +metadata: |
| 46 | + name: node-checkpoint-access |
| 47 | +roleRef: |
| 48 | + apiGroup: rbac.authorization.k8s.io |
| 49 | + kind: ClusterRole |
| 50 | + name: node-checkpoint-access |
| 51 | +subjects: |
| 52 | + - kind: Group |
| 53 | + name: system:nodes |
| 54 | + apiGroup: rbac.authorization.k8s.io |
| 55 | +``` |
| 56 | +
|
| 57 | +⚠️ During checkpointing, a .tar archive is created, which requires a functional tar binary. By default, k3s relies on the BusyBox implementation of tar, which is incompatible with CRIU. To ensure checkpointing works correctly, you may need to override this with the system’s full tar binary. |
| 58 | +
|
| 59 | +```sh |
| 60 | +# /var/lib/rancher/k3s/data/ is k3s’s runtime dependency store, containing unpacked, versioned |
| 61 | +# bundles of the k3s binary, containerd, and supporting tools |
| 62 | + |
| 63 | +[root@x86-dev:~] ls -l /var/lib/rancher/k3s/data/current/bin/tar |
| 64 | +lrwxrwxrwx 1 root root 7 Sep 6 19:27 tar -> busybox* |
| 65 | + |
| 66 | +[root@x86-dev:~] rm /var/lib/rancher/k3s/data/current/bin/tar |
| 67 | +[root@x86-dev:~] ln -s $(which tar) /var/lib/rancher/k3s/data/current/bin/tar |
| 68 | +``` |
| 69 | + |
| 70 | +## Checkpoint a running pod on a K3s node |
| 71 | + |
| 72 | +```sh |
| 73 | +curl -q -s --insecure \ |
| 74 | +--cert /var/lib/rancher/k3s/agent/client-kubelet.crt \ |
| 75 | +--key /var/lib/rancher/k3s/agent/client-kubelet.key \ |
| 76 | +--cacert /var/lib/rancher/k3s/agent/client-ca.crt \ |
| 77 | +-X POST "https://$(hostname -i):10250/checkpoint/NAMESPACE/PODNAME/CONTAINERNAME" |
| 78 | +``` |
| 79 | + |
| 80 | +## Example |
| 81 | + |
| 82 | +Try checkpointing a running netshoot pod |
| 83 | + |
| 84 | +```sh |
| 85 | +[root@x86-dev:~] curl -q -s --insecure \ |
| 86 | +--cert /var/lib/rancher/k3s/agent/client-kubelet.crt \ |
| 87 | +--key /var/lib/rancher/k3s/agent/client-kubelet.key \ |
| 88 | +--cacert /var/lib/rancher/k3s/agent/client-ca.crt \ |
| 89 | +-X POST "https://$(hostname -i):10250/checkpoint/default/netshoot/netshoot" |
| 90 | +``` |
| 91 | + |
| 92 | +Output |
| 93 | + |
| 94 | +```sh |
| 95 | +[root@x86-dev:~] {"items":["/var/lib/kubelet/checkpoints/checkpoint-netshoot_default-netshoot-2025-09-13T18:09:15-05:00.tar"]} |
| 96 | + |
| 97 | +[root@x86-dev:~] ls /var/lib/kubelet/checkpoints/ |
| 98 | +checkpoint-netshoot_default-netshoot-2025-09-13T18:09:15-05:00.tar |
| 99 | +``` |
| 100 | + |
| 101 | +## Restoring checkpoint image for analysis |
| 102 | + |
| 103 | +Download [checkpointctl](https://github.com/checkpoint-restore/checkpointctl) |
| 104 | + |
| 105 | +``` |
| 106 | +[root@x86-dev:~] checkpointctl list |
| 107 | +Listing checkpoints in path: /var/lib/kubelet/checkpoints/ |
| 108 | +NAMESPACE POD CONTAINER ENGINE TIME CHECKPOINTED CHECKPOINT NAME |
| 109 | +--------- --- --------- ------ ----------------- --------------- |
| 110 | +default netshoot netshoot containerd 13 Sep 25 18:09 CDT checkpoint-netshoot_default-netshoot-2025-09-13T18:09:15-05:00.tar |
| 111 | +
|
| 112 | +``` |
| 113 | + |
| 114 | +Inspecting a checkpoint image |
| 115 | + |
| 116 | +```sh |
| 117 | +[root@x86-dev:~] checkpointctl inspect \ |
| 118 | +--files \ |
| 119 | +--metadata \ |
| 120 | +--mounts \ |
| 121 | +--ps-tree \ |
| 122 | +--ps-tree-cmd \ |
| 123 | +--ps-tree-env \ |
| 124 | +--sockets \ |
| 125 | +checkpoint-netshoot_default-netshoot-2025-09-13T18:09:15-05:00.tar |
| 126 | +``` |
| 127 | + |
| 128 | +Show memory dump |
| 129 | + |
| 130 | +```sh |
| 131 | +# kubelet stores checkpoint under /var/lib/kubelet/checkpoints/ |
| 132 | +[root@x86-dev:~] checkpointctl memparse <PATH-TO-CHECKPOINT-TAR> |
| 133 | + |
| 134 | +# show full memory dump of a process |
| 135 | +[root@x86-dev:~] checkpointctl memparse --pid PID <PATH-TO-CHECKPOINT-TAR> |
| 136 | +``` |
| 137 | + |
| 138 | +### Reference |
| 139 | + |
| 140 | +-- |
| 141 | + |
| 142 | +- https://kubernetes.io/docs/reference/node/kubelet-checkpoint-api/ |
| 143 | +- https://criu.org/Containerd |
| 144 | +- https://github.com/checkpoint-restore |
0 commit comments