- 
                Notifications
    
You must be signed in to change notification settings  - Fork 194
 
CC: newly pulled pause image by snapshotter stored in an unexpected location #5781
Description
Description of problem
With a config IMAGE_OFFLOAD_TO_GUEST=yes and FORKED_CONTAINERD=no, a pod creation under IBM Z SE is sometimes stuck in a CreateContainerError state with the following error:
Error: failed to create containerd container: create instance 697: object with key "697" already exists: unknown
It is a known issue with an upstream containerd v1.6.8 (#5775 (comment)). A quick remedy would be to remove a pause image and get the snapshotter to pull the image. But the newly pulled image is stored in an unexpected location (originally /run/kata-containers/shared/sandboxes/${sandbox_id}/shared is expected) as follows:
# ls -lah /run/kata-containers/shared/sandboxes/a322d916b5dc547d1dce178d31b13091418793a9675a8aa006fcfecd49f8bbc1/shared
total 16K
drwxr-x--- 3 root root 160 Oct 12 11:04 .
drwx------ 5 root root 100 Oct 12 11:04 ..
-rw-r--r-- 1 root root 103 Oct 12 11:04 a322d916b5dc547d1dce178d31b13091418793a9675a8aa006fcfecd49f8bbc1-e9967091f9448d8a-resolv.conf
-rw-r--r-- 1 root root  11 Oct 12 11:04 efde0bf9b12e2e127bdb007f58e4dfb893d990fc64b8063f9594c1c1753c06ce-44e4e6f3b60b2926-hostname
-rw-r--r-- 1 root root 103 Oct 12 11:04 efde0bf9b12e2e127bdb007f58e4dfb893d990fc64b8063f9594c1c1753c06ce-4c6bb0d5b7fc98ff-resolv.conf
-rw-rw-rw- 1 root root   0 Oct 12 11:04 efde0bf9b12e2e127bdb007f58e4dfb893d990fc64b8063f9594c1c1753c06ce-83476f850307d009-termination-log
-rw-r--r-- 1 root root 205 Oct 12 11:04 efde0bf9b12e2e127bdb007f58e4dfb893d990fc64b8063f9594c1c1753c06ce-844b44105b991bcd-hosts
drwxrwxrwt 3 root root 140 Oct 12 11:04 efde0bf9b12e2e127bdb007f58e4dfb893d990fc64b8063f9594c1c1753c06ce-ab6d937a4d086125-serviceaccount
# ls -lah /run/containerd/io.containerd.runtime.v2.task/k8s.io/a322d916b5dc547d1dce178d31b13091418793a9675a8aa006fcfecd49f8bbc1/
total 28K
drwx------  3 root root  200 Oct 12 11:04 .
drwx--x--x 20 root root  400 Oct 12 11:04 ..
-rw-r--r--  1 root root   89 Oct 12 11:04 address
-rw-r--r--  1 root root 8.4K Oct 12 11:04 config.json
prwx------  1 root root    0 Oct 12 11:07 log
-rw-r--r--  1 root root  101 Oct 12 11:04 monitor_address
drwx--x--x  2 root root   40 Oct 12 11:04 rootfs
-rw-------  1 root root   32 Oct 12 11:04 shim-binary-path
-rw-r--r--  1 root root    7 Oct 12 11:04 shim.pid
lrwxrwxrwx  1 root root  121 Oct 12 11:04 work -> /var/lib/containerd/io.containerd.runtime.v2.task/k8s.io/a322d916b5dc547d1dce178d31b13091418793a9675a8aa006fcfecd49f8bbc1
This leads to a test failure for Test can pull an unencrypted image inside the guest. 
| [ ${#rootfs[@]} -eq 1 ] | 
This could be resolved by bumping the containerd to v1.7, but is not an option at the moment.
The error looks only happening at http://jenkins.katacontainers.io/job/kata-containers-CCv0-ubuntu-20.04-s390x-SE-daily/. We could skip the test until the update is finished.