- 
                Notifications
    
You must be signed in to change notification settings  - Fork 194
 
CCv0 | kata-remote | Issues with running conformance on kata-remote runtime class #5779
Description
I am trying to run the e2e conformance tests as part of my setup on the kata-remote runtime class. I seem to notice that the YAML files for the test pods are very different to those generated for the kata or the kata-cc runtime class. I understand that these differences may arise due to parameters in the configuration-remote.toml differing from that of the configurations for kata and kata-cc which gives rise to the question on whether the tests are even supported on kata-remote.
To compare the 2 configuration files -
diff kata-config-toml/configuration-remote.toml configuration-clh-snp.toml
1,2c1,2
< # Copyright (c) 2017-2019 Intel Corporation
< # Copyright (c) 2023 IBM Corporation
---
> # Copyright (c) 2019 Ericsson Eurolab Deutschland GmbH
> # Copyright (c) 2021 Adobe Inc.
9c9
< # XXX: Source file: "config/configuration-remote.toml.in"
---
> # XXX: Source file: "config/configuration-clh.toml.in"
14,18c14,23
<
< [hypervisor.remote]
< remote_hypervisor_socket = "/run/peerpod/hypervisor.sock"
< remote_hypervisor_timeout = 600
<
---
> [hypervisor.clh]
> path = "/opt/confidential-containers/bin/cloud-hypervisor-snp"
> igvm = "/opt/confidential-containers/share/kata-containers/kata-containers-igvm.img"
> image = "/opt/confidential-containers/share/kata-containers/kata-containers.img"
>
> # rootfs filesystem type:
> #   - ext4 (default)
> #   - xfs
> #   - erofs
> rootfs_type="ext4"
29c34
< #   - CPU Hotplug
---
> #   - CPU Hotplug
32a38,40
> # Supported TEEs:
> # * Intel TDX
> #
34c42,67
< # confidential_guest = true
---
> confidential_guest = true
>
> # enable SEV SNP VMs.
> # This is not currently used by CLH
> sev_snp_guest = true
>
> # SNP guest policy
> # Based on SEV Secure Nested Paging Firmware ABI Specification section 4.3
> # If it is unspecified or 0, it will default to 0x30000 (i.e. Bit#17 is '1' which is reserved and Bit#16 is '1' which means SMT is allowed).
> # This is not currently used by CLH
> snp_guest_policy=0x30000
>
> # Enable running clh VMM as a non-root user.
> # By default clh VMM run as root. When this is set to true, clh VMM process runs as
> # a non-root random user. See documentation for the limitations of this mode.
> # rootless = true
>
> # disable applying SELinux on the VMM process (default false)
> disable_selinux=false
>
> # disable applying SELinux on the container process
> # If set to false, the type `container_t` is applied to the container process by default.
> # Note: To enable guest SELinux, the guest rootfs must be CentOS that is created and built
> # with `SELINUX=yes`.
> # (default: true)
> disable_guest_selinux=true
35a69,78
> # Path to the firmware.
> # If you want Cloud Hypervisor to use a specific firmware, set its path below.
> # This is option is only used when confidential_guest is enabled.
> #
> # For more information about firmwared that can be used with specific TEEs,
> # please, refer to:
> # * Intel TDX:
> #   - td-shim: https://github.com/confidential-containers/td-shim
> #
> # firmware = ""
40,41c83,89
< # Note: Remote hypervisor is only handling the following annotations
< enable_annotations = ["machine_type", "default_memory", "default_vcpus", "image", "volume_name"]
---
> enable_annotations = ["enable_iommu"]
>
> # List of valid annotations values for the hypervisor
> # Each member of the list is a path pattern as described by glob(3).
> # The default if not set is empty (all annotations rejected.)
> # Your distribution recommends: ["/opt/confidential-containers/bin/cloud-hypervisor-snp"]
> valid_hypervisor_paths = ["/opt/confidential-containers/bin/cloud-hypervisor-snp"]
53,58c101
< # NOTE: kernel_params are not currently passed over in remote hypervisor
< # kernel_params = ""
<
< # Path to the firmware.
< # If you want that qemu uses the default firmware leave this option empty
< firmware = ""
---
> kernel_params = " agent.enable_signature_verification=false "
65c108
< # default_vcpus = 1
---
> default_vcpus = 1
81,94c124
< # NOTICE: on arm platform with gicv2 interrupt controller, set it to 8.
< # default_maxvcpus = 0
<
< # Bridges can be used to hot plug devices.
< # Limitations:
< # * Currently only pci bridges are supported
< # * Until 30 devices per bridge can be hot plugged.
< # * Until 5 PCI bridges can be cold plugged per VM.
< #   This limitation could be a bug in qemu or in the kernel
< # Default number of bridges per SB/VM:
< # unspecified or 0   --> will be set to 1
< # > 1 <= 5           --> will be set to the specified number
< # > 5                --> will be set to 5
< default_bridges = 1
---
> default_maxvcpus = 0
98,100c128,129
< # Note: the remote hypervisor uses the peer pod config to determine the memory of the VM
< # default_memory = 2048
< #
---
> default_memory = 2048
>
104d132
< # Note: the remote hypervisor uses the peer pod config to determine the memory of the VM
106a135,196
> # Default maximum memory in MiB per SB / VM
> # unspecified or == 0           --> will be set to the actual amount of physical RAM
> # > 0 <= amount of physical RAM --> will be set to the specified number
> # > amount of physical RAM      --> will be set to the actual amount of physical RAM
> default_maxmemory = 0
>
> # Shared file system type:
> #   - virtio-fs (default)
> #   - virtio-fs-nydus
> shared_fs = "virtio-fs"
>
> # Path to vhost-user-fs daemon.
> virtio_fs_daemon = "/opt/confidential-containers/libexec/virtiofsd"
>
> # List of valid annotations values for the virtiofs daemon
> # The default if not set is empty (all annotations rejected.)
> # Your distribution recommends: ["/opt/confidential-containers/libexec/virtiofsd"]
> valid_virtio_fs_daemon_paths = ["/opt/confidential-containers/libexec/virtiofsd"]
>
> # Default size of DAX cache in MiB
> virtio_fs_cache_size = 0
>
> # Default size of virtqueues
> virtio_fs_queue_size = 1024
>
> # Extra args for virtiofsd daemon
> #
> # Format example:
> #   ["-o", "arg1=xxx,arg2", "-o", "hello world", "--arg3=yyy"]
> # Examples:
> #   Set virtiofsd log level to debug : ["-o", "log_level=debug"] or ["-d"]
> # see `virtiofsd -h` for possible options.
> virtio_fs_extra_args = ["--thread-pool-size=1", "-o", "announce_submounts"]
>
> # Cache mode:
> #
> #  - never
> #    Metadata, data, and pathname lookup are not cached in guest. They are
> #    always fetched from host and any changes are immediately pushed to host.
> #
> #  - auto
> #    Metadata and pathname lookup cache expires after a configured amount of
> #    time (default is 1 second). Data is cached while the file is open (close
> #    to open consistency).
> #
> #  - always
> #    Metadata, data, and pathname lookup are cached in guest and never expire.
> virtio_fs_cache = "auto"
>
> # Block storage driver to be used for the hypervisor in case the container
> # rootfs is backed by a block device. This is virtio-blk.
> block_device_driver = "virtio-blk"
>
> # Enable huge pages for VM RAM, default false
> # Enabling this will result in the VM memory
> # being allocated using huge pages.
> #enable_hugepages = true
>
> # Disable the 'seccomp' feature from Cloud Hypervisor, default false
> # TODO - to be re-enabled with next CH-SNP release. This is fixed but the fix is not yet released
> # disable_seccomp = true
>
108c198
< # to enable debug output where available. And Debug also enable the hmp socket.
---
> # to enable debug output where available.
128,139c218,290
< #guest_hook_path = "/usr/share/oci/hooks"
<
< # disable applying SELinux on the VMM process (default false)
< disable_selinux=false
<
< # disable applying SELinux on the container process
< # If set to false, the type `container_t` is applied to the container process by default.
< # Note: To enable guest SELinux, the guest rootfs must be CentOS that is created and built
< # with `SELINUX=yes`.
< # (default: true)
< # Note: The remote hypervisor has a different guest, so currently requires this to be disabled
< disable_guest_selinux = true
---
> #guest_hook_path = "/opt/confidential-containers/share/oci/hooks"
> #
> # These options are related to network rate limiter at the VMM level, and are
> # based on the Cloud Hypervisor I/O throttling.  Those are disabled by default
> # and we strongly advise users to refer the Cloud Hypervisor official
> # documentation for a better understanding of its internals:
> # https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/docs/io_throttling.md
> #
> # Bandwidth rate limiter options
> #
> # net_rate_limiter_bw_max_rate controls network I/O bandwidth (size in bits/sec
> # for SB/VM).
> # The same value is used for inbound and outbound bandwidth.
> # Default 0-sized value means unlimited rate.
> #net_rate_limiter_bw_max_rate = 0
> #
> # net_rate_limiter_bw_one_time_burst increases the initial max rate and this
> # initial extra credit does *NOT* affect the overall limit and can be used for
> # an *initial* burst of data.
> # This is *optional* and only takes effect if net_rate_limiter_bw_max_rate is
> # set to a non zero value.
> #net_rate_limiter_bw_one_time_burst = 0
> #
> # Operation rate limiter options
> #
> # net_rate_limiter_ops_max_rate controls network I/O bandwidth (size in ops/sec
> # for SB/VM).
> # The same value is used for inbound and outbound bandwidth.
> # Default 0-sized value means unlimited rate.
> #net_rate_limiter_ops_max_rate = 0
> #
> # net_rate_limiter_ops_one_time_burst increases the initial max rate and this
> # initial extra credit does *NOT* affect the overall limit and can be used for
> # an *initial* burst of data.
> # This is *optional* and only takes effect if net_rate_limiter_bw_max_rate is
> # set to a non zero value.
> #net_rate_limiter_ops_one_time_burst = 0
> #
> # These options are related to disk rate limiter at the VMM level, and are
> # based on the Cloud Hypervisor I/O throttling.  Those are disabled by default
> # and we strongly advise users to refer the Cloud Hypervisor official
> # documentation for a better understanding of its internals:
> # https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/docs/io_throttling.md
> #
> # Bandwidth rate limiter options
> #
> # disk_rate_limiter_bw_max_rate controls disk I/O bandwidth (size in bits/sec
> # for SB/VM).
> # The same value is used for inbound and outbound bandwidth.
> # Default 0-sized value means unlimited rate.
> #disk_rate_limiter_bw_max_rate = 0
> #
> # disk_rate_limiter_bw_one_time_burst increases the initial max rate and this
> # initial extra credit does *NOT* affect the overall limit and can be used for
> # an *initial* burst of data.
> # This is *optional* and only takes effect if disk_rate_limiter_bw_max_rate is
> # set to a non zero value.
> #disk_rate_limiter_bw_one_time_burst = 0
> #
> # Operation rate limiter options
> #
> # disk_rate_limiter_ops_max_rate controls disk I/O bandwidth (size in ops/sec
> # for SB/VM).
> # The same value is used for inbound and outbound bandwidth.
> # Default 0-sized value means unlimited rate.
> #disk_rate_limiter_ops_max_rate = 0
> #
> # disk_rate_limiter_ops_one_time_burst increases the initial max rate and this
> # initial extra credit does *NOT* affect the overall limit and can be used for
> # an *initial* burst of data.
> # This is *optional* and only takes effect if disk_rate_limiter_bw_max_rate is
> # set to a non zero value.
> #disk_rate_limiter_ops_one_time_burst = 0
168,169c319,320
< # (default: 30)
< #dial_timeout = 30
---
> # (default: 90)
> dial_timeout = 90
193,194c344
< # Note: The remote hypervisor, uses it's own network, so "none" is required
< internetworking_model="none"
---
> internetworking_model="tcfilter"
201d350
< # Note: The remote hypervisor has a different guest, so currently requires this to be set to true
204d352
<
234,235c382
< # Note: The remote hypervisor has a different networking model, which requires true
< disable_new_netns = true
---
> #disable_new_netns = true
252d398
< # Note: the remote hypervisor uses the peer pod config to determine the sandbox size, so requires this to be set to true
254a401,406
> # If specified, sandbox_bind_mounts identifieds host paths to be mounted (ro) into the sandboxes shared path.
> # This is only valid if filesystem sharing is utilized. The provided path(s) will be bindmounted into the shared fs directory.
> # If defaults are utilized, these mounts should be available in the guest at `/run/kata-containers/shared/containers/sandbox-mounts`
> # These will not be exposed to the container workloads, and are only provided for potential guest services.
> sandbox_bind_mounts=[]
>
278d429
< # Note: remote hypervisor has no sharing of emptydir mounts from host to guest
299,300c450
< # Note: The remote hypervisor offloads the pulling on images on the peer pod VM, so requries this to be true
< service_offload = true
---
> service_offload = false
I seem to notice that the same set of tests which pass for kata-cc and kata runtime class fail out with the kata-remote runtime class.
I am not able to pinpoint what in the configuration is giving rise to this difference. An example of a test that is failing on kata remote but passing on kata/kata-cc is the should function for intra-pod communication: udp test.
Some interesting differences that I observed in the YAMLs for the test pods generated in kata and kata-remote are given below:
diff sample/netserver-0-kata.yaml sample/netserver-0-kata-remote.yaml
5,6c5,28
<   creationTimestamp: "2023-10-04T23:07:51Z"
---
>     cni.projectcalico.org/containerID: 9e3a3c5f55b3e686924e3485a4de1b9175ac716dfa55c529bbed93269a89509a
>     cni.projectcalico.org/podIP: 172.28.95.139/32
>     cni.projectcalico.org/podIPs: 172.28.95.139/32
>     k8s.v1.cni.cncf.io/network-status: |-
>       [{
>           "name": "k8s-pod-network",
>           "ips": [
>               "172.28.95.139"
>           ],
>           "default": true,
>           "dns": {}
>       }]
>     k8s.v1.cni.cncf.io/networks-status: |-
>       [{
>           "name": "k8s-pod-network",
>           "ips": [
>               "172.28.95.139"
>           ],
>           "default": true,
>           "dns": {}
>       }]
>   creationTimestamp: "2023-10-05T17:10:09Z"
>   deletionGracePeriodSeconds: 30
>   deletionTimestamp: "2023-10-05T17:15:57Z"
19c41
<     image: registry.k8s.io/e2e-test-images/agnhost:2.43
---
>     image: k8s.gcr.io/e2e-test-images/agnhost:2.39
60,64c82,83
<     katacontainers.io/kata-runtime: "true"
<     kubernetes.io/hostname: aks-nodepool1-90529124-vmss000000
<   overhead:
<     cpu: 250m
<     memory: 160Mi
---
>     kubernetes.io/hostname: hardware0-control-plane
>     node.kubernetes.io/worker: ""
.
.
.
.
I don't quite understand what parameters lead to the introduction of these new annotations which makes the behaviour of the test so different. Why are these parameters different between kata/kata-cc and kata-remote. What is the impact of setting these to be the same.