Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

melange test does not connect to qemu runner #1732

Open
stevebeattie opened this issue Jan 6, 2025 · 1 comment · May be fixed by #1747
Open

melange test does not connect to qemu runner #1732

stevebeattie opened this issue Jan 6, 2025 · 1 comment · May be fixed by #1747

Comments

@stevebeattie
Copy link
Member

stevebeattie commented Jan 6, 2025

Attempting to run melange test fails to make its initial connection to the qemu guest like so, whereas melange build will succeed with the qemu runner:

$ make test/nano  MELANGE_EXTRA_OPTS="--runner qemu"
yamlfile is nano.yaml
Testing package nano with version nano-8.3-r0 from file nano.yaml
$HOME/go/bin/melange test nano.yaml --repository-append $HOME/git/wolfi-dev/os/packages --keyring-append local-melange.rsa.pub --arch x86_64 --pipeline-dirs ./pipelines/ --repository-append https://packages.wolfi.dev/os --keyring-append https://packages.wolfi.dev/os/wolfi-signing.rsa.pub --test-package-append wolfi-base --debug --runner qemu --source-dir ./nano/
2025/01/05 23:12:22 INFO evaluating pipelines for package requirements
2025/01/05 23:12:22 INFO building test workspace in: '$HOME/tmp/melange-guest-1343812321-main' with apko
2025/01/05 23:12:22 [DEBUG] GET https://packages.wolfi.dev/os/apk-configuration
2025/01/05 23:12:22 INFO setting apk repositories: [$HOME/git/wolfi-dev/os/packages https://packages.wolfi.dev/os]
2025/01/05 23:12:22 INFO image configuration:
2025/01/05 23:12:22 INFO   contents:
2025/01/05 23:12:22 INFO     build repositories: []
2025/01/05 23:12:22 INFO     runtime repositories: []
2025/01/05 23:12:22 INFO     keyring:      []
2025/01/05 23:12:22 INFO     packages:     [nano]
2025/01/05 23:12:24 INFO installing ca-certificates-bundle (20241121-r0)
[...]
2025/01/05 23:12:25 INFO qemu: generating ssh key pairs for ephemeral VM
2025/01/05 23:12:25 INFO qemu: no disk space specified, using default: 50Gi
2025/01/05 23:12:25 INFO qemu: generating disk image, name ./3887205368.img, size 50Gi:
2025/01/05 23:12:25 INFO qemu: executing - qemu-system-x86_64 -machine microvm,rtc=on,pcie=on,pit=off,pic=off,isa-serial=off -bios /usr/share/seabios/bios-microvm.bin -kernel /tmp/kernel/boot/vmlinuz-virt -initrd $HOME/tmp/melange-guest-2983141708.initramfs.cpio -m 8036913k -smp 16 -accel kvm -cpu host -daemonize -display none -no-reboot -no-user-config -nodefaults -parallel none -serial none -vga none -netdev user,id=id1,hostfwd=tcp:127.0.0.1:40755-:22 -device virtio-net-pci,netdev=id1 -device virtio-rng-pci,rng=rng0 -object rng-random,filename=/dev/urandom,id=rng0 -append quiet nomodeset panic=-1 sshkey=ZWNkc2Etc2hhMi1uaXN0cDI1NiBBQUFBRTJWalpITmhMWE5vWVRJdGJtbHpkSEF5TlRZQUFBQUlibWx6ZEhBeU5UWUFBQUJCQkdRN3JydE1PdHcvR2JJVHhlakljblVDUC9Pa1pYSjFOaWRCK3gzYUpqYVZLajJEc0JUeUltWk5YdzNaSnRBRDlnRnZXampYQ2VaNGpZd3BWQmU5azU4PQo= -fsdev local,security_model=mapped,id=fsdev100,path=$HOME/tmp/melange-workspace-1187461383 -device virtio-9p-pci,id=fs100,fsdev=fsdev100,mount_tag=defaultshare -object iothread,id=io1 -device virtio-blk-pci,drive=disk0,iothread=io1 -drive if=none,id=disk0,cache=none,format=raw,aio=threads,werror=report,rerror=report,file=./3887205368.img
2025/01/05 23:12:26 INFO qemu: waiting for ssh to come up, try 1 of 6000
2025/01/05 23:12:33 ERRO Failed to dial: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2025/01/05 23:12:33 INFO ERROR: failed to test package. the test environment has been preserved:
2025/01/05 23:12:33 INFO   workspace dir: $HOME/tmp/melange-workspace-1187461383
2025/01/05 23:12:33 INFO   guest dir: $HOME/tmp/melange-guest-1343812321
2025/01/05 23:12:33 ERRO failed to test package: unable to start pod: qemu: could not get VM host key

Note that, when attempting to reproduce, the make test/PKG target in wolfi-dev/os/Makefile does not
honor the $MELANGE_OPTS environment variable as emitted by make fetch-kernel; in order to run the test target with the qemu runner, MELANGE_EXTRA_OPTS="--runner qemu" must be set.

This failure is due to the build user not being created within the guest via the apko, unlike what happens during a build. Qhen the qemu runner tries to capture the ssh host public key from the guest, it tries to connect as the build user at

func getHostKey(ctx context.Context, cfg *Config) error {
var hostKey ssh.PublicKey
signer, err := ssh.ParsePrivateKey(cfg.SSHKey)
if err != nil {
clog.FromContext(ctx).Errorf("Unable to parse private key: %v", err)
return err
}
// Create SSH client configuration
config := &ssh.ClientConfig{
User: "build",

Manually editing the above to set User: "root", cause the connection to be made and the tests to run.

One can see the difference in the apko configuration emitted by melange as part of the build and test output.

melange build for nano:

2025/01/05 23:29:27 INFO image configuration:
2025/01/05 23:29:27 INFO   contents:
2025/01/05 23:29:27 INFO     build repositories: []
2025/01/05 23:29:27 INFO     runtime repositories: []
2025/01/05 23:29:27 INFO     keyring:      []
2025/01/05 23:29:27 INFO     packages:     [apk-tools=2.14.4-r1 autoconf=2.72-r1 automake=1.17-r0 binutils=2.43.1-r2 build-base=1-r8 busybox=1.37.0-r0 ca-certificates-bundle=20241121-r0 e2fsprogs-libs=1.47.2-r0 e2fsprogs=1.47.2-r0 gcc=14.2.0-r7 glibc-dev=2.40-r3 glibc-locale-posix=2.40-r3 glibc=2.40-r3 gmp=6.3.0-r2 iproute2=6.12.0-r0 iptables=1.8.11-r0 isl=0.27-r0 kmod=33-r2 ld-linux=2.40-r3 libatomic=14.2.0-r7 libblkid=2.40.2-r1 libbz2-1=1.0.8-r9 libcom_err=1.47.2-r0 libcrypt1=2.40-r3 libcrypto3=3.4.0-r5 libelf=0.192-r1 libfdisk=2.40.2-r1 libgcc=14.2.0-r7 libgo=14.2.0-r7 libgomp=14.2.0-r7 libmnl=1.0.5-r4 libmount=2.40.2-r1 libnftnl=1.2.8-r1 libquadmath=14.2.0-r7 libsmartcols=2.40.2-r1 libssl3=3.4.0-r5 libstdc++-dev=14.2.0-r7 libstdc++=14.2.0-r7 libuuid=2.40.2-r1 libxcrypt-dev=4.4.37-r0 libxcrypt=4.4.37-r0 libzstd1=1.5.6-r5 linux-headers=6.6.69-r0 m4=1.4.19-r6 make=4.4.1-r4 melange-microvm-init=0.18.3-r0 mount=2.40.2-r1 mpc=1.3.1-r5 mpfr=4.2.1-r5 ncurses-dev=6.5_p20241228-r0 ncurses-terminfo-base=6.5_p20241228-r0 ncurses=6.5_p20241228-r0 nss-db=2.40-r3 nss-hesiod=2.40-r3 openssf-compiler-options=20240627-r6 openssh-keygen=9.9_p1-r3 openssh-server-config=9.9_p1-r3 openssh-server=9.9_p1-r3 perl=5.40.0-r3 pkgconf=2.3.0-r1 posix-cc-wrappers=1-r4 scanelf=1.3.8-r1 sqlite-libs=3.47.2-r0 util-linux-misc=2.40.2-r1 util-linux=2.40.2-r1 wget=1.25.0-r0 wolfi-base=1-r6 wolfi-baselayout=20230201-r15 wolfi-keys=1-r8 xz=5.6.3-r2 zlib=1.3.1-r4]
2025/01/05 23:29:27 INFO   accounts:
2025/01/05 23:29:27 INFO     runas:  
2025/01/05 23:29:27 INFO     users:
2025/01/05 23:29:27 INFO       - uid=1000(build) gid=824640536780
2025/01/05 23:29:27 INFO     groups:
2025/01/05 23:29:27 INFO       - gid=1000(build) members=[build]
2025/01/05 23:29:27 INFO auth configured for: []

melange test for nano:

025/01/05 23:12:22 INFO image configuration:
2025/01/05 23:12:22 INFO   contents:
2025/01/05 23:12:22 INFO     build repositories: []
2025/01/05 23:12:22 INFO     runtime repositories: []
2025/01/05 23:12:22 INFO     keyring:      []
2025/01/05 23:12:22 INFO     packages:     [nano]

There is a larger, overarching issue that what user is used in the build environment is inconsistent between the different build environments; qemu runner runs the build as root whereas docker runs it as build.

stevebeattie added a commit to stevebeattie/melange that referenced this issue Jan 13, 2025
The Test pipeline environment does not get a build user included by
default; this causes test pipelines to fail when the qemu runner is used
because the initial connection attempt to get the host public key always
uses the build user to connect. Fix this by adding the build user to all
Test environments, both the primary pipeline and any subpipelines.

Fixes: chainguard-dev#1732
Signed-off-by: Steve Beattie <[email protected]>
@stevebeattie
Copy link
Member Author

I have a WIP fix for this in main...stevebeattie:melange:add-build-user-to-test-environments but I need to fix up the test_test.go tests to expect the added user, and also I'm not entirely sure this is the right approach to fix this.

stevebeattie added a commit to stevebeattie/melange that referenced this issue Jan 14, 2025
The Test pipeline environment does not get a build user included by
default; this causes test pipelines to fail when the qemu runner is used
because the initial connection attempt to get the host public key always
uses the build user to connect. Fix this by adding the build user to all
Test environments, both the primary pipeline and any subpipelines.

Fixes: chainguard-dev#1732
Signed-off-by: Steve Beattie <[email protected]>
@stevebeattie stevebeattie linked a pull request Jan 14, 2025 that will close this issue
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant