|
| 1 | +--- |
| 2 | +title: "kexec" |
| 3 | +date: 2023-12-10T21:32:56-08:00 |
| 4 | +draft: false |
| 5 | +--- |
| 6 | + |
| 7 | +* What is kexec? |
| 8 | + |
| 9 | +~kexec~ is short for kernel execute. At a high level, it is analogous with the |
| 10 | +syscall ~exec~. ~kexec~ replaces the memory of the currently loaded kernel image |
| 11 | +and begins executing the newly loaded kernel image. tl;dr it enables users to |
| 12 | +run new kernels without needing to do a full power cycle of a system. |
| 13 | + |
| 14 | +* What is the value of kexec? |
| 15 | + |
| 16 | ++ Enables a "boot once" flow for testing a potentially problematic kernel. |
| 17 | + + No bootloader configuration required. |
| 18 | ++ Does not require a complete system reboot (BIOS boot) to swap between kernel |
| 19 | + images for testing. |
| 20 | ++ Lowers the difficulty for setting up ad-hoc kernel development environments |
| 21 | + using the latest kernel source tree. |
| 22 | + |
| 23 | +* What is required to use kexec? |
| 24 | + |
| 25 | +** Kconfig |
| 26 | + |
| 27 | +The kernel that will have its image rewritten (the currently booted kernel) will |
| 28 | +need the following kernel configuration options. |
| 29 | + |
| 30 | +#+BEGIN_SRC |
| 31 | + CONFIG_KEXEC=y # Enables support for general kexec syscall functionality. |
| 32 | + CONFIG_KEXEC_FILE=y # Enables kexec_file_load syscall. See the manual for kexec_load(2) for more details. |
| 33 | + |
| 34 | + # Optional security options |
| 35 | + CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=n # Verify the signing signature of the bzImage used in kexec |
| 36 | + CONFIG_KEXEC_SIG=n # If the kernel image has a signature, make sure the signature is valid when using the kexec_file_load syscall |
| 37 | + CONFIG_KEXEC_SIG_FORCE=n # Enforce that the kernel image used in the kexec_file_load syscall has a valid signature |
| 38 | +#+END_SRC |
| 39 | + |
| 40 | +** Userspace |
| 41 | + |
| 42 | +You will need ~kexec-tools~ in userspace to be able to configure a kernel image |
| 43 | +for ~kexec~. |
| 44 | + |
| 45 | +* Using kexec |
| 46 | + |
| 47 | +Here is an example of setting up a new kernel image to execute in place of the |
| 48 | +currently running image while reusing the kernel commandline of the previous |
| 49 | +image. |
| 50 | + |
| 51 | +#+BEGIN_SRC sh |
| 52 | + kexec -l /path/to/vmlinuz --initrd=/path/to/initramfs.img --reuse-cmdline |
| 53 | + kexec -e |
| 54 | +#+END_SRC |
| 55 | + |
| 56 | +The ~--initrd~ flag is optional. It is used when an initramfs image is needed to |
| 57 | +properly bring up a system. ~kexec -e~ will then execute the configured kernel |
| 58 | +without gracefully taking down any system services momentarily that might be |
| 59 | +impacted. |
| 60 | + |
| 61 | +~systemd~ can assist with a more graceful ~kexec~ flow for taking down services |
| 62 | +and even supporting ad-hoc clean-up services using ~WantedBy=kexec.target~ in |
| 63 | +the ~Install~ section of a systemd service definition. ~kexec -e~ is replaced |
| 64 | +with ~systemctl kexec~. |
| 65 | + |
| 66 | +If you want to unload the target kernel that was previously loaded by ~kexec~, |
| 67 | +~kexec -u~ will unload the currently running ~kexec~ target kernel. |
| 68 | + |
| 69 | +There are a number of options for the ~kexec~ commandline tool that are |
| 70 | +documented in the manual for ~kexec(8)~. |
| 71 | + |
| 72 | +Usage details for ~kexec~ are also well documented on the Arch Linux wiki and |
| 73 | +Gentoo wiki. |
| 74 | + |
| 75 | +OpenSUSE also has a more thorough [[https://documentation.suse.com/de-de/sles/15-GA/html/SLES-all/cha-tuning-kexec.html][write-up]] on ~kexec~ and ~kdump~. |
| 76 | + |
| 77 | +NOTE: It seems Gentoo uses a patched version of ~reboot~ to offer a graceful |
| 78 | +~-k~ flag. Gentoo probably does this since it offers an alternative to |
| 79 | +~systemd~, OpenRC. |
| 80 | + |
| 81 | +* Why is kexec not enabled everywhere? |
| 82 | + |
| 83 | +Lightly mentioned in this article already, ~kexec~ is not an infallible process. |
| 84 | +This has to do with the fact that, unlike a userspace program that might have |
| 85 | +its application instructions overwritten with an ~exec~ syscall, the kernel |
| 86 | +image being overwritten is in charge of managing devices at a very low-level. |
| 87 | +Device teardown and initialization may not occur in a way that leaves an |
| 88 | +"already-running" system in a stable state. |
| 89 | + |
| 90 | +The idea of being able to safely move to a different kernel without compromising |
| 91 | +the system is called kernel livepatching. Kernel livepatching is a hot area of |
| 92 | +research with multiple entities taking their own approaches on the matter. |
| 93 | + |
| 94 | ++ [[https://ubuntu.com/blog/an-overview-of-live-kernel-patching][Canonical's solution without kexec by using ftrace hooking]] |
| 95 | ++ [[https://documentation.suse.com/sles/12-SP4/html/SLES-kgraft/index.html][kGraft by OpenSUSE that similarly uses ftrace hooking]] |
| 96 | ++ [[https://www.redhat.com/en/topics/linux/what-is-linux-kernel-live-patching#the-two-spaces-of-linux-system-operations][RedHat's ftrace hooking kpatch livepatch solution]] |
| 97 | ++ [[https://en.wikipedia.org/wiki/Ksplice#Design][Ksplice using its own injector and hooking mechanism]] |
| 98 | + |
| 99 | +*NOTE:* Ksplice was initially developed by MIT students during its initial |
| 100 | +development, as a reference illustrating that kernel livepatching is a topic |
| 101 | +worth academic exploration. |
| 102 | + |
| 103 | +*DISCLAIMER:* I have not read any of the above approaches in detail. I just felt |
| 104 | +I should draw attention to them for curious readers interested in ways to |
| 105 | +potentially make ~kexec~ "more-robust". |
| 106 | + |
| 107 | +* Alternatives to kexec |
| 108 | + |
| 109 | +For "boot once" testing, using bootloader configurations to do a "boot once" is |
| 110 | +an option. However, many bootloaders have an involved process for achieving |
| 111 | +this. |
| 112 | + |
| 113 | +I have not found decent documentation on how to do this for GRUB 2 (not GRUB |
| 114 | +Legacy). In reality, I might work with a number of systems using different |
| 115 | +bootloaders such as ~systemd-boot~ and learning how to do this with every |
| 116 | +bootloader implementation out there seems like an adventure. |
0 commit comments