Skip to content

Conversation

@matthewsheerin
Copy link
Contributor

@matthewsheerin matthewsheerin commented Jun 30, 2025

Problem

We have various repos building UBI-based containers using rpmoci. That presents a problem, because whenever an RPM is bumped upstream, the old version is yanked and locked rpmoci builds break immediately.

We tried to mitigate this by using an Artifactory mirror, but that gets the metadata to serve up from the upstream rather than its own cache, so RPMs still become unavailable to dnf even if they're cached by the internal mirror configured in rpmoci.toml.

We therefore added a separate internal escrow RPM store. That faced two issues:

  1. rpmoci update fails, because dnf finds RPMs in both the mirror and the escrow, and the resolution didn't de-duplicate these so they conflicted with each other. This turned out to not actually be blocking or need any code changes (see fix discussion below).
  2. After working around 1, builds using just the escrow store fail when there's a mismatch in checksum algorithm between the upstream (presented by the mirror) and the escrow store.

A non-issue that allows this to work is that we don't check GPG keys on these internal repos, which makes them interchangeable at install time because the repoid field in the lockfile is only used for GPG checking.

Fix

The code changes don't specifically mention escrow, and instead refer to the more general case that the checksum algorithm (and checksum) differ between resolution and download, which in theory could happen for other reasons too. For example, metadata could be updated to use a more secure hash algorithm.

  1. Originally I fixed this by adding some de-duplication logic. I then realised that setting a priority value for the repos also works so this is a non-issue. I wanted to keep this in, but factoring in the (optional) priority value when de-duplicating was very finnicky mostly due to the dnf interface in Python being a bit horrible and very poorly documented. There is scope to add this in future, possibly by figuring out dnf's interface to tell it to just return one RPM.
  2. Spot such mismatches, and verify those checksums after downloading the corresponding RPMs. I refactored download.py to have a new Package class to make this easier to write.

Example of configuring escrow, note that the priority needs to be quoted because the options hash requires both keys and values to be strings:

# Set the escrow as lower priority than everything else to allow rpmoci to disambiguate, and have the non-escrow location
# be recorded in the lockfile. Confusingly, lower priority values are preferred.
repositories = [
    { url = "https://artifactory.com/artifactory/foo-mirror", id = "foo-mirror", options = { gpgcheck = "False", priority = "1" } },
    { url = "https://artifactory.com/artifactory/my-escrow", id = "my-escrow", options = { gpgcheck = "False", priority = "2" } },
]

Bonus fixes

  • Make GPG ordering in the lockfile be consistent to reduce churn, especially false-positive updates where only the order has changed and none of the RPM versions have.
  • Remove a stray debugging line that causes 3 to be printed during rpmoci updates.

This also bumps the version to get a new release out.

Testing

Compiled locally and:

  1. Ran an rpmoci update while setting priorities as above. The lockfile consistently gets updated to have the GPG keys ordered alphabetically, and all RPMs have their repoid set to the mirrors rather than the escrow. Locked build succeeds.
  2. Edited rpmoci.toml to have excludepkgs = "*" for all but the escrow repo. Locked build succeeds, despite the lockfile having sha256 checksums and the escrow metadata having sha1.
  3. Edit a checksum to make it incorrect. Locked build fails with error including the expected and actual checksums.
  4. Do an update so that all RPMs now come from escrow (which is something that maint branches will do). Checksums algorithms are now reported as sha1. Locked build succeeds.
  5. Edit a checksum to make it incorrect. Locked build fails with the existing error message indicating that the package is missing.

This prevents rpmoci updates that seem like the have changes, but are
just reordering the GPG entries.
The checksum algorithm (and hence checksum) can differ between resolving
the RPM and writing the lockfile, and downloading the RPM later. For
these cases, verify the checksum after downloading the RPM instead by
hashing the package ourselves with the recorded algorithm.
@matthewsheerin matthewsheerin changed the title Support escrow (+ other fixes) Support checksum algorithm mismatch (+ other fixes) Jul 1, 2025
@tofay tofay merged commit fd11658 into Metaswitch:main Jul 1, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants