Use predictable VMs for jobs in the CI #13619

ssbarnea · 2025-07-25T17:07:29Z

In general is better not to use -latest runners because they are
very often outdated versions, causing problems. As github does not
update the name of the runners very often, it is easier to use
specific ones. This also prevents surprises when github is changing
the runner versions as they do this gradually and -latest ones might
point to different runners in different jobs.

Reference: https://github.com/actions/runner-images

Pierre-Sassoulas

Looks good !

.python-version

.github/workflows/test.yml

nicoddemus · 2025-07-25T22:41:42Z

I'm hesitant to merge this, because we will not catch problems in the "outdated" Python versions, which pytest technically still supports.

Note that the tox issue has also been fixed already.

webknjaz · 2025-07-26T07:00:14Z

I think that the VM pinning part is good. I'm also uneasy about the rest.

In general is better not to use `-latest` runners because they are very often outdated versions, causing problems. As github does not update the name of the runners very often, it is easier to use specific ones. This also prevents surprises when github is changing the runner versions as they do this gradually and -latest ones might point to different runners in different jobs. Related: tox-dev/tox#3565 Reference: https://github.com/actions/runner-images

RonnyPfannschmidt · 2025-07-27T22:09:41Z

Btw does dependabot update those?

Pierre-Sassoulas · 2025-07-28T05:40:19Z

Having a matrix of os (oldest/newest) while technically a good thing would double the runner cost for very little value imo. I remember a single instance where the os number version mattered in a bug report (new macos release removing an old api or us using a new mac os api implicitely, not sure). I think we should settle on oldest supported or newest supported and keep one. Using the name 'latest' for 'oldest os version supported' is very counter intuitive in any case.

webknjaz · 2025-07-28T09:44:56Z

I think the PR title is misleading. The current diff basically pins the runner VMs to the extent possible. It's a good idea to pin them just like software deps for better predictability.

vivodi · 2025-07-30T09:02:51Z

While I understand the desire for pinning, I believe that sticking with the -latest runners is more beneficial for pytest for a few key reasons:

Maintenance Burden vs. Security: Pinning versions shifts the responsibility of keeping the environment secure and up-to-date onto us. It creates a significant maintenance burden to manually track and update the runner image. By using -latest, we automatically receive the most recent software and security patches from GitHub, preventing our CI from running on a stale or potentially vulnerable environment.
Testing on a Relevant Environment: The -latest tag ensures we are always testing against a modern OS that is representative of what our users have. For a foundational library like pytest, staying current with the ecosystem is more valuable than locking into an environment that will quickly become dated. GitHub's gradual rollout of new -latest images also minimizes the risk of sudden, unexpected breakages.
Reproducibility Where It Counts: True test reproducibility comes from pinning our software dependencies (like Python versions, libraries, etc.) within the workflow itself, which we already do. The runner is the environment, and it's better for that environment to be modern and managed by the platform provider.

For these reasons, the automated updates and relevance provided by *-latest seem to outweigh the risks. I suggest we continue with the current approach.

vivodi · 2025-07-30T09:37:53Z

I don't think the reasoning in your PR is valid. @ssbarnea

In general is better not to use -latest runners because they are very often outdated versions, causing problems.

The term 'latest' implies the most recent version. Are you perhaps misunderstanding its meaning?

As github does not update the name of the runners very often, it is easier to use specific ones.

Based on historical data, GitHub designates a new runner image with the '*-latest' tag within four months (typically around two months) after it becomes stable.

ssbarnea · 2025-07-30T14:15:20Z

I don't think the reasoning in your PR is valid. @ssbarnea

In general is better not to use -latest runners because they are very often outdated versions, causing problems.

The term 'latest' implies the most recent version. Are you perhaps misunderstanding its meaning?

As github does not update the name of the runners very often, it is easier to use specific ones.

Based on historical data, GitHub designates a new runner image with the '*-latest' tag within four months (typically around two months) after it becomes stable.

@vivodi I can only guess that you do not have much experience with GHA in general. While at industry level, 'latest' has the meaning you assumed, that is not the case for GHA runners. Looking back the last decade, ~75% the time -latest pointed to an older version of gihub runners when there was already an -xxx version that was newer already available. For the rest of the time it happened to be latest. Basically you need mentally translate the -latest to -stable.

The good part is that they add new runners only every 2-3 years for each platform, which means that using fixed ones is the better choice. I hope this explains it better.

In all other cases I know I would advocate for using -latest, but not for github runners.

Also there is an even worse extra reason for not using them, as it means github will randomly start to use new runners without exactly telling you when. That is because they gradually migrate repositories to newer runners during a multiple months transition when they do it. You have zero control over which runner is used if you use latest and you will only have to dig a failed job to discover that it used a different runner.

Your reasoning makes sense for other stuff, not for github runners. In fact the pinned one receive updates the same way as latest ones as latest ones just point to one of the pinned ones, just that you can no way of knowing which one. So, latest is no more secure, no more reproducible.

Now, if we argue about using a heavily outdated and unsupported runner that is going to be retired soon, that is another case. Yep, using ubuntu-18.04 is clearly worse than ubuntu-latest, but using ubuntu-24.04 is still superior.

If you want to keep them updated automatically, try something like https://docs.renovatebot.com/modules/manager/github-actions/

webknjaz · 2025-07-30T17:19:18Z

@ssbarnea could you add a contrib change note? This change is worth exposing to the contributors, especially more infrequent ones.

vivodi · 2025-07-31T15:28:45Z

If you want to keep them updated automatically, try something like https://docs.renovatebot.com/modules/manager/github-actions/

@ssbarnea I believe that as the author of this PR, you have the responsibility to do so within this PR.

The pytest repository is currently using Dependabot.

webknjaz · 2025-07-31T15:32:27Z

@vivodi I don't think onboarding new automations is in the scope. But regardless, this change improves predictability greatly.

vivodi · 2025-07-31T15:42:29Z

@vivodi I don't think onboarding new automations is in the scope. But regardless, this change improves predictability greatly.

Before this PR is merged, the pytest repository’s workflows can always run on an almost up‑to‑date runner image (given that the -latest tag is only used a few months after the runner image stabilizes).

If this PR cannot deliver automation, then in the future the pytest repository’s workflows may end up running on a deprecated runner image, posing security risks.

In that case, it would be better not to merge this PR at all.

webknjaz · 2025-07-31T16:19:54Z

I don't think your assesment is reasonable: it's not like nobody maintains this repo. Such updates can be happening in a fairly predictable manner. Having a stable CI is always a priority.

If anything, things like dependabot and renovate are quite spammy and this annoyance is not something that can be dismissed easily.

But if you can hire somebody to be responsible for security, please do so. Burdening FOSS maintainers with this is irresponsible at best.

vivodi · 2025-07-31T16:32:23Z

In general is better not to use -latest runners because they are
very often outdated versions, causing problems.

(This is a direct quote from the author’s PR description)

The author’s original intention in creating this PR was based on the belief that the -latest runner image is not truly up-to-date but “outdated.” They wanted to always use the latest runner image, and the only way to achieve that is by using Dependabot for automatic updates. The point you @webknjaz raised in rebuttal is in direct conflict with the original intention behind creating this PR.

If anything, things like dependabot and renovate are quite spammy and this annoyance is not something that can be dismissed easily.

The author of this PR wants to always use the latest runner image. Moreover, runner image updates are infrequent—no more than three times a year (for Ubuntu, macOS, and Windows). I have no idea how you arrived at the conclusion that “Burdening FOSS maintainers with this is irresponsible at best.”

webknjaz · 2025-07-31T16:50:12Z

The PR scope has changed since it was opened and it is a bit different now. I don't think that description accurately reflects why this is useful. I've observed statically set VM values contributing ro stability. Having a random factor like GH updating a runner at an unspecified point in type hurts traceability and makes troubleshooting difficult when this happens.
Setting up additional services is a scope creep. If you want to discuss this, do it in a separate issue. This PR already contains a useful improvement.

ssbarnea added the skip news used on prs to opt out of the changelog requirement label Jul 25, 2025

ssbarnea force-pushed the fix/runners branch 2 times, most recently from b608574 to da50728 Compare July 25, 2025 17:37

ssbarnea changed the title ~~build: replace outdated windows-latest runner with windows-2025~~ build(wip): avoid running outdated python 3.9.x and 3.10.x on github runners Jul 25, 2025

ssbarnea force-pushed the fix/runners branch 3 times, most recently from 7b97ad5 to 4d2ee07 Compare July 25, 2025 17:54

Pierre-Sassoulas reviewed Jul 25, 2025

View reviewed changes

.python-version Outdated Show resolved Hide resolved

webknjaz reviewed Jul 25, 2025

View reviewed changes

.github/workflows/test.yml Outdated Show resolved Hide resolved

ssbarnea changed the title ~~build(wip): avoid running outdated python 3.9.x and 3.10.x on github runners~~ build: use avoid using outdated github runners Jul 27, 2025

ssbarnea requested review from webknjaz and Pierre-Sassoulas July 27, 2025 21:30

ssbarnea marked this pull request as ready for review July 27, 2025 21:30

ssbarnea force-pushed the fix/runners branch from 0953f05 to 94d317c Compare July 27, 2025 21:31

Merge branch 'main' into fix/runners

2d54444

webknjaz changed the title ~~build: use avoid using outdated github runners~~ Use predictable VMs for jobs in the CI Jul 31, 2025

Uh oh!

Use predictable VMs for jobs in the CI #13619

Are you sure you want to change the base?

Use predictable VMs for jobs in the CI #13619

Uh oh!

Conversation

ssbarnea commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

nicoddemus commented Jul 25, 2025

Uh oh!

webknjaz commented Jul 26, 2025

Uh oh!

RonnyPfannschmidt commented Jul 27, 2025

Uh oh!

Pierre-Sassoulas commented Jul 28, 2025

Uh oh!

webknjaz commented Jul 28, 2025

Uh oh!

vivodi commented Jul 30, 2025

Uh oh!

vivodi commented Jul 30, 2025

Uh oh!

ssbarnea commented Jul 30, 2025

Uh oh!

webknjaz commented Jul 30, 2025

Uh oh!

vivodi commented Jul 31, 2025

Uh oh!

webknjaz commented Jul 31, 2025

Uh oh!

vivodi commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

webknjaz commented Jul 31, 2025

Uh oh!

vivodi commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

webknjaz commented Jul 31, 2025

Uh oh!

Uh oh!

ssbarnea commented Jul 25, 2025 •

edited

Loading

vivodi commented Jul 31, 2025 •

edited

Loading

vivodi commented Jul 31, 2025 •

edited

Loading