Skip to content

Commit efd420c

Browse files
committed
Auto merge of #144244 - jieyouxu:pr-full-ci, r=Kobzol
Enforce that PR CI jobs are a subset of Auto CI jobs modulo carve-outs ### Background Currently, it is possible for a PR with red PR-only CI to pass Auto CI, then all subsequent PR CI runs will be red until that is fixed, even in completely unrelated PRs. For instance, this happened with PR-CI-only Spellcheck (#144183). See more discussions at [#t-infra > Spellcheck workflow now fails on all PRs (tree bad?)](https://rust-lang.zulipchat.com/#narrow/channel/242791-t-infra/topic/Spellcheck.20workflow.20now.20fails.20on.20all.20PRs.20.28tree.20bad.3F.29/with/529769404). ### CI invariant: PR CI jobs are a subset of Auto CI jobs modulo carve-outs To prevent red PR CI in completely unrelated subsequent PRs and PR CI runs, we need to maintain an invariant that **PR CI jobs are a subset of Auto CI jobs modulo carve-outs**. This is **not** a "strict" subset relationship: some jobs necessarily have to differ under PR CI and Auto CI environments, at least in the current setup. Still, we can try to enforce a weaker "subset modulo carve-outs" relationship between CI jobs and their corresponding Auto jobs. For instance: - `x86_64-gnu-tools` will have `auto`-only env vars like `DEPLOY_TOOLSTATES_JSON: toolstates-linux.json`. - `tidy` will want to `continue_on_error: true` in PR CI to allow for more "useful" compilation errors to also be reported, whereas it should be `continue_on_error: false` in Auto CI to prevent wasting Auto CI resources. The **carve-outs** are: 1. `env` variables. 2. `continue_on_error`. We enforce this invariant through `citool`, so only affects job definitions that are handled by `citool`. Notably, this is not sufficient *alone* to address the CI-only Spellcheck issue (#144183). To carry out this enforcement, we modify `citool` to auto-register PR jobs as Auto jobs with `continue_on_error` overridden to `false` **unless** there's an overriding Auto job for the PR job of the same name that only differs by the permitted **carve-outs**. ### Addressing the Spellcheck PR-only CI issue Note that Spellcheck currently does not go through `citool` or `bootstrap`, and is its own GitHub Actions workflow. To actually address the PR-CI-only Spellcheck issue (#144183), and carry out the subset-modulo-carve-outs enforcement universally, this PR additionally **removes the current Spellcheck implementation** (a separate GitHub Actions Workflow). That is incompatible with Homu unless we do some hacks in the main CI workflow. This effectively partially reverts #134006 (the separate workflow part, not the tidy extra checks component), but is not prejudice against relanding the `typos`-based spellcheck in another implementation that goes through the usual bootstrap CI workflow so that it does work with Homu. The `typos`-based spellcheck seems to have a good false-positive rate. Closes #144183. --- r? infra-ci
2 parents ace6330 + 523594d commit efd420c

File tree

5 files changed

+352
-34
lines changed

5 files changed

+352
-34
lines changed

.github/workflows/spellcheck.yml

Lines changed: 0 additions & 23 deletions
This file was deleted.

src/ci/citool/src/jobs.rs

Lines changed: 112 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
#[cfg(test)]
22
mod tests;
33

4-
use std::collections::BTreeMap;
4+
use std::collections::{BTreeMap, HashSet};
55

6-
use anyhow::Context as _;
6+
use anyhow::{Context as _, anyhow};
77
use serde_yaml::Value;
88

99
use crate::GitHubContext;
@@ -85,6 +85,10 @@ impl JobDatabase {
8585
.cloned()
8686
.collect()
8787
}
88+
89+
fn find_auto_job_by_name(&self, job_name: &str) -> Option<&Job> {
90+
self.auto_jobs.iter().find(|job| job.name == job_name)
91+
}
8892
}
8993

9094
pub fn load_job_db(db: &str) -> anyhow::Result<JobDatabase> {
@@ -97,14 +101,118 @@ pub fn load_job_db(db: &str) -> anyhow::Result<JobDatabase> {
97101
db.apply_merge().context("failed to apply merge keys")
98102
};
99103

100-
// Apply merge twice to handle nested merges
104+
// Apply merge twice to handle nested merges up to depth 2.
101105
apply_merge(&mut db)?;
102106
apply_merge(&mut db)?;
103107

104-
let db: JobDatabase = serde_yaml::from_value(db).context("failed to parse job database")?;
108+
let mut db: JobDatabase = serde_yaml::from_value(db).context("failed to parse job database")?;
109+
110+
register_pr_jobs_as_auto_jobs(&mut db)?;
111+
112+
validate_job_database(&db)?;
113+
105114
Ok(db)
106115
}
107116

117+
/// Maintain invariant that PR CI jobs must be a subset of Auto CI jobs modulo carve-outs.
118+
///
119+
/// When PR jobs are auto-registered as Auto jobs, they will have `continue_on_error` overridden to
120+
/// be `false` to avoid wasting Auto CI resources.
121+
///
122+
/// When a job is already both a PR job and a auto job, we will post-validate their "equivalence
123+
/// modulo certain carve-outs" in [`validate_job_database`].
124+
///
125+
/// This invariant is important to make sure that it's not easily possible (without modifying
126+
/// `citool`) to have PRs with red PR-only CI jobs merged into `master`, causing all subsequent PR
127+
/// CI runs to be red until the cause is fixed.
128+
fn register_pr_jobs_as_auto_jobs(db: &mut JobDatabase) -> anyhow::Result<()> {
129+
for pr_job in &db.pr_jobs {
130+
// It's acceptable to "override" a PR job in Auto job, for instance, `x86_64-gnu-tools` will
131+
// receive an additional `DEPLOY_TOOLSTATES_JSON: toolstates-linux.json` env when under Auto
132+
// environment versus PR environment.
133+
if db.find_auto_job_by_name(&pr_job.name).is_some() {
134+
continue;
135+
}
136+
137+
let auto_registered_job = Job { continue_on_error: Some(false), ..pr_job.clone() };
138+
db.auto_jobs.push(auto_registered_job);
139+
}
140+
141+
Ok(())
142+
}
143+
144+
fn validate_job_database(db: &JobDatabase) -> anyhow::Result<()> {
145+
fn ensure_no_duplicate_job_names(section: &str, jobs: &Vec<Job>) -> anyhow::Result<()> {
146+
let mut job_names = HashSet::new();
147+
for job in jobs {
148+
let job_name = job.name.as_str();
149+
if !job_names.insert(job_name) {
150+
return Err(anyhow::anyhow!(
151+
"duplicate job name `{job_name}` in section `{section}`"
152+
));
153+
}
154+
}
155+
Ok(())
156+
}
157+
158+
ensure_no_duplicate_job_names("pr", &db.pr_jobs)?;
159+
ensure_no_duplicate_job_names("auto", &db.auto_jobs)?;
160+
ensure_no_duplicate_job_names("try", &db.try_jobs)?;
161+
ensure_no_duplicate_job_names("optional", &db.optional_jobs)?;
162+
163+
fn equivalent_modulo_carve_out(pr_job: &Job, auto_job: &Job) -> anyhow::Result<()> {
164+
let Job {
165+
name,
166+
os,
167+
only_on_channel,
168+
free_disk,
169+
doc_url,
170+
codebuild,
171+
172+
// Carve-out configs allowed to be different.
173+
env: _,
174+
continue_on_error: _,
175+
} = pr_job;
176+
177+
if *name == auto_job.name
178+
&& *os == auto_job.os
179+
&& *only_on_channel == auto_job.only_on_channel
180+
&& *free_disk == auto_job.free_disk
181+
&& *doc_url == auto_job.doc_url
182+
&& *codebuild == auto_job.codebuild
183+
{
184+
Ok(())
185+
} else {
186+
Err(anyhow!(
187+
"PR job `{}` differs from corresponding Auto job `{}` in configuration other than `continue_on_error` and `env`",
188+
pr_job.name,
189+
auto_job.name
190+
))
191+
}
192+
}
193+
194+
for pr_job in &db.pr_jobs {
195+
// At this point, any PR job must also be an Auto job, auto-registered or overridden.
196+
let auto_job = db
197+
.find_auto_job_by_name(&pr_job.name)
198+
.expect("PR job must either be auto-registered as Auto job or overridden");
199+
200+
equivalent_modulo_carve_out(pr_job, auto_job)?;
201+
}
202+
203+
// Auto CI jobs must all "fail-fast" to avoid wasting Auto CI resources. For instance, `tidy`.
204+
for auto_job in &db.auto_jobs {
205+
if auto_job.continue_on_error == Some(true) {
206+
return Err(anyhow!(
207+
"Auto job `{}` cannot have `continue_on_error: true`",
208+
auto_job.name
209+
));
210+
}
211+
}
212+
213+
Ok(())
214+
}
215+
108216
/// Representation of a job outputted to a GitHub Actions workflow.
109217
#[derive(serde::Serialize, Debug)]
110218
struct GithubActionsJob {

src/ci/citool/src/jobs/tests.rs

Lines changed: 220 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
use std::collections::BTreeMap;
12
use std::path::Path;
23

34
use super::Job;
@@ -146,3 +147,222 @@ fn validate_jobs() {
146147
panic!("Job validation failed:\n{error_messages}");
147148
}
148149
}
150+
151+
#[test]
152+
fn pr_job_implies_auto_job() {
153+
let db = load_job_db(
154+
r#"
155+
envs:
156+
pr:
157+
try:
158+
auto:
159+
optional:
160+
161+
pr:
162+
- name: pr-ci-a
163+
os: ubuntu
164+
env: {}
165+
try:
166+
auto:
167+
optional:
168+
"#,
169+
)
170+
.unwrap();
171+
172+
assert_eq!(db.auto_jobs.iter().map(|j| j.name.as_str()).collect::<Vec<_>>(), vec!["pr-ci-a"])
173+
}
174+
175+
#[test]
176+
fn implied_auto_job_keeps_env_and_fails_fast() {
177+
let db = load_job_db(
178+
r#"
179+
envs:
180+
pr:
181+
try:
182+
auto:
183+
optional:
184+
185+
pr:
186+
- name: tidy
187+
env:
188+
DEPLOY_TOOLSTATES_JSON: toolstates-linux.json
189+
continue_on_error: true
190+
os: ubuntu
191+
try:
192+
auto:
193+
optional:
194+
"#,
195+
)
196+
.unwrap();
197+
198+
assert_eq!(db.auto_jobs.iter().map(|j| j.name.as_str()).collect::<Vec<_>>(), vec!["tidy"]);
199+
assert_eq!(db.auto_jobs[0].continue_on_error, Some(false));
200+
assert_eq!(
201+
db.auto_jobs[0].env,
202+
BTreeMap::from([(
203+
"DEPLOY_TOOLSTATES_JSON".to_string(),
204+
serde_yaml::Value::String("toolstates-linux.json".to_string())
205+
)])
206+
);
207+
}
208+
209+
#[test]
210+
#[should_panic = "duplicate"]
211+
fn duplicate_job_name() {
212+
let _ = load_job_db(
213+
r#"
214+
envs:
215+
pr:
216+
try:
217+
auto:
218+
219+
220+
pr:
221+
- name: pr-ci-a
222+
os: ubuntu
223+
env: {}
224+
- name: pr-ci-a
225+
os: ubuntu
226+
env: {}
227+
try:
228+
auto:
229+
optional:
230+
"#,
231+
)
232+
.unwrap();
233+
}
234+
235+
#[test]
236+
fn auto_job_can_override_pr_job_spec() {
237+
let db = load_job_db(
238+
r#"
239+
envs:
240+
pr:
241+
try:
242+
auto:
243+
optional:
244+
245+
pr:
246+
- name: tidy
247+
os: ubuntu
248+
env: {}
249+
try:
250+
auto:
251+
- name: tidy
252+
env:
253+
DEPLOY_TOOLSTATES_JSON: toolstates-linux.json
254+
continue_on_error: false
255+
os: ubuntu
256+
optional:
257+
"#,
258+
)
259+
.unwrap();
260+
261+
assert_eq!(db.auto_jobs.iter().map(|j| j.name.as_str()).collect::<Vec<_>>(), vec!["tidy"]);
262+
assert_eq!(db.auto_jobs[0].continue_on_error, Some(false));
263+
assert_eq!(
264+
db.auto_jobs[0].env,
265+
BTreeMap::from([(
266+
"DEPLOY_TOOLSTATES_JSON".to_string(),
267+
serde_yaml::Value::String("toolstates-linux.json".to_string())
268+
)])
269+
);
270+
}
271+
272+
#[test]
273+
fn compatible_divergence_pr_auto_job() {
274+
let db = load_job_db(
275+
r#"
276+
envs:
277+
pr:
278+
try:
279+
auto:
280+
optional:
281+
282+
pr:
283+
- name: tidy
284+
continue_on_error: true
285+
env:
286+
ENV_ALLOWED_TO_DIFFER: "hello world"
287+
os: ubuntu
288+
try:
289+
auto:
290+
- name: tidy
291+
continue_on_error: false
292+
env:
293+
ENV_ALLOWED_TO_DIFFER: "goodbye world"
294+
os: ubuntu
295+
optional:
296+
"#,
297+
)
298+
.unwrap();
299+
300+
// `continue_on_error` and `env` are carve-outs *allowed* to diverge between PR and Auto job of
301+
// the same name. Should load successfully.
302+
303+
assert_eq!(db.auto_jobs.iter().map(|j| j.name.as_str()).collect::<Vec<_>>(), vec!["tidy"]);
304+
assert_eq!(db.auto_jobs[0].continue_on_error, Some(false));
305+
assert_eq!(
306+
db.auto_jobs[0].env,
307+
BTreeMap::from([(
308+
"ENV_ALLOWED_TO_DIFFER".to_string(),
309+
serde_yaml::Value::String("goodbye world".to_string())
310+
)])
311+
);
312+
}
313+
314+
#[test]
315+
#[should_panic = "differs"]
316+
fn incompatible_divergence_pr_auto_job() {
317+
// `os` is not one of the carve-out options allowed to diverge. This should fail.
318+
let _ = load_job_db(
319+
r#"
320+
envs:
321+
pr:
322+
try:
323+
auto:
324+
optional:
325+
326+
pr:
327+
- name: tidy
328+
continue_on_error: true
329+
env:
330+
ENV_ALLOWED_TO_DIFFER: "hello world"
331+
os: ubuntu
332+
try:
333+
auto:
334+
- name: tidy
335+
continue_on_error: false
336+
env:
337+
ENV_ALLOWED_TO_DIFFER: "goodbye world"
338+
os: windows
339+
optional:
340+
"#,
341+
)
342+
.unwrap();
343+
}
344+
345+
#[test]
346+
#[should_panic = "cannot have `continue_on_error: true`"]
347+
fn auto_job_continue_on_error() {
348+
// Auto CI jobs must fail-fast.
349+
let _ = load_job_db(
350+
r#"
351+
envs:
352+
pr:
353+
try:
354+
auto:
355+
optional:
356+
357+
pr:
358+
try:
359+
auto:
360+
- name: tidy
361+
continue_on_error: true
362+
os: windows
363+
env: {}
364+
optional:
365+
"#,
366+
)
367+
.unwrap();
368+
}

0 commit comments

Comments
 (0)