Skip to content

contrib/aws: skip trn1 tests for PRs without relevant code changes#12091

Open
a-szegel wants to merge 1 commit intoofiwg:mainfrom
a-szegel:only-run-trn-against-efa-related-changes
Open

contrib/aws: skip trn1 tests for PRs without relevant code changes#12091
a-szegel wants to merge 1 commit intoofiwg:mainfrom
a-szegel:only-run-trn-against-efa-related-changes

Conversation

@a-szegel
Copy link
Copy Markdown
Contributor

@a-szegel a-szegel commented Apr 2, 2026

trn1.32xlarge instances use a limited pool of lockable resources that causes builds to time out when too many PRs are queued. Gate the trn1 test stage on a new have_efa_provider_changes() helper that checks whether the PR modifies paths that could affect EFA provider behavior (prov/efa, prov/shm, prov/util, src, include, fabtests, contrib/aws, and build system files).

PRs that only touch documentation, other providers, or CI configs unrelated to AWS will skip the trn1 stage entirely, reducing lock contention.

@a-szegel a-szegel requested a review from a team April 2, 2026 14:09
@a-szegel a-szegel force-pushed the only-run-trn-against-efa-related-changes branch 3 times, most recently from ad29421 to c80ab84 Compare April 2, 2026 17:00
@a-szegel
Copy link
Copy Markdown
Contributor Author

a-szegel commented Apr 2, 2026

Build 4 in AWS CI show's that this is working with this temporary testing commit applied (removes contrib from list and trn1 doesn't build)

 def have_efa_provider_changes() {
-    def paths = ["prov/efa/", "prov/shm/", "prov/util/", "src/", "include/", "fabtests/", "contrib/aws/", "Makefile.am", "configure.ac", "autogen.sh", "config/"]
+    def paths = ["prov/efa/", "prov/shm/", "prov/util/", "src/", "include/", "fabtests/", "Makefile.am", "configure.ac", "autogen.sh", "config/"]
     sh "git fetch origin ${env.CHANGE_TARGET}:refs/remotes/origin/${env.CHANGE_TARGET} --no-tags --no-recurse-submodules"

@a-szegel a-szegel force-pushed the only-run-trn-against-efa-related-changes branch from c80ab84 to 3d4b128 Compare April 3, 2026 02:07
When many PR builds are queued, limited lockable resources (especially
trn1.32xlarge) cause builds to time out. Add a "Check for relevant
changes" stage that skips all testing when the PR does not modify paths
that could affect provider behavior: prov/efa, prov/shm, prov/util,
prov/tcp, src, include, fabtests, contrib/aws,
and build system files.

PRs that only touch documentation, other providers, or unrelated configs
will complete immediately after checkout, freeing up CI resources.

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
@a-szegel a-szegel force-pushed the only-run-trn-against-efa-related-changes branch from 3d4b128 to e9df49e Compare April 3, 2026 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant