Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
e9fc454
ci: add ci_test GitHub workflows
zhangyue207 May 6, 2026
ce5dce2
fix: avoid cross-platform CUDA probing in tests
zhangyue207 May 6, 2026
751d5e7
ci: target master and setup Python in CI
zhangyue207 May 7, 2026
d1244a6
ci: use python module pip for CI dependency
zhangyue207 May 7, 2026
e247321
ci: update CI submodule for failure logs
zhangyue207 May 7, 2026
cb747dd
ci: update CI submodule for ci_ref scheduler
zhangyue207 May 7, 2026
3ae03f2
ci: update CI submodule for source-mounted scheduler
zhangyue207 May 7, 2026
fdf7b19
ci: update CI submodule for Unit generator fix
zhangyue207 May 7, 2026
30179b3
ci: update CI submodule for tag model configs
zhangyue207 May 7, 2026
a754b36
ci: update CI submodule for Unit failure logs
zhangyue207 May 7, 2026
c7c69fb
ci: pass explicit devices for XPU unit jobs
zhangyue207 May 7, 2026
6c52a33
ci: standardize CI config extension to yml
zhangyue207 May 7, 2026
18e803b
ci: update CI submodule for concise job names
zhangyue207 May 7, 2026
4864738
ci: update CI submodule for skipped job names
zhangyue207 May 7, 2026
af5c270
Remove obsolete CI and lint config files
zhangyue207 May 8, 2026
24d158a
ci: add manual platform dispatch
zhangyue207 May 12, 2026
500f30f
ci: remove smoke and performance pipeline jobs
zhangyue207 May 12, 2026
272c7c7
Update Moore CI deployment fixes
zhangyue207 May 12, 2026
0f61501
ci: rerun PR checks
zhangyue207 May 12, 2026
ac47c0d
ci: default PR tests to nvidia
zhangyue207 May 12, 2026
9890370
ci: rerun nvidia check
zhangyue207 May 12, 2026
25c71fc
ci: update nvidia unit workflow
zhangyue207 May 12, 2026
25773f0
ci: run PR checks on active platforms
zhangyue207 May 12, 2026
ad2a351
ci: register iluvatar platform
UsamaRana3444 May 13, 2026
a504c8e
ci: trigger checks on ci online branch
UsamaRana3444 May 13, 2026
8aa035b
ci: enable ascend online runner
May 13, 2026
cb46996
ci: rerun with metax scheduler fix
May 13, 2026
ffcef4d
ci: rerun metax after cancel
May 13, 2026
c363be2
ci: skip ascend image rebuild
May 13, 2026
ff47eec
ci: rerun ascend with encoded args
May 13, 2026
9dd279b
ci: rerun ascend after runner cleanup
May 13, 2026
a8ef489
ci: rerun iluvatar with timeout guard
UsamaRana3444 May 13, 2026
93f99d1
ci: cancel stale online runs
May 13, 2026
f7de477
ci: cap metax unit runtime
May 13, 2026
93f489d
ci: match ascend runner label
May 13, 2026
50d644a
ci: avoid queued platforms blocking ascend
May 13, 2026
f018422
ci: rerun ascend with runner proxy
May 13, 2026
06dd9ed
ci: rerun ascend after python compatibility fix
May 13, 2026
9f68159
ci: rerun ascend with scheduler image
May 13, 2026
8e6f6f7
ci: rerun ascend locally
May 13, 2026
dc1f5c4
ci: run metax quick operator subset
May 13, 2026
2e9ef03
ci: install ascend build dependencies
May 13, 2026
485df21
ci: rerun iluvatar after scheduler fix
UsamaRana3444 May 13, 2026
30bf022
ci: rerun metax quick subset
May 13, 2026
b323b7f
ci: rerun with safe matrix output
May 13, 2026
439dc9a
ci: rerun after matrix output fix
UsamaRana3444 May 13, 2026
d54adbd
ci: rerun after matrix output fix
May 13, 2026
7abd39e
ci: rerun iluvatar after report fix
UsamaRana3444 May 13, 2026
4b0ac87
ci: rerun ascend accepting docker 137
May 13, 2026
5f9caea
ci: limit metax online smoke cases
May 13, 2026
e576aca
ci: rerun metax after busy gpu filter
May 13, 2026
747c3e1
ci: rerun full ci online
May 13, 2026
1191a55
ci: address pr feedback
May 13, 2026
e468ed7
ci: use prebuilt ascend test image
May 13, 2026
884e8c5
test: generate fallback randint data on cpu
May 13, 2026
44d15af
test: format gemm skip reason as markdown
May 13, 2026
d9f84c2
ci: build ascend test image from dockerfile
May 13, 2026
85694ac
ci: update ci tooling submodule
May 13, 2026
a980fd3
ci: opt ascend into buildkit
May 13, 2026
8d79d1e
ci: keep default repo branch on master
May 13, 2026
aaea110
ci: run ascend tests on free npu
May 13, 2026
211c272
ci: let ascend pick an available npu
May 13, 2026
d98ee59
ci: update dynamic ascend allocation tooling
May 13, 2026
1ca8c7b
ci: update ascend npu allocation parser
May 13, 2026
271f8e9
ci: update ascend logical device mapping
May 13, 2026
2006f5c
ci: use nvidia base compatible with runner
May 13, 2026
0ed7e69
ci: align nvidia test command with master
May 13, 2026
772436f
ci: run nvidia tests on compatible base image
May 13, 2026
4e6473f
ci: address review comments
zhangyue207 May 14, 2026
3ebdae5
ci: update moore resource locking
zhangyue207 May 14, 2026
9515331
ci: update scheduler stale lock cleanup
zhangyue207 May 14, 2026
a435434
ci: update nvidia gpu allocation
zhangyue207 May 14, 2026
e8fc4bf
ci: add v2 shadow workflow
zhangyue207 May 14, 2026
bfa8c83
ci: handle unavailable v2 shadow agents
zhangyue207 May 14, 2026
974ba77
ci: add v2 agent installer
zhangyue207 May 14, 2026
7e08cff
ci: match v2 runner labels
zhangyue207 May 14, 2026
1bcd199
ci: enforce v2 shadow checks
zhangyue207 May 14, 2026
f6473cd
ci: update v2 runner user agent
zhangyue207 May 14, 2026
8da0668
ci: default v2 shadow to active platforms
zhangyue207 May 14, 2026
b17bffa
ci: limit v2 agent queue wait to ten minutes
zhangyue207 May 14, 2026
b5b60be
ci: use self-healing v2 agent workflow
zhangyue207 May 14, 2026
3e4c9bf
ci: use transient state dir fallback
zhangyue207 May 14, 2026
6fbcd86
ci: use platform lock probe workflow
zhangyue207 May 14, 2026
cf11639
ci: use checkout-free self-hosted workflow
zhangyue207 May 14, 2026
9d11aac
ci: use nested junit result detection
zhangyue207 May 14, 2026
4e9630e
ci: use per-job checked-out agent
zhangyue207 May 14, 2026
276cbd5
ci: use metax resource allocation fix
zhangyue207 May 14, 2026
1d0444b
test: keep tests aligned with master
zhangyue207 May 17, 2026
ccae706
ci: update iluvatar ci tooling
zhangyue207 May 18, 2026
dc50a03
ci: use early-exit v2 queue watchdog
zhangyue207 May 18, 2026
b7b32bf
ci: handle queued runners and update platform sets
zhangyue207 May 18, 2026
4fb2f9f
ci: add iluvatar runner filesystem repair script
zhangyue207 May 18, 2026
a28e98b
ci: enable iluvatar in legacy workflow
zhangyue207 May 18, 2026
4420783
ci: skip host gpu probing for iluvatar
zhangyue207 May 18, 2026
97b7864
ci: pin iluvatar local runner support
zhangyue207 May 18, 2026
6d0616b
ci: pin shadow workflow ci ref
zhangyue207 May 18, 2026
9fc073b
Fix Iluvatar CI container setup
zhangyue207 May 18, 2026
07496bb
Include Iluvatar CI build backend dependency
zhangyue207 May 18, 2026
a6f5bfb
ci: remove local iluvatar repair script
zhangyue207 May 18, 2026
5307f6b
ci: preflight runner availability before jobs
zhangyue207 May 18, 2026
286ab40
ci: pin runner preflight token fix
zhangyue207 May 18, 2026
2793d9a
ci: pin best-effort runner preflight
zhangyue207 May 18, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .ci
Submodule .ci added at c6bf36
388 changes: 0 additions & 388 deletions .ci/README.md

This file was deleted.

Loading
Loading