Skip to content

Actions: NVIDIA/Fuser

Lint

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
13,533 workflow runs
13,533 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Fix stmatrix scheduling for persistent GEMM (#3791)
Lint #20223: Commit a3f6ba9 pushed by rdspring1
January 30, 2025 02:09 4m 15s main
January 30, 2025 02:09 4m 15s
Add a few benchmarks for concatenation.
Lint #20222: Pull request #3751 synchronize by tfogal
January 30, 2025 01:48 4m 5s tfogal/benchmark-cat
January 30, 2025 01:48 4m 5s
Add a few benchmarks for concatenation.
Lint #20221: Pull request #3751 synchronize by tfogal
January 30, 2025 01:14 5m 28s tfogal/benchmark-cat
January 30, 2025 01:14 5m 28s
Verify the same runtime is used for different sequence lengths
Lint #20220: Pull request #3798 opened by wujingyue
January 30, 2025 00:47 7m 38s wjy/reuse
January 30, 2025 00:47 7m 38s
Tensor memory support 1
Lint #20219: Pull request #3755 synchronize by zasdfgbnm
January 30, 2025 00:46 12m 23s tmem-copy-kernel
January 30, 2025 00:46 12m 23s
DID loop split for linear => allreduce
Lint #20218: Pull request #3797 opened by wujingyue
January 30, 2025 00:33 8m 53s wjy/linear
January 30, 2025 00:33 8m 53s
tcgen05.alloc TMem usage
Lint #20217: Pull request #3795 synchronize by zasdfgbnm
January 29, 2025 23:27 12m 25s alloc-tmem
January 29, 2025 23:27 12m 25s
tcgen05.alloc TMem usage
Lint #20216: Pull request #3795 synchronize by zasdfgbnm
January 29, 2025 23:18 9m 2s alloc-tmem
January 29, 2025 23:18 9m 2s
tcgen05.alloc TMem usage
Lint #20215: Pull request #3795 synchronize by zasdfgbnm
January 29, 2025 23:00 12m 30s alloc-tmem
January 29, 2025 23:00 12m 30s
fix error when calculating smem overhead
Lint #20214: Pull request #3790 synchronize by liqiangxl
January 29, 2025 22:59 6m 49s llu/smem_overhead
January 29, 2025 22:59 6m 49s
Naive register <-> tmem load/store support
Lint #20213: Pull request #3786 synchronize by zasdfgbnm
January 29, 2025 22:59 8m 49s tmem-no-alloc
January 29, 2025 22:59 8m 49s
tcgen05.alloc TMem usage
Lint #20212: Pull request #3795 synchronize by zasdfgbnm
January 29, 2025 22:44 16m 32s alloc-tmem
January 29, 2025 22:44 16m 32s
[WIP] ExprEvalExecutor Speedup
Lint #20211: Pull request #3796 synchronize by csarofeen
January 29, 2025 21:49 9m 46s expr_eval_exec_devel
January 29, 2025 21:49 9m 46s
tcgen05.alloc TMem usage
Lint #20210: Pull request #3795 synchronize by zasdfgbnm
January 29, 2025 21:46 17m 22s alloc-tmem
January 29, 2025 21:46 17m 22s
Split Hopper MMA by warp-tile before instruction tile
Lint #20209: Pull request #3642 synchronize by jacobhinkle
January 29, 2025 21:46 7m 33s hopper_warptile_split
January 29, 2025 21:46 7m 33s
[WIP] ExprEvalExecutor Speedup
Lint #20208: Pull request #3796 synchronize by csarofeen
January 29, 2025 21:37 8m 10s expr_eval_exec_devel
January 29, 2025 21:37 8m 10s
[WIP] ExprEvalExecutor Speedup
Lint #20207: Pull request #3796 opened by csarofeen
January 29, 2025 21:36 1m 0s expr_eval_exec_devel
January 29, 2025 21:36 1m 0s
tcgen05.alloc TMem usage
Lint #20206: Pull request #3795 opened by zasdfgbnm
January 29, 2025 21:35 11m 33s alloc-tmem
January 29, 2025 21:35 11m 33s
Expose MatmulParams::cluster_dims in python frontend (#3789)
Lint #20204: Commit bface75 pushed by jacobhinkle
January 29, 2025 20:36 4m 40s main
January 29, 2025 20:36 4m 40s
Tensor memory support 1
Lint #20203: Pull request #3755 synchronize by zasdfgbnm
January 29, 2025 20:10 13m 25s tmem-copy-kernel
January 29, 2025 20:10 13m 25s
Tensor memory support 1
Lint #20202: Pull request #3755 synchronize by zasdfgbnm
January 29, 2025 20:04 6m 28s tmem-copy-kernel
January 29, 2025 20:04 6m 28s
Update cuda runtime files for TMA Multicast and cluster (#3793)
Lint #20201: Commit 9eb6121 pushed by rdspring1
January 29, 2025 19:33 4m 49s main
January 29, 2025 19:33 4m 49s