Skip to content

[QST][CuTeDSL] How to set __launch_bound__ / .maxntid in cute dsl? #2698

@monellz

Description

@monellz

Are there equivalents of __launch_bound__ or .maxntid in cute dsl?

For a compiled kernel of cute dsl, even if I use cuda.arch.warpgroup_reg_alloc/dealloc, the register number I get from CUfunction_attribute.CU_FUNC_ATTRIBUTE_NUM_REGS is still not fixed as expected. So I guess the inner compiler in cute dsl dosen't provide thread number hint to help derive register number.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions