You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Are there equivalents of __launch_bound__ or .maxntid in cute dsl?
For a compiled kernel of cute dsl, even if I use cuda.arch.warpgroup_reg_alloc/dealloc, the register number I get from CUfunction_attribute.CU_FUNC_ATTRIBUTE_NUM_REGS is still not fixed as expected. So I guess the inner compiler in cute dsl dosen't provide thread number hint to help derive register number.