Fix: print subbyte<T> compilation error #2783

chrisHuxi · 2025-11-19T15:48:17Z

Bug Fix

Which component has the problem?

CuTe C++

Describe the bug

When printing a tensor of subbyte <cutlass::float_e2m1_t> created by make_fragment_like, a compilation error occurs.

Steps/Code to reproduce bug

With repo at commit: a2439551c765c5393aebe557ee75d3a0412d2211

#include <cuda.h>
#include <cute/tensor.hpp>
#include <cutlass/numeric_types.h>  // cutlass::float_e4m3_t

using namespace cute;

__global__ void print_nvfp4_kernel(const __half *Aptr) {

  auto A_tensor = make_tensor(make_gmem_ptr((__half *)Aptr), make_shape(Int<16>{}, Int<16>{}), make_stride(16, Int<1>{}));

  Tensor glm_A_tensor_fp4 = make_tensor(make_gmem_ptr((cutlass::float_e2m1_t *)Aptr), make_shape(Int<16>{}, Int<16>{}), make_stride(16, Int<1>{}));

  Tensor A_tensor_fp4 = make_fragment_like<cutlass::float_e2m1_t>(A_tensor);
  
  if (cute::thread0()) {
    print("nvfp4: \n");
    print(glm_A_tensor_fp4(0)); print("\n"); // pass, result correct
    print(A_tensor_fp4(0)); print("\n"); // compilation error
  }
}

Outputs

Compilation error:

error: more than one instance of overloaded function "cuda_kernel::print" matches the argument list:
function template "void cute::print(const cute::subbyte_reference &)" (declared at line 370 of ../third_party/cutlass/include/cute/container/array_subbyte.hpp)
function template "void cute::print(cute::subbyte_reference)" (declared at line 198 of ../third_party/cutlass/include/cute/container/array_subbyte.hpp)
argument types are: (cute::subbyte_referencecutlass::float_e2m1_t)
print(A_tensor_fp4(0)); print("\n");

Expected behavior

compile pass & result correct.

Environment details

Compiler: g++ (Debian 12.2.0-14+deb12u1) 12.2.0
CUDA: Cuda compilation tools, release 12.8, V12.8.93
Build: cuda_12.8.r12.8/compiler.35583870_0

Additional context

The two overloads are indistinguishable at the "pass-by-value / pass-by-const-reference" level. Changing one of them to "accept only rvalues" allows the compiler to make a unique distinction: "passing an lvalue invokes the const& version, while passing an rvalue invokes the && version."

With the rvalue overload added, the code now compiles successfully and produces the expected results.
Could you please take a look and let me know your thoughts?

Thanks!

@ccecka @thakkarV

include/cute/container/array_subbyte.hpp

chrisHuxi · 2025-11-28T03:23:58Z

Hi, @hwu36 could you help merge it? or do I need more modification?

chrisHuxi · 2025-12-02T03:19:28Z

@fengxie

Hi, could you please help review and merge this PR?

Thanks

fengxie · 2025-12-02T10:17:32Z

@Junkai-Wu @hwu36 to see if this can be merged.

Fix print subbyte<T> compilation error

5f248f6

ccecka reviewed Nov 19, 2025

View reviewed changes

include/cute/container/array_subbyte.hpp Outdated Show resolved Hide resolved

ccecka approved these changes Nov 19, 2025

View reviewed changes

delete overload for const& input

07d0803

ccecka approved these changes Nov 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: print subbyte<T> compilation error #2783

Fix: print subbyte<T> compilation error #2783

chrisHuxi commented Nov 19, 2025

Uh oh!

Uh oh!

chrisHuxi commented Nov 28, 2025

Uh oh!

chrisHuxi commented Dec 2, 2025

Uh oh!

fengxie commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix: print subbyte<T> compilation error #2783

Are you sure you want to change the base?

Fix: print subbyte<T> compilation error #2783

Conversation

chrisHuxi commented Nov 19, 2025

Bug Fix

Which component has the problem?

Describe the bug

Steps/Code to reproduce bug

Outputs

Expected behavior

Environment details

Additional context

Uh oh!

Uh oh!

chrisHuxi commented Nov 28, 2025

Uh oh!

chrisHuxi commented Dec 2, 2025

Uh oh!

fengxie commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants