[Feature]: Implement a Batch Attention Kernel

### Suggestion Description

Flashinfer-ai/flashinfer provides an implementation of a batched attention kernel https://github.com/flashinfer-ai/flashinfer/blob/main/csrc/batch_attention.cu.

Refer https://github.com/flashinfer-ai/flashinfer/pull/1137. It is worth exploring if such a feature is viable for ROCm and whether it will be of benefit to end users.

### Operating System

_No response_

### GPU

_No response_

### ROCm Component

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Implement a Batch Attention Kernel #101

Suggestion Description

Operating System

GPU

ROCm Component

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature]: Implement a Batch Attention Kernel #101

Description

Suggestion Description

Operating System

GPU

ROCm Component

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions