Skip to content
Discussion options

You must be logged in to vote

The two CTAs are indeed cooperating to produce the full MxN tile.

However, PTX documentation does not indicate that the 2SM MMA computation (tcgen05.mma..cta_group::2) share matrix A/B between CTAs. In other words, each CTA can only use the data present in its own shared memory during the MMA operation.

Where is the documentation incorrect? Your link does not point to a specific line.

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@XieXiating
Comment options

@SimonZh1234
Comment options

Answer selected by XieXiating
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants