Skip to content

Thread group control through implementation of Call group system #72

@bmillsNV

Description

@bmillsNV

Proposal: Traditionally, compute shader dispatch follows a two-tier 3D grid model (ignoring clusters for now). We’re all familiar with this structure, as illustrated in this diagram: https://learn.microsoft.com/en-us/windows/win32/api/d3d11/images/threadgroupids.png
I propose that, just as the traditional compute pipeline defines a thread-group, we introduce a corresponding concept: a call-group.
A call-group is an ND-shaped structure where all call-IDs within a call-group are guaranteed to execute within the same hardware thread-group.
This could be implemented very simply by just:

  • Setting the thread group size to [1,1,full-call-group-size]
  • Calculate call group id from dispatch group id
  • Calculate call group thread id from dispatch group thread id
  • Calculate call id by combining the 2
    This approach integrates thread-groups pretty seamlessly with our new ND execution model. In traditional compute you can split your dispatch into 1D, 2D or 3D cells. Now for a 5D call, we can now say that group-shared memory exists in 5D cells.
    We could also optimize where appropriate (or when hinted to by the user), and map call groups directly to thread groups where possible, which would simply reduce the need for arithmetic operations. i.e. if I specified a 3D call group size of [3,4,5], we can map that directly to a thread group size of [3,4,5] without needing to convert the linear representation to a grid one explicitly.
Image

Metadata

Metadata

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions