-
Notifications
You must be signed in to change notification settings - Fork 146
Open
Description
OpenMP enables operations on different GPUs to run in parallel in the context of SNMG ANN. However, nested parallelism is not enabled by default impeding the underlying index construction from using per-GPU parallelism.
#1526 aimed at fixing this issue by :
- Enabling nested parallelism with
omp_set_nested(1) - Limiting outer loop to
num_ranksthreads (one per GPU) - Inside each rank thread, allocating
threads_per_rankfor internal parallelism - Restoring original thread count after parallel region
However other more elegant solutions might be considered in the future such as :
- Using the concept of "teams" available in modern versions of OpenMP
- Using environment variables to configure nested loops (like a comma separated
OMP_NUM_THREADSenvironment variable with different number of threads for each loop in the nested hierarchy). - Other solutions
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Todo