Skip to content

Long term improvements of OpenMP use in the context of SNMG ANN #1559

@viclafargue

Description

@viclafargue

OpenMP enables operations on different GPUs to run in parallel in the context of SNMG ANN. However, nested parallelism is not enabled by default impeding the underlying index construction from using per-GPU parallelism.

#1526 aimed at fixing this issue by :

  • Enabling nested parallelism with omp_set_nested(1)
  • Limiting outer loop to num_ranks threads (one per GPU)
  • Inside each rank thread, allocating threads_per_rank for internal parallelism
  • Restoring original thread count after parallel region

However other more elegant solutions might be considered in the future such as :

  • Using the concept of "teams" available in modern versions of OpenMP
  • Using environment variables to configure nested loops (like a comma separated OMP_NUM_THREADS environment variable with different number of threads for each loop in the nested hierarchy).
  • Other solutions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions