-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
The MPIPatternP2P has a bug in one of the MPI_Isend calls (
dft-efe/src/utils/MPIPatternP2P.t.cpp
Line 1131 in c5060be
| err = MPIIsend<MemorySpace::HOST>(&numGhostIndicesInProc, |
numGhostIndicesInProc is a local to the for-loop and goes out of scope before the MPI_Isend completes the handshake with the target processor. We lucked out in most cases because it's just one integer value being sent, and hence, the handshake probably happened before numGhostIndicesInProc went out of scope. But in some corner cases, it causes an MPI_Waitall error, where the run is stuck indefinitely as the MPI_Isend is unable to complete the handshake. The fix is easy: just replace numGhostIndicesInProc with d_numGhostIndicesInGhostProcs[iGhostProc]. This bug might have affected some crashes we observed in DFT-FE as well.Metadata
Metadata
Assignees
Labels
No labels