-
Notifications
You must be signed in to change notification settings - Fork 425
Closed
Description
Version of Singularity:
$ singularity --version
SingularityPRO version 3.11-5.el8
Actual behavior
When running
#!/bin/bash
#SBATCH --ntasks=8
#SBATCH --ntasks-per-node=4
#SBATCH --nodes=2
module load openmpi/4.1.6--nvhpc--24.3
mpirun -np 8 singularity exec fall3d_opeacc.sif Fall3d.x I get the following error
[lrdn2911:1723502:0:1723502] cma_ep.c:88 process_vm_readv(pid=1723503 {0x14745e5ac800,61928}-->{0x150dba573e00,61928}) returned -1: Bad address
[lrdn2912:2435814:0:2435814] cma_ep.c:88 process_vm_readv(pid=2435813 {0x14d1065ac800,61928}-->{0x15453a573e00,61928}) returned -1: Bad address
[lrdn2911:1723498:0:1723498] cma_ep.c:88 process_vm_readv(pid=1723500 {0x1505545ac800,61928}-->{0x1490a2573e00,61928}) returned -1: Bad address
[lrdn2912:2435816:0:2435816] cma_ep.c:88 process_vm_readv(pid=2435815 {0x149e885ac800,61928}-->{0x154f4a573e00,61928}) returned -1: Bad address
==== backtrace (tid:1723498) ====
0 0x0000000000003803 uct_cma_ep_tx_error() /build-result/src/hpcx-v2.20-gcc-inbox-redhat8-cuda12-x86_64/ucx-39c8f9b/src/uct/sm/scopy/cma/cma_ep.c:85
...CMA (Cross-Memory Attach) is enabled inside UCX/Open MPI but fails to on the process_vm_readv()/process_vm_writev() system calls to do zero-copy shared memory transfers between processes on the same node.
Metadata
Metadata
Assignees
Labels
No labels