Skip to content

Conversation

@vchuravy
Copy link
Member

@vchuravy vchuravy commented Nov 6, 2025

@wsmoses on 1.10 I am seeing:

; Function Attrs: nofree norecurse nosync nounwind willreturn
declare i32 @PMPI_Wait(i64, i64) local_unnamed_addr #13

; Function Attrs: nofree norecurse nosync nounwind willreturn
declare i32 @MPI_Comm_rank(i32, i64) local_unnamed_addr #16

; Function Attrs: nofree norecurse nosync nounwind willreturn
declare i32 @MPI_Comm_size(i32, i64) local_unnamed_addr #17

; Function Attrs: nofree norecurse nosync nounwind willreturn
declare i32 @MPI_Irecv(i64, i32, i32, i32, i32, i32, i64) local_unnamed_addr #18

; Function Attrs: nofree norecurse nosync nounwind willreturn
declare i32 @MPI_Isend(i64, i32, i32, i32, i32, i32, i64) local_unnamed_addr #1

So when PMPI_Wait goes looking for MPI_Isend it fails to look at the right one.

If this looks kosher to you, I can also go and fix all the other users for getRenamedPerCallingConv

llvm::StringRef callee) {
// The function could exist in two forms in the module - with PMPI_ or MPI_
// prefix. Check for both.
assert(startsWith(callee, "MPI"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no this is wrong, we should create the func decl if not exists

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only checks that the user side is passing in MPI_Gather. (Used to be in getRenamedPerCallingConv)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry comment applied to three lines later

@wsmoses
Copy link
Member

wsmoses commented Nov 6, 2025

this is the wrong fix overall getRenamedPerCallingConv("PMPI_...", "MPI_x") should give PMPI_x, which ought resolve?

@vchuravy
Copy link
Member Author

vchuravy commented Nov 6, 2025

this is the wrong fix overall getRenamedPerCallingConv("PMPI_...", "MPI_x") should give PMPI_x, which ought resolve?

No, that's the crux of the issue. A module may contain a mix of PMPI and MPI names, likely due to Enzyme.jl trying to backsolve the name and the symbol either being PMPI or MPI.

The full module is here. https://gist.github.com/vchuravy/6f5edc18764db407a019294eba7f39e5

And we are running on PMPI_Wait and we are looking for MPI_Isend, which we normalize to PMPI_Isend and then not find in the name.

julia: /workspace/srcdir/Enzyme/enzyme/Enzyme/Utils.cpp:1804: llvm::Function* getOrInsertDifferentialMPI_Wait(llvm::Module&, llvm::ArrayRef<llvm::Type*>, llvm::Type*, llvm::StringRef): Assertion `isendfn' failed.

[1123] signal (6.-6): Aborted
in expression starting at /__w/Enzyme.jl/Enzyme.jl/test/integration/MPI/nonblocking_halo.jl:50
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7ee131fcb81a)
__assert_fail at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
getOrInsertDifferentialMPI_Wait at /workspace/srcdir/Enzyme/enzyme/Enzyme/Utils.cpp:1804
handleMPI at /workspace/srcdir/Enzyme/enzyme/Enzyme/CallDerivatives.cpp:429
handleKnownCallDerivatives at /workspace/srcdir/Enzyme/enzyme/Enzyme/CallDerivatives.cpp:2254
visitCallInst at /workspace/srcdir/Enzyme/enzyme/Enzyme/AdjointGenerator.h:6405
visit at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/InstVisitor.h:111 [inlined]
CreatePrimalAndGradient at /workspace/srcdir/Enzyme/enzyme/Enzyme/EnzymeLogic.cpp:4505
EnzymeCreatePrimalAndGradient at /workspace/srcdir/Enzyme/enzyme/Enzyme/CApi.cpp:688
EnzymeCreatePrimalAndGradient at /__w/Enzyme.jl/Enzyme.jl/src/api.jl:270
jfptr_EnzymeCreatePrimalAndGradient_24158 at /root/.julia/compiled/v1.10/Enzyme/G1p5n_64aGk.so (unknown line)
_jl_invoke at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:3077
macro expansion at /__w/Enzyme.jl/Enzyme.jl/src/compiler.jl:2639 [inlined]
macro expansion at /root/.julia/packages/LLVM/iza6e/src/base.jl:97 [inlined]
enzyme! at /__w/Enzyme.jl/Enzyme.jl/src/compiler.jl:2512

EnzymeAD/Enzyme.jl#518 (comment)

@wsmoses
Copy link
Member

wsmoses commented Nov 6, 2025

I mean the bigger issue is that we should never have a mix of mpi/pmpi?

since I assume julia never gives us a mix. So we should never generate a mix?

@wsmoses
Copy link
Member

wsmoses commented Nov 6, 2025

the issue imo is that this

auto isendfn = M.getFunction(getRenamedPerCallingConv(caller, "MPI_Isend"));
needs to become a getOrInsertFunction, not getFunction

@wsmoses
Copy link
Member

wsmoses commented Nov 6, 2025

we have the args, so therefore we ought be able to form the functiontype

@vchuravy
Copy link
Member Author

vchuravy commented Nov 7, 2025

since I assume julia never gives us a mix. So we should never generate a mix?

Enzyme.jl is to blame, it tries to invert a pointer from Julia and what name it ends up is a 50/50.

since I assume julia never gives us a mix. So we should never generate a mix?

See the module, Julia precisely gives us a mix and that is the issue I am trying to fix.

@wsmoses
Copy link
Member

wsmoses commented Nov 7, 2025

ah bleh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants