Skip to content

Conversation

@michel2323
Copy link
Collaborator

Let me know if you want to keep the trailing whitespaces.

@michel2323 michel2323 requested a review from vchuravy October 13, 2022 17:39
@michel2323
Copy link
Collaborator Author

#585 @vchuravy This should be merged I guess. Just checking again.

@michel2323 michel2323 force-pushed the ms/mpi branch 3 times, most recently from 409d930 to b9e0668 Compare January 31, 2023 22:27
@vchuravy
Copy link
Member

Can you rebase once more?

@michel2323
Copy link
Collaborator Author

@vchuravy @wsmoses The MPI test crashes now. With 1.9 we have the finalizer issue, and with 1.8 the logs and Manifest.toml are attached.
log.tar.gz

@michel2323 michel2323 force-pushed the ms/mpi branch 2 times, most recently from 0f2cbdb to 6754c67 Compare November 7, 2023 22:52
@wsmoses
Copy link
Member

wsmoses commented Nov 8, 2023

Can you open an issue with the error message?

@michel2323 michel2323 mentioned this pull request Nov 8, 2023
@codecov-commenter
Copy link

codecov-commenter commented Nov 8, 2023

Codecov Report

❌ Patch coverage is 42.85714% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.00%. Comparing base (c009fa5) to head (66d0c3d).

Files with missing lines Patch % Lines
src/compiler/orcv2.jl 42.85% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #518      +/-   ##
==========================================
- Coverage   70.01%   70.00%   -0.02%     
==========================================
  Files          58       58              
  Lines       19295    19295              
==========================================
- Hits        13510    13507       -3     
- Misses       5785     5788       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@michel2323
Copy link
Collaborator Author

michel2323 commented Nov 8, 2023

@wsmoses @vchuravy I marked the test here as broken so we can merge the MPI tests and mark it unbroken with the fix. Issue opened #1138.

@wsmoses wsmoses force-pushed the ms/mpi branch 2 times, most recently from 0e829b9 to a1745a5 Compare November 20, 2023 02:37
@wsmoses
Copy link
Member

wsmoses commented Nov 20, 2023

@michel2323 i rebased this PR with the jll with your fix, if it passes let's merge this!

@wsmoses
Copy link
Member

wsmoses commented Nov 20, 2023

@vchuravy this still fails presumably for the need for: #669

from

2023-11-20T03:26:35.5348040Z error: /home/runner/.julia/packages/MPI/hhI6i/src/api/generated_api.jl:2151:0: in function preprocess_julia_Isend_50847 {} addrspace(10)* ({} addrspace(10)*, i64, {} addrspace(10)*): Enzyme: failed to deduce type of copy   %65 = call i32 @MPI_Isend(i64 %58, i32 %value_phi4, i32 %61, i32 %52, i32 0, i32 %62, i64 %64) #34 [ "jl_roots"({} addrspace(10)* %2, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 139848936934304 to {}*) to {} addrspace(10)*), {} addrspace(10)* %value_phi5, {} addrspace(10)* %value_phi3) ], !dbg !107

2023-11-20T03:26:35.5120729Z   %value_phi5 = phi {} addrspace(10)* [ null, %pass ], [ addrspacecast ({}* inttoptr (i64 139848936932704 to {}*) to {} addrspace(10)*), %L54 ]: {[-1]:Pointer}, intvals: {0,}
2023-11-20T03:26:35.4979683Z   %59 = bitcast {} addrspace(10)* %value_phi5 to i32 addrspace(10)*, !dbg !123: {[-1]:Pointer}, intvals: {0,}
2023-11-20T03:26:35.4980592Z   %60 = addrspacecast i32 addrspace(10)* %59 to i32 addrspace(11)*, !dbg !123: {[-1]:Pointer}, intvals: {0,}
2023-11-20T03:26:35.4981343Z   %61 = load i32, i32 addrspace(11)* %60, align 4, !dbg !123, !tbaa !45: {}, intvals: {}

While I do agree custom rules are good (and we can redo that API to be a custom global invariant rule, the issue of that MPI.Double being hidden behind a separate julia-specific global int is hindering the analysis here (and also likely would for other libXYZ calls potentially, for example CUDA).

@vchuravy vchuravy added this to the release-0.12 milestone Apr 1, 2024
@michel2323
Copy link
Collaborator Author

michel2323 commented Apr 1, 2024

I get this now on the most recent Enzyme build:

ERROR: LoadError: ERROR: LoadError: Enzyme execution failed.
Enzyme: jl_call calling convention not implemented in aug_forward for   %33 = call nonnull {} addrspace(10)* ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32)*, {} addrspace(10)*, ...) @julia.call({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32)* noundef nonnull @jl_f_finalizer, {} addrspace(10)* noundef null, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 140298333135136 to {}*) to {} addrspace(10)*), {} addrspace(10)* nofree nonnull align 8 dereferenceable(16) %newstruct15.i) #29, !dbg !149
Stacktrace:
 [1] finalizer
   @ ./gcutils.jl:87
 [2] Request
   @ /disk/mschanen/julia_depot/packages/MPI/z2owj/src/nonblocking.jl:183

This is the request issue @vchuravy mentioned. How do we proceed with MPI in Julia? Should we start an Enzyme extension for MPI/MPI extension for Enzyme?

@vchuravy
Copy link
Member

vchuravy commented Nov 3, 2025

I also locally saw a

ERROR: LoadError: LLVM error: Attribute 'readonly' applied to incompatible type!
i32 (i64, i32, i32, i32, i32, i32, i64)* @"ejlptr$MPI_Isend$1"

Executing nonblocking_halo.jl with Julia 1.11

@vchuravy
Copy link
Member

vchuravy commented Nov 4, 2025

Having lot's of fun re-discovering issues,

With EnzymeAD/Enzyme#2527 for the nonblocking_halo.jl

ERROR: LoadError: ERROR: LoadError: Enzyme execution failed.
Enzyme: unhandled augmented forward for jl_f_finalizer

Stacktrace:
  [1] finalizer
    @ ./gcutils.jl:86 [inlined]
  [2] Request
    @ ~/.julia/packages/MPI/hNJm0/src/nonblocking.jl:183 [inlined]
  [3] Isend
    @ ~/.julia/packages/MPI/hNJm0/src/pointtopoint.jl:59 [inlined]
  [4] halo
    @ ~/src/Enzyme/test/integration/MPI/nonblocking_halo.jl:19 [inlined]
  [5] halo
    @ ~/src/Enzyme/test/integration/MPI/nonblocking_halo.jl:0 [inlined]
  [6] diffejulia_halo_1504_inner_13wrap
    @ ~/src/Enzyme/test/integration/MPI/nonblocking_halo.jl:0
  [7] macro expansion
    @ ~/src/Enzyme/src/compiler.jl:5887 [inlined]
  [8] enzyme_call
    @ ~/src/Enzyme/src/compiler.jl:5421 [inlined]
  [9] CombinedAdjointThunk
    @ ~/src/Enzyme/src/compiler.jl:5307 [inlined]
 [10] autodiff
    @ ~/src/Enzyme/src/Enzyme.jl:521 [inlined]
 [11] autodiff
    @ ~/src/Enzyme/src/Enzyme.jl:562 [inlined]
 [12] Enzyme execution failed.
Enzyme: unhandled augmented forward for jl_f_finalizer

Co-authored-by: Valentin Churavy <[email protected]>
@vchuravy vchuravy changed the base branch from main to vc/blocking_ring November 5, 2025 17:13
@vchuravy vchuravy changed the title Adding MPI test Test for MPI.Irecv/MPI.Isend/MPI.Wait Nov 5, 2025
@vchuravy vchuravy removed this from the release-0.12 milestone Nov 5, 2025
@vchuravy
Copy link
Member

vchuravy commented Nov 5, 2025

Depends on #2736

Base automatically changed from vc/blocking_ring to main November 6, 2025 18:11

dx = zeros(nlocal)
fill!(dx, Float64(rank))
autodiff(Reverse, halo, Duplicated(x, dx))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 1.10 only: https://github.com/EnzymeAD/Enzyme.jl/actions/runs/19148197862/job/54731298414?pr=518#step:8:31

julia: /workspace/srcdir/Enzyme/enzyme/Enzyme/Utils.cpp:1804: llvm::Function* getOrInsertDifferentialMPI_Wait(llvm::Module&, llvm::ArrayRef<llvm::Type*>, llvm::Type*, llvm::StringRef): Assertion `isendfn' failed.

[1123] signal (6.-6): Aborted
in expression starting at /__w/Enzyme.jl/Enzyme.jl/test/integration/MPI/nonblocking_halo.jl:50
pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7ee131fcb81a)
__assert_fail at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
getOrInsertDifferentialMPI_Wait at /workspace/srcdir/Enzyme/enzyme/Enzyme/Utils.cpp:1804
handleMPI at /workspace/srcdir/Enzyme/enzyme/Enzyme/CallDerivatives.cpp:429
handleKnownCallDerivatives at /workspace/srcdir/Enzyme/enzyme/Enzyme/CallDerivatives.cpp:2254
visitCallInst at /workspace/srcdir/Enzyme/enzyme/Enzyme/AdjointGenerator.h:6405
visit at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/InstVisitor.h:111 [inlined]
CreatePrimalAndGradient at /workspace/srcdir/Enzyme/enzyme/Enzyme/EnzymeLogic.cpp:4505
EnzymeCreatePrimalAndGradient at /workspace/srcdir/Enzyme/enzyme/Enzyme/CApi.cpp:688
EnzymeCreatePrimalAndGradient at /__w/Enzyme.jl/Enzyme.jl/src/api.jl:270
jfptr_EnzymeCreatePrimalAndGradient_24158 at /root/.julia/compiled/v1.10/Enzyme/G1p5n_64aGk.so (unknown line)
_jl_invoke at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:3077
macro expansion at /__w/Enzyme.jl/Enzyme.jl/src/compiler.jl:2639 [inlined]
macro expansion at /root/.julia/packages/LLVM/iza6e/src/base.jl:97 [inlined]
enzyme! at /__w/Enzyme.jl/Enzyme.jl/src/compiler.jl:2512

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grrrml

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vchuravy vchuravy merged commit 575f622 into main Nov 6, 2025
22 checks passed
@vchuravy vchuravy deleted the ms/mpi branch November 6, 2025 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants