aa wrapper and KLU migration to PNM for PFs#302
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Replaces the optional AppleAccelerate.jl weak dependency / package extension with an in-tree AccelerateWrapper submodule that binds directly to libSparse.dylib on macOS. The wrapper exposes a cached symbolic+numeric LDLT factorization, in-place dense/sparse-RHS solves, and SpMM/SpMV; PNM's PTDF/LODF/VirtualPTDF code paths are migrated from AAFactorization to the new AAFactorCache. Non-Apple builds get a stub module that errors on use, removing the runtime extension gating.
Changes:
- New
src/AccelerateWrapper/submodule with libSparse ccalls, a symbolic/numeric factor cache, and dense/sparse solve + SpMM bindings (macOS-gated; stubs elsewhere). - Inlined
_calculate_PTDF_matrix_AppleAccelerate/_calculate_LODF_matrix_AppleAccelerateintosrc/, swapped_solve_factorization/_create_factorization/with_solverto dispatch onAAFactorCache, and replaced_has_apple_accelerate_ext()with_has_apple_accelerate_backend() = Sys.isapple(). - Removed
ext/AppleAccelerateExt.jl, theAppleAccelerateweakdep/compat entries, theruntests.jlinstall step, and the corresponding Aqua stale-dep ignore; addedtest/test_accelerate_wrapper.jl.
Reviewed changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/AccelerateWrapper/AccelerateWrapper.jl |
Submodule entry; macOS-gated includes plus non-Apple stubs. |
src/AccelerateWrapper/libsparse_bindings.jl |
C struct layouts, enums, and mangled @ccall bindings into libSparse. |
src/AccelerateWrapper/aa_cache.jl |
AAFactorCache lifecycle: lower-triangle pattern, symbolic/numeric (re)factor, finalizer. |
src/AccelerateWrapper/solve_dense.jl |
In-place dense vector/matrix solves and \ overload. |
src/AccelerateWrapper/solve_sparse_rhs.jl |
Block-packed sparse-RHS solver with reusable per-cache scratch. |
src/AccelerateWrapper/spmm.jl |
aa_spmm! / aa_spmv! bindings with per-call CSC→Apple index translation. |
src/PowerNetworkMatrices.jl |
Includes the wrapper, imports its symbols, drops AppleAccelerate forward decls. |
src/linalg_settings.jl |
Drops the extension probe; adds _has_apple_accelerate_backend and updates check_linalg_backend. |
src/solver_dispatch.jl |
Adds with_solver overload specialized on AAFactorCache; updates docstring. |
src/ptdf_calculations.jl |
Inlines PTDF AppleAccelerate path using AccelerateWrapper. |
src/lodf_calculations.jl |
Inlines LODF AppleAccelerate path using AccelerateWrapper. |
src/virtual_ptdf_calculations.jl |
Switches _solve_factorization to typed AAFactorCache overload; updates docs. |
ext/AppleAccelerateExt.jl |
Removed (logic moved into src/). |
Project.toml |
Drops AppleAccelerate weakdep, extension entry, and compat bound. |
test/PowerNetworkMatricesTests.jl |
Removes :AppleAccelerate from Aqua stale-dep ignore list. |
test/runtests.jl |
Drops the Pkg.add("AppleAccelerate") branch and updates the comment. |
test/test_accelerate_wrapper.jl |
New unit tests for AAFactorCache, dispatch, with_solver, and KLU parity. |
test/test_ptdf.jl, test/test_lodf.jl, test/test_virtual_ptdf.jl, test/test_powerflow_matrix_types.jl, test/test_network_modification.jl |
Switch guards to _has_apple_accelerate_backend, update messages, and update expected factor type to AAFactorCache. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| """ | ||
| solve!(cache, B) -> B | ||
|
|
||
| Solve `A · X = B` in place. `B::StridedVecOrMat{Float64}` must have | ||
| first-dimension size equal to `cache.n` and unit stride in the first | ||
| dimension. Multiple columns of `B` are handled in a single libSparse call. | ||
| """ | ||
| function solve!(cache::AAFactorCache, B::StridedMatrix{Cdouble}) |
| This solver is only available on macOS. | ||
| Install AppleAccelerate: | ||
| julia> using Pkg; Pkg.add(\"AppleAccelerate\")""" | ||
| _has_apple_accelerate_backend() = Sys.isapple() |
| # Called by libSparse on failure with a null-terminated C string; we surface | ||
| # the message via `@error`. Must be a top-level function so `@cfunction` can | ||
| # resolve it. | ||
| function _libsparse_report_error(msg::Cstring) | ||
| s = unsafe_string(msg) | ||
| @error "libSparse reported an error" message = s | ||
| return nothing | ||
| end | ||
|
|
||
| # `reportError` is fired by libSparse before it returns a failure status — | ||
| # we log the message so it ends up in user output rather than libSparse's | ||
| # own stderr. Passing libc malloc/free explicitly (the C_NULL "use Apple |
There was a problem hiding this comment.
Seconded. Looking at AppleAccelerate.jl, I see instead: @cfunction(text->error(unsafe_string(text)), Cvoid, (Cstring, )). Probably don't need to go that low-level, but @error seems risky.
| @@ -383,7 +410,7 @@ efficient when the prerequisite matrices with factorization are already availabl | |||
| - `BA::BA_Matrix`: The branch susceptance weighted incidence matrix (B * A) | |||
|
|
|||
| # Keyword Arguments | |||
| - `linear_solver::String = "KLU"`: | |||
| - `linear_solver::String = _default_linear_solver()`: | |||
| Linear solver algorithm for matrix computations. Currently only "KLU" is supported | |||
| col_start = pos | ||
| for p in nzrange(A, j) | ||
| if rowval[p] >= j | ||
| pos += 1 | ||
| pos > cache.nnz_tri && return _pattern_mismatch(op) | ||
| cache.rowIndices[pos] == Cint(rowval[p] - 1) || | ||
| return _pattern_mismatch(op) | ||
| end | ||
| end | ||
| cache.columnStarts[j + 1] == Clong(pos) || | ||
| return _pattern_mismatch(op) | ||
| # silence unused-warning on col_start | ||
| col_start === col_start |
Performance ResultsPrecompile Time
Execution Time
|
luke-kiernan
left a comment
There was a problem hiding this comment.
Big picture question: what's the motivation or goal here? Yeah, using the public Julia package has its issues and limitations, but it's also modular and maintained in tandem with the _jll binary. So I'd like to hear your reasoning here.
If there's specific shortcomings of these libraries, I'm willing to go open a PR. I expect it wouldn't get merged for a while (~months), but at least then we wouldn't need to maintain our own low-level bindings indefinitely.
My other question: why is AA no longer an extension?
| factorization_type::SparseFactorization_t = SparseFactorizationLDLT, | ||
| ) | ||
| n = size(A, 1) | ||
| n == size(A, 2) || throw(DimensionMismatch("matrix must be square; got $(size(A))")) |
There was a problem hiding this comment.
If the user tries to apply a symmetric factorization to a non symmetric matrix, this will silently take the lower triangle and factorize a different matrix...potentially hazardous. I'd at least check that the pattern is symmetric, with a flag to bypass.
| Release the libSparse numeric and symbolic handles held by `cache`, leaving | ||
| Julia-side state intact. Idempotent. | ||
| """ | ||
| function _free_handles!(cache::AAFactorCache) |
There was a problem hiding this comment.
And if it isn't SparseStatusOk? Leaving it around seems iffy. Unlikely to cause issues in practice, but could clog up the heap with un-GCed objects in theory.
| # Called by libSparse on failure with a null-terminated C string; we surface | ||
| # the message via `@error`. Must be a top-level function so `@cfunction` can | ||
| # resolve it. | ||
| function _libsparse_report_error(msg::Cstring) | ||
| s = unsafe_string(msg) | ||
| @error "libSparse reported an error" message = s | ||
| return nothing | ||
| end | ||
|
|
||
| # `reportError` is fired by libSparse before it returns a failure status — | ||
| # we log the message so it ends up in user output rather than libSparse's | ||
| # own stderr. Passing libc malloc/free explicitly (the C_NULL "use Apple |
There was a problem hiding this comment.
Seconded. Looking at AppleAccelerate.jl, I see instead: @cfunction(text->error(unsafe_string(text)), Cvoid, (Cstring, )). Probably don't need to go that low-level, but @error seems risky.
| # Status codes and shared error helper | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
| const KLU_OK = 0 |
There was a problem hiding this comment.
nitpick: int backed enum would be more stylistic.
| # Keyword Arguments | ||
| - `linear_solver::String = "KLU"`: | ||
| Linear solver algorithm for matrix computations. Currently only "KLU" is supported | ||
| This constructor is intentionally KLU-only because `ABA.K` is always a |
There was a problem hiding this comment.
Reasoning behind this decision? Which matrices are AA-compatible and why?
| # element-wise row scaling. KLU's BTF short-circuits this internally so the | ||
| # overhead was modest; AA's libSparse and LAPACK's `getrf!`/`getrs!` do | ||
| # not, so the previous code was 3–5× slower on AA and order-of-magnitude | ||
| # slower on DENSE than necessary. Replace both with a direct row scaling. |
There was a problem hiding this comment.
Wait, is what we were doing that simple? And yet we were invoking factorization routines? Just want to confirm that this is real human error corrected by AI, not a hallucination
In short, I am trying reduce the number of external dependencies under which we have no control specially around the linear solvers and expose only the code that we need. I have been frustrated with the speed at which the other libraries respond and also at the exposure of mistakes there like it happened with the solve! call. Another reason specially for the linear solvers is to have cohesive infrastructure for PNM and PFs. |
josephmckinsey
left a comment
There was a problem hiding this comment.
I'm mainly confused as to why we are making so many changes to KLU immediately again. I haven't dug into the Apple Accelerate part yet.
There was a problem hiding this comment.
What feature is this used for? I don't actually see any specific code for KLU, so it seems like it could potentially be used for any LinearAlgebra.Factorization.
There was a problem hiding this comment.
I'm a bit confused why we are wrapping this. KLU's internals weren't very thread-safe, which made it a clearer case until that can be fixed upstream, but I thought AppleAccelerate.jl was a bit better now?
| const KLU_INVALID = -3 | ||
| const KLU_TOO_LARGE = -4 | ||
| # --------------------------------------------------------------------------- | ||
| # Int32 (int) family — mirror of the Int64 set above |
There was a problem hiding this comment.
Why are we implementing Int32? It seemed like the previous PR purposefully left out Int32
Maybe in not the cleanest way I layered the changes needed in KLU here so we can remove the use of KLU.jl in PowerFlows and use our own wrapper. In PFs we use Int32 |
This PR provides usage of the AA framework in the rest of the matrices and changes to KLU to be used later in PowerFlows to drop completely KLU.jl dependency and pass onto PFs the protection against the pointer problem and will close Sienna-Platform/PowerFlows.jl#107