Skip to content

Potential numerical corruption in fallback logic #27

@albertpod

Description

@albertpod

Folks, I know you've been doing quite some work around FastCholesky.jl, but the constant warnings I experience during inference recently have become a bit annoying. So, I decided to look into the code again, using Gemini 3.0.

It flagged something that looks like a potential bug in the fallback logic. I know AI code analysis can often be "slop," but I ran the reproduction script it suggested, and the numbers actually don't match up.

The potential issue:
Gemini claims that because the initial cholesky!(A) is in-place, it modifies the matrix A (partially factorizing it) even if it fails. When the code then enters the fallback block (if fallback_gmw81), it passes this partially modified/corrupted A to PositiveFactorizations, rather than the original input.

Here is the reproduction script:
I compared the result of FastCholesky against running PositiveFactorizations directly on a clean matrix.

using LinearAlgebra
using FastCholesky
using PositiveFactorizations
using Test

# [ 1  2 ] -> Determinant is -3 (Indefinite).
# Standard Cholesky fails; PositiveFactorizations modifies diagonal to handle it.
A_orig = [1.0 2.0; 2.0 1.0] 

# 1. Get the "Correct" Answer (GMW81 on clean data)
# We copy A_orig to ensure it's clean.
C_correct = cholesky(PositiveFactorizations.Positive, Hermitian(copy(A_orig)))

# 2. Run FastCholesky! 
# If the hypothesis is true, this will try standard Cholesky -> Fail (modifying A) -> Retry on dirty A
A_fast = copy(A_orig)
C_fast = fastcholesky!(A_fast)

println("--- Correct L Factor (GMW81 on clean data) ---")
display(C_correct.L)

println("\n--- FastCholesky L Factor (GMW81 on potentially dirty data) ---")
display(C_fast.L)

println("\n--- Do they match? ---")
println(C_correct.L  C_fast.L) 

Output:

--- Correct L Factor (GMW81 on clean data) ---
2×2 LowerTriangular{Float64, Matrix{Float64}}:
 1.0   ⋅ 
 2.0  1.73205

--- FastCholesky L Factor (GMW81 on potentially dirty data) ---
2×2 LowerTriangular{Float64, Matrix{Float64}}:
 1.0   ⋅ 
 2.0  2.64575

--- Do they match? ---
false

Is this a real bug regarding the state of A being carried over to the fallback, or am I missing something about how cholesky! handles failures?

Thanks! (sorry if slop)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions