Skip to content

Cache dlpath on MacOS because it gets really slow #58409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

IanButterworth
Copy link
Member

JLL inits can get really slow on MacOS when lots of libs are loaded because jl_pathname_for_handle is expensive on MacOS.

This caches the work within dlpath on MacOS.

master

julia> using SuiteSparse_jll, Libdl

julia> @time dlpath(SuiteSparse_jll.libamd_handle)
  0.000675 seconds (1 allocation: 96 bytes)
"/Users/ian/Documents/GitHub/alts/julia/usr/lib/libamd.3.3.3.dylib"

julia> @time using Plots
  1.045309 seconds (1.74 M allocations: 95.389 MiB, 11.74% gc time, 8.30% compilation time)

julia> @time dlpath(SuiteSparse_jll.libamd_handle)
  0.042908 seconds (1 allocation: 96 bytes)
"/Users/ian/Documents/GitHub/alts/julia/usr/lib/libamd.3.3.3.dylib"

julia> @time dlpath(SuiteSparse_jll.libamd_handle)
  0.055904 seconds (1 allocation: 96 bytes)
"/Users/ian/Documents/GitHub/alts/julia/usr/lib/libamd.3.3.3.dylib"

julia> @time SuiteSparse_jll.__init__()
  0.452236 seconds (43 allocations: 1.875 KiB)
"/Users/ian/Documents/GitHub/alts/julia/usr"

PR

julia> using SuiteSparse_jll, Libdl

julia> @time dlpath(SuiteSparse_jll.libamd_handle)
  0.000001 seconds
"/Users/ian/Documents/GitHub/julia/usr/lib/libamd.3.3.3.dylib"

julia> @time using Plots
  0.937607 seconds (1.70 M allocations: 92.011 MiB, 3.28% gc time, 9.93% compilation time)

julia> @time dlpath(SuiteSparse_jll.libamd_handle)
  0.000001 seconds
"/Users/ian/Documents/GitHub/julia/usr/lib/libamd.3.3.3.dylib"

julia> @time SuiteSparse_jll.__init__()
  0.002423 seconds (31 allocations: 752 bytes)
"/Users/ian/Documents/GitHub/julia/usr"

num_images = ccall(:_dyld_image_count, Cint, ())
# start at 1 instead of 0 to skip self
for i in 1:num_images-1 # 0-based
name = unsafe_string(ccall(:_dyld_get_image_name, Cstring, (UInt32,), i))
Copy link
Member

@Keno Keno May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're already relying on the structure of the macos handle here, so this whole thing can just be

dlpath_cache[(@ccall _dyld_get_image_header(i::UInt32)::Ptr{Cvoid}) >> 5]

cf https://github.com/apple-opensource/dyld/blob/e3f88907bebb8421f50f0943595f6874de70ebe0/dyld3/APIs.cpp#L889

with a corresponding mask of the bottom bit in the lookup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(To avoid the extra dlopen)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into that first but the handle returned by _dyld_get_image_header seems to not be the expected heap allocated handles. They're in the 0x1xx range where the handle we have in julia is in the 0x3xx range. Does that sound right?

At least I did a two stage approach trying that first then falling back to the current way, and it never seemed to fast path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note the shift by 5)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside to whether cache or not I can't seem to get this right. It always hits the fallthrough error

const dlpath_cache_lock = Base.ReentrantLock()
const dlpath_cache = Dict{UInt,String}()
function dlpath(handle::Ptr{Cvoid})
    @lock dlpath_cache_lock begin
        key = UInt(handle) & ~UInt(1) # mask the flags bit. see `makeDlHandle`
        path = get(dlpath_cache, key, nothing)
        if path !== nothing
            return path
        else
            num_images = ccall(:_dyld_image_count, Cint, ())
            # start at 1 instead of 0 to skip self
            for i in 1:num_images-1 # 0-based
                h = UInt(@ccall _dyld_get_image_header(i::UInt32)::Ptr{Cvoid}) << 5
                haskey(dlpath_cache, h) && continue
                name = unsafe_string(ccall(:_dyld_get_image_name, Cstring, (UInt32,), i))
                key2 = UInt(h)
                dlpath_cache[key] = name
                if key2 == key
                    return name
                end
            end
        end
    end
    error("dlpath: could not find path for handle $(handle)")
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're applying the shift in the wrong direction. You either need to << 5 the key or >> 5 the image header.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>> 5 on the image header gives quite different values

function dlpath(handle::Ptr{Cvoid})
    @lock dlpath_cache_lock begin
        key = UInt(handle) & ~UInt(1) # mask the flags bit. see `makeDlHandle`
        path = get(dlpath_cache, key, nothing)
        if path !== nothing
            return path
        else
            num_images = ccall(:_dyld_image_count, Cint, ())
            # start at 1 instead of 0 to skip self
            for i in 1:num_images-1 # 0-based
                key2 = UInt(@ccall _dyld_get_image_header(i::UInt32)::Ptr{Cvoid}) >> 5
                haskey(dlpath_cache, key2) && continue
                name = unsafe_string(ccall(:_dyld_get_image_name, Cstring, (UInt32,), i))
                dlpath_cache[key2] = name
                @show key, key2
                if key2 == key
                    return name
                end
            end
        end
    end
    error("dlpath: could not find path for handle $(handle)")
end
(key, key2) = (0x0000000397000010, 0x000000000d60bd00)
(key, key2) = (0x0000000397000010, 0x000000000d73a180)
(key, key2) = (0x0000000397000010, 0x000000000e226680)
(key, key2) = (0x0000000397000010, 0x000000000d72f200)
(key, key2) = (0x0000000397000010, 0x000000000d655300)
(key, key2) = (0x0000000397000010, 0x000000000d73a880)
(key, key2) = (0x0000000397000010, 0x00000000123a3b80)
(key, key2) = (0x0000000397000010, 0x000000000f449880)

Also tried

key2 = (UInt(@ccall _dyld_get_image_header(i::UInt32)::Ptr{Cvoid}) >> 5) & ~UInt(1)

I found myself trying << 5 because the values are quite similar..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't know. I guess it's possible that dyld changed the way it computes handles - the open version is somewhat old.

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the cache and implement this inside jl_pathname_for_handle as Keno directed

@Keno
Copy link
Member

Keno commented May 13, 2025

Please remove the cache

Isn't avoiding the iteration still part of the point?

@giordano giordano added system:mac Affects only macOS and removed JLLs labels May 13, 2025
@vtjnash
Copy link
Member

vtjnash commented May 14, 2025

I'd recommend just removing dlpath from your code, since it seems like a pretty smelly thing to be doing (outside of debug printing stuff), since there isn't really much you should be doing with filesystem path manually

@vtjnash
Copy link
Member

vtjnash commented May 14, 2025

For example, the single use of it recently added in Base would wrong for anyone that statically links their own binaries using libjulia or renames it. That use could be using dladdr on a known symbol (we use jl_get_libdir elsewhere for that), or better yet it looks like it could just let the system dlopen do the right thing on its own (unix already searches DT_RPATH first which is $ORIGIN, macos does so if you specify the path @rpath/, and Windows does this because we specify that to LoadLibraryEx)

julia/base/libdl.jl

Lines 344 to 345 in 58daba4

libname = ifelse(isdebugbuild(), "libjulia-internal-debug", "libjulia-internal")
dirname(dlpath(libname))

@vtjnash
Copy link
Member

vtjnash commented May 14, 2025

This will also already be fixed by #58405 without needing this hack

@IanButterworth
Copy link
Member Author

I had assumed JLLWrappers also used dlpath like the stdlib JLLs, but it doesn't seem to https://github.com/JuliaPackaging/JLLWrappers.jl/blob/ecaba50a4462209714f0979667c64c1bf28ee892/src/products/library_generators.jl#L63

So yeah #58405 seems to go in the right direction (but not fully, there's still 20 dlpaths during stdlib JLL inits currently there) so I'll close this.

@IanButterworth IanButterworth deleted the ib/dlpath_cache branch May 14, 2025 15:26
@IanButterworth
Copy link
Member Author

Just FYI @staticfloat

@IanButterworth
Copy link
Member Author

And just adding that dlpath only has 47 hits in General, most of which appear to be at build-time or during tests so yeah, not a big issue
https://juliahub.com/ui/Search?type=code&q=dlpath(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
latency Latency system:mac Affects only macOS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants