-
Notifications
You must be signed in to change notification settings - Fork 75
Open
Description
This may not be the best way to approach this, but to improve the heuristic deciding whether to reduce with blocks or with threads I'm thinking there should be a way to expose the number of cores.
See https://github.com/JuliaGPU/CUDA.jl/blob/e561e7a106684f8e4be59cad98a51cc304c671d2/src/mapreduce.jl#L163-L167 and JuliaGPU/Metal.jl#626
I guess we would also need a way to access the max threads per block/group. Maybe we expose an API specifically for reductions that is essentially an interface for what CUDA has defined in big_mapreduce_threshold
?
Should probably update https://discourse.julialang.org/t/how-to-get-the-device-name-and-the-number-of-compute-units-when-using-oneapi-jl-or-amdgpu-jl/128361 once resolved
Metadata
Metadata
Assignees
Labels
No labels