Skip to content

Conversation

@christiangnrd
Copy link
Member

@christiangnrd christiangnrd commented Mar 20, 2025

I took out the unrelated MPSMatrix and MPSNDArray fixes from #566 and added some code coverage exclusions inspired by JuliaGPU/CUDA.jl#2704

This PR:

  • Fixes the MPSNDArray constructor to provide the offset in bytes instead of elements
  • Defines Base.size for MPSMatrix, adds tests for it, and fixes some copy-paste misses in the pre-existing MPSMatrix tests (hide whitespace in the diff for this one)
  • Moves MPSDataType to its own file as it didn't belong in mps/matrix.jl
  • Many new tests as well as coverage exclusions for device code

@github-actions
Copy link
Contributor

github-actions bot commented Mar 20, 2025

Your PR requires formatting changes to meet the project's style guidelines.
Please consider running Runic (git runic main) to apply these changes.

Click here to view the suggested changes.
diff --git a/lib/mps/matrix.jl b/lib/mps/matrix.jl
index 5953377e..314212fd 100644
--- a/lib/mps/matrix.jl
+++ b/lib/mps/matrix.jl
@@ -142,7 +142,8 @@ with any `MtlArray` and it should be accelerated using Metal Performance Shaders
 """
 function matmul!(c::MtlArray{T1,N}, a::MtlArray{T2,N}, b::MtlArray{T3,N},
                  alpha::Number=true, beta::Number=true,
-                 transpose_a=false, transpose_b=false) where {T1, T2, T3, N}
+        transpose_a = false, transpose_b = false
+    ) where {T1, T2, T3, N}
     # NOTE: MPS uses row major, while Julia is col-major. Instead of transposing
     #       the inputs (by passing !transpose_[ab]) and afterwards transposing
     #       the output, we use the property that (AB)ᵀ = BᵀAᵀ
diff --git a/test/mtl/metal.jl b/test/mtl/metal.jl
index 278ce34e..b59857d3 100644
--- a/test/mtl/metal.jl
+++ b/test/mtl/metal.jl
@@ -9,7 +9,7 @@ using .MTL
 devs = devices()
 @test length(devs) > 0
 
-dev = MTLDevice(1)
+            dev = MTLDevice(1)
 @test dev == devs[1]
 
 if length(devs) > 1
@@ -34,35 +34,35 @@ full_str = sprint(io->show(io, MIME"text/plain"(), dev))
 
 @test dev.currentAllocatedSize isa Integer
 
-@test is_m1(dev) isa Bool
-@test is_m2(dev) isa Bool
-@test is_m3(dev) isa Bool
-@test is_m4(dev) isa Bool
+            @test is_m1(dev) isa Bool
+            @test is_m2(dev) isa Bool
+            @test is_m3(dev) isa Bool
+            @test is_m4(dev) isa Bool
 
-@test MTL.MTLCreateSystemDefaultDevice() isa MTLDevice
+            @test MTL.MTLCreateSystemDefaultDevice() isa MTLDevice
 
-end
+        end
+
+        @testset "storage_type" begin
+            @test convert(MTL.MTLStorageMode, MTL.SharedStorage) == MTL.MTLStorageModeShared
+            @test convert(MTL.MTLStorageMode, MTL.ManagedStorage) == MTL.MTLStorageModeManaged
+            @test convert(MTL.MTLStorageMode, MTL.PrivateStorage) == MTL.MTLStorageModePrivate
+            @test convert(MTL.MTLStorageMode, MTL.Memoryless) == MTL.MTLStorageModeMemoryless
+
+            @test convert(MTL.MTLResourceOptions, MTL.SharedStorage) == MTL.MTLResourceStorageModeShared
+            @test convert(MTL.MTLResourceOptions, MTL.ManagedStorage) == MTL.MTLResourceStorageModeManaged
+            @test convert(MTL.MTLResourceOptions, MTL.PrivateStorage) == MTL.MTLResourceStorageModePrivate
+            @test convert(MTL.MTLResourceOptions, MTL.Memoryless) == MTL.MTLResourceStorageModeMemoryless
+
+            @test convert(MTL.MTLResourceOptions, MTL.MTLStorageModeShared) == MTL.MTLResourceStorageModeShared
+            @test convert(MTL.MTLResourceOptions, MTL.MTLStorageModeManaged) == MTL.MTLResourceStorageModeManaged
+            @test convert(MTL.MTLResourceOptions, MTL.MTLStorageModePrivate) == MTL.MTLResourceStorageModePrivate
+            @test convert(MTL.MTLResourceOptions, MTL.MTLStorageModeMemoryless) == MTL.MTLResourceStorageModeMemoryless
 
-@testset "storage_type" begin
-    @test convert(MTL.MTLStorageMode, MTL.SharedStorage) == MTL.MTLStorageModeShared
-    @test convert(MTL.MTLStorageMode, MTL.ManagedStorage) == MTL.MTLStorageModeManaged
-    @test convert(MTL.MTLStorageMode, MTL.PrivateStorage) == MTL.MTLStorageModePrivate
-    @test convert(MTL.MTLStorageMode, MTL.Memoryless) == MTL.MTLStorageModeMemoryless
-
-    @test convert(MTL.MTLResourceOptions, MTL.SharedStorage) == MTL.MTLResourceStorageModeShared
-    @test convert(MTL.MTLResourceOptions, MTL.ManagedStorage) == MTL.MTLResourceStorageModeManaged
-    @test convert(MTL.MTLResourceOptions, MTL.PrivateStorage) == MTL.MTLResourceStorageModePrivate
-    @test convert(MTL.MTLResourceOptions, MTL.Memoryless) == MTL.MTLResourceStorageModeMemoryless
-
-    @test convert(MTL.MTLResourceOptions, MTL.MTLStorageModeShared) == MTL.MTLResourceStorageModeShared
-    @test convert(MTL.MTLResourceOptions, MTL.MTLStorageModeManaged) == MTL.MTLResourceStorageModeManaged
-    @test convert(MTL.MTLResourceOptions, MTL.MTLStorageModePrivate) == MTL.MTLResourceStorageModePrivate
-    @test convert(MTL.MTLResourceOptions, MTL.MTLStorageModeMemoryless) == MTL.MTLResourceStorageModeMemoryless
-
-    @test MTL.MTLResourceStorageModeShared == MTL.MTLStorageModeShared
-    @test MTL.MTLStorageModeManaged == MTL.MTLResourceStorageModeManaged
-    @test MTL.MTLResourceStorageModePrivate == MTL.MTLStorageModePrivate
-    @test MTL.MTLStorageModeMemoryless == MTL.MTLResourceStorageModeMemoryless
+            @test MTL.MTLResourceStorageModeShared == MTL.MTLStorageModeShared
+            @test MTL.MTLStorageModeManaged == MTL.MTLResourceStorageModeManaged
+            @test MTL.MTLResourceStorageModePrivate == MTL.MTLStorageModePrivate
+            @test MTL.MTLStorageModeMemoryless == MTL.MTLResourceStorageModeMemoryless
 end
 
 @testset "compile options" begin

@codecov
Copy link

codecov bot commented Mar 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.81%. Comparing base (b7606f0) to head (129392c).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #569      +/-   ##
==========================================
+ Coverage   74.92%   81.81%   +6.89%     
==========================================
  Files          57       54       -3     
  Lines        2716     2486     -230     
==========================================
- Hits         2035     2034       -1     
+ Misses        681      452     -229     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Benchmark suite Current: 129392c Previous: b7606f0 Ratio
private array/construct 24354.166666666668 ns 25215.333333333332 ns 0.97
private array/broadcast 460458 ns 459459 ns 1.00
private array/random/randn/Float32 808563 ns 811458 ns 1.00
private array/random/randn!/Float32 627208 ns 649333 ns 0.97
private array/random/rand!/Int64 562125 ns 570959 ns 0.98
private array/random/rand!/Float32 588687.5 ns 613875 ns 0.96
private array/random/rand/Int64 762125 ns 787625 ns 0.97
private array/random/rand/Float32 585750 ns 595292 ns 0.98
private array/copyto!/gpu_to_gpu 457791 ns 675708 ns 0.68
private array/copyto!/cpu_to_gpu 823542 ns 827167 ns 1.00
private array/copyto!/gpu_to_cpu 652292 ns 678792 ns 0.96
private array/accumulate/1d 1352375 ns 1336771 ns 1.01
private array/accumulate/2d 1381042 ns 1378104 ns 1.00
private array/iteration/findall/int 2102625 ns 2062958.5 ns 1.02
private array/iteration/findall/bool 1835750 ns 1834437.5 ns 1.00
private array/iteration/findfirst/int 1700791.5 ns 1702125 ns 1.00
private array/iteration/findfirst/bool 1677167 ns 1669958 ns 1.00
private array/iteration/scalar 3282875 ns 3669312.5 ns 0.89
private array/iteration/logical 3222500 ns 3205125 ns 1.01
private array/iteration/findmin/1d 1777625 ns 1782500 ns 1.00
private array/iteration/findmin/2d 1355396 ns 1351333 ns 1.00
private array/reductions/reduce/1d 1050583.5 ns 1050417 ns 1.00
private array/reductions/reduce/2d 662125 ns 662520.5 ns 1.00
private array/reductions/mapreduce/1d 1044500 ns 1042687 ns 1.00
private array/reductions/mapreduce/2d 662917 ns 664979 ns 1.00
private array/permutedims/4d 2535458 ns 2536854 ns 1.00
private array/permutedims/2d 1020875 ns 1019479 ns 1.00
private array/permutedims/3d 1576562.5 ns 1577291.5 ns 1.00
private array/copy 582042 ns 588959 ns 0.99
latency/precompile 9096227041 ns 9098777875 ns 1.00
latency/ttfp 3670404791 ns 5137594083.5 ns 0.71
latency/import 1244124875 ns 1244565875 ns 1.00
integration/metaldevrt 705792 ns 710917 ns 0.99
integration/byval/slices=1 1549292 ns 1594958.5 ns 0.97
integration/byval/slices=3 8455250 ns 9534438 ns 0.89
integration/byval/reference 1543646 ns 1613834 ns 0.96
integration/byval/slices=2 2634208 ns 2614209 ns 1.01
kernel/indexing 470125 ns 453145.5 ns 1.04
kernel/indexing_checked 469750 ns 452729 ns 1.04
kernel/launch 10062.5 ns 8041 ns 1.25
metal/synchronization/stream 14625 ns 15208 ns 0.96
metal/synchronization/context 14917 ns 15083 ns 0.99
shared array/construct 24336.75 ns 24125 ns 1.01
shared array/broadcast 461000 ns 462709 ns 1.00
shared array/random/randn/Float32 822416 ns 800979.5 ns 1.03
shared array/random/randn!/Float32 627458 ns 643583 ns 0.97
shared array/random/rand!/Int64 566667 ns 575667 ns 0.98
shared array/random/rand!/Float32 588084 ns 613708 ns 0.96
shared array/random/rand/Int64 744083 ns 774333 ns 0.96
shared array/random/rand/Float32 613583 ns 627666.5 ns 0.98
shared array/copyto!/gpu_to_gpu 82750 ns 83208 ns 0.99
shared array/copyto!/cpu_to_gpu 78917 ns 83875 ns 0.94
shared array/copyto!/gpu_to_cpu 80916.5 ns 83084 ns 0.97
shared array/accumulate/1d 1343645.5 ns 1349937.5 ns 1.00
shared array/accumulate/2d 1390812.5 ns 1390958 ns 1.00
shared array/iteration/findall/int 1811750 ns 1825250 ns 0.99
shared array/iteration/findall/bool 1611333.5 ns 1599791.5 ns 1.01
shared array/iteration/findfirst/int 1392167 ns 1384041.5 ns 1.01
shared array/iteration/findfirst/bool 1369291.5 ns 1358041 ns 1.01
shared array/iteration/scalar 156458 ns 160292 ns 0.98
shared array/iteration/logical 2998542 ns 2994750 ns 1.00
shared array/iteration/findmin/1d 1466291 ns 1464291 ns 1.00
shared array/iteration/findmin/2d 1360645.5 ns 1349917 ns 1.01
shared array/reductions/reduce/1d 734208 ns 731917 ns 1.00
shared array/reductions/reduce/2d 665166 ns 666500 ns 1.00
shared array/reductions/mapreduce/1d 746958 ns 749125 ns 1.00
shared array/reductions/mapreduce/2d 658541.5 ns 670625 ns 0.98
shared array/permutedims/4d 2540375 ns 2526584 ns 1.01
shared array/permutedims/2d 1027666 ns 1017500 ns 1.01
shared array/permutedims/3d 1610167 ns 1572687.5 ns 1.02
shared array/copy 252271 ns 245291 ns 1.03

This comment was automatically generated by workflow using github-action-benchmark.

@christiangnrd christiangnrd merged commit 8d4dd5d into JuliaGPU:main Mar 21, 2025
7 checks passed
@christiangnrd christiangnrd deleted the misc branch March 21, 2025 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants