-
Notifications
You must be signed in to change notification settings - Fork 477
Description
Crash in MVKCommandEncoder::finishQueries() with VK_QUERY_TYPE_TIMESTAMP
Summary
Intermittent segfault (use-after-free) in MVKCommandEncoder::finishQueries() when using VK_QUERY_TYPE_TIMESTAMP query pools. The crash occurs on a dispatch worker thread during Metal command buffer completion. Likely the same class of missing-retain bug as #1178 but in the query path, and possibly related to the race condition in #2378.
Environment
- MoltenVK: 1.4.0 and 1.4.1 (both affected)
- macOS: 26.2 (Tahoe)
- Hardware: Apple M1, Metal 4
- Application: vkdt (Vulkan compute photo editor)
Reproduction
The application creates a double-buffered timestamp query pool (2000 queries each) and writes timestamps via vkCmdWriteTimestamp() around compute dispatches. The pool is reset via vkCmdResetQueryPool() at the start of each command buffer recording, and results are read with vkGetQueryPoolResults(..., VK_QUERY_RESULT_64_BIT | VK_QUERY_RESULT_WAIT_BIT) after proper timeline semaphore synchronization.
The crash is intermittent — roughly 1 in 3-5 launches. MVK_CONFIG_SYNCHRONOUS_QUEUE_SUBMITS=1 did not help.
Crash stack (identical across all reports)
0: libMoltenVK.dylib invocation function for block in MVKCommandEncoder::finishQueries()
1: Metal MTLDispatchListApply
2: Metal -[_MTLCommandBuffer didCompleteWithStartTime:endTime:error:]
3: IOGPU -[IOGPUMetalCommandBuffer didCompleteWithStartTime:endTime:error:]
4: Metal -[_MTLCommandQueue commandBufferDidComplete:startTime:completionTime:error:]
5: IOGPU IOGPUNotificationQueueDispatchAvailableCompletionNotifications
6: IOGPU __IOGPUNotificationQueueSetDispatchQueue_block_invoke
7: libdispatch.dylib _dispatch_client_callout4
Faulting addresses vary and are always "not in any region" (e.g. 0x728b0b990cf8, 0xffffe6b668cd52cb, 0xba44753ce7e), consistent with use-after-free.
Application query usage pattern
// Init (double-buffered, i=0..1):
VkQueryPoolCreateInfo info = {
.sType = VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO,
.queryType = VK_QUERY_TYPE_TIMESTAMP,
.queryCount = 2000,
};
vkCreateQueryPool(device, &info, NULL, &query[i].pool);
// Record (each frame):
vkBeginCommandBuffer(cmd, &begin_info);
vkCmdResetQueryPool(cmd, query[buf].pool, 0, query[buf].max);
// ... compute dispatches with vkCmdWriteTimestamp() pairs ...
vkEndCommandBuffer(cmd);
// Read results (after timeline semaphore wait for previous frame):
vkGetQueryPoolResults(device, query[prev_buf].pool, 0, cnt, ...
VK_QUERY_RESULT_64_BIT | VK_QUERY_RESULT_WAIT_BIT);Workaround
Not creating the VkQueryPool / not issuing any vkCmdWriteTimestamp() calls eliminates the crash.