Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fence delay options at capture time #1155

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions USAGE_android.md
Original file line number Diff line number Diff line change
Expand Up @@ -330,6 +330,9 @@ option values.
| Page guard signal handler watcher | debug.gfxrecon.page_guard_signal_handler_watcher | BOOL | When the `page_guard` memory tracking mode is enabled, setting this enviroment variable to `true` will spawn a thread which will periodically reinstall the `SIGSEGV` handler if it has been replaced by the application being traced. Default is `false` |
| Page guard signal handler watcher max restores | debug.gfxrecon.page_guard_signal_handler_watcher_max_restores | INTEGER | Sets the number of times the watcher will attempt to restore the signal handler. Setting it to a negative value will make the watcher thread run indefinitely. Default is `1` |
| Force FIFO present mode | debug.gfxrecon.force_fifo_present_mode | BOOL | When the `force_fifo_present_mode` is enabled, force all present modes in vkGetPhysicalDeviceSurfacePresentModesKHR to VK_PRESENT_MODE_FIFO_KHR, app present mode is set in vkCreateSwapchain to VK_PRESENT_MODE_FIFO_KHR. Otherwise the original present mode will be used. Default is: `true` |
| Fence Query Delay | debug.gfxrecon.fence_query_delay | INTEGER | Fences queried using `vkGetFenceStatus` and `vkWaitForFences` won't return `VK_SUCCESS` before a number of such queries and will instead return `VK_NOT_READY` and `VK_TIMEOUT`. Default is `0`. |
| Fence Query Delay Unit | debug.gfxrecon.fence_query_delay_unit | STRING | Specify the "unit of time" used for the delay fence queries option. If set to `calls` then fence query delay is the number of calls to `vkGetFenceStatus`/`vkWaitForFences` that will be delayed. If set to `frames` then fence query delay is the number of frames for which called will be delayed. Default is `calls`. |
| Fence Query Delay Timeout Threshold | debug.gfxrecon.fence_query_delay_timeout_threshold | INTEGER | Specify a timeout threshold (in nanoseconds) as to what is considered a "fence query" when calling `vkWaitForFences`. Calls to `vkWaitForFences` can either be understood as a synchronization step where you actually want to wait for the underlying command to complete and reaching the timeout is a failure in the command, or as a "delayed query" where you just want to query the fence for a certain amount of time and will try again later if timeout is reached. This option sets the threshold of the timeout value to differentiate these two usages. |

#### Settings File

Expand Down
4 changes: 4 additions & 0 deletions USAGE_desktop_Vulkan.md
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,10 @@ option values.
| Force Command Serialization | GFXRECON_FORCE_COMMAND_SERIALIZATION | BOOL | Sets exclusive locks(unique_lock) for every ApiCall. It can avoid external multi-thread to cause captured issue. |
| Queue Zero Only | GFXRECON_QUEUE_ZERO_ONLY | BOOL | Forces to using only QueueFamilyIndex: 0 and queueCount: 1 on capturing to avoid replay error for unavailble VkQueue. |
| Allow Pipeline Compile Required | GFXRECON_ALLOW_PIPELINE_COMPILE_REQUIRED | BOOL | The default behaviour forces VK_PIPELINE_COMPILE_REQUIRED to be returned from Create*Pipelines calls which have VK_PIPELINE_CREATE_FAIL_ON_PIPELINE_COMPILE_REQUIRED_BIT set, and skips dispatching and recording the calls. This forces applications to fallback to recompiling pipelines without caching, the Vulkan calls for which will be captured. Enabling this option causes capture to record the application's calls and implementation's return values unmodified, but the resulting captures are fragile to changes in Vulkan implementations if they use pipeline caching. |
| Fence Query Delay | GFXRECON_FENCE_QUERY_DELAY | INTEGER | Fences queried using `vkGetFenceStatus` and `vkWaitForFences` won't return `VK_SUCCESS` before a number of such queries and will instead return `VK_NOT_READY` and `VK_TIMEOUT`. Default is `0`. |
| Fence Query Delay unit | GFXRECON_FENCE_QUERY_DELAY_UNIT | STRING | Specify the "unit of time" used for the delay fence queries option. If set to `calls` then fence query delay is the number of calls to `vkGetFenceStatus`/`vkWaitForFences` that will be delayed. If set to `frames` then fence query delay is the number of frames for which called will be delayed. Default is `calls`. |
| Fence Query Delay Timeout Threshold | GFXRECON_FENCE_QUERY_DELAY_TIMEOUT_THRESHOLD | INTEGER | Specify a timeout threshold (in nanoseconds) as to what is considered a "fence query" when calling `vkWaitForFences`. Calls to `vkWaitForFences` can either be understood as a synchronization step where you actually want to wait for the underlying command to complete and reaching the timeout is a failure in the command, or as a "delayed query" where you just want to query the fence for a certain amount of time and will try again later if timeout is reached. This option sets the threshold of the timeout value to differentiate these two usages. |

#### Memory Tracking Known Issues

### Capture Limitations
Expand Down
79 changes: 43 additions & 36 deletions framework/decode/vulkan_replay_consumer_base.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3338,23 +3338,6 @@ VkResult VulkanReplayConsumerBase::OverrideWaitForFences(PFN_vkWaitForFences
const VkFence* modified_fences = nullptr;
std::vector<VkFence> valid_fences;

// Check if the call is in a frame range for being skipped (see --skip-get-fence-ranges, --skip-get-fence-status)
bool in_skip_range = options_.skip_get_fence_ranges.empty();
const uint32_t current_frame = application_->GetCurrentFrameNumber() + 1;
for (const util::UintRange& range : options_.skip_get_fence_ranges)
{
if (current_frame >= range.first && current_frame <= range.last)
{
in_skip_range = true;
break;
}
}

if (in_skip_range && options_.skip_get_fence_status == SkipGetFenceStatus::SkipAll)
{
return result;
}

// Check for fences that need to be removed.
if (shadow_fences_.empty())
{
Expand Down Expand Up @@ -3386,23 +3369,42 @@ VkResult VulkanReplayConsumerBase::OverrideWaitForFences(PFN_vkWaitForFences
modified_fences = valid_fences.data();
}

if (original_result == VK_SUCCESS)
// If the timeout is 0, then we suppose this "wait for fence" is in fact a "get fence status" and should be skipped
// accordingly.
bool in_skip_range = false;
if (timeout == 0)
{
// Ensure that wait for fences waits until the fences have been signaled (or error occurs) by changing the
// timeout to UINT64_MAX.
if (modified_fence_count > 0)
// Check if the call is in a frame range for being skipped (see --skip-get-fence-ranges,
// --skip-get-fence-status)
in_skip_range = options_.skip_get_fence_ranges.empty();
const uint32_t current_frame = application_->GetCurrentFrameNumber() + 1;
for (const util::UintRange& range : options_.skip_get_fence_ranges)
{
result = func(device, modified_fence_count, modified_fences, waitAll, std::numeric_limits<uint64_t>::max());
if (current_frame >= range.first && current_frame <= range.last)
{
in_skip_range = true;
break;
}
}
}
else

if (in_skip_range && options_.skip_get_fence_status == SkipGetFenceStatus::SkipAll)
{
if (in_skip_range && options_.skip_get_fence_status == SkipGetFenceStatus::SkipUnsuccessful)
// Nothing.
}
else if (modified_fence_count > 0)
{
if (original_result == VK_SUCCESS)
{
return result;
// Ensure that wait for fences waits until the fences have been signaled (or error occurs) by changing the
// timeout to UINT64_MAX.
result = func(device, modified_fence_count, modified_fences, waitAll, std::numeric_limits<uint64_t>::max());
}

if (original_result == VK_TIMEOUT)
else if (in_skip_range && options_.skip_get_fence_status == SkipGetFenceStatus::SkipUnsuccessful)
{
// Nothing.
}
else if (original_result == VK_TIMEOUT)
{
// Try to get a timeout result with a 0 timeout.
result = func(device, modified_fence_count, modified_fences, waitAll, 0);
Expand All @@ -3427,6 +3429,11 @@ VkResult VulkanReplayConsumerBase::OverrideGetFenceStatus(PFN_vkGetFenceStatus
VkDevice device = device_info->handle;
VkFence fence = fence_info->handle;

if (shadow_fences_.find(fence) != shadow_fences_.end())
{
return result;
}

// Check if the call is in a frame range for being skipped (see --skip-get-fence-ranges, --skip-get-fence-status)
bool in_skip_range = options_.skip_get_fence_ranges.empty();
const uint32_t current_frame = application_->GetCurrentFrameNumber() + 1;
Expand All @@ -3446,17 +3453,17 @@ VkResult VulkanReplayConsumerBase::OverrideGetFenceStatus(PFN_vkGetFenceStatus
return result;
}

if (shadow_fences_.find(fence) != shadow_fences_.end())
{
return result;
}
result = func(device, fence);

// If you find this loop to be infinite consider adding a limit in the same way
// it is done for GetEventStatus and GetQueryPoolResults.
do
// We don't want the replay to continue if fence was ready at capture time but is not at replay time because future
// calls might use the resources depending on that fence...
if (original_result == VK_SUCCESS && result == VK_NOT_READY)
{
result = func(device, fence);
} while ((original_result == VK_SUCCESS) && (result == VK_NOT_READY));
const encode::VulkanDeviceTable* device_table = GetDeviceTable(device);
GFXRECON_ASSERT(device_table != nullptr);

result = device_table->WaitForFences(device, 1, &fence, VK_TRUE, UINT64_MAX);
}

return result;
}
Expand Down
2 changes: 1 addition & 1 deletion framework/encode/api_capture_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ class ApiCaptureManager

void WriteFrameMarker(format::MarkerType marker_type) { common_manager_->WriteFrameMarker(marker_type); }

void EndFrame(std::shared_lock<CommonCaptureManager::ApiCallMutexT>& current_lock)
virtual void EndFrame(std::shared_lock<CommonCaptureManager::ApiCallMutexT>& current_lock)
{
common_manager_->EndFrame(api_family_, current_lock);
}
Expand Down
Loading
Loading