Support task scheduling priorities in PyRosettaCluster #571
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds support for finer control of task execution orchestration in PyRosettaCluster by exposing Dask's work priority API controlling Dask schedulers. There are two major task execution patterns that the user may wish to follow when setting up a PyRosettaCluster simulation:
Breadth-first task execution: Currently, tasks are always run generally following a first-in, first-out (FIFO-like) task chain behavior. This means that when the Dask worker resources are saturated (which is a typical scenario), all submitted tasks have equal priority and are front-loaded to the upstream user-defined PyRosetta protocols, delaying execution of the downstream protocols until all tasks finish the upstream protocols.
Depth-first task execution: This PR enables task chains to run to completion, by allowing the user to explicitly increase the priority of tasks submitted to downstream user-defined PyRosetta protocols. This means that when the Dask worker resources are saturated, once a task finishes an upstream protocol, it is submitted to the next downstream protocol with a higher priority than tasks still queued for the upstream protocols, so task chains may run through all protocols to completion.
For example, to run user-defined PyRosetta protocols with depth-first task execution, the
prioritieskeyword argument is implemented in this PR where higher priorities take precedence:Say the user has 10,000 tasks and only 10 Dask worker threads to run on, then with depth-first task execution, the process is as follows:
protocol_1protocol_1on available Dask worker resourcesprotocol_1, they immediately are scheduled to runprotocol_2before the other 9,990 tasks queued to runprotocol_1are scheduledprotocol_2, they are saved to disk, and the next 10 tasks immediately are scheduled to runprotocol_1Note that in distributed cluster scenarios, tasks are scheduled on the remote cluster asynchronously from task submissions from the client, so due to normal cluster-specific network latencies, even if a task's priority is higher, there may be short delays in the Dask worker receiving the task, leading to slightly nondeterministic behavior in practice, but in general the task execution pattern follows the user's priority specifications.