Preallocate Ray Workers #62
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Significant time is spent allocating
StageServices as the python actor,RayStageand waiting for them to bind to a listening port.This changes the semantics of
RayContextandRayDataFramesuch that theRayQuerySupervisoris now created when theRayContextis created and a pool ofRayStages are preallocated. When aRayDataFrameis created by the context'ssql()method, stages are calculated and the number ofRayStageactors are requested from the pool. When the query is finished, instead of tearing down these actors, they are simply returned to the pool.The pool size is parameterized by min size and max size values. The pool will preallocate at the minimum size and can grow up to the maximum size. Requesting workers beyond the maximum size will raise an exception. The pool is released and ray resources are torn down when the
RayContextgoes out of scope.This change makes a significant difference on TPCH benchmarks. Tested on SF100, it improved the result by 25% on a machine with very fast disk, such that the overhead of creating and tearing down ray resources was a large chunk of execution time.
This PR does not handle the pool shrinking back to a minimum size only growing, let's handle that in a subsequent change.
The
tpcbench.pybenchmark script, andtpc.pyscript accept--worker-pool-minAs
RayStageactors are now longer lived, they were updated to be able to accept updatedExecutionPlans to serve. This meant that debugging issues withRayStages is a little more difficult as it no longer makes sense to name them after they stage they are hosting, because that can change. As such, they now receive friendly human readable unique names which make reading debug and info output much easier.