feat: Add support for cloud object storage (e.g. s3) based shuffle #48

andygrove · 2024-11-17T17:44:54Z

Follows on from #47

Closes #46

Description

Shuffle files are still written to local disk in ShuffleWriterExec, but they are then uploaded to object storage
ShuffleReaderExec then downloads the shuffle files from object storage and deletes them

edmondop · 2024-11-18T14:18:23Z

datafusion_ray/context.py

        final_stage_id = graph.get_final_query_stage().id()
-        partitions = schedule_execution(graph, final_stage_id, True)
+        # serialize the query stages and store in Ray object store
+        query_stages = [


I am a little puzzled, are these the query_stages or the serialized execution plans?

these are the serialized execution plans for each query stage:

graph.get_query_stage(i).get_execution_plan_bytes()

edmondop reviewed Nov 18, 2024

View reviewed changes

andygrove changed the title ~~feat: Add support for object storage based shuffle~~ feat: Add support for cloud object storage (e.g. s3) based shuffle Nov 18, 2024

Use object store to transfer shuffle files between writers and readers

96495b7

andygrove force-pushed the minio branch from 7d08c06 to 96495b7 Compare November 19, 2024 15:13

andygrove closed this Feb 28, 2025

andygrove deleted the minio branch February 28, 2025 18:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add support for cloud object storage (e.g. s3) based shuffle #48

feat: Add support for cloud object storage (e.g. s3) based shuffle #48

Uh oh!

andygrove commented Nov 17, 2024 •

edited

Loading

Uh oh!

edmondop Nov 18, 2024

Uh oh!

andygrove Nov 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Add support for cloud object storage (e.g. s3) based shuffle #48

feat: Add support for cloud object storage (e.g. s3) based shuffle #48

Uh oh!

Conversation

andygrove commented Nov 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

edmondop Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

andygrove Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andygrove commented Nov 17, 2024 •

edited

Loading