Skip to content

[EPIC] A collection of tickets for improved WASM support in DataFusion #13815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 of 9 tasks
alamb opened this issue Dec 17, 2024 · 7 comments
Open
3 of 9 tasks

[EPIC] A collection of tickets for improved WASM support in DataFusion #13815

alamb opened this issue Dec 17, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Dec 17, 2024

Is your feature request related to a problem or challenge?

DataFusion can be compiled today to WASM with some care. However it is somewhat tricky and the experience could be improved.

Describe the solution you'd like

This ticket tracks various ways we could make the experience better

Describe alternatives you've considered

Additional context

No response

@alamb alamb added the enhancement New feature or request label Dec 17, 2024
@alamb alamb changed the title [EPIC] A collection of tickets for supporting WASM in DataFusion [EPIC] A collection of tickets for improved WASM support in DataFusion Dec 17, 2024
@savaliyabhargav
Copy link

Dear @alamb ,

I hope you’re doing well.

I’m very interested in contributing to the WASM support for DataFusion project as part of GSoC 2025. Enhancing embeddability and ensuring robust WASM integration aligns with my passion for building reliable and maintainable software. I’d greatly appreciate it if you could share more details on the project’s current challenges and key milestones. Understanding these aspects will help me prepare a thoughtful and well-structured proposal.

Thank you for your time and guidance.

Best regards,
bhargav

@alamb
Copy link
Contributor Author

alamb commented Mar 17, 2025

I’m very interested in contributing to the WASM support for DataFusion project as part of GSoC 2025. Enhancing embeddability and ensuring robust WASM integration aligns with my passion for building reliable and maintainable software

Thank you @savaliyabhargav

I think the key challenge is the same as in this ticket's description:

DataFusion can be compiled today to WASM with some care. However it is somewhat tricky and the experience could be improved.

I suggest some key milestones:

  1. Write a blog post explaining what WASM is, why it is important, and how to use DataFusion with it
  2. Add example / section to the documentation about compiling and building with WASM
  3. Make several example WASM apps with DataFusion -- for example Browser-accessible official DataFusion playground / DataFusion fiddle #13818
  4. Stretch goal -- prototype how we would support WASM user defined functions

@matthewmturner
Copy link
Contributor

matthewmturner commented Mar 17, 2025

  1. Stretch goal -- prototype how we would support WASM user defined functions

On this point, I was able to get WASM UDFs working in dft (im sure there is room for improvement but i got both WASM Native and Arrow IPC formats working) and as part of the next release I was planning to publish that crate so others could use it. @savaliyabhargav I would be happy to collaborate on that if you are interested in it.

@savaliyabhargav
Copy link

@matthewmturner yes sure i am interested can you please give me more detail about it

@matthewmturner
Copy link
Contributor

For the WASM UDFs they just need some more real world testing / benchmarking. To be honest, the other points @alamb mentioned would probably better benefit the DataFusion ecosystem at this stage. But, if WASM UDFs are what interest you then you could make an issue on the dft repo and we could discuss some tangible next steps for that.

@qstommyshu
Copy link
Contributor

Hi, I'm also interested in working on this project for Gsoc 2025. I'm currently a master student at Georgia Institute of Technology, I had some experiences with WASM and Rust in my undergrad study, and I would like to see how that can be incorporated with DataFusion!

Are there any prerequisite for this project? I suppose the points below would be the key points for our proposal, right?

I suggest some key milestones:

Write a blog post explaining what WASM is, why it is important, and how to use DataFusion with it
Add example / section to the documentation about compiling and building with WASM
Make several example WASM apps with DataFusion -- for example #13818
Stretch goal -- prototype how we would support WASM user defined functions

I'm new to DataFusion but I've heard of the name for a long long time.

  1. Can you please give me some relative materials to look at for this project? or working on some "good first issue" should be enough for me to get familiar with DataFusion?
  2. Can you please give me some guidance on how to work on an precise and appropriate proposal?

Thanks in advance!

@savaliyabhargav
Copy link

@matthewmturner Please provide me with some information on DFT so that I can learn and work on your project effectively

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants