Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polars: Chapter 14 - User-Defined Functions #57

Merged

Conversation

peter-gy
Copy link
Contributor

@peter-gy peter-gy commented Mar 9, 2025

📝 Summary

The core idea of this chapter is to demonstrate realistic scenarios where the flexibility of user-defined functions outweighs potential performance drawbacks, while simultaneously providing an overview of relevant Polars expressions for UDFs and utilizing the interactive Marimo environment.

In the "🚀 Higher-performance UDFs" section I introduced concepts like NumPy ufuncs and generalized ufuncs but did not go into huge detail on purpose, as there is a dedicated "NumPy functions" chapter for that in the course outline #40.

Chapter outline:

  • ⚖️ The Cost of UDFs
  • 📊 Project Overview
  • 🔂 Element-Wise UDFs
  • 📦 Batch-Wise UDFs
  • ⚙️ Row-Wise UDFs
  • 🚀 Higher-performance UDFs
  • ⏱️ Quantifying the Overhead

You can try this notebook by executing:

uvx marimo run https://github.com/peter-gy/learn/blob/33b7a6233c066885fc3fb127c5e2f2f4d769c195/polars/14_user_defined_functions.py --sandbox

📋 Checklist

  • I have included package dependencies in the notebook file using --sandbox
  • Keep language direct and simple.

@mscolnick
Copy link
Contributor

wow, this is such a great notebook. great stuff @peter-gy

Copy link
Collaborator

@Haleshot Haleshot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow; it was a great experience to review the notebook! Really informative, packed with native UI elements to play around with and understand the effects on relevant plots.

Right from starting the notebook (as to why UDFs are better for certain reasons but have certain drawbacks in terms of performance) to giving info on how combining it with Numba can lead to better efficiency at the end; this notebook proved to be really informational.

Thanks a ton for this high-quality notebook contribution 😃

@Haleshot Haleshot merged commit e338d9a into marimo-team:main Mar 10, 2025
1 check passed
@peter-gy peter-gy mentioned this pull request Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants