Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[spark] Add max_pt function #5088

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

ulysses-you
Copy link
Contributor

@ulysses-you ulysses-you commented Feb 14, 2025

Purpose

Adds a new Spark function max_pt. It accpets a string type literal and return a max-valid-toplevel partition value.

  • valid means the partition contains data files
  • toplevel means only return the first partition value if the table has multi-partition columns

max_pt will throw exception when:

  • the table is not a partitioned table
  • the partitioned table does not have partition
  • all of the partitions do not contains data files

This pr adds max_pt through spark v2 function catalog using a fake scalar function and then adds a rule to replace max_pt to a literal value during analysis.

Example:

SELECT max_pt('t')
=>
20250101

API and Format

no

Documentation

add sql-functions docs

Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a new documentation: sql-function?

@ulysses-you
Copy link
Contributor Author

@JingsongLi added docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants