-
Notifications
You must be signed in to change notification settings - Fork 419
feat: adds format, regex_extract function and more type tests #7107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Dennis Zhuang <[email protected]>
WHERE a IS NOT NULL | ||
ORDER BY a - (SELECT MIN(a) FROM test WHERE a IS NOT NULL); | ||
|
||
Error: 1001(Unsupported), This feature is not implemented: Physical plan does not support logical expression ScalarSubquery(<subquery>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's known issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds enhanced string function support and comprehensive type testing to GreptimeDB. The primary purpose is to implement the FORMAT
function with escape character support, the REGEXP_EXTRACT
function for regex-based text extraction, and expand test coverage for various data types including Unicode strings, floating-point edge cases, and SQL operations.
Key Changes:
- Implementation of
FORMAT
andREGEXP_EXTRACT
string functions with proper escaping and regex support - Addition of comprehensive test suites for string functions, Unicode handling, NaN/infinity arithmetic, and SQL operations
- Upgrade of regex dependency to version 1.12 for improved functionality
Reviewed Changes
Copilot reviewed 53 out of 54 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
src/common/function/src/scalars/string/* |
New string function implementations including format and regex_extract |
tests/cases/standalone/common/function/string/* |
Comprehensive string function test suites |
tests/cases/standalone/common/types/float/* |
Floating-point type tests covering NaN, infinity, and IEEE compliance |
tests/cases/standalone/common/types/string/* |
Unicode and large string handling tests |
tests/cases/standalone/common/order/* |
ORDER BY functionality tests |
tests/cases/standalone/common/sample/* |
Table sampling tests |
Cargo.toml and related |
Regex dependency upgrade to version 1.12 |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Signed-off-by: Dennis Zhuang <[email protected]>
Signed-off-by: Dennis Zhuang <[email protected]>
Signed-off-by: Dennis Zhuang <[email protected]>
if let Some(name) = name_opt | ||
&& needed_names.contains(name) | ||
{ | ||
if i + 1 >= values.len() { | ||
return Err(DataFusionError::Execution(format!( | ||
"FORMAT: named argument '{}' has no corresponding value", | ||
name | ||
))); | ||
} | ||
named_map.insert(name.clone(), &values[i + 1]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the positional value has the same value as the needed_names? We may not be able to distinguish that because the needed_names only contains the name, without the position of the format name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! We should require the name parameter to appear at the end of the parameter list, similar to Python or Rust macros.
Let me fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@evenyag I removed the format
function for now, as it’s not the right time to add it before we consider it carefully. Let’s remove it temporarily.
PTAL.
Signed-off-by: Dennis Zhuang <[email protected]>
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
#1168
What's changed and what's your intention?
Main changes:
format(fmt, arg1, arg2...)
andregex_extract
functions.Date64
support fordate_format
.Remaining issues:
TODO:
format(fmt, arg1, arg2...)
andregex_extract
functions to Datafusion and remove our custom implementations.PR Checklist
Please convert it to a draft if some of the following conditions are not met.