Skip to content

DuckDB, Postgres, SQLite: NOT NULL and NOTNULL expressions #1927

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ryanschneider
Copy link
Contributor

@ryanschneider ryanschneider commented Jul 5, 2025

Fixes #1920 by adding NOT NULL generally as an alias for IS NOT NULL, (only in ParserState::Normal) and adds the non-standard NOTNULL keyword supported by DuckDB, SQLite, and Postgres. It also adds a new ParserState::ColumnDefinition to avoid incorrectly parsing NOT NULL as IS NOT NULL in column definitions.

Since this is my first non-trivial PR please provide any feedback! I confirmed all tests pass w/ cargo test and cargo test --all-features but if there's anything else I need to do please let me know, thanks!


/// Returns true if the dialect supports the passed in alias.
/// See [IsNotNullAlias].
fn supports_is_not_null_alias(&self, alias: IsNotNullAlias) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to have this behavior dialect agnostic and let the parser always accept any variant that shows up? it would let us skip this dialect method as a result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the main issue I see w/ that approach is that technically x NOTNULL is shorthand for x AS NOTNULL for dialects that don't support NOTNULL but do support aliases w/o AS, so this could be considered a breaking change if a query happened to by using NOTNULL as a column alias.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah makes sense! That behavior would only be applicable to the x NOTNULL scenario or? thinking if so, for x NOT NULL we should be able to parse that unambiguously as most other operators? Thinking if so then we could get away with only introducing a dialect method supports_notnull_operator() and parse the latter unconditionally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking if so then we could get away with only introducing a dialect method supports_notnull_operator() and parse the latter unconditionally?

Sounds good, I like that change, it gets rid of the new enum as well, I'll re-request review when it's ready, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I made the change but now some of the CREATE TABLE tests are failing around the IsNotNull column options, so I think I broke something w/ precedence afterall so need to take a closer look at the parsing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I was able to make it work, but it required an ugly special case in parse_subexpr. The good news is with this special case I didn't need to do any precedence changes for NOT NULL, just NOTNULL.

If you have any suggestions on how to better handle this special case, please let me know! The reason I need to handle it there is because get_next_precedence() doesn't know about the current expr, so one fix would be to pass expr to get_next_precedence(expr) then the precedence of NOT NULL could be dynamic if the current expr is an Expr::Identifier or Value::Null, then the logic could be moved back into parse_infix.

Alternaitvely, I could move the special case into a helper function to compartmentalize the changes, but figured I'd get your feedback before going to far. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ @iffyio (sorry forget to mention you in previous message)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to explain what I'm trying to work around: in CREATE TABLE statements the NOT NULL should never be considered an Expr::IsNotNull but it's getting parsed as such by the parse_expr calls in parse_optional_column_option on column definitions like foo DEFAULT true NOT NULL etc.

So an alternative to the above special case would be to add a "mode" to parse_expr so it knows when it's parsing column definitions vs. other expressions, or maybe I could replace the calls to parse_expr in parse_optional_column_option with calls to parse_subexpr(some_high_precedence) where some_high_precedence is high enough to prevent it from considering the NOT NULL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, what about adding a ParserState::ColumnDefinition and using the state in get_next_precedence_default() to decide if NOT NULL should have Is precedence? If you're ok with this approach I think it's the cleanest way to handle it.

Comment on lines 654 to 657
Token::Word(w)
if w.keyword == Keyword::NULL
&& self.supports_is_not_null_alias(NotSpaceNull) =>
{
Ok(p!(Is))
}
_ => Ok(self.prec_unknown()),
},
Token::Word(w)
if w.keyword == Keyword::NOTNULL && self.supports_is_not_null_alias(NotNull) =>
{
Ok(p!(Is))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the new operators can we add a test case that cover their precedence behavior? see here for example. It would be good if we don't already have coverage to also ensure that we aren't also inadvertently changing the precedence of the IS NOT NULL operator as well, so we could also include a test case demonstrating that if lacking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 637fb69

// All other dialects consider `x NOTNULL` like `x AS NOTNULL` and thus
// consider `NOTNULL` an alias for x.
let sql = r#"WITH t AS (SELECT NULL AS x) SELECT x NOTNULL FROM t"#;
let canonical = r#"WITH t AS (SELECT NULL AS x) SELECT x AS NOTNULL FROM t"#;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify this test, since we're only interested in testing the expression, doing so via a CTE makes the test verbose and harder to maintain going forward without providing additional coverage (note we also have verified_expr() helper functions to make expression testing easier)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I simplified the tests in 431fca3, thanks for the pointers they are much more readable now!

@ryanschneider ryanschneider requested a review from iffyio July 8, 2025 20:33
@ryanschneider ryanschneider force-pushed the not-null-and-notnull-support branch from 307c465 to ab6607b Compare July 11, 2025 00:21
@ryanschneider
Copy link
Contributor Author

Rebased against main to resolve conflicts.

@ryanschneider
Copy link
Contributor Author

@iffyio I went with the new ParserState::ColumnDefinition idea mentioned here: #1927 (comment) I think it leads to the cleanest code changes, let me know what you think! Also merged in main and resolved conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parser: Cannot parse COUNT(CASE WHEN x NOT NULL THEN 1 END)
2 participants