Skip to content

Conversation

Weijun-H
Copy link
Member

@Weijun-H Weijun-H commented Oct 11, 2025

Which issue does this PR close?

Rationale for this change

DataFusion CLI v50.1.0
> SET TIME ZONE = '+08:00';
0 row(s) fetched. 
Elapsed 0.011 seconds.

> SELECT arrow_typeof(now());
+---------------------------------------+
| arrow_typeof(now())                   |
+---------------------------------------+
| Timestamp(Nanosecond, Some("+08:00")) |
+---------------------------------------+
1 row(s) fetched. 
Elapsed 0.015 seconds.

> SELECT count(1) result FROM (SELECT now() as n) a WHERE n > '2000-01-01'::date;
+--------+
| result |
+--------+
| 1      |
+--------+
1 row(s) fetched. 
Elapsed 0.029 seconds.

What changes are included in this PR?

When the timezone changes, re-register now() function

Are these changes tested?

Are there any user-facing changes?

@Weijun-H Weijun-H changed the title fix: Use dynamic timezone in now() function for accurate timestamp WIP fix: Use dynamic timezone in now() function for accurate timestamp Oct 11, 2025
@github-actions github-actions bot added the functions Changes to functions implementation label Oct 11, 2025
@Weijun-H Weijun-H force-pushed the 17993-make-now-aware-time-zone branch from 8db266f to 33732c5 Compare October 11, 2025 12:25
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Oct 11, 2025
@Weijun-H Weijun-H changed the title WIP fix: Use dynamic timezone in now() function for accurate timestamp fix: Use dynamic timezone in now() function for accurate timestamp Oct 11, 2025
@Weijun-H Weijun-H marked this pull request as draft October 11, 2025 12:28
@Weijun-H Weijun-H force-pushed the 17993-make-now-aware-time-zone branch 2 times, most recently from 70ed548 to b81ebf6 Compare October 12, 2025 10:03
@github-actions github-actions bot added the core Core DataFusion crate label Oct 13, 2025
@Weijun-H Weijun-H force-pushed the 17993-make-now-aware-time-zone branch from f7ba4e5 to 9f1b55f Compare October 13, 2025 19:00
@Weijun-H Weijun-H requested a review from Copilot October 13, 2025 19:01
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes the now() function to return timestamps with the correct timezone set in the session configuration, instead of always using the default "+00:00" timezone.

Key changes:

  • Updated now() function to use dynamic timezone from session configuration
  • Added configuration-aware UDF creation macro for functions that depend on session state
  • Added test coverage for timezone-aware now() function behavior

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
datafusion/functions/src/macros.rs Added new macro for creating UDFs with configuration parameters
datafusion/functions/src/datetime/now.rs Modified NowFunc to accept and use timezone from configuration instead of hardcoded "+00:00"
datafusion/core/src/execution/context/mod.rs Added logic to re-register now() UDF when timezone configuration changes
datafusion/sqllogictest/test_files/timestamps.slt Added test to verify now() returns correct timezone type
datafusion/sqllogictest/test_files/dates.slt Updated expected error message to reflect new default timezone format

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@Weijun-H Weijun-H marked this pull request as ready for review October 13, 2025 19:05
Comment on lines 1078 to 1089

// Register UDFs that return values based on session configuration
// e.g. now() which depends on the time_zone configuration option
if variable == "datafusion.execution.time_zone" {
let config_options = self.state.read().config().options().clone();
let now_udf = {
// recreate the function so it captures the new time zone
make_udf_function_with_config!(NowFunc, now, &ConfigOptions);
now(&config_options)
};
self.state.write().register_udf(now_udf)?;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the above approach will handle all future udf changes that follow this pattern

@Weijun-H Weijun-H marked this pull request as draft October 13, 2025 20:00
@Weijun-H Weijun-H force-pushed the 17993-make-now-aware-time-zone branch from 34ad88f to e7a6ab9 Compare October 13, 2025 20:11
@Weijun-H Weijun-H marked this pull request as ready for review October 13, 2025 20:11
@Weijun-H Weijun-H force-pushed the 17993-make-now-aware-time-zone branch from 8a1de2c to ed73c86 Compare October 13, 2025 21:30
@Weijun-H Weijun-H requested a review from Omega359 October 14, 2025 07:25
@kosiew
Copy link
Contributor

kosiew commented Oct 14, 2025

Re-registering now() inside set_config_option covers interactive SET commands, but sessions that are constructed differently eg with a pre-configured SessionConfig (e.g. via builder APIs or server defaults) still get the default NowFunc::new() that ignores the configured timezone.

eg.

❯ DATAFUSION_EXECUTION_TIME_ZONE='+05:45' cargo run --bin datafusion-cli -- -c "SELECT arrow_typeof(now());"

DataFusion CLI v50.1.0
+------------------------------------+
| arrow_typeof(now())                |
+------------------------------------+
| Timestamp(Nanosecond, Some("+00")) |
+------------------------------------+
1 row(s) fetched.
Elapsed 0.092 seconds.
❯ cargo run --bin datafusion-cli -- -c "SET datafusion.execution.time_zone = '+05:45'; SELECT arrow_typeof(now());"
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.48s
     Running `target/debug/datafusion-cli -c 'SET datafusion.execution.time_zone = '\''+05:45'\''; SELECT arrow_typeof(now());'`
DataFusion CLI v50.1.0
0 row(s) fetched.
Elapsed 0.034 seconds.

+---------------------------------------+
| arrow_typeof(now())                   |
+---------------------------------------+
| Timestamp(Nanosecond, Some("+05:45")) |
+---------------------------------------+
1 row(s) fetched.
Elapsed 0.040 seconds.

@github-actions github-actions bot added the logical-expr Logical plan and expressions label Oct 14, 2025
@Weijun-H
Copy link
Member Author

Thanks @Omega359 and @kosiew for reviewing. I think this pr is ready for review.

╰─ DATAFUSION_EXECUTION_TIME_ZONE='+05:45' cargo run --bin datafusion-cli -- -c "SELECT arrow_typeof(now());"                                           ─╯
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.53s
     Running `target/debug/datafusion-cli -c 'SELECT arrow_typeof(now());'`
DataFusion CLI v50.1.0
+---------------------------------------+
| arrow_typeof(now())                   |
+---------------------------------------+
| Timestamp(Nanosecond, Some("+05:45")) |
+---------------------------------------+
1 row(s) fetched. 
Elapsed 0.011 seconds.

@Weijun-H Weijun-H force-pushed the 17993-make-now-aware-time-zone branch 2 times, most recently from 0903031 to 1310d0f Compare October 14, 2025 20:31
@Weijun-H Weijun-H force-pushed the 17993-make-now-aware-time-zone branch from 1310d0f to 179607a Compare October 14, 2025 20:52
@Jefffrey
Copy link
Contributor

Dropping a reminder here to uncomment relevant tests that were disabled by #18065 if this PR fixes some of them

Copy link
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a question to clarify my understanding.

Self {
signature: Signature::nullary(Volatility::Stable),
aliases: vec!["current_timestamp".to_string()],
timezone: Some(Arc::from("+00")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NowFunc::new() (and therefore Default) now hardcodes the timezone as "+00".
Everywhere else in the codebase — including the default ConfigOptions and existing sqllogictests — expects the canonical "+00:00" form.
Won't this mean that any downstream caller that still constructs the UDF via NowFunc::new()/Default (for example, when registering their own function registry) now get a different, non-canonical offset that will no longer compare equal to the rest of the system?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could remove new(), because now relies on config for the initialization

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can someone explain the implications of this change on downstream crates? Does it mean that that now() will always uses the current timezone from the ConfigOptions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: Yes, this is the implication

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb With this implementation, yes. To me the logical solution is to allow the time_zone config option to be an Option (and default to None) and fix any existing usage to handle that ... or to detect when the time_zone config option is '' and handle that in the same manner ('' -> None tz)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Weijun-H -- this definitely seems like a step in the right direction.

I think we should try and make sure to minimize downstream effects of this change (perhaps we should put a note in the upgrade guide that now() will respect the current configuration settings, for example)

Self {
signature: Signature::nullary(Volatility::Stable),
aliases: vec!["current_timestamp".to_string()],
timezone: Some(Arc::from("+00")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can someone explain the implications of this change on downstream crates? Does it mean that that now() will always uses the current timezone from the ConfigOptions?

/// # Returns
///
/// * `Some(ScalarUDF)` - A new instance of this function configured with the new settings
/// * `None` - If this function does not support runtime reconfiguration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// * `None` - If this function does not support runtime reconfiguration
/// * `None` - If this function does not change with new configuration settings (the default)

// Re-initialize any UDFs that depend on configuration
// This allows both built-in and custom functions to respond to configuration changes
let config_options = state.config().options();
let udf_names: Vec<_> = state.scalar_functions().keys().cloned().collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to "clone()" the keys (all the function names)? It seems like only the &name is used below

Self {
signature: Signature::nullary(Volatility::Stable),
aliases: vec!["current_timestamp".to_string()],
timezone: Some(Arc::from("+00")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: Yes, this is the implication

@Omega359
Copy link
Contributor

This is looking good. I'd like to see an addition to the upgrade guide as this currently will be either a slight change in behaviour (timezone of None previously vs now it'll be Some("+00:00") with default config) or with a different tz set in the config it'll result in a different value.

We may want to update the doc for ExecutionOptions::time_zone slightly as well as currently it's very targetted at 'Extract'.

As a followup PR I think we could allow for when no tz is set in the config_options to use None vs Some('').

@Weijun-H
Copy link
Member Author

This is looking good. I'd like to see an addition to the upgrade guide as this currently will be either a slight change in behaviour (timezone of None previously vs now it'll be Some("+00:00") with default config) or with a different tz set in the config it'll result in a different value.

We may want to update the doc for ExecutionOptions::time_zone slightly as well as currently it's very targetted at 'Extract'.

As a followup PR I think we could allow for when no tz is set in the config_options to use None vs Some('').

tracked by #18081

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate functions Changes to functions implementation logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make now aware of execution timezone

5 participants