Skip to content

Commit 7484a07

Browse files
res-lifeChong Gao
andauthored
Filter out timezones not supported by Python (NVIDIA#13289)
Fixes NVIDIA#13285 ### Description The root cause is the inconsistent between Python and JVM: JVM supports 'America/Coyhaique' but Python gets None in the code `tz.gettz(time_zone)`. And the `TimestampGen` will use `TimestampNTZType` type which is not expected, then causes error: `TypeError: can't subtract offset-naive and offset-aware datetimes` ### Fix Filter out timezones not supported by Python ### Checklists This PR has: - [ ] added documentation for new or modified features or behaviors. - [x] updated the license in the source code files when it is required. - [x] added new tests or modified existing tests to cover new code paths. (Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.) Please select one of the following options: - [ ] Performance testing has been performed and its results are added in the PR description. - [ ] An issue is filed for performance testing and its link is added in the PR description. (Select this if performance testing will not be completed before the PR is submitted.) --------- Signed-off-by: Chong Gao <[email protected]> Co-authored-by: Chong Gao <[email protected]>
1 parent 462756f commit 7484a07

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

integration_tests/src/main/python/timezones.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
# limitations under the License.
1414

1515
from conftest import spark_jvm
16+
from dateutil import tz
1617

1718
fixed_offset_timezones = ["Asia/Shanghai", "UTC", "UTC+0", "UTC-0", "GMT", "GMT+0", "GMT-0", "EST", "MST", "VST"]
1819
variable_offset_timezones = ["PST", "NST", "AST", "America/Los_Angeles", "America/New_York", "America/Chicago"]
@@ -21,5 +22,6 @@
2122

2223
# Dynamically get supported timezones from JVM.
2324
# Different JVMs can have different timezones, should not use a constant list here.
24-
# Note: excludes `America/Coyhaique`, refer to bug: https://github.com/NVIDIA/spark-rapids/issues/13285
25-
all_timezones = [tz for tz in spark_jvm().java.time.ZoneId.getAvailableZoneIds() if tz != 'America/Coyhaique']
25+
# Also different Python versions can have different timezones, so we also filter out timezones not supported by Python.
26+
all_timezones = [zone_name for zone_name in spark_jvm().java.time.ZoneId.getAvailableZoneIds()
27+
if tz.gettz(zone_name) is not None]

0 commit comments

Comments
 (0)