-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fix timezone parsing to support 3-component IANA names and hyphens #5280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit fixes a parsing error in the MySQL SET statement parser that
occurred when processing `SET time_zone` statements with:
1. Three-component IANA timezone names (e.g., America/Argentina/Buenos_Aires)
2. Timezone names containing hyphens (e.g., America/Port-au-Prince)
Previously, the regex pattern `(?:\w+/\w+)` only matched 2-component
timezone names and did not support hyphens. This caused parsing errors
logged as:
"[ERROR] Unable to parse query. If correct, report it as a bug:
SET time_zone=\"America/Argentina/Buenos_Aires\";"
When multiplexing is enabled, this bug causes timestamps to be incorrectly
written to the database.
Changes:
- Updated timezone regex from `(?:\w+/\w+)` to `(?:[\w-]+(?:/[\w-]+){1,2})`
- Supports 2-3 components: Area/Location or Area/Country/Location
- Supports hyphens in component names (e.g., Port-au-Prince)
- Added comprehensive Doxygen documentation for timezone parsing
- Extended TAP test cases with new timezone formats
Note: Bare words like 'SYSTEM' and 'UTC' were already supported via
other patterns in the parser (vp2 pattern for word matching).
Fixes: #4993
Related: gemini-code-assist review comments
Summary of ChangesHello @renecannao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical parsing bug in the MySQL SET time_zone statement, which previously failed to recognize IANA timezone names containing three components or hyphens. The updated regex pattern now correctly processes these formats, preventing data inconsistencies when multiplexing is active. The changes are supported by new, comprehensive Doxygen documentation explaining the parsing logic and expanded test cases to ensure robustness. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is an excellent pull request. The fix to the timezone parsing regex is correct and effectively resolves the issue with 3-component IANA names and names containing hyphens. The addition of comprehensive Doxygen documentation is a fantastic improvement for maintainability, and the new test cases are thorough, ensuring the fix is well-covered. I have one minor suggestion to improve the clarity of one of the new test cases.
| // Timezone names with hyphens (additional fix) | ||
| { "SET time_zone = 'America/Port-au-Prince'", { Expected("time_zone", {"America/Port-au-Prince"}) } }, | ||
| { "SET time_zone = 'America/Blanc-Sablon'", { Expected("time_zone", {"America/Blanc-Sablon"}) } }, | ||
| { "SET time_zone = \"Atlantic/Canary\"", { Expected("time_zone", {"Atlantic/Canary"}) } }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test case for Atlantic/Canary is valid, but it doesn't test the new hyphen support as the comment on line 156 suggests, since the name contains no hyphens and would have been matched by the old regex. To make this test more specific to the hyphen-support feature, consider replacing it with a name that does contain a hyphen, such as US/East-Indiana.
| { "SET time_zone = \"Atlantic/Canary\"", { Expected("time_zone", {"Atlantic/Canary"}) } }, | |
| { "SET time_zone = \"US/East-Indiana\"", { Expected("time_zone", {"US/East-Indiana"}) } }, |
Replace 'Atlantic/Canary' with 'US/East-Indiana' in the hyphen test cases. The previous test case (Atlantic/Canary) did not contain a hyphen and would have been matched by the old regex, so it did not actually test the new hyphen support. The new test case (US/East-Indiana) contains a hyphen and properly validates the hyphen support feature. Thanks to @gemini-code-assist for the excellent review feedback!
|
@gemini-code-assist Thank you for the excellent review! I've addressed your feedback by replacing You were absolutely right - |
|
You're very welcome, @renecannao! I appreciate you taking the time to update the test cases. Replacing |
|
|
retest this please |



Summary
Fixes a parsing error in the MySQL SET statement parser that occurred when processing
SET time_zonestatements with:America/Argentina/Buenos_Aires,America/Indiana/Indianapolis)America/Port-au-Prince,America/Blanc-Sablon)Previously, the regex pattern
(?:\w+/\w+)only matched 2-component timezone names and did not support hyphens. This caused parsing errors in the logs:When multiplexing is enabled, this bug causes timestamps to be incorrectly written to the database.
Changes
Core Fix
lib/MySQL_Set_Stmt_Parser.cpp:298(?:\w+/\w+)- matches only 2 components, no hyphens(?:[\w-]+(?:/[\w-]+){1,2})- matches 2-3 components with hyphensDocumentation
+08:00) and IANA timezone name formatTests
time_zonetest array intest/tap/tests/setparser_test_common.hExamples of Supported Timezones
+08:00,-05:30,+00:00Europe/London,America/New_York,Asia/TokyoAmerica/Argentina/Buenos_Aires,America/Indiana/IndianapolisAmerica/Port-au-Prince,America/Blanc-SablonSYSTEM,UTCAddresses Review Feedback
This PR incorporates feedback from gemini-code-assist on the original #4993:
America/Port-au-Prince)Limitations
The regex pattern limits matching to 2-3 components (e.g.,
Area/LocationorArea/Country/Location). While IANA timezone names with 4+ components are theoretically possible, they are extremely rare and not currently supported. This is documented in the code.Related
Credit to @pbrydzinski for the original PR