table, executor, lightning: fix generated column TIMESTAMP index inconsistency across session time zones#66771
table, executor, lightning: fix generated column TIMESTAMP index inconsistency across session time zones#66771Desel72 wants to merge 24 commits intopingcap:masterfrom
Conversation
|
Review Complete Findings: 0 issues ℹ️ Learn more details on Pantheon AI. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @Desel72. Thanks for your PR. I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Welcome @Desel72! |
|
Hi @Desel72. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds UTC-aware evaluation for generated columns that depend on TIMESTAMP across insert, update, import, decode, and virtual-column paths by converting TIMESTAMP inputs to UTC, using a UTC expression-eval context for generation/casting, then restoring the original session location and post-converting TIMESTAMP results back to the session zone. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Executor
participant TableInfo
participant EvalCtx
participant MutRow
Client->>Executor: INSERT / UPDATE / IMPORT / DECODE
Executor->>TableInfo: HasGeneratedColumnDependingOnTimestamp(tbl)?
alt timestamp-dependent
TableInfo-->>Executor: true
Executor->>MutRow: ConvertTimestampDatumsToUTC / ConvertTimestampMutRowToUTC
MutRow-->>Executor: restoreFunc
Executor->>EvalCtx: CtxWithUTCLocation(originalCtx) -> utcCtx
Executor->>EvalCtx: Evaluate generated expressions with utcCtx
EvalCtx-->>Executor: Generated values (UTC)
Executor->>MutRow: If gen col is TIMESTAMP, convert value from UTC -> session loc
Executor->>MutRow: restoreFunc()
else not timestamp-dependent
TableInfo-->>Executor: false
Executor->>EvalCtx: Evaluate generated columns with session location
end
Executor-->>Client: Persist row / return result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can use TruffleHog to scan for secrets in your code with verification capabilities.Add a TruffleHog config file (e.g. trufflehog-config.yml, trufflehog.yml) to your project to customize detectors and scanning behavior. The tool runs only when a config file is present. |
|
Hi @bb7133 |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/executor/test/writetest/write_test.go (1)
555-611: Add an indirect-dependency regression case.This test only covers a generated column that directly references
ts. It won't catch a chain likeg1 DATETIME AS (ts)andg2 TIMESTAMP AS (g1), where the virtual-column backfill/check path still makes a different UTC decision. Please add one mixed-time_zonecase withADMIN CHECKfor that shape so the fix is locked in end-to-end.As per coding guidelines:
**/*_test.go: MUST add a regression test and verify it fails before fix and passes after fix for bug fixes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/executor/test/writetest/write_test.go` around lines 555 - 611, Add a regression case to TestGeneratedColumnTimestampTZConsistency that creates a chained generated-column table (e.g., table name t_chain) with columns ts TIMESTAMP, g1 DATETIME GENERATED ALWAYS AS (ts) (virtual or stored as appropriate), and g2 TIMESTAMP GENERATED ALWAYS AS (g1) with an index on g2 (e.g., INDEX idx_g2); insert a row with time_zone = '+00:00', change session time_zone to '-08:00' and perform an UPDATE to ts, then switch back to '+00:00' and run ADMIN CHECK TABLE t_chain and ADMIN CHECK INDEX t_chain idx_g2 and assert the reported g2 value is identical when read under both time zones (similar to how gen_hour is checked), so the test exercises the indirect dependency path (g1 -> g2) end-to-end.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/table/column.go`:
- Around line 870-925: The UTC switch must be decided at table-level, not
per-output-column: instead of looping only over virtualColumnIndex to set
needUTC, check all generated columns (scan colInfos for any column where
GeneratedColumnDependsOnTimestamp(...) is true) and set needUTC if any such
column exists; keep using tsColIndices for timestamp column positions and keep
the per-column isUTCCol check simple (use needUTC &&
GeneratedColumnDependsOnTimestamp(colInfos[idx], colInfos) if you still want to
restrict conversion to virtuals that are timestamp-derived), but ensure needUTC
is true whenever any generated column in the table transitively depends on a
TIMESTAMP so the whole FillVirtualColumnValue evaluation runs in UTC like other
evaluators.
---
Nitpick comments:
In `@pkg/executor/test/writetest/write_test.go`:
- Around line 555-611: Add a regression case to
TestGeneratedColumnTimestampTZConsistency that creates a chained
generated-column table (e.g., table name t_chain) with columns ts TIMESTAMP, g1
DATETIME GENERATED ALWAYS AS (ts) (virtual or stored as appropriate), and g2
TIMESTAMP GENERATED ALWAYS AS (g1) with an index on g2 (e.g., INDEX idx_g2);
insert a row with time_zone = '+00:00', change session time_zone to '-08:00' and
perform an UPDATE to ts, then switch back to '+00:00' and run ADMIN CHECK TABLE
t_chain and ADMIN CHECK INDEX t_chain idx_g2 and assert the reported g2 value is
identical when read under both time zones (similar to how gen_hour is checked),
so the test exercises the indirect dependency path (g1 -> g2) end-to-end.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: ac22ac58-594a-4377-a4f0-a7e42d1a7e82
📒 Files selected for processing (7)
pkg/executor/insert_common.gopkg/executor/test/writetest/write_test.gopkg/executor/write.gopkg/expression/exprctx/context.gopkg/lightning/backend/kv/base.gopkg/table/column.gopkg/util/rowDecoder/decoder.go
…MP fix Widen needUTC check in FillVirtualColumnValue to scan all generated columns (not just virtualColumnIndex) so transitive TIMESTAMP dependencies through stored columns are detected. Add chained generated-column test case to cover indirect TIMESTAMP dependency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
pkg/table/column.go (1)
905-927:⚠️ Potential issue | 🟠 MajorRun the whole virtual-column pass under UTC once
needUTCis true.
isUTCColstill decides bothEvalVirtualColumnandCastColumnValueper output column, so a chain likeg1 DATETIME AS (ts)andg2 TIMESTAMP AS (g1)will still evaluate/castg2in the session timezone here. That keepsFillVirtualColumnValueout of sync with the other generated-column paths in this PR and can still produce divergent virtual values during reads orADMIN CHECK.🛠️ Suggested direction
- isUTCCol := needUTC && len(tsColIndices) > 0 && GeneratedColumnDependsOnTimestamp(colInfos[idx], colInfos) + useUTCCtx := needUTC && origLoc != time.UTC + needTSRewrite := useUTCCtx && len(tsColIndices) > 0 && GeneratedColumnDependsOnTimestamp(colInfos[idx], colInfos) for row := iter.Begin(); row != iter.End(); row = iter.Next() { var datum types.Datum var err error - if isUTCCol { + if needTSRewrite { // Extract datums, convert TIMESTAMP columns from session TZ to UTC, // evaluate in UTC context to get consistent results. datums := make([]types.Datum, nCols) for ci := 0; ci < nCols; ci++ { datums[ci] = row.GetDatum(ci, &colInfos[ci].FieldType) } for _, ci := range tsColIndices { if !datums[ci].IsNull() && datums[ci].Kind() == types.KindMysqlTime { t := datums[ci].GetMysqlTime() if convErr := t.ConvertTimeZone(origLoc, time.UTC); convErr == nil { datums[ci].SetMysqlTime(t) } } } mutRow := chunk.MutRowFromDatums(datums) datum, err = expCols[idx].EvalVirtualColumn(utcEvalCtx, mutRow.ToRow()) + } else if useUTCCtx { + datum, err = expCols[idx].EvalVirtualColumn(utcEvalCtx, row) } else { datum, err = expCols[idx].EvalVirtualColumn(evalCtx, row) } if err != nil { return err } @@ // Because the expression might return different type from // the generated column, we should wrap a CAST on the result. castCtx := ectx - if isUTCCol { + if useUTCCtx { castCtx = utcCastCtx } castDatum, err := CastColumnValue(castCtx, datum, colInfos[idx], false, true)Based on learnings: Correctness is first priority in TiDB changes; seemingly small changes can alter SQL semantics, consistency, or cluster behavior.
Also applies to: 934-938
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/table/column.go` around lines 905 - 927, The current code uses isUTCCol to decide per-output-column whether to use utcEvalCtx for EvalVirtualColumn and CastColumnValue, causing chains like g1 DATETIME AS (ts); g2 TIMESTAMP AS (g1) to be evaluated/cast in session TZ and produce inconsistent virtual values; change the virtual-column evaluation pass so that when needUTC is true (and tsColIndices non-empty / GeneratedColumnDependsOnTimestamp indicates dependence) you run the entire virtual-column pass under utcEvalCtx (use utcEvalCtx for all expCols[idx].EvalVirtualColumn and CastColumnValue calls) instead of switching per column by isUTCCol, ensuring expCols, FillVirtualColumnValue, CastColumnValue, and any dependent evaluation use utcEvalCtx consistently; keep GeneratedColumnDependsOnTimestamp, tsColIndices, expCols, utcEvalCtx and evalCtx references to locate the affected code paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@pkg/table/column.go`:
- Around line 905-927: The current code uses isUTCCol to decide
per-output-column whether to use utcEvalCtx for EvalVirtualColumn and
CastColumnValue, causing chains like g1 DATETIME AS (ts); g2 TIMESTAMP AS (g1)
to be evaluated/cast in session TZ and produce inconsistent virtual values;
change the virtual-column evaluation pass so that when needUTC is true (and
tsColIndices non-empty / GeneratedColumnDependsOnTimestamp indicates dependence)
you run the entire virtual-column pass under utcEvalCtx (use utcEvalCtx for all
expCols[idx].EvalVirtualColumn and CastColumnValue calls) instead of switching
per column by isUTCCol, ensuring expCols, FillVirtualColumnValue,
CastColumnValue, and any dependent evaluation use utcEvalCtx consistently; keep
GeneratedColumnDependsOnTimestamp, tsColIndices, expCols, utcEvalCtx and evalCtx
references to locate the affected code paths.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: cd87f872-0077-4079-9d52-841ec8356ee0
📒 Files selected for processing (2)
pkg/executor/test/writetest/write_test.gopkg/table/column.go
|
/ok-to-test |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #66771 +/- ##
================================================
+ Coverage 77.7189% 78.1526% +0.4337%
================================================
Files 2016 1945 -71
Lines 552594 539659 -12935
================================================
- Hits 429470 421758 -7712
+ Misses 121382 117474 -3908
+ Partials 1742 427 -1315
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/retest |
|
/retest |
|
Hi @lance6716 I've updated this PR. Please review this. |
|
@wjhuang2016 @xhebox can you take a look? Thanks. |
|
Hi @wjhuang2016 @xhebox @qw4990 @winoros @guo-shaoge |
|
Hi @MyonKeminta @D3Hunter @lance6716 |
|
Hi @lance6716 Could you please review this? I want to contribute to this project. I think this should be merged asap. Thanks |
|
Hi @Desel72 I am not familiar with this part. Please wait other reviewers. |
|
@lance6716 Thanks for reply. Could you please let me know when this can be merged? |
Need 2 review's LGTM + code OWNER's approve. You can check comment of ti-chi-bot |
| genCols []GeneratedCol) (errCol *model.ColumnInfo, err error) { | ||
| genCols []GeneratedCol, tblInfo *model.TableInfo) (errCol *model.ColumnInfo, err error) { | ||
| mutRow := chunk.MutRowFromDatums(record) | ||
| evalCtx := se.GetExprCtx().GetEvalCtx() |
There was a problem hiding this comment.
please build a case for lightning if it's also affected, and check it passes after the change
Add TestEncodeGeneratedTimestampTZConsistency to verify that Lightning's evalGeneratedColumns produces identical KV pairs for stored generated columns depending on TIMESTAMP regardless of session time_zone. Ref pingcap#66753 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ad004fb to
1ae4faa
Compare
|
/retest |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… columns In multi-table UPDATE statements (e.g. UPDATE t1, t2 SET ...), assign.Col.Index is the evalBuffer index spanning all tables in the join. Accessing tblMeta.Columns[assign.Col.Index] causes an index-out-of-range panic when the assignment refers to a column in another table. Fix by using the table-local index (assign.Col.Index - offset) and bounds-checking before accessing tblMeta.Columns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/retest |
1 similar comment
|
/retest |
|
@bb7133 @expxiaoli @qw4990 @AilinKid @joechenrh @GMHDBJD @D3Hunter @hawkingrei @zimulala @winoros |
|
[FORMAT CHECKER NOTIFICATION] Notice: To remove the For example:
📖 For more info, you can check the "Contribute Code" section in the development guide. |
|
@Desel72: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/retest |
What problem does this PR solve?
Issue Number: close #66753
Problem Summary:
Generated column expressions referencing TIMESTAMP columns produce different values depending on session timezone, causing index-record inconsistency. For example,
HOUR(ts_col)returns different values in UTC vs UTC-8 because TIMESTAMP values are stored in UTC but converted to session timezone on read. This leads to[tikv:8141] assertion failederrors andADMIN CHECK TABLEfailures.What changed and how does it work?
Force UTC timezone when evaluating generated column expressions that depend on TIMESTAMP columns. This ensures deterministic, consistent results regardless of session timezone.
Key changes:
CtxWithUTCLocation()helper to override eval context timezone to UTCGeneratedColumnDependsOnTimestamp()to detect TIMESTAMP dependenciesConvertTimestampDatumsToUTC()/ConvertTimestampMutRowToUTC()to convert TIMESTAMP values before expression evaluationinsert_common.go), UPDATE (write.go), ADMIN CHECK / DDL reorg (decoder.go), Lightning import (base.go), and virtual column filling (FillVirtualColumnValue)Check List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
Bug Fixes
Tests