[spark] Support V2 UPDATE for data evolution tables by kerwin-zk · Pull Request #8214 · apache/paimon

kerwin-zk · 2026-06-12T04:24:00Z

Purpose

Support V2 UPDATE for data evolution tables.

Tests

CI

JingsongLi · 2026-06-12T06:51:05Z

How to support delete row ids? cc @leaves12138

kerwin-zk · 2026-06-12T07:26:17Z

How to support delete row ids? cc @leaves12138

@JingsongLi Deleted row ids are simply retired — they become holes in the row-id space. Surviving rows always keep their original row ids, and no physical _ROW_ID column is ever written, so row ids stay derived from firstRowId + position, exactly like the files produced by DataEvolutionDeleteRewriter in #8182

JingsongLi

+1 from my side, no blocking issues found

JingsongLi · 2026-06-17T08:38:58Z

How to support delete row ids? cc @leaves12138

@JingsongLi Deleted row ids are simply retired — they become holes in the row-id space. Surviving rows always keep their original row ids, and no physical _ROW_ID column is ever written, so row ids stay derived from firstRowId + position, exactly like the files produced by DataEvolutionDeleteRewriter in #8182

@kerwin-zk It's too troublesome to maintain the index after deletion. Maybe the correct solution is deletion-vector. Can you just support update in this PR?

JingsongLi · 2026-06-20T02:17:26Z

+      .withIgnorePreviousFiles(true)
+      .getWrite
+      .asInstanceOf[AbstractFileStoreWrite[PaimonInternalRow]]
+      .createWriter(partition, 0)


This path skips the IOManager setup that the normal V2 writer does before creating append writers. If a data-evolution table has write-buffer-for-append=true (with the default spillable buffer), AppendOnlyWriter builds an ExternalBuffer with a null IOManager; once the buffer spills, it will fail at ioManager.createChannel(). Please pass an IOManager into this TableWriteImpl (and close it with the writer), or otherwise disable the buffered append path here.

JingsongLi reviewed Jun 13, 2026

View reviewed changes

kerwin-zk force-pushed the spark-v2-dml-data-evolution branch from c5d12ff to bcbbfb0 Compare June 17, 2026 13:34

kerwin-zk changed the title ~~[spark] Support V2 DELETE and UPDATE for data evolution tables~~ [spark] Support V2 UPDATE for data evolution tables Jun 17, 2026

kerwin-zk force-pushed the spark-v2-dml-data-evolution branch from bcbbfb0 to 55f0db4 Compare June 17, 2026 14:02

[spark] Support V2 UPDATE for data evolution tables

1927c17

kerwin-zk force-pushed the spark-v2-dml-data-evolution branch from 55f0db4 to 1927c17 Compare June 18, 2026 04:32

JingsongLi reviewed Jun 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[spark] Support V2 UPDATE for data evolution tables#8214

[spark] Support V2 UPDATE for data evolution tables#8214
kerwin-zk wants to merge 1 commit into
apache:masterfrom
kerwin-zk:spark-v2-dml-data-evolution

kerwin-zk commented Jun 12, 2026 •

edited

Loading

Uh oh!

JingsongLi commented Jun 12, 2026

Uh oh!

kerwin-zk commented Jun 12, 2026

Uh oh!

JingsongLi left a comment

Uh oh!

JingsongLi commented Jun 17, 2026

Uh oh!

JingsongLi Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kerwin-zk commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Tests

Uh oh!

JingsongLi commented Jun 12, 2026

Uh oh!

kerwin-zk commented Jun 12, 2026

Uh oh!

JingsongLi left a comment

Choose a reason for hiding this comment

Uh oh!

JingsongLi commented Jun 17, 2026

Uh oh!

JingsongLi Jun 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kerwin-zk commented Jun 12, 2026 •

edited

Loading