[spark] Support V2 UPDATE for data evolution tables#8214
Conversation
|
How to support delete row ids? cc @leaves12138 |
@JingsongLi Deleted row ids are simply retired — they become holes in the row-id space. Surviving rows always keep their original row ids, and no physical |
JingsongLi
left a comment
There was a problem hiding this comment.
+1 from my side, no blocking issues found
@kerwin-zk It's too troublesome to maintain the index after deletion. Maybe the correct solution is deletion-vector. Can you just support update in this PR? |
c5d12ff to
bcbbfb0
Compare
bcbbfb0 to
55f0db4
Compare
55f0db4 to
1927c17
Compare
| .withIgnorePreviousFiles(true) | ||
| .getWrite | ||
| .asInstanceOf[AbstractFileStoreWrite[PaimonInternalRow]] | ||
| .createWriter(partition, 0) |
There was a problem hiding this comment.
This path skips the IOManager setup that the normal V2 writer does before creating append writers. If a data-evolution table has write-buffer-for-append=true (with the default spillable buffer), AppendOnlyWriter builds an ExternalBuffer with a null IOManager; once the buffer spills, it will fail at ioManager.createChannel(). Please pass an IOManager into this TableWriteImpl (and close it with the writer), or otherwise disable the buffered append path here.
Purpose
Support V2 UPDATE for data evolution tables.
Tests
CI