Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] When CDC Action is not actively configured with checkpoint and is allowed to force snapshots to be submitted, Flink's failover does not take effect. #4998

Open
2 tasks done
huyuanfeng2018 opened this issue Jan 25, 2025 · 0 comments · May be fixed by #5011
Labels
bug Something isn't working

Comments

@huyuanfeng2018
Copy link
Contributor

Search before asking

  • I searched in the issues and found nothing similar.

Paimon version

0.8+

Compute Engine

Flink

Minimal reproduce step

  1. Use cdc action and do not configure checkpoint intervals (Paimon will actively configure checkpoint via: [CDC] Refactor cdc to reduce redundant code and set default checkpoint interval #2461)
  2. open commit.force-create-snapshot = true
  3. Actively kill a taskmanager during a task run, triggering a flink failover.

What doesn't meet your expectations?

Exception

Chances are, we'll find that the task keeps restarting.

How this led to this Exception

  • With commit.force-create-snapshot turned on, a commit will be triggered to generate a snapshot of paimon even if no data is written.

  • flink's failover resumes from checkpoint, based on paimon's two-phase commit, and resuming from checkpoint may trigger a commit of the manifest file

  • If there are no new data files in the commit (via the streamingCheckpointEnabled variable in the CommitterOperator), the manifest file commit may be filtered out because we didn't actively set the checkpoint. streamingCheckpointEnabled= false, then the commit will be ignored (because no new data file has been generated). Then, after the second job restart, the process will be repeated again, and we will be stuck in a dead loop.

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant