fix: detect concurrent MarkDirty during patrol slice to prevent stale ANN#8
Merged
Conversation
… ANN If a merge/split fires MarkDirty while a patrol slice is already running, the slice would finish and overwrite CursorTargetID with lastTarget.ID, silently discarding the cursor reset (to 0) that MarkDirty had set. The next "empty-targets" slice would then see cursor > all IDs and mark dirty=false, so the re-scan from scratch (with a fresh ANN built after the concurrent write) never happened. Add DirtyGeneration to personMergeSuggestionState, incremented on every MarkDirty call. At the end of each slice, only advance CursorTargetID if the generation is unchanged; otherwise keep cursor=0 so the next run starts a fresh re-scan with an up-to-date ANN index. Observed symptom: person 264884 (84.5% similarity with 271495/牛牛) was missing from merge suggestions because 271495 was merged at 15:33 CST while the patrol ANN was being built, and the resulting MarkDirty was overwritten by the slice completion at 15:41 CST. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
davidhoo
added a commit
that referenced
this pull request
May 19, 2026
…#9) The hourly stale re-run adds little value: all person-modifying operations (merge, split, move_faces, category change, detection, recluster) already call MarkDirty immediately, so re-running on unchanged data produces identical suggestions. The DirtyGeneration fix (PR #8) also closed the race condition where a concurrent MarkDirty could be silently lost. 24h is a more appropriate fallback interval — pure safety net for edge cases like a missed MarkDirty trigger, not a primary discovery mechanism. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DirtyGeneration uint64字段到personMergeSuggestionState,每次MarkDirty调用时递增RunBackgroundSlice在 slice 开始时捕获 generation,结束时只有 generation 未变才推进CursorTargetID;若 generation 已变(说明 slice 执行期间有并发 MarkDirty),保持 cursor=0,让下一次运行从头重扫并重建 ANNRoot cause
当 patrol slice 正在运行时,如果用户执行合并操作触发了
MarkDirty(将 cursor 重置为 0),slice 完成时会无条件写入CursorTargetID = lastTarget.ID,覆盖掉 MarkDirty 的 cursor=0。下一个 slice 看到 cursor > 所有 target ID,认为本轮巡检完成,设dirty=false——MarkDirty 触发的"从头重扫"信号就此丢失。实测症状:271495(牛牛)于 15:33 CST 完成合并,ANN 在 15:30 开始构建,用到了合并前的旧 prototype;MarkDirty 在 15:33 重置 cursor=0,但被 15:41 完成的 slice 覆盖,导致 264884(84.5% 相似度)始终不出现在合并建议里。
Test plan
🤖 Generated with Claude Code