Skip to content

Commit ebd36df

Browse files
committed
Fix issue hash collision for missing-patch-file antipatterns
CRITICAL BUG: Multiple patch files with same CVE got same issue_hash PROBLEM: Patch descriptions like: - 'CVE-2025-11111.patch' → extracted CVE-2025-11111 → hash: nginx-CVE-2025-11111-missing-patch-file - 'CVE-2025-11111-and-CVE-2025-22222.patch' → extracted CVE-2025-11111 (first match) → SAME hash! Both antipatterns got the SAME issue_hash, so challenging one marked BOTH as challenged. IMPACT: User challenged 1 antipattern out of 10, but Azure Function saw: - issue_lifecycle had 10 entries - But 2 shared same hash (hash collision) - Challenging nginx-CVE-2025-11111-missing-patch-file marked BOTH as challenged - If there were multiple collisions, could mark many as challenged - total=10, challenged=10 → unchallenged=0 → Removed 'radar-issues-detected' label ❌ ROOT CAUSE: _extract_key_identifier() prioritized CVE extraction over patch filename. For missing-patch-file, the patch FILENAME is the unique identifier, not the CVE. FIX: Check antipattern.id == 'missing-patch-file' FIRST, extract full patch filename. This ensures each patch file gets a unique hash: - 'CVE-2025-11111.patch' → nginx-CVE-2025-11111-missing-patch-file - 'CVE-2025-11111-and-CVE-2025-22222.patch' → nginx-CVE-2025-11111-and-CVE-2025-22222-missing-patch-file - 'security-fix.patch' → nginx-security-fix-missing-patch-file Now each patch file has a unique hash, preventing false positive challenges.
1 parent 55c4534 commit ebd36df

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

.pipelines/prchecks/CveSpecFilePRCheck/AntiPatternDetector.py

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -149,15 +149,22 @@ def _extract_key_identifier(self, antipattern: 'AntiPattern') -> str:
149149
Returns:
150150
Stable identifier string
151151
"""
152-
# Try to extract CVE number first (most common)
152+
# For missing-patch-file, use patch filename as identifier (most specific)
153+
# This prevents hash collisions when multiple patches reference same CVE
154+
if antipattern.id == "missing-patch-file":
155+
patch_match = re.search(r"(?:Patch file |')([A-Za-z0-9_.-]+\.patch)", antipattern.description)
156+
if patch_match:
157+
return patch_match.group(1).replace('.patch', '') # e.g., "CVE-2085-88888" from "CVE-2085-88888.patch"
158+
159+
# Try to extract CVE number (for CVE-related antipatterns)
153160
cve_match = re.search(r'CVE-\d{4}-\d+', antipattern.description)
154161
if cve_match:
155162
return cve_match.group(0) # e.g., "CVE-2085-88888"
156163

157-
# Extract patch filename
164+
# Extract patch filename (fallback for other patch-related issues)
158165
patch_match = re.search(r"(?:Patch file |')([A-Za-z0-9_.-]+\.patch)", antipattern.description)
159166
if patch_match:
160-
return patch_match.group(1) # e.g., "CVE-2085-88888.patch"
167+
return patch_match.group(1).replace('.patch', '') # e.g., "CVE-2085-88888"
161168

162169
# For changelog entries, try to extract meaningful text
163170
entry_match = re.search(r"entry '([^']+)'", antipattern.description)

0 commit comments

Comments
 (0)