-
Couldn't load subscription status.
- Fork 2
chore(report): add PR diff report generator #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit adds a Python script that generates much more verbose PR Diff.
| # Fallback if exact match fails | ||
| file_path = lines[0].split()[2][2:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this an expected scenario? Or should we log something / panic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you're right - we should panic if it doesn't match.
|
|
||
| class Hunk: | ||
| def __init__(self, text: str): | ||
| self.text = text.strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I followed everything correctly, text here begins with a hunk header like: @@ -5,3 +5,6 @@. What happens if two diffs are identical but (for example due do the PRs being based on different branches) are applied on different lines of the files? Would that mistakenly count as a different diff then? Or have I missed something?
Other than that, using the md5 checksum should work fine as a strategy to compare hunks, as long as the two diffs are generated with the same number of lines of context around the changes, which appears to be 3 (non configurable) for gh pr diff.
I'm wondering if we should limit the comparison to the actual changes, similarly to what was happening before:
Lines 79 to 80 in 3785a2d
| pr1NoCtxDiff=$(echo "$pr1Diff" | grep -v '^[^+-]') | |
| pr2NoCtxDiff=$(echo "$pr2Diff" | grep -v '^[^+-]') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct - however I'm not sure if we could reliably tell if given diff is the same or not. There certainly are the cases where "left" hunk is the same as the "right" hunk but only line numbers differ - but can we be sure in that case that these should be aligned? I think there are two scenarios:
- Some lines were added at the top of "right" hunk so the diff is the same but line numbers differ.
- The "right" hunk appears in a completely different place in the file and results in different execution flow.
I'm not sure how to distinguish between these two 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure either at the moment. But this seems to happen quite often, especially when comparing backports, so we really need to find a solution. Otherwise, many diffs will be marked as different even when they (logically) aren't.
This change makes it easier to identify identical diffs (100% similarity cases). However, I'm worried that false positives would create confusion, since the similarity score would appear "wrong".
The current code might be vulnerable to 2 if I remember correctly. I think - maybe - that if the purpose is to validate that nothing was missed/forgotten between two PR, it could be good enough to check that there are no missing hunks, and trust the developers that they were fit in the right place. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree - we can tackle the 2 a bit better. I'm experimenting with the algorithm. I've added hunk similarity comparison based on + / - lines. Here's an example:
Kong/kong#14756 (comment)
What do you think?
We might also add sort of fuzzy matching but I'm not sure if that won't hinder fidelity of the score 🤔
This commit adds a Python script that generates
much more verbose PR Diff.
The results of these reports are presented in Kong's repository: