Skip to content

Tracking: match git diff slider algorithm more closely in gix-diff #2308

@cruessler

Description

@cruessler

This is a tracking issue related to what’s known as the “diff slider problem”. When calculating the diff between two versions of a file, a hunk can be ambiguous with respect to where it starts and ends. Consider the following example:

 fn test_a_different_thing() {
 }
 
+#[test]
+fn test_something() {
+}
+
 #[test]
 fn test_something_else() {
 } 

The following diff is, technically speaking, equally valid, even though human readers most likely would prefer the above diff.

 fn test_a_different_thing() {
 }
 
 #[test]
+fn test_something() {
+}
+
+#[test]
 fn test_something_else() {
 } 

gix-diff relies on imara-diff 0.1 under the hood for it’s diffing algorithm. There’s a significant number of cases, though, where imara-diff 0.1 and git diff yield different results as they employ different heuristics with respect to moving ambiguous diff hunks up or down.

There is an experimental feature flag in gix-diff (blob-experimental) that uses imara-diff 0.2. imara-diff 0.2 comes closer to what git diff returns, though it still doesn’t match its output in all cases.

We assume that most people would expect gix-diff to yield the same results as git diff. Based on this assumption, we want to track progress towards closing the gap between gix-diff and git diff with respect to slider placement.

Tasks

  • Extract a set of baseline tests for use in gix-diff/tests
  • Document how closely imara-diff matches git diff
  • Port git diff postprocessing heuristic for use with imara-diff 0.2

Sources

https://github.com/mhagger/diff-slider-tools

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-tracking-issueAn issue to track to track the progress of multiple PRs or issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions