Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A few questions :D #2

Open
lukesandberg opened this issue Apr 15, 2024 · 0 comments
Open

A few questions :D #2

lukesandberg opened this issue Apr 15, 2024 · 0 comments

Comments

@lukesandberg
Copy link

Thanks for the blog post and repository, they were some interesting food for thought and i appreciated the unique contributions (circular buffer and restricting the k range)

I strongly suspect that the reason these optimizations are not very effective is simply because they only trigger under 'worst case' conditions. e.g. -(D - 2*max(0, D-M) only evaluates to something other than -D when D>M but the fundamental hypothsis of software like diff is that D is very small. When D is close to M (or N), this means that the meyers algorithm is quadratic. Software like git diff simply avoids ever exploring that muche of the edit space and switches to a different heuristic, or splits on a suboptimal but 'good enough diagonal chain (which they confusingly call a snake). diffutils apparently has a similar trick.

The modular arithmetic will add a non trivial overhead and possibly interfere with prefetching logic leading to more cache misses. Furthermore this would only tend to trigger when D > delta, such a condition isn't suprising, but allocating a large array that you don't always access (e.g. when D is small) is also very efficient on modern hardware and operating systems due to virtual memory.

I would hazard a guess that this is the reason the MaxRSS doesn't behave as you expected in your implementation, you end up pulling in more pages due to the wraparound.

Thanks again for the article and repo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant