Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

entrymode canonicalization #1260

Merged
merged 2 commits into from
Jan 22, 2024
Merged

entrymode canonicalization #1260

merged 2 commits into from
Jan 22, 2024

Conversation

Byron
Copy link
Member

@Byron Byron commented Jan 22, 2024

Fixes #1259

This means that mode.kind() and mode.is_* will also work correctly if the underlying mode wasn't canonicalized, which could happen when looking at trees that were stored with very old versions of Git.

This means that `mode.kind()` and `mode.is_*` will also work correctly
if the underlying mode wasn't canonicalized, which could happen when
looking at trees that were stored with very old versions of Git.
@Byron
Copy link
Member Author

Byron commented Jan 22, 2024

@xmo-odoo Please take a look at the fix, do you think that works for #1259?

This helps testing for one, but might also be useful in other contexts.
After all, this is a plumbing crate.
@xmo-odoo
Copy link
Contributor

@Byron it seems to reproduce the semantics of canon_mode. I think I have more notes on that morass on my personal machine so if I don't forget I can check that when I clock out to be sure I'm not forgetting some bits.

@Byron
Copy link
Member Author

Byron commented Jan 22, 2024

Alright, I will hold this PR for now while it awaits your review. Thanks for your help with this.

@xmo-odoo
Copy link
Contributor

xmo-odoo commented Jan 22, 2024

Sure thing, thanks as always for the quick turnaround.

FWIW a few months back when digging around this I found a bunch of neutered checks in git's mode handling (https://lore.kernel.org/all/[email protected]/T/), I believe these have since been restored but they're likely opt-in so as not to block existing non-canonical objects. I don't know if you'd already considered / found these issues (the kinda textual nature of git makes for a lot of lurking bugs and inconsistencies, it's pretty maddening, I also discovered recently that at one point git had an experimental binary loose object header patterned after pack entry headers, but that ended up being dropped).

Some of the issues could occur for legacy reasons although the only one I remember was around permissions handling (a very old version of git could generate a third blob permission set1), the entry issues I'm not sure were ever generated by git but third party implementation could have run with them as git did not complain / fault.

Footnotes

  1. ah it's listed in the mailing list message and documented for fsck --strict: some very old blobs apparently have the permissions 664/775 (g+w set), something along those lines

@Byron
Copy link
Member Author

Byron commented Jan 22, 2024

I also discovered recently that at one point git had an experimental binary loose object header patterned after pack entry headers, but that ended up being dropped).

Oh that's interesting! I never saw it, and hope that was never released to avoid having to deal with this new loose object format 😅.

[..] the kinda textual nature of git makes for a lot of lurking bugs and inconsistencies, it's pretty maddening, [..]

Generally, gix has to be just as compatible, even if it's unpleasant. There are options around having a 'strict' mode though if there is demand, of course, but even if there was right now that would have to be postponed to the future.

@xmo-odoo
Copy link
Contributor

I also discovered recently that at one point git had an experimental binary loose object header patterned after pack entry headers, but that ended up being dropped).

Oh that's interesting! I never saw it, and hope that was never released to avoid having to deal with this new loose object format 😅.

I only know of it because libgit2 has a codepath to parse them so I'd assume it did leak in the wild at one point. However the commit originally adding it is not exactly informative. Seems reasonable to ignore it I think.

@Byron
Copy link
Member Author

Byron commented Jan 22, 2024

Actually, it fits my workflow better to merge this, and instead while reviewing you could create a new PR with changes should you find anything.

Thanks for your understanding.

@Byron Byron merged commit 35d2083 into main Jan 22, 2024
18 checks passed
@Byron
Copy link
Member Author

Byron commented Jan 22, 2024

I also discovered recently that at one point git had an experimental binary loose object header patterned after pack entry headers, but that ended up being dropped).

Oh that's interesting! I never saw it, and hope that was never released to avoid having to deal with this new loose object format 😅.

I only know of it because libgit2 has a codepath to parse them so I'd assume it did leak in the wild at one point. However the commit originally adding it is not exactly informative. Seems reasonable to ignore it I think.

Thanks for digging this out, very interesting indeed!

And I agree, it's probably fine to just let it be forgotten and assume that no repository that uses these objects is still around. If that changes, that is somebody raises an issue with such an object as example, I'd be happy to implement it though - let's see :D.

@xmo-odoo
Copy link
Contributor

Actually, it fits my workflow better to merge this, and instead while reviewing you could create a new PR with changes should you find anything.

Thanks for your understanding.

Sure thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Discretisation of EntryMode to EntryKind is almost certainly incorrect
2 participants