Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: consider non-files as pruned officially. #1730

Merged
merged 3 commits into from
Dec 20, 2024
Merged

fix!: consider non-files as pruned officially. #1730

merged 3 commits into from
Dec 20, 2024

Conversation

Byron
Copy link
Member

@Byron Byron commented Dec 19, 2024

Fixes #1729.

Tasks

  • experiment with NonFile + Pruned
  • change it to NoneFile + default status handling

I think the learning is that NonFile is a file type downstream has to deal with officially, just pruning these leads to unexpected results.

Git will officially say that it won't diff or read or add non-files, and to do that, such entries must not be pruned, but they must be part of the normal entry handling that should always or most of the time reject non-files.

Previously, these were misclassified as `File`, which can lead to blocking applications
which get stuck reading pipes.

Now the downstream is forced to deal with the possibility that the item at hand isn't a file,
to do application-specific things.
@Byron Byron marked this pull request as ready for review December 20, 2024 10:23
@Byron Byron enabled auto-merge December 20, 2024 10:26
@Byron Byron merged commit 6822689 into main Dec 20, 2024
20 checks passed
@Byron Byron deleted the fix-1729 branch December 20, 2024 14:15
Copy link
Member

@EliahKagan EliahKagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good and I've verified that it fixes #1729 in gix clean, skipping over the files changed to pipes with panicking, as well as showing just the typechange in gix status output. To summarize:

$ rm README.md
$ mkfifo README.md
$ cargo run --bin=gix -- status
   Compiling gitoxide v0.39.0 (/home/ek/repos/gitoxide)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 6.57s
     Running `target/debug/gix status`
 08:51:43 status done 2.3K files in 0.02s (116.6K files/s)
  T README.md

head -> index isn't implemented yet
$ cargo run --bin=gix -- clean -n
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.51s
     Running `target/debug/gix clean -n`
Nothing to clean (Skipped 1 expendable entry - show with -x)

I wonder if there is a clearer name than NonFile. I haven't thought of one, though. Also, it seems to me that it may be feasible to commit generated archives that contain named pipes. I've commented below with my thinking about those things.

Comment on lines +29 to +32
/// Something that is not a file, like a named pipe or character device.
///
/// These can only exist in the filesystem.
NonFile,
Copy link
Member

@EliahKagan EliahKagan Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the best thing is to call filesystem entries of kinds that are not representable in Git repositories. In my experience, usually "file" is either construed broadly to include such entries well as directories and symbolic links, or construed narrowly to mean "regular file" and thus exclude directories and symbolic links.

I think it is unintuitive for regular files, directories, and symbolic links all not to be NonFiles, at the same time as FIFOs, sockets, character and block devices, and other more exotic kinds of filesystem entries are NonFiles (though at least one such kind of entry, a whiteout, is very intuitively a non-file).

I don't know if it's possible to name NonFile better--which is why I'm saying this instead of opening a PR to rename it. The documentation comment can probably be expanded to clarify fully what NonFile is everything other than, as well as to list the other common examples of NonFiles, and I do plan to open a PR for that. It may be that documentation in other places can be usefully expanded too; I'll look into that.

This would not be the first technical usage of "non-" with a non-literal meaning. For example, in git commit -am message, only commit is said to be a "non-option argument," even though the operand message is unambiguously an argument that is not an option argument. Sometimes this is hard to avoid, or maybe not worth avoiding.

However, this does seem to be an area where confusion is possible, or at least possible for me. For example, when I read the description and commit message in #1629, which quoted from a comment in an early version of Git a few weeks before Git began to track symbolic links, I took "non-regular files" to encompass symbolic links and was worried that the code there might wrongly remove support for them.

In fact, the code there deliberately preserved support for them (as did #1727, which also used that phrase). But the same version of the comment that Git developers used early on to describe a situation that encompassed the absence of symlink support was taken to encompass the presence of symlink support when cited.

Another possible source of confusion is that some kinds of NonFiles are distinguished from related entities by being files. For example, a FIFO (named pipe) differs from an anonymous pipe by being a file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to make this point of ambiguity so clear.

I'd definitely welcome a PR which changes the variant name and/or improves the documentation - right now it's definitely very brief.

Other also comes to mind as catch-all for everything that isn't the other variants. Technically that's a good fit as the logic literally does it like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #1734 for this. For now, I have not attempted to rename NonFile.

@@ -1,3 +1,4 @@
status_unchanged.tar
status_changed.tar
symlink_stack.tar
status_nonfile.tar
Copy link
Member

@EliahKagan EliahKagan Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could generated archives be committed for this?

I think all tests using fixtures this covers are run only only on Unix-like systems (so the Windows differences discussed in #1727 (review) would not be a problem).

With the tar command, a FIFO can be included in a .tar archive and is recreated when that archive is extracted. I'm not sure about other approaches to taring and untaring, but gix-testtools uses the tar crate. The tar crate seems to support FIFOs, which are listed here, whereas kinds of filesystem entries that don't tar, like sockets, are omitted from that enum.

(A test of the tar crate with FIFOs is run only on Linux, but I understand the comment there to mean that this is solely due to a CI limitation and not to any decreased ability to archive and extract them on other Unix-like systems.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, I ran into this locally and on MacOS it definitely didn't properly retain the pipe status of the non-files, or it couldn't recreate it. They ended up as normal files after extracting them from the tar archive.

Since the script doesn't do much, I think it's also fine to not have it cached.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a commit was pushed, that might still be accessible, that has the .tar file, then I could look at it to see if the FIFO didn't end up in it, or if the issue was extraction. (But if not, or if it's not convenient to find, then I could still possibly reproduce this, next time I have access to a macOS system.)

@Byron
Copy link
Member Author

Byron commented Dec 20, 2024

That's great to hear, thanks for checking!

I wonder if there is a clearer name than NonFile.

I was wondering the same thing, but also believe to remember that name in Git somewhere.

EliahKagan added a commit to EliahKagan/gitoxide that referenced this pull request Dec 22, 2024
Discussed in:
GitoxideLabs#1730 (comment)

At least for now, they remain called `NonFile`s (and sometimes
referred to as "non-files" in text), but more specifically defined.
@EliahKagan
Copy link
Member

I was wondering the same thing, but also believe to remember that name in Git somewhere.

I didn't find that, or not yet. I have a bit more detail in the #1734 description, though that PR does not change the name.

@Byron
Copy link
Member Author

Byron commented Dec 22, 2024

It's true, I also couldn't find that NonFile name anymore. However, I found the alternative of Untrackable so fitting as to go with that instead.

EliahKagan

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

gix clean panics if a tracked regular file is replaced with a FIFO
2 participants