-
-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When adding +x, get and set mode through the file descriptor #1803
Conversation
When executable permissions were set on a file being checked out, and when it was done after the file was already created, this was done on its path, using `chmod` (via a higher level abstraction). But we have an open `File` object for it already, at that point. This uses `fchmod` on the existing open file descriptor (obtained from the `File` object), instead of `chmod` on the path. (Since GitoxideLabs#1764, the file mode has also been read first, to figure out what new mode to set it to for +x. That uses `lstat` (via a higher level abstraction). The change here is therefore really only half of what should be done. To complete it, and also to avoid new problems due to a new inconsistency, that should be done with `fstat` on the file descriptor instead of `lstat` on the path. That will be done in a directly fortcoming commit -- after portability issues in the type of the `st_mode` field, which is not actually `u32` on all systems, are examined.)
The previous conversion worked on Linux but failed on macOS where modes, as represented in system data structures, are 16-bit rather than 32-bit. The previous commit broke builds on macOS (and possibly other OSes not currently tested on CI), which this should fix. This commit also does some other refactorings: - Simplify the way conversions are represented. - Express with `expect` that, by this point, there are no unknown bits, rather than doing something that would silently preserve or silently discard them. There would have to be a bug (and probably in `gix-worktree-state` itself) for that not to be the case. - Clarify the TODO comment, and also weaken it since there might be some reason for `set_mode_executable` to keep using `Permissions` in some way.
While replacing `into()` with `rustix::fs::Mode::from_bits(...)` helped in expressing (and enforcing) the desired semantics for bits that fit in `mode_t` but whose meaning is unrecognized, it did not actually make the code build on macOS, and probably was not enough to improve portability to any other systems where `mode_t` is not `u32`. This commit adds an explicit checked conversion of the value we get from calling `mode()` on a `Permissions` object, which is aways `u32`, into whatever type `mode_t` is defined as, which is `u32` on most systems but `u16` on macOS and possibly others.
- Make the conditional `unused_variable` suppression specific to the `set_executable_after_creation` parameter, which is all that needs it. This also makes it more immediately clear that nothing along the lines of `chmod +x` is done (or relevant) on Windows. - Allow `useless_conversion` for the conversion from `u32` to `rustix::fs::RawMode`. The `RawMode` type is what `rustix::fs` calls `mode_t` as used for the `st_mode` field of a `stat` structure. On most operating systems, it is `u32`, but this is not guaranteed by POSIX (which does not even guarantee that it is unsigned). It is `u16` at least on macOS and possibly other systems, and I am unsure if there are systems this code can run on where it is some other type besides `u16` or `u32`. For now, this does not attempt to make that suppression conditional, even though it is only needed when `RawMode` is already `u32`. I don't if there is a good way to express the condition under which it should apply. If there is, then it could be made conditional or, if it is truly reliable, then the conversion itself could be made conditional (though I suspect it is omitted when not needed in release builds, by optimization).
This uses `fstat` rather than a method that delegates to `lstat`, to get the current mode bits of the still-open file using the open file descriptor (obtained from the `File` object) rather than opening the path, which could be a different file in the case of concurrent modification by code outside gitoxide. This is half of the change to using the file descriptor rather than the path for the operations that adjust the mode by adding executable bitsi. (As before, they are added for whoever already had read bits, and we remove setuid, setgid, and sticky bits, so that +x does not cause a security problem or other potentially unexpected effects.) The other half of this change was already done, appearing in a recent `change:` commit. It consisted of setting the mode using `fchmod` instead of a method that delegated to `chmod`. With only that change, the overall effect would be worse, since possible race conditions could introduce even greater inconsistency. With both changes, the effect is consistent. Race conditions can still occur, but their effect is probably less severe than before either of the changes were made. One of the ways this may be less subject to race conditions is that, because neither obtaining the old mode nor setting the new one uses a path, but only the open file descriptor, an unexpected typechange will not occur: having opened a regular file, the open file will not turn into a directory, symlink, or anything else, *as seen by* the open file descriptor. The file may be moved or replaced, but the open file descriptor "follows" the move and does not alias the separate file that now has the original path. Because we only intend to add executable bits to regular files, it is no longer necessary to handle this as a race condition and keep going. Attempting to add +x to any other type of thing should now only be possible if there is a bug in the caller. So the check for that, and the associated Option<...> logic used to keep things testable, is not needed. This still does the check, but panics if the file descriptor does not represent a regular file. The tests of this are mostly removed, but two tests that it does panic for the kinds of non-regular files we are most likely to encounter are added. (It may eventually make sense check this and panic only in debug builds, or not at all.) There are now two `fstat` calls: the one that is added to check the `st_mode` to figure out what to pass to `fchmod`, and the preexisting higher level invocation (that delegates to `fstat`) done just before closing the file, to update `entry.stat`. Using the same call for both and accounting for any effect of `fchmod` might be more complicated, but more importantly, it does not seem like it would be correct. If attempting to set the permissions reports success but has an unexpected effect, perhaps due to restrictions of the underlying filesystem or how it is mounted (which we may not be guarnateed to have detected in a probe to decide if +x is supported), we would want the real stat. Because this no longer uses `Permissions` and `PermissionsExt` facilities to check the mode before adding executable bits or to add the executable bits, it also includes some refactoring that is possible now that they are not being used.
This extracts the logic that calls `fstat`, transforms the mode with `let_readers_execute`, and writes it back through the same file descriptor with `fchmod`, introducing a new helper function, `set_executable`, for it. The reason to do this is that it makes the issue of what kind of `Error` appears in the returned `Result` clearer. This may be slightly clearer to humans. It's significnatly clearer to the Rust compiler, allowing both `map_err(std::io::Error::from)` calls to be eliminated without having to write anything confusing or inelegant. (This is separate from the reason the `let_readers_execute` function, which does not access the filesystem, is its own function. That is so it's easy to test the mode transformation logic.)
In `finalize_entry` and its callers, `set_executable_after_creation` was previously an `Option<&Path>`, because `chmod` was called (via a higher level abstraction) on the path to make the file executable. From GitoxideLabs#1764, `lstat` (via a higher level abstraction) was likewise called on the path to find out what its old mode was, to figure out what mode to set to achieve +x. Now that the `lstat` is replaced with `fstat`, and the `chmod` is replaced with `fchmod`, figuring out how to set +x and setting it are done entirely on the file descriptor, with the path unused. It turns out that this was the only use of that path in that context. Accordingly, this makes `set_executable_after_creation` a `bool` in both `finalize_entry` and its callers, and removes code that was only used for handling that path. This includes eliminating the `rela_path_as_path` variable in `checkout::chunk::process_delayed_filter_results`, which was apparently only used for this purpose. (All writes and reads on it were for preparing and passing a path to `finalize_entry`. The `finalize_entry` function had only used it to set the file executable, and had documented that this was its purpose.)
5372911
to
ee7b10c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for the fix, this is great work!
There is one other area I suggest giving special attention when reviewing this PR, though I very much doubt it is a problem: I was surprised that the
rela_path_as_path
variable introduced near the top ofset_executable_after_creation
was actually only used for arranging for executable permissions to be added. Given the way the code was written, I wonder if it was intended to use it for more than that.
I think this was just me struggling to do what I wanted as I was missing the rustix
functions to work on an open file handle.
/// | ||
/// This returns only mode bits, not file type. The return value can be passed to chmod or fchmod. | ||
#[cfg(unix)] | ||
fn let_readers_execute(mut raw_mode: rustix::fs::RawMode) -> rustix::fs::Mode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a great idea to factor this into its own, documented function!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This let_readers_execute
function is actually not really new in this PR, though its documentation comment is. This function was renamed (in de939de) from set_mode_executable
, and also changed to operate on different types. I had introduced set_mode_executable
in bf33d2b (#1764) to be able to test this logic without actually doing any operations on the filesystem.
In contrast, the related set_executable
function (which calls let_readers_execute
between its fstat
and fchmod
calls) is new in this PR. I introduced it in fc82e78, originally to make ?
seamlessly work, rather than having to write .map_err(std::io::Error::from)
, but my feeling is that having it separate from finalize_entry
may improve readability slightly even beyond that.
I wasn't sure what the best name would be for set_executable
. I also considered set_file_executable
, but it seemed to me that would still not be clearer with respect to the important distinction between a file descriptor and its path, and I went with set_executable
.
This replaces explicit `fstat` and `fchmod` calls (via `rustix`) with `File::metadata` and `File::set_permissions`, respectively. In practice, on Unix-like systems: - `File::metadata` either: * calls `fstat`, or * calls `statx` in a way that causes it to operate similarly to `fstat` (this is used on Linux, in versions with `statx`). This is not explicitly documented, though calling `stat` or `lstat`, or calling `statx` in a way that would cause it to behave like `stat` or `lstat` rather than `fstat`, would not preserve the documented behavior of `File::metadata`, which operates on a `File` and does not take a path. - `File::set_permissions` calls `fchmod`. This is explicitly documented and `fchmod` is listed as an alias for documentation purposes. But it is noted as an implementation detail that may change without warning. However, calling `chmod` or `lchmod` would not preserve the documented behavior of `File::set_permissions`, which (like `File::metadata`) operates on a file and does not take a path. While these details can, in principle, change without warning, per https://doc.rust-lang.org/stable/std/io/index.html#platform-specific-behavior, to preserve the documented semantics of operating on the file referenced by the file descriptor, it seems like the behavior would still have to be correct for our use. In our use here, what matters is that we use the file descriptor and not the path. The change in this commit intends to maintain the effect of GitoxideLabs#1803 without requiring `gix-worktree-state` to depend directly on `rustix`, and to benefit from the standard library's slightly higher level interface where modes are treated as `u32` even on operating systems where they are different (e.g. `u16` on macOS). Importantly, this continues to differ from the behavior before GitoxideLabs#1803 of using functions that operated on paths. The current general correspondence between Rust `std`, POSIX functions, and `statx`, where the rightmost three columns describe `statx` calls, is: | std::fs::* | POSIX | fd | path | flags | |------------------|-------|----------|------|---------------------| | metadata | stat | AT_FDCWD | path | | | symlink_metadata | lstat | AT_FDCWD | path | AT_SYMLINK_NOFOLLOW | | File::metadata | fstat | file | "" | AT_EMPTY_PATH | It may be that some uses of `rustix::fs` can be similarly replaced in `gix_fs`, but this commit does not include any such changes.
This replaces explicit `fstat` and `fchmod` calls (via `rustix`) with `File::metadata` and `File::set_permissions`, respectively. The change here is confined to `gix-worktree-state` and, more specifically, to the operation of checking the mode of an open file and setting a new mode based on it with some executable bits added. In practice, currently, on Unix-like systems: - `File::metadata` either: * calls `fstat`, or * calls `statx` in a way that causes it to operate similarly to `fstat` (this is used on Linux, in versions with `statx`). This is not explicitly documented, though calling `stat` or `lstat`, or calling `statx` in a way that would cause it to behave like `stat` or `lstat` rather than `fstat`, would not preserve the documented behavior of `File::metadata`, which operates on a `File` and does not take a path. - `File::set_permissions` calls `fchmod`. This is explicitly documented and `fchmod` is listed as an alias for documentation purposes. But it is noted as an implementation detail that may change without warning. However, calling `chmod` or `lchmod` would not preserve the documented behavior of `File::set_permissions`, which (like `File::metadata`) operates on a `File` and does not take a path. While these details can, in principle, change without warning, per https://doc.rust-lang.org/stable/std/io/index.html#platform-specific-behavior, it seems that, in preserving the documented semantics of operating on the file referenced by the file descriptor (rather than a path), the behavior would still have to be correct for our use here. The change in this commit intends to maintain the effect of GitoxideLabs#1803 with two minor improvements for maintainability and clarity: - No longer require `gix-worktree-state` to depend directly on `rustix`. (This is minor because it still depends transitively on it through `gix-fs`, though some uses of `rustix::fs` in `gix-fs` might likewise be possible to replace in the future.) - Gain the standard library's slightly higher level interface where modes are treated as `u32` even on operating systems where they are different (e.g. `u16` on macOS). This removes operations that do not correspond to what the code is conceptually doing. It also lets the function that computes the new mode from the old mode no longer depend on a type that differs across Unix-like targets. Importantly, this continues to differ from the pre-GitoxideLabs#1803 behavior of using functions that operated on paths. For reading metadata on a file on Unix-like systems, the current general correspondence between Rust `std`, POSIX functions, and `statx`, where the rightmost three columns pertain to how `statx` is called, is: | std::fs::* | POSIX | fd | path | *distinctive* flags | |------------------|-------|----------|------|---------------------| | metadata | stat | AT_FDCWD | path | (none) | | symlink_metadata | lstat | AT_FDCWD | path | AT_SYMLINK_NOFOLLOW | | File::metadata | fstat | file | "" | AT_EMPTY_PATH | For writing metadata on a file on Unix-like systems, the current correspondence between Rust `std` and POSIX functions is: | std::fs::* | POSIX | |-----------------------| -------| | set_permissions | chmod | | (none) | lchmod | | File::set_permissions | fchmod | It may be that some uses of `rustix::fs` facilities can be similarly replaced in `gix_fs`, but this commit does not include any such changes.
This replaces explicit `fstat` and `fchmod` calls (via `rustix`) with `File::metadata` and `File::set_permissions`, respectively. The change here is confined to `gix-worktree-state` and, more specifically, to the operation of checking the mode of an open file and setting a new mode based on it with some executable bits added. In practice, currently, on Unix-like systems: - `File::metadata` either: * calls `fstat`, or * calls `statx` in a way that causes it to operate similarly to `fstat` (this is used on Linux, in versions with `statx`). This is not explicitly documented, though calling `stat` or `lstat`, or calling `statx` in a way that would cause it to behave like `stat` or `lstat` rather than `fstat`, would not preserve the documented behavior of `File::metadata`, which operates on a `File` and does not take a path. - `File::set_permissions` calls `fchmod`. This is explicitly documented and `fchmod` is listed as an alias for documentation purposes. But it is noted as an implementation detail that may change without warning. However, calling `chmod` or `lchmod` would not preserve the documented behavior of `File::set_permissions`, which (like `File::metadata`) operates on a `File` and does not take a path. While these details can, in principle, change without warning, per https://doc.rust-lang.org/stable/std/io/index.html#platform-specific-behavior, it seems that, in preserving the documented semantics of operating on the file referenced by the file descriptor (rather than a path), the behavior would still have to be correct for our use here. The change in this commit intends to maintain the effect of GitoxideLabs#1803 with two minor improvements for maintainability and clarity: - No longer require `gix-worktree-state` to depend directly on `rustix`. (This is minor because it still depends transitively on it through `gix-fs`, though some uses of `rustix::fs` in `gix-fs` might likewise be possible to replace in the future.) - Use the standard library's slightly higher level interface where modes are treated as `u32` even on operating systems where they are different (e.g. `u16` on macOS). This removes operations that do not correspond to what the code is conceptually doing. It also lets the function that computes the new mode from the old mode no longer depend on a type that differs across Unix-like targets. Importantly, this continues to differ from the pre-GitoxideLabs#1803 behavior of using functions that operated on paths. For reading metadata on a file on Unix-like systems, the current general correspondence between Rust `std`, POSIX functions, and `statx`, where the rightmost three columns pertain to how `statx` is called, is: | std::fs::* | POSIX | fd | path | *distinctive* flags | |------------------|-------|----------|------|---------------------| | metadata | stat | AT_FDCWD | path | (none) | | symlink_metadata | lstat | AT_FDCWD | path | AT_SYMLINK_NOFOLLOW | | File::metadata | fstat | file | "" | AT_EMPTY_PATH | For writing metadata on a file on Unix-like systems, the current correspondence between Rust `std` and POSIX functions is: | std::fs::* | POSIX | |-----------------------| -------| | set_permissions | chmod | | (none) | lchmod | | File::set_permissions | fchmod | It may be that some uses of `rustix::fs` facilities can be similarly replaced in `gix_fs`, but this commit does not include any such changes.
Fixes #1786
This changes
finalize_entry
so it uses the open file descriptor, rather than the path, to read the old mode and write the new mode. (This is the PR mentioned in #1786 (comment).)How it worked before
Previously:
The old mode was read using
lstat
, called indirectly through the higher-level functionstd::fs::symlink_metadata
.gitoxide/gix-worktree-state/src/checkout/entry.rs
Line 288 in 02efddd
The new mode was set using
chmod
, called indirectly through the higher-level functionstd::fs::set_permissions
.gitoxide/gix-worktree-state/src/checkout/entry.rs
Line 290 in 02efddd
It is documented to use
chmod
, which may not be ideal, since if we are to continue to operate on a path here,lchmod
is probably what we want.How it would work now
With the changes in this PR,
gix-worktree-state
takes a direct dependency onrustix
. I think this may be okay sincegix-worktree-state
already depends ongix-index
, which depends onrustix
. (I used the same version range and features as were used there.)It uses the
fstat
andfchmod
functions thatrustix
lightly wraps--as well as theRawMode
andMode
types to avoid having to handle conversions and casts on systems like macOS wheremode_t
is 16-bit--to implement the new behavior of operating on the open file descriptor instead of the path:The old mode is read using
fstat
:gitoxide/gix-worktree-state/src/checkout/entry.rs
Line 301 in 5372911
The new is set using
fchmod
:gitoxide/gix-worktree-state/src/checkout/entry.rs
Line 303 in 5372911
Caveat 1 (but maybe benefit?): We must not dereference symlinks
Because
fchmod
operates on a file descriptor, it changes the permissions of whatever file is open. If we opened this file through a symlink, then this may be the wrong thing to do. I believe that this will not happen, at least on a Unix-like system. We have, typically, opened this file to write to it. So we try hard not to open a symlink, and I think avoid this in a way that completely robust, at least on systems where we are setting executable permissions. But this is an area I recommend be considered and examined when reviewing this PR.See #1786 (comment) for more information on this and related issues, including an area where I think this PR may actually improve resilience to wrongly following symlinks.
Caveat 2 (minor): Was the path intended to be used for more?
There is one other area I suggest giving special attention when reviewing this PR, though I very much doubt it is a problem: I was surprised that the
rela_path_as_path
variable introduced near the top ofset_executable_after_creation
was actually only used for arranging for executable permissions to be added. Given the way the code was written, I wonder if it was intended to use it for more than that.There is some more information in the commit messages. In particular, 588e6b0 presents the curious
rela_path_as_path
situation.