-
-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
various improvements #1800
Merged
Merged
various improvements #1800
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this cause code targeting the previous API and calling
with_shell
without actually requiring a shell to be used to continue to build and appear to work, but to work differently by using a shell in circumstances where it wasn't before? If so, I wonder if that might introduce bugs, or worsen existing bugs, in code that usesgix-command
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was aware of this (and agree that it could cause unwanted effects downstream) and probably didn't even mark it as
fix:
which is what I thought about when doing so.with_shell()
now actually does what it says, as opposed to before when it surprisingly only used a shell when the command seemed to need it. Especially on Windows this matters so I really thought of it as a fix for that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is true that it could fix bugs where people assumed
use_shell
had a more intuitive meaning. An alternative could be to abandon the method nameuse_shell
altogether and call all the methods something different. That would break anything that should be broken. It might also break things that need not break.On the other hand, I recognize it's possible to take that too far and accumulate lots of names that one does use in the future even when they would be the best description available. So I don't know what's best.
In this project itself, I think the changed methods may be useful for future bug investigation. I suspect we have some invocations that treat the command name as a shell script when, to match the expectations set by Git's behavior and the guarantees in Git's documentation, it would be necessary to treat the command as a shell script when it was obtained from some sources, while being necessary to treat it not as a shell script (i.e. just as a path) when it was obtained from other sources.
The specific case of this that I most strongly suspect the SSH client. When obtained from the environment
GIT_SSH
, it must not be run or parsed as a shell script (regardless of its value). When obtained fromGIT_SSH_COMMAND
, or the value of thecore.sshCommand
configuration variable, it is a shell script and must either be run as one or (in sufficiently simple cases) parsed as one. These requirements are explicated in git(1).I have not checked recently--and maybe not at all, in terms of running experiments to verify it--that we are doing the wrong thing here. When I have a chance to test it, I'll open an issue if we are. I recall it having seemed so when I was working on #1342, and that it seemed like a contributing factor to the bug reported in #1432 and discussed further in connection with some related topics in #1585 (comment). This was the suspected aspect of #1432 highlighted in #1432 (comment).
If we don't yet have an issue for it, then I'll try to open one once I get around to checking the effect in a way that facilitates direct comparison to Git behavior. Some invocations of the SSH client occur only under some circumstances; for example, it is only invoked to figure out what kind of SSH client it is if that cannot be inferred from its name (and is not overridden by configuration), and I think that is one of the places where
GIT_SSH
may be being wrongly treated likeGIT_SSH_COMMAND
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing this up!
I was aware of the difference between
GIT_SSH
andGIT_SSH_COMMAND
at the time and should have implemented it correctly. Due to me renamingwith_shell()
to its new name, in this codebase the semantics are preserved, too, so I'd think there is no issue, still.However, there I support and value an investigation to assure there aren't any subtle bugs around that in the hopes that this also helps the API of
gix-command
to mature. After all, it's vital to make it as bullet-proof and unambiguous to use as possible.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much agree that this PR introduces no new incorrect behavior--related to the distinction between
GIT_SSH
andGIT_SSH_COMMAND
, or otherwise. But I think there is probably a preexisting issue, and if so, it remains intact. (One of the ways this PR is significant to that is that it calls attention to such possible things, which is a good thing about this PR, not a bad thing.)On Windows,
\
is the primary directory separator, so it often appears in paths that are otherwise unremarkable, likeC:\Windows\System32\OpenSSH\ssh.exe
. But\
is also taken to mean that we must run a command in a shell if we are allowed to do so (since\
is a shell metacharacter). Specifically:The method now called
command_may_be_shell_script
is meant to conditionally refrain from planning to run unremarkable paths using a shell, but the presence of\
causes it to go ahead and plan that by setting theuse_shell
field totrue
:gitoxide/gix-command/src/lib.rs
Lines 108 to 110 in f3dc83b
When the
use_shell
field is set totrue
, using a shell is planned, but it may still be averted if theallow_manual_arg_splitting
field is set totrue
, which is its default value on Windows. However,\
is also one of the characters that suppresses the effect ofallow_manual_arg_splitting
:gitoxide/gix-command/src/lib.rs
Lines 222 to 234 in f3dc83b
Where, matching on
split_args
:gitoxide/gix-command/src/lib.rs
Lines 243 to 244 in f3dc83b
Therefore, if there are places where the function now called
command_may_be_shell_script
is called when it shouldn't be called, on Windows this will often have the effect of running a command in a shell that should not be run that way--this effect is more likely on Windows, due to\
being a common character in paths.To be clear, this behavior is not in any way exacerbated by the changes in this PR. My thinking was that, as far as this sort of thing goes, the changes here are a boon, because they make it easier to reason about what is going on.
However, I just now realize that there is actually a sense in which the changes in this PR exacerbate this. Before, because the method had the name
with_shell
, it conveyed the impression that a shell would be used in more circumstances than it was intended to convey. But as detailed above, it seems that a shell was also used, on Windows, in more circumstances than it was intended to be used. These partially "cancelled out": the impression it gave was, on Windows, closer to the truth than it probably is now.I don't suggest that as a reason to undo anything from this PR. Instead, the specific effect of
\
in a command on those two methods on Windows, in light of the special significance of\
on Windows, can modified, documented, or both; and any code in this project that callscommand_may_be_shell_script
when it shouldn't can be identified and fixed.As far as I can tell, in the main code path used to decide how to invoke an SSH client to actually connect to a remote, this is implemented perfectly. In
gix::repository::config::ssh_connect_options
, thefallback_active
flag keeps track of where the SSH client command is configured from:gitoxide/gix/src/repository/config/mod.rs
Lines 63 to 70 in f3dc83b
This is used to decide, I believe correctly, whether to disallow the use of a shell:
gitoxide/gix/src/repository/config/mod.rs
Lines 71 to 79 in f3dc83b
These options are ultimately passed to
gix_transport::client::blocking_io::ssh::connect
. In most cases, thisconnect
function arranges a scenario in which the SSH client will run once. This happens through the last thing thisconnect
function does, which to unconditionally callgix_transport::client::blocking_io::file::SpawnProcessOnDemand::new_ssh
. That call relates to actually using the SSH client to make a connection to a server and transfer data. It properly passes along thedisallow_shell
flag:gitoxide/gix-transport/src/client/blocking_io/ssh/mod.rs
Lines 136 to 144 in f3dc83b
However, before this
connect
function does that, it figures out how the SSH client will be invoked to do that. It checks if this is configured, and otherwise tries to ascertain what SSH implementation it is or behaves like. Under some conditions, it runs the SSH client in order to do this. I am unclear at this time on how often or exactly under what circumstances that happens. The logic is:gitoxide/gix-transport/src/client/blocking_io/ssh/mod.rs
Lines 112 to 113 in f3dc83b
But when it does it, it runs a command that it sets up like this:
gitoxide/gix-transport/src/client/blocking_io/ssh/mod.rs
Lines 115 to 125 in f3dc83b
(In #1342, I modified part of this, but not the part that is relevant to the use of a shell. I wondered why it was calling what was, back then, named
with_shell
, but this was not part of what was necessary to modify to achieve the goals of that PR.)That code unconditionally calls the
command_may_be_shell_script
method, and does not subsequently do anything to reverse its effects. It runs the command immediately. Specifically, the above expression is passed to:gitoxide/gix-transport/src/client/blocking_io/ssh/mod.rs
Line 114 in f3dc83b
And the subsequent code is:
gitoxide/gix-transport/src/client/blocking_io/ssh/mod.rs
Lines 127 to 132 in f3dc83b
Consequently, when
gix-transport
pre-runs an SSH client to figure out what kind of SSH client it is, it seems to me that it runs it in a way that allows a shell to be used and, combined with the aforementioned effect of\
suppressing the conditional shell-avoidance behaviors of bothcommand_may_be_shell_script
andallow_manual_arg_splitting
, this will often run it in a (Bourne-style) shell on Windows, where it will typically fail or otherwise malfunction because the shell treats\
as the escape character.(Separately from that, I am unclear on why
gix_transport::client::blocking_io::file::SpawnProcessOnDemand::new_local
always passes afalse
value ofssh_disallow_shell
. But since, if I understand correctly, that is not using SSH, my guess is that it is okay.)Edit: I've opened #1814, which would fix the second part of what I've described here--that is, the place in
gix-transport
wheredisallow_shell
should be followed but is not, when running the SSH client to figure out what implementation it is. I'm not sure how good shape that PR is in yet, though (as described there).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for sharing! I see now how this PR made it easier to realize that on Windows, the shell would actually run more often than it would on Unix.
Memory does not serve, unfortunately, but given the importance of not messing this up (i.e. URLs could be crafted to use the shell to execute arbitrary commands) I'd hope that was intentional. However, I'd think it's better to document why this is
false
given that shells ideally are avoided entirely.