`sh` quoting of program names on an opt-in basis

### Summary 💡

Possible tasks:

- [x] Document that `gix_command::Prepare` with `use_shell` does not quote (unless it will) ([#1800](https://github.com/GitoxideLabs/gitoxide/pull/1800))
- [x] Document which data in a `Prepare` instance are subject to shell expansions if unquoted ([#1800](https://github.com/GitoxideLabs/gitoxide/pull/1800))
- [x] Clarify the semantics of [`args`](https://github.com/GitoxideLabs/gitoxide/blob/97f85f78f4026203dca2b62ffe2174e2f028941e/gix-command/src/lib.rs#L25-L26) ([#1813](https://github.com/GitoxideLabs/gitoxide/pull/1813))
- [x] *Maybe:* When `use_shell` is `true`, let caller opt into POSIX shell style quoting ([#1801](https://github.com/GitoxideLabs/gitoxide/pull/1801))

It might make sense to split this into multiple issues, in that the first three items could be done by themselves, though they become more important and possibly also different if fourth is done.

#### When using a shell, the *how* varies based on the *why*

By default, `gix_command::Prepare` rightly does not use a shell. But it supports using a shell if requested. This shell is generally `sh` and assumed to have POSIX shell semantics. There are at least three reasons one might want to use a shell:

1. To run a command in the same way `git` would run it, where `git` is guaranteed to use a shell.
2. To run a command that is expressed as a single string, with arguments split how `sh` splits them.
3. To get a different environment that a shell is hoped to confer.

#### Reasons (1) and (2) are the most important

In the absence of (1), a shell must be used very carefully, and basically must never be used for running a command `git` documents not to use a shell or that can otherwise be assumed not to be run with a shell when `git` runs it. For example, if a command that should not run in a shell has a `\` in its name due that being a path separator on Windows, and it is run in a shell anyway, then that will turn into an escape character.

When (2) is the only reason to use a shell, and neither (1) nor (3) apply, or they apply but can be "optimized away," then shell-like argument splitting can be done without using a shell. `Prepare` currently does this on Windows, if it is it is allowed but not required to use a shell, i.e. if `use_shell` and `allow_manual_arg_splitting` were both set to `true`.

The feature suggested here is about making (3) easier and slightly more likely to be done correctly when done at all. But I mention (1) and (2) because I think they are the main uses of `use_shell`, and I recommend only pursuing this feature idea for (3) if it will not interfere with, complicate, or confuse (1) and (2).

#### Reason (3) is hard to get right

I think it is very hard to do (3) effectively and achieve the predicted effect, as discussed in #1752 and #1758. The `sh.exe` in Git for Windows probably does provide an environment that is different from the caller's in some ways that could help find or use some tools. But:

- Path interpretation could sometimes differ unexpectedly, since `sh.exe`/`bash.exe` support Cygwin-style paths. As one but not the only case of this, some Windows paths that `std::process::Command` supports (it supports nearly all Windows paths, including `\\?\`, `\\.\`, and even `\??\` paths) can be rejected or misinterpreted.
- It probably also differs in other ways that make things work *less* reliably. For example, if it and gitoxide are built for different architectures and one of them is 32-bit, then they may resolve the same paths to different locations when running some programs, due to the [file system redirector](https://learn.microsoft.com/en-us/windows/win32/winprog64/file-system-redirector), which I suspect may have been a factor in [#1432](https://github.com/GitoxideLabs/gitoxide/issues/1432) (which I do plan to revisit; I have not forgotten about [#1585 (comment)](https://github.com/GitoxideLabs/gitoxide/pull/1585#discussion_r1764341739)).
- Running something with `sh.exe`, or even with `bash.exe`, is not equivalent to using the interactive environment in a Git Bash environment. This is (mostly) because some environment customization is done through startup scripts that mainly run when a shell is used interactively.

Outside of Windows, I would expect nearly all of the difference between an environment using and not using a shell as an intermediary would be due to startup scripts that are unlikely to run, and probably should not be run, when using the shell in an automated, non-interactive way. This is because, unlike a Unix-style shell on Windows, a Unix-style on a Unix-like system is not doing anything extra to take an environment that is presumed to be non-Unix-like and make it Unix-like.

#### But we may be able to make (3) easier to get right on the occasion it is needed

If I understand correctly, the way `gix_command::Prepare` currently works when it uses an external shell process is that it assumes the shell is `sh`-like, invokes it with a `-c` argument followed by the value of the `command` field and, if `args` is nonempty and `$@` does not already appear in the `command` field, adds a space followed by `"$@"` to the command and passes the arguments as separate argument to the shell, so that they will be treated as positional parameters and accessible through `$1`, `$2`, `$3`, etc., and with `$@` and `$*`, with their usual semantics.

This assumes the command to be run should be subject to shell expansions, including expansions that might turn it into a command followed by some of its arguments, and then explicit `args` should not be supplied literally as positional parameters that the command can access. This seems generally reasonable when the goal of using a shell is (1) or (2), provided that the caller has considered all the effects of using a shell and understands exactly what is being passed to it. The first two proposed tasks ("Document ...") would help authors of code that use `gix_command`, or that use other code that uses `gix_command` and documents that it does so, so know what to expect.

But this may not be the right semantics for (3), where no expansions are wanted at all. Suppose (3) applies and (1) and (2) do not. Suppose further that using a shell has been confirmed in the specific use case to produce the desired environment (so the solution is not simply to abandon (3) as a goal or technique). Then the value of `command` should be quoted.

#### This seems feasible

`command` can only be quoted safely if enough is known about the shell language and if its quoting rules are simple enough.

Fortunately, POSIX-compatible `sh`, and most if not all modern Bourne-style shells, satisfy this requirement. For any string, so long as the string actually *can* be included in a script or `-c` operand (i.e., barring an encoding mismatch), the string can be quoted by replacing any preexisting occurrence of `'` with the literal sequence `'\''`, prepending a `'`, and appending a `'`.

This works for causing it to be passed as an argument, so long as it is properly separated from the command name and other arguments with whitespace. It also works for using it as the command name. For example, `C:\Users\O'Shaughnessy\foo bar.exe` can be quoted as `'C:\Users\O'\''Shaughnessy\foo bar.exe'`.

I don't think this should be done automatically. One reason it should not is that the caller may be quoting it already. But it might be useful as a feature that can be opted into.

Because--and so long as--`args` are not passed to the shell as part of text the shell considers a shell script, they should not be quoted and no option to quote them should be presented. But the semantics when doing this should be made clear, especially if this feature I am proposing is to be implemented. Furthermore, if this feature is implemented, then the circumstances under which quoting is added should be clearly distinguished from those under which it is not.

#### The process starts with the arguments, right?

It may be that the current summary of `args` will be understood correctly:

https://github.com/GitoxideLabs/gitoxide/blob/97f85f78f4026203dca2b62ffe2174e2f028941e/gix-command/src/lib.rs#L25-L26

However, I was momentarily tripped up by it, reading it to mean that the process has been spawned and then receives the arguments. This may be possible to fix while revising the docstring, or other revisions might make it clear automatically by describing exactly how the `args` are passed, or maybe this specific point about the wording being ambiguous was just an initial misreading on my part.

### Motivation 🔦

The commented discussions in #1752 and #1758 reveal (3) as a goal. Part of the view I expressed there is that whether, and how, this should be done, is probably higher level than gitoxide.

But even though I think that is largely true, and even though I think gitoxide itself should be especially reluctant to automatically use a shell for any operations when neither (1) nor (2) decisively applies, there does seem to be a fragment of *how* to do (3) that belongs in `gix_command`: helping the caller cause the correct string to be used as `command` depending on whether quoting is needed.

Another motivation is that I noticed what looked like a possible use case for this in https://github.com/gitbutlerapp/gitbutler/pull/6028. Note, however, that the only non-documentation change proposed in this feature request is opt-in quoting for `command` when `command` is interpreted as a single path. The quotes would be added and then, if `args` are present, a space and `"$@"` appended.

#### What this doesn't propose

This does not currently propose even opt-in quoting for arguments, because `"$@"` is already in place and seems sufficient to neutralize anything in them that would be treated specially by a shell if it appeared in text the shell considers a script or command.

This does not propose anything to add a `--` argument before `"$@"`, because whether it is needed or not is independent of whether a shell is used. More broadly, if present, it's just a literal `"--"`, so it can be included explicitly in `args` at least as clearly and accurately as if specified in any other way.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

`sh` quoting of program names on an opt-in basis #1799

Summary 💡

When using a shell, the how varies based on the why

Reasons (1) and (2) are the most important

Reason (3) is hard to get right

But we may be able to make (3) easier to get right on the occasion it is needed

This seems feasible

The process starts with the arguments, right?

Motivation 🔦

What this doesn't propose

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	/// The arguments to pass to the spawned process.
	pub args: Vec<OsString>,

Uh oh!

sh quoting of program names on an opt-in basis #1799

Description

Summary 💡

When using a shell, the how varies based on the why

Reasons (1) and (2) are the most important

Reason (3) is hard to get right

But we may be able to make (3) easier to get right on the occasion it is needed

This seems feasible

The process starts with the arguments, right?

Motivation 🔦

What this doesn't propose

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`sh` quoting of program names on an opt-in basis #1799