Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove TODOs about using path-slash to handle escapes #1841

Merged
merged 1 commit into from
Feb 16, 2025

Conversation

EliahKagan
Copy link
Member

@EliahKagan EliahKagan commented Feb 15, 2025

I considered opening an issue or discussion question about this, but I figured it would be more efficient to use a PR, which can be merged or not according to whether I'm correct to think the change these TODOs recommend should not be attempted.

Why I think those TODOs shouldn't be planned

The to_unix_separators and to_windows_separators functions in gix_path::convert had TODO comments saying they should use the path-slash crate "to handle escapes". These comments were added as part of e4f4c4b (#397) but the context there and in the broader related issue #301 does not seem to clarify the basis for this.

It is not really clear what handling escapes would entail here, and it seems like there is not a way to do it without substantially changing the interface of these conversion functions in gix-path, which currently take a single argument representing a path and return a single string-like value also representing a path. If escape sequences appear in the input to such a path conversion function, it would not have a way to know if they are meant literally or as escape sequences. (An analogous concern applies if a function is to add escape sequences in its return value; it would have no way to know if the caller expects them.)

Furthermore, while path-slash can convert some \ paths to use / instead, it does not appear to do anything related to handling escape sequences or distinguishing which occurrences of \ or any other character may be intended as part of an escape sequence. Its documentation does prominently mention that \ in escape sequences should not be converted to /:

On Unix-like OS, the path separator is /. So any conversion is not necessary. But on Windows, the file path separator is \, and needs to be replaced with / for converting the paths to "slash paths". Of course, \s used for escaping characters should not be replaced.

But it looks like the part about \ characters used for escaping is meant as advice on how and when to use path-slash, rather than meaning that path-slash would itself be able to distinguish between \ characters meant as directory separators and \ characters that perform quoting/escaping.

An alternative to removing the TODOs

One possible alternative to removing these TODOs would be to change them to express a plan to switch to path-slash, but for a different reason. If distinguishing \ is something these conversion functions should not do, and that path-slash does not do, then that would still support using path-slash, just for the reason of convenience rather than extra functionality.

However, I think we should probably also not switch to path-slash for this particular use at this time, for the reasons detailed in the part of #1840 (reply in thread) that describes some limitations of path-slash.

The `to_unix_separators` and `to_windows_separators` functions in
`gix_path::convert` had TODO comments saying they should use the
`path-slash` crate "to handle escapes". These comments were added
as part of e4f4c4b (GitoxideLabs#397) but the context there and in the broader
related issue GitoxideLabs#301 does not seem to clarify the basis for this.

It is not really clear what handling escapes would entail here, and
it seems like there is not a way to do it without substantially
changing the interface of these conversion functions in `gix-path`,
which currently take a single argument representing a path and
return a single string-like value also representing a path. If
escape sequences appear in the input to such a path conversion
function, it would not have a way to know if they are meant
literally or as escape sequences. (An analogous concern applies if
a function is to add escape sequences in its return value; it would
have no way to know if the caller expects them.)

Furthermore, while `path-slash` can convert some `\` paths to use
`/` instead, it does not appear to do anything related to handling
escape sequences or distinguishing which occurrences of `\` or any
other character may be intended as part of an escape sequence. Its
documentation (https://docs.rs/path-slash/latest/path_slash/) does
prominently mention that `\` in escape sequences should not be
converted to `/`:

> On Unix-like OS, the path separator is `/`. So any conversion is
> not necessary. But on Windows, the file path separator is `\`,
> and needs to be replaced with `/` for converting the paths to
> "slash paths". Of course, `\`s used for escaping characters
> should not be replaced.

But it looks like the part about `\` characters used for escaping
is meant as advice on how and when to use `path-slash`, rather than
meaning that `path-slash` would itself be able to distinguish
between `\` characters meant as directory separators and `\`
characters that perform quoting/escaping.
@Byron
Copy link
Member

Byron commented Feb 16, 2025

Thanks a lot!

Yes, let's remove the TODOs and wait for a concrete case that shows that a simple find/replace operation won't actually do it.

I considered opening an issue or discussion question about this, but I figured it would be more efficient to use a PR, which can be merged or not according to whether I'm correct to think the change these TODOs recommend should not be attempted.

I love that! In the worst case, a discussion becomes an issue becomes a PR, each time with context being repeated for the sake of completeness of each of these independent units. Instead, if a change is desired, it's usually easier to have a conversation around the change even if it's not fully fleshed out yet or might not even be merged in the end.

@Byron Byron merged commit ad7a94e into GitoxideLabs:main Feb 16, 2025
21 checks passed
@EliahKagan EliahKagan deleted the no-esc-todo branch February 16, 2025 14:25
@EliahKagan
Copy link
Member Author

Thanks. I sometimes struggle with that because a lot of discussions or issues I open could be PRs that might eventually make it into good shape to merge but probably would not--I could imagine a lot of PRs open at a time. I also worry about long-standing PRs dissuading other potential contributors from working on something. (I fear that may have happened in gitpython-developers/GitPython#1955.) But I also agree, at least for the recent #1840/#1842, which should've started out at least as an issue, and maybe even as a PR, since even if it were decided to keep --, a clarifying comment would be in order.

@Byron
Copy link
Member

Byron commented Feb 17, 2025

That's true, I didn't consider the potential long-livedness of such PRs which I am not a huge fan of at least while working more intensively on Gitoxide than I do now. What remains is the consideration if a change is intended, then it seems like a possibility to start with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants