Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new path API for os2 #4954

Merged
merged 8 commits into from
Mar 24, 2025
Merged

Add new path API for os2 #4954

merged 8 commits into from
Mar 24, 2025

Conversation

Feoramund
Copy link
Contributor

In order for os2 to replace os, filepath has to be decoupled from the package. This PR takes care of that by re-writing most of the main part of the core:path/filepath API directly into os2. (The glob-related API will come in a future PR that I want to spend more time testing.)

  • I completely rewrote how clean works to not depend on the old Lazy_Buffer structure and procs.
  • All of the procs now use system-specific internal procedures where applicable instead of when statements to keep things cleaner/more modular.
  • Documentation has been added.
  • The splitting API has been simplified.
    • Instead of split, base, short_stem, ext, long_ext, and dir, we now have split_path -> (dir, filename) and split_filename -> (base, ext). There is also split_filename_all to treat the first . extension separator as the splitting point.
  • The splitting API will no longer include a path separator in the directory when a filename is present. This means /usr/share will split into /usr and share, instead of /usr/ and share as was done with the filepath API.
  • rel has also been rewritten. Currently, it is invalid on case-sensitive filesystems (POSIX, for example), as it uses strings.equal_fold. The rewritten version no longer automatically runs clean on the arguments for the sake of performance, and this is documented as an argument requirement.

Here is the new API summary. Given that it lives directly inside os2, the procedure names have been made verbose enough to outline their path-specific functionality, following in line with the rest of os2's procedures being fluently readable.

  • are_paths_identical compares two paths for exactness (but not equivalency). This is to handle case sensitivity on different platforms.
  • clean_path replaces filepath.clean.
  • is_absolute_path replaces filepath.is_abs.
  • get_absolute_path replaces filepath.abs.
  • get_relative_path replaces filepath.rel.
  • split_path replaces filepath.split, filepath.dir, and filepath.base.
  • join_path replaces filepath.join.
  • split_filename, when used with the filename result of split_path, replaces stem and ext.
  • split_filename_all, when used with the filename result of split_path, replaces short_stem and long_ext.
  • join_filename complements split_filename_*, as it made sense to me to have a join if we have split.
  • split_path_list replaces filepath.split_list.

Note: split_filename_*, when given ".foo" will return ".foo", "" instead of "", "foo".

I reduced the various splitting mechanisms down to just a couple for the sake of clarity and performance, because if someone needs both the name and the extension split, it's easier to just index the separator once and return both sides of the split. You can discard the other if you don't need it.

I'm interested in hearing feedback about this API: if it's sensible and matches expectations, particularly with regard to trimming final separators on split directories and treating dot-prefixed filenames as entire names. It's a convention in the POSIX world that dot-prefixed files are treated as hidden (so it feels more natural to me that it is part of the filename), but I'm not sure how Windows users feel about this particular handling.

}
case:
// Copy the path element verbatim and add a separator.
intrinsics.mem_copy_non_overlapping(raw_data(buffer[buffer_i:]), raw_data(elem), len(elem))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this style isn't used? I know mem_copy_non_overlapping is going to be faster in this case, but it is a little more noiser.

copy(buffer[buffer_i:], elem)

See all other cases of this too.

Copy link
Member

@gingerBill gingerBill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about the benefit of intrinsics.mem_copy_non_overlapping vs copy. However, everything else looks really really good!

@gingerBill gingerBill merged commit 4a595f9 into odin-lang:master Mar 24, 2025
7 checks passed
@Feoramund
Copy link
Contributor Author

Feoramund commented Mar 24, 2025

I am not sure about the benefit of intrinsics.mem_copy_non_overlapping vs copy. However, everything else looks really really good!

Initially I did this for speed, because I was under the assumption that memmove was creating an actual secondary buffer. I did some digging after reading your comments here, and I realized I mis-remembered a key point of the C standard which is that memmove works as if it's copying to a temporary buffer, but it doesn't have to. Sane implementations just check if the runs would overlap and copy backwards or forwards depending on that factor.

copy is completely fine and more readable.

@laytan laytan mentioned this pull request Feb 28, 2025
61 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants