Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gix-status improvements #1155

Merged
merged 4 commits into from
Dec 30, 2023
Merged

gix-status improvements #1155

merged 4 commits into from
Dec 30, 2023

Conversation

Byron
Copy link
Member

@Byron Byron commented Dec 6, 2023

Based on #1106


diff-correctness → gix-status → gix reset


Improve gix status to the point where it's suitable for use in reset functinoality.
Leads to a proper worktree reset implementation, eventually leading to a high-level reset similar to how git supports it.

Architecture

The reason this PR deals quite a bit with gix status is that for a safe implementation of reset() we need to be sure that the files we would want to touch don't don't carry modifications or are untracked files. In order to know what would need to be done, we have to diff the current-index with target-index. The set of files to touch can then be used to lookup information provided by git-status, like worktree modifications, index modifications, and untracked files, to know if we can proceed or not. Here is also where the reset-modes would affect the outcome, i.e. what to change and how.

This is a very modular approach which facilitates testing and understanding of what otherwise would be a very complex algorithm. Having a set of changes as output also allows to one day parallelize applying these changes.

This leaves us in a situation where the current checkout() implementation wants to become a fastpath for situations where the reset involves an empty tree as source (i.e. create everything and overwrite local changes).

On the way to reset() it's a valid choice to warm up more with the matter by improving on the current gix status implementation and assure correctness of what's there, which currently doesn't seem to be the case in comparison. Further, implementing gix status similarly to git status should be made possible.

Gix Status

  • gix-index with case-insensitive lookup (also: check if directory is present and what kind of dir it is)
  • sketch parsing and basic API for precious files (see technical document)
  • a way to obtain untracked files to learn if changes can be made.
    • Integrate with the untracked-file extension and provide a way to update it.
    • It looks like this wants to be directory traversal with index, untracked and ignored support, as well as empty-dir and .git dir/file handling
    • Should also include pathspec-based filtering, and looks like this is also useful for gix add then
    • ignore index untracked-cache and name-hash - these are optimizations. At the current time, we don't even write the untracked-cache back so it's OK to ignore.
    • special attention needs to be paid to case-insensitive file systems (and looking up their path in the index)
    • needs to deal with precompose-unicode settings
    • let's have statistics right away
    • let's do a custom traversal for most control (i.e. not an iterator for now, need correctness, can easily create threaded iter from it if needed)
  • Integrate untracked and ignored files with rename tracking of index-worktree diffs
  • status in gix crate with index-worktree
  • diff index with index to learn what we would want to do in the worktree, or alternatively,
    diff tree with index (with reverse-diff functionality to simulate diff of index with tree), for better performance as it
    would avoid having to allocate a whole index even though we are only interested in a diff.
    • Must include rename tracking.
  • how to make diff results available from status with all transformations applied, to allow user to obtain diffs of any kind?

Next PR: Reset

  • reset() that checks if it's allowed to perform a worktree modification is allowed, or if an entry should be skipped. That way we can postpone safety checks like --hard

Postponed

What follows is important for resets, but won't be needed for cargo worktree resets.

  • a way to expand sparse dirs (but figure out if this is truly always necessary) - probably not, unless sparse dirs can be empty, but even then no expansion is needed
    • wire it up in gix index entries to optionally expand sparse entries
  • gix status with actual submodule support - needs status in gix (crate) effectively
  • gix status with actual conflict support

Research

  • Ignored files are considered expandable and can be overwritten on reset
  • How to integrate submodules - probably easy to answer once gix status can deal a little better with submodules. Even though in this case a lot of submodule-related information is needed for a complete reset, probably only doable by a higher-level caller which orchestrates it.
  • How to deal with various modes like merge and keep? How to control refresh? Maybe partial (only the files we touch), and full, to also update the files we don't touch as part of status? Maybe it's part of status if that is run before.
  • Worthwhile to make explicit the difference between git reset and git checkout in terms of HEAD modifications. With the former changing HEADs referent, and the latter changing HEAD itself.
  • figure out how this relates to the current checkout() method as technically that's a reset --hard with optional overwrite check. Could it be rolled into one, with pathspec support added?
    • just keep them separate until it's clear that reset() performs just as well, which is unlikely as there is more overhead. But maybe it's not worth to maintain two versions over it. But if so, one should probably rename it.
  • for git status: what about rename tracking? It's available for tree-diffs and quite complex on its own. Probably only needs HEAD-vs-index rename tracking. No, also can have worktree rename tracking, even though it's hard to imagine how this can be fast unless it's tightly integrated with untracked-files handling. This screams for a generalization of the tracking code though as the testing and implementation is complex, but should be generalisable.

@Byron Byron mentioned this pull request Dec 6, 2023
9 tasks
@Byron Byron force-pushed the gix-status branch 5 times, most recently from 1029830 to f4875f3 Compare December 12, 2023 09:18
Byron referenced this pull request in abathur/lilgit Dec 12, 2023
@Byron Byron force-pushed the gix-status branch 3 times, most recently from 74fcbe3 to ff776eb Compare December 22, 2023 06:28
@Byron Byron force-pushed the gix-status branch 2 times, most recently from eecaea6 to 342858b Compare December 30, 2023 07:45
This is needed to be able to refer from an entry on disk to the index,
and figure out if the index already has such entry.

New methods are:

* File::entry_by_path_icase
* File::prefixed_entry_range_icase
* File::entry_by_path_and_stage_icase
* File::directory_kind_by_path_icase
*Precious* files are ignored files, but those that are not expendable.
By default, all ignored files are expendable, but now it's possible to
declare ignored files as precious, meaning they will not be removed
just like untracked files.

See [the technical document][1] for details.

[1]: newren/git@0e6e3a6?diff=unified&w=0
That way one can tell whether the excluded item is precious or not.
It's main purpose is to hold the traversal algorithm which is used
to determine which files are tracked, untracked, ignored or precious.

It's also automatically ignoring anything called '.git'.
@Byron Byron merged commit c3983c6 into main Dec 30, 2023
18 checks passed
@Byron Byron mentioned this pull request Dec 30, 2023
16 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant