Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitIndex fails to update after multiple successful GitIndex::update calls #181

Open
paolobarbolini opened this issue Jan 6, 2025 · 4 comments

Comments

@paolobarbolini
Copy link
Contributor

The following code will fail after running for for less than 1 hour. It either fails with thread 'main' panicked at ~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/gix-odb-0.66.0/src/store_impls/dynamic/load_index.rs:374:13: if the generation changed, the slot index must have changed for sure or it fails with "gix" crate failed. If problems persist, consider deleting `~/.cargo/registry/index/github.com-1ecc6299db9ec823/`: Object de01e95cf873164afae15e080903b6a7b22d6329 as referred to by "refs/remotes/origin/master" could not be found:

use std::{thread, time::Duration};

use crates_index::GitIndex;

fn main() {
    let mut index = GitIndex::new_cargo_default().unwrap();

    loop {
        let _ = dbg!(index.update());

        thread::sleep(Duration::from_secs(20));
    }
}

I've created a repro repo at https://github.com/paolobarbolini/crates-index-update-repro, with https://github.com/paolobarbolini/crates-index-update-repro/tree/96bdcac26902a88c114f7b3326d1e4643a3335ea even running on a patched gitoxide version that more clearly shows the problem paolobarbolini/gitoxide@main...paolobarbolini:gitoxide:try-to-fix-depsrs. You can find the output of the run until the failure at the end in out.patched.log

deps.rs has long had this problem and we never found the time to debug it, but running a few tests locally clearly shows the issue with some patience for the symptoms to show themselves.

@Byron
Copy link
Collaborator

Byron commented Jan 7, 2025

Thanks a lot for reporting, and for debugging this!

This… is my nightmare bug, as it shows that an assertion error in highly parallel code which is outside of the support of the Rust compiler due to it.
Further, we see that an object can't be found which might be a side-effect of a race condition which prevents additional packs from being loaded and/or discovered by all threads. It might be that if it doesn't run into the assertion error anymore that everything else is magically fixed, of course.

As for the reproduction, do you think this can be adjusted to something like the following to show the issue more quickly?

  • initial state: two repos, source and clone, which was cloned from source

Then

  • create a new commit in source with the same empty tree, so this literally adds a single object to the object database.
  • fetch the new object from source - this create a new pack
  • access the newly fetched object probably by means of the remote tracking branch
  • repeat

This should simulate what rust-crates-index is doing well enough.

Something I may note here is that it's probably not even a multi-threading issue, but maybe an issue with too many small packs not being handled correctly? After all, we only have a single thread here so there is no contention.

Thus I'd hope that the issue is very fixable and mostly related to having a long-running instance of a Repository that has to deal with an ever-growing set of small packs.

I will pick this up as soon as possible, it's my top-priority now after concluding getting gix status into starship.

@Byron
Copy link
Collaborator

Byron commented Jan 9, 2025

I did reproduce the issue with objects not being found 🎉 (GitoxideLabs/gitoxide#1750) and I am pretty sure I know what that is.
The other error, the more dangerous one, is still elusive though.

@Byron
Copy link
Collaborator

Byron commented Jan 13, 2025

I believe to have improved the situation by fixing the underlying bug to allow it to be able to properly bail out to indicate it doesn't have enough slots to load the additional packs.
This is a distinct error so the update() implementation could check for it, and re-initialize its repository if that happens. This way, it gets a slot-map that is twice as large as it needs to be to hold the current amount of packs.

This comes at the cost of probably receiving the same or similar pack again as the local tracking refs won't have been updated, but at least it should be able to recover.
Eventually one will have to run git gc though as it's quite likely that it won't be able to load all the needed packs to perform the desired operations as it may genuinely run out of file handles.

I will try to implement this recovery eventually.

@paolobarbolini
Copy link
Contributor Author

Thank you for getting this out so quickly. I'll try this out and see how it behaves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants