Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Locked index error #62

Open
epage opened this issue Aug 27, 2021 · 12 comments
Open

Locked index error #62

epage opened this issue Aug 27, 2021 · 12 comments

Comments

@epage
Copy link
Contributor

epage commented Aug 27, 2021

I was swapping out git fetch with crates-index in cargo-edit and got test failures due to locked index.

While its suspicious this was being updating during tests, it highlights that callers need to handle concurrent access. This is not obvious from the API.

@epage
Copy link
Contributor Author

epage commented Aug 27, 2021

Range of fixes

  • Offer a retry (how will this interact with UI though?)
  • Document it in the API
  • Provide an easier way to check for index errors

epage added a commit to epage/cargo-edit that referenced this issue Aug 27, 2021
This was inspired by running into errors where `cargo add inquire` was
not adding the latest version.  We've seen similar problems during the
development of `crates-index`, so hoping what they've done fixes this.

Even if it doesn't solve this problem, hopefully by nature of using a
shared crate means it will get more use and more eyes to make it more
reliable in the long term.

Handling retrying the index is annoying.  I opened
frewsxcv/rust-crates-index#62

Note: in the next version of crates-index
- Opening an index will implicitly create it, so we won't be able to
  differentiate in our message whether we are initializing or updating.
  See frewsxcv/rust-crates-index#61
- We will be able to open an Index by URL, which will allow us to remove
  more logic
epage added a commit to epage/cargo-edit that referenced this issue Aug 30, 2021
This was inspired by running into errors where `cargo add inquire` was
not adding the latest version.  We've seen similar problems during the
development of `crates-index`, so hoping what they've done fixes this.

Even if it doesn't solve this problem, hopefully by nature of using a
shared crate means it will get more use and more eyes to make it more
reliable in the long term.

Handling retrying the index is annoying.  I opened
frewsxcv/rust-crates-index#62

Note: in the next version of crates-index
- Opening an index will implicitly create it, so we won't be able to
  differentiate in our message whether we are initializing or updating.
  See frewsxcv/rust-crates-index#61
- We will be able to open an Index by URL, which will allow us to remove
  more logic
epage added a commit to epage/cargo-edit that referenced this issue Aug 31, 2021
This was inspired by running into errors where `cargo add inquire` was
not adding the latest version.  We've seen similar problems during the
development of `crates-index`, so hoping what they've done fixes this.

Even if it doesn't solve this problem, hopefully by nature of using a
shared crate means it will get more use and more eyes to make it more
reliable in the long term.

Handling retrying the index is annoying.  I opened
frewsxcv/rust-crates-index#62

Note: in the next version of crates-index
- Opening an index will implicitly create it, so we won't be able to
  differentiate in our message whether we are initializing or updating.
  See frewsxcv/rust-crates-index#61
- We will be able to open an Index by URL, which will allow us to remove
  more logic
ordian pushed a commit to killercup/cargo-edit that referenced this issue Aug 31, 2021
This was inspired by running into errors where `cargo add inquire` was
not adding the latest version.  We've seen similar problems during the
development of `crates-index`, so hoping what they've done fixes this.

Even if it doesn't solve this problem, hopefully by nature of using a
shared crate means it will get more use and more eyes to make it more
reliable in the long term.

Handling retrying the index is annoying.  I opened
frewsxcv/rust-crates-index#62

Note: in the next version of crates-index
- Opening an index will implicitly create it, so we won't be able to
  differentiate in our message whether we are initializing or updating.
  See frewsxcv/rust-crates-index#61
- We will be able to open an Index by URL, which will allow us to remove
  more logic
@savente93
Copy link

savente93 commented Jan 29, 2022

just adding this for future context. I did a bit of digging and i think this could arise from the fact that the underlying git2-rs crate uses unsafe code to do the opening. (source: https://github.com/rust-lang/git2-rs/blob/c55bd6dbdba52f90788150180ef124ef6c90daa6/src/repo.rs#L145-L154) I am not familiar at all with unsafe code, so I"m not sure if the handling of this issue would change because of it, but I think it was a useful bit of context to add.

I have no idea if this would work but perhaps this could also partially be solved by creating some kind of Mutex or RwLock around the repo in the Index to provide more access control once we manage to open the index? It's a bit hard for me to judge whether this would be a good solution without more information about the error mentioned. It's just an idea, would love to hear your thoughts

@kornelski
Copy link
Collaborator

I assume the lock is on the filesystem level. I could add a Mutex to stop multiple instances of crates-index from using the same repo at the same time, but that would work only within a process. Is that useful? Apart from tests, it seems unusual. Is there a risk of deadlocks?

@savente93
Copy link

Tried to look into this a bit. Again as I said this is a bit beyond me but here is what I understand of it:

Firstly, about the file locking. What I understand is that there are two kinds of file locking. Advisory and Mandatory. While an advisory lock is acquired, other processes are still free to read or write the file if they are unaware of locks. However, every bit of documentation I found on Mandatory locking advises to not use it as it has to be enforced by the kernel, so I'm going to guess that any file locking done in these contexts is advisory. Getting a definite answer on this would mean digging into the actual C implementation of the underlying git library and I am not really comfortable doing that, so unless someone more knowledgable than me can comment on that I'm going to run with this idea. However, from what I have seen, file locks are assumed to be process specific anyway, so if we can guarantee consistence within out crate-index process that should fix the problem.

About it occurring only in testing, that is hard for me to judge accurately since I didn't actually see the errors myself, but I did find this bug referenced in a bunch of downstream libraries such as rustsec and cargo-edit so I do think it would be worth fixing.

Finally about the deadlocking. Mutex I'm not super familiar with since I usually tend to work with RwLock which does mention that it can become poisoned but only if a thread panics while having acquired a write lock, and for as far as I understand we only try to acquire read locks? So I'm thinking this should be okay, but again, it's hard for me to say without knowing more about how the underlying C implementation works. Does that help at all?

@Byron Byron mentioned this issue Jul 18, 2023
9 tasks
@Byron
Copy link
Collaborator

Byron commented Jul 29, 2023

I think this should be fixed by now and I'd appreciated if the issue could be re-evaluated against v2.0. Thank you.

@epage
Copy link
Contributor Author

epage commented Jul 31, 2023

I validated the previously parallel clones would fail due to a lock that git2 creates, and now we use gix file locks to protect against concurrent access and block instead. The test-suite also makes use of this capability for better performance.

So gix blocks instead of errors, right?

Also, it sounds like these locks are library specific, so a mixture of gix and git2 code can cause concurrent access failures? How would a new cargo-edit working alongside a git2-based git-index cargo behave?

@Byron
Copy link
Collaborator

Byron commented Aug 1, 2023

So gix blocks instead of errors, right?

No, gix doesn't, but this library now uses locks to protect itself from… itself. I might have geared that specifically to make the test-suite run out of the box, even though one might rightfully argue that this is an issue specific to the test-suite.

As you point out, the solution isn't universal, and can't ever be universal, so probably crates-index should rather fail and leave all protections to the caller.

If this assessment is correct, two actions could follow:

  • I roll back the blocking locking when opening the index and rather lock when running the test-suite
  • This issue is closed as not actionable, as a library can't reasonably protect itself against all forms of concurrency issues

What do you think?

@epage
Copy link
Contributor Author

epage commented Aug 1, 2023

My priorities are

  • crates-index does not lead to data corruption for files on disk
    • This includes correctly interoperating with cargo without data corruption
  • there is a clear path for callers to work with this (e.g. in the old code, documenting the lock retry behavior and ideally making it easier)

For myself, I care less about locking for maintaining a consistent view of the data

@Byron
Copy link
Collaborator

Byron commented Aug 2, 2023

Thank you!

I went with option 2: "Document it in the API" which now makes clear that opening an index isn't safe in concurrent contexts due to the implicit auto-cloning.
This also made the addition of GitIndex::try_new…() variants necessary which are truly read-only, and now provide more options to deal with concurrency.

These improvements will be released within the hour.

Further, I think this issue is resolved with v2.1 of crates-index, as with today's cargo it's even less likely to have issues with concurrent clones due to the sparse index being the default now.

@Byron Byron closed this as completed Aug 2, 2023
@epage
Copy link
Contributor Author

epage commented Aug 2, 2023

I went with option 2: "Document it in the API" which now makes clear that opening an index isn't safe in concurrent contexts due to the implicit auto-cloning.

The "document in the API" solution was specifically for documenting the expectation for retrying and how to do so.

Yes, concurrent access is less likely to happen but I don't think this is something that should be punted to the caller. We have multiple process that could potentially be accessing a shared git repo. Putting the burden for avoiding corrupting the files on disk onto the user seems like a recipe for disaster as most won't notice and everyone will do it differently, meaning there will effectively be no protection.

Whether we get implicit blocking or a documented error for "try again", we should be safe from corrupting data.

@Byron
Copy link
Collaborator

Byron commented Aug 2, 2023

I see!

I just tried to see what kind of errors I see if there is a race just across crates-index itself, and it's two different ones already:

---- git::with_https::reads_replaced_source stdout ----
thread 'git::with_https::reads_replaced_source' panicked at 'clone works and there is no racing: Git(PrepareClone(Init(Init(IoOpen { source: Os { code: 17, kind: AlreadyExists, message: "File exists" }, path: "tests/fixtures/git-registry" }))))', tests/git/mod.rs:192:50

---- git::with_https::changes stdout ----
thread 'git::with_https::changes' panicked at 'clone works and there is no racing: Git(PrepareClone(Init(Init(IoOpen { source: Os { code: 17, kind: AlreadyExists, message: "File exists" }, path: "tests/fixtures/git-registry" }))))', tests/git/mod.rs:192:50

---- git::with_https::opens_bare_index_and_can_update_it stdout ----
thread 'git::with_https::opens_bare_index_and_can_update_it' panicked at 'clone works and there is no racing: Git(PrepareClone(Init(Init(IoOpen { source: Os { code: 17, kind: AlreadyExists, message: "File exists" }, path: "tests/fixtures/git-registry" }))))', tests/git/mod.rs:192:50
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- git::with_https::crates stdout ----
thread 'git::with_https::crates' panicked at 'clone works and there is no racing: Git(PrepareClone(Init(Init(IoOpen { source: Os { code: 17, kind: AlreadyExists, message: "File exists" }, path: "tests/fixtures/git-registry" }))))', tests/git/mod.rs:192:50

---- git::with_https::can_update_index_explicitly stdout ----
thread 'git::with_https::can_update_index_explicitly' panicked at 'clone works and there is no racing: MissingHead { refs_tried: ["FETCH_HEAD", "origin/HEAD", "origin/master"], refs_available: [], repo_path: "tests/fixtures/git-registry" }', tests/git/mod.rs:192:50

cargo solves this by having a gix-maintained error that knows which of each possible variants means what to determine if the issue is spurious or there is something corrupt (even if temporarily), and this crate could do the same. That would allow the caller to get a good idea if retries should be done. It's still shifting the burden, but it's far more doable with an API.

@Byron Byron reopened this Aug 2, 2023
@Byron Byron removed the question label Aug 2, 2023
@epage
Copy link
Contributor Author

epage commented Aug 2, 2023

That would work for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants